151
|
Bhattacharjee S, Ikromjanov K, Carole KS, Madusanka N, Cho NH, Hwang YB, Sumon RI, Kim HC, Choi HK. Cluster Analysis of Cell Nuclei in H&E-Stained Histological Sections of Prostate Cancer and Classification Based on Traditional and Modern Artificial Intelligence Techniques. Diagnostics (Basel) 2021; 12:diagnostics12010015. [PMID: 35054182 PMCID: PMC8774423 DOI: 10.3390/diagnostics12010015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 12/14/2021] [Accepted: 12/20/2021] [Indexed: 11/16/2022] Open
Abstract
Biomarker identification is very important to differentiate the grade groups in the histopathological sections of prostate cancer (PCa). Assessing the cluster of cell nuclei is essential for pathological investigation. In this study, we present a computer-based method for cluster analyses of cell nuclei and performed traditional (i.e., unsupervised method) and modern (i.e., supervised method) artificial intelligence (AI) techniques for distinguishing the grade groups of PCa. Two datasets on PCa were collected to carry out this research. Histopathology samples were obtained from whole slides stained with hematoxylin and eosin (H&E). In this research, state-of-the-art approaches were proposed for color normalization, cell nuclei segmentation, feature selection, and classification. A traditional minimum spanning tree (MST) algorithm was employed to identify the clusters and better capture the proliferation and community structure of cell nuclei. K-medoids clustering and stacked ensemble machine learning (ML) approaches were used to perform traditional and modern AI-based classification. The binary and multiclass classification was derived to compare the model quality and results between the grades of PCa. Furthermore, a comparative analysis was carried out between traditional and modern AI techniques using different performance metrics (i.e., statistical parameters). Cluster features of the cell nuclei can be useful information for cancer grading. However, further validation of cluster analysis is required to accomplish astounding classification results.
Collapse
Affiliation(s)
| | - Kobiljon Ikromjanov
- Department of Digital Anti-Aging Healthcare, u-AHRC, Inje University, Gimhae 50834, Korea; (K.I.); (K.S.C.); (Y.-B.H.); (R.I.S.); (H.-C.K.)
| | - Kouayep Sonia Carole
- Department of Digital Anti-Aging Healthcare, u-AHRC, Inje University, Gimhae 50834, Korea; (K.I.); (K.S.C.); (Y.-B.H.); (R.I.S.); (H.-C.K.)
| | - Nuwan Madusanka
- School of Computing & IT, Sri Lanka Technological Campus, Paduka 10500, Sri Lanka;
| | - Nam-Hoon Cho
- Department of Pathology, Yonsei University Hospital, Seoul 03722, Korea;
| | - Yeong-Byn Hwang
- Department of Digital Anti-Aging Healthcare, u-AHRC, Inje University, Gimhae 50834, Korea; (K.I.); (K.S.C.); (Y.-B.H.); (R.I.S.); (H.-C.K.)
| | - Rashadul Islam Sumon
- Department of Digital Anti-Aging Healthcare, u-AHRC, Inje University, Gimhae 50834, Korea; (K.I.); (K.S.C.); (Y.-B.H.); (R.I.S.); (H.-C.K.)
| | - Hee-Cheol Kim
- Department of Digital Anti-Aging Healthcare, u-AHRC, Inje University, Gimhae 50834, Korea; (K.I.); (K.S.C.); (Y.-B.H.); (R.I.S.); (H.-C.K.)
| | - Heung-Kook Choi
- Department of Computer Engineering, u-AHRC, Inje University, Gimhae 50834, Korea;
- Correspondence: ; Tel.: +82-10-6733-3437
| |
Collapse
|
152
|
Ma ZF, Liu Z, Luo C, Song L. Evidential classification of incomplete instance based on K-nearest centroid neighbor. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-210991] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Classification of incomplete instance is a challenging problem due to the missing features generally cause uncertainty in the classification result. A new evidential classification method of incomplete instance based on adaptive imputation thanks to the framework of evidence theory. Specifically, the missing values of different incomplete instances in test set are adaptively estimated based on Shannon entropy and K-nearest centroid neighbors (KNCNs) technology. The single or multiple edited instances (with estimations) then are classified by the chosen classifier to get single or multiple classification results for the instances with different discounting (weighting) factors, and a new adaptive global fusion method finally is proposed to unify the different discounted results. The proposed method can well capture the imprecision degree of classification by submitting the instances that are difficult to be classified into a specific class to associate the meta-class and effectively reduce the classification error rates. The effectiveness and robustness of the proposed method has been tested through four experiments with artificial and real datasets.
Collapse
Affiliation(s)
- Zong-fang Ma
- School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an, China
| | - Zhe Liu
- School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an, China
- Department of Computer Science, St. Francis Xavier University, Antigonish, NS B2G 2W5, Canada
| | - Chan Luo
- School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an, China
| | - Lin Song
- School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an, China
| |
Collapse
|
153
|
Modulation Recognition of Communication Signal Based on Convolutional Neural Network. Symmetry (Basel) 2021. [DOI: 10.3390/sym13122302] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
In the noncooperation communication scenario, digital signal modulation recognition will help people to identify the communication targets and have better management over them. To solve problems such as high complexity, low accuracy and cumbersome manual extraction of features by traditional machine learning algorithms, a kind of communication signal modulation recognition model based on convolution neural network (CNN) is proposed. In this paper, a convolution neural network combines bidirectional long short-term memory (BiLSTM) with a symmetrical structure to successively extract the frequency domain features and timing features of signals and then assigns importance weights based on the attention mechanism to complete the recognition task. Seven typical digital modulation schemes including 2ASK, 4ASK, 4FSK, BPSK, QPSK, 8PSK and 64QAM are used in the simulation test, and the results show that, compared with the classical machine learning algorithm, the proposed algorithm has higher recognition accuracy at low SNR, which confirmed that the proposed modulation recognition method is effective in noncooperation communication systems.
Collapse
|
154
|
|
155
|
Wang H, Zhou S, Liu Y, Yu Y, Xu S, Peng L, Ni C. Exploration study on serum metabolic profiles of Chinese male patients with artificial stone silicosis, silicosis, and coal worker's pneumoconiosis. Toxicol Lett 2021; 356:132-142. [PMID: 34861340 DOI: 10.1016/j.toxlet.2021.11.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 10/31/2021] [Accepted: 11/22/2021] [Indexed: 01/04/2023]
Abstract
Long-term exposure to inhaled silica dust induces pneumoconiosis, which remains a heavy burden in developing countries. Modern industry provides new resources of occupational SiO2 leading to artificial stone silicosis especially in developed countries. This study aimed to characterize the serum metabolic profile of pneumoconiosis and artificial stone silicosis patients. Our case-control study recruited 46 pairs of pneumoconiosis patients and dust-exposed workers. Nontargeted metabolomics and lipidomics by ultra-high-performance liquid chromatography-tandem mass spectrometry platform were conducted to characterize serum metabolic profile in propensity score-matched (PSM) pilot study. 54 differential metabolites were screened, 24 of which showed good screening efficiency through receiver operating characteristics (ROC) in pilot study and validation study (both AUC > 0.75). 4 of the 24 metabolites can predict pneumoconiosis stages, which are 1,2-dioctanoylthiophosphatidylcholine, phosphatidylcholine(O-18:1/20:1), indole-3-acetamide and l-homoarginine. Kynurenine, N-tetradecanoylsphingosine 1-phosphate, 5-methoxytryptophol and phosphatidylethanolamine(22:6/18:1) displayed the potential as specific biomarkers for artificial stone silicosis. Taken together, our results confirmed that tryptophan metabolism is closely related to pneumoconiosis and may be related to disease progression. Hopefully, our results could supplement the biomarkers of pneumoconiosis and provide evidence for the discovery of artificial stone silicosis-specific biomarkers.
Collapse
Affiliation(s)
- Huanqiang Wang
- National Institute of Occupational Health and Poison Control, Chinese Center for Disease Control and Prevention, Beijing, 100000, PR China
| | - Siyun Zhou
- Department of Occupational Medical and Environmental Health, Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, 211166, PR China
| | - Yi Liu
- Gusu School, Nanjing Medical University, Nanjing, 211166, PR China
| | - Yihan Yu
- Hubei Provincial Hospital of Integrated Chinese & Western Medicine, Wuhan, 430000, PR China
| | - Sha Xu
- Hubei Provincial Hospital of Integrated Chinese & Western Medicine, Wuhan, 430000, PR China
| | - Lan Peng
- Department of Occupational Medical and Environmental Health, Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, 211166, PR China
| | - Chunhui Ni
- Department of Occupational Medical and Environmental Health, Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, 211166, PR China.
| |
Collapse
|
156
|
Yang Y, Zeng Q. Impact-slip experiments and systematic study of coal gangue “category” recognition technology Part I: Impact-slip experiments between coal gangue mixture and top coal caving hydraulic support and the study of coal gangue “category” recognition technology. POWDER TECHNOL 2021. [DOI: 10.1016/j.powtec.2021.06.055] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
157
|
A Real-Time Car Towing Management System Using ML-Powered Automatic Number Plate Recognition. ALGORITHMS 2021. [DOI: 10.3390/a14110317] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Automatic Number Plate Recognition (ANPR) has been widely used in different domains, such as car park management, traffic management, tolling, and intelligent transport systems. Despite this technology’s importance, the existing ANPR approaches suffer from the accurate identification of number plats due to its different size, orientation, and shapes across different regions worldwide. In this paper, we are studying these challenges by implementing a case study for smart car towing management using Machine Learning (ML) models. The developed mobile-based system uses different approaches and techniques to enhance the accuracy of recognizing number plates in real-time. First, we developed an algorithm to accurately detect the number plate’s location on the car body. Then, the bounding box of the plat is extracted and converted into a grayscale image. Second, we applied a series of filters to detect the alphanumeric characters’ contours within the grayscale image. Third, the detected the alphanumeric characters’ contours are fed into a K-Nearest Neighbors (KNN) model to detect the actual number plat. Our model achieves an overall classification accuracy of 95% in recognizing number plates across different regions worldwide. The user interface is developed as an Android mobile app, allowing law-enforcement personnel to capture a photo of the towed car, which is then recorded in the car towing management system automatically in real-time. The app also allows owners to search for their cars, check the case status, and pay fines. Finally, we evaluated our system using various performance metrics such as classification accuracy, processing time, etc. We found that our model outperforms some state-of-the-art ANPR approaches in terms of the overall processing time.
Collapse
|
158
|
Liu W, Fang X, Zhou Y, Dou L, Dou T. Machine learning-based investigation of the relationship between gut microbiome and obesity status. Microbes Infect 2021; 24:104892. [PMID: 34678464 DOI: 10.1016/j.micinf.2021.104892] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 09/30/2021] [Accepted: 10/10/2021] [Indexed: 02/06/2023]
Abstract
Gut microbiota is believed to play a crucial role in obesity. However, the consistent findings among published studies regarding microbiome-obesity interaction are relatively rare, and one of the underlying causes could be the limited sample size of cohort studies. In order to identify gut microbiota changes between normal-weight individuals and obese individuals, fecal samples along with phenotype information from 2262 Chinese individuals were collected and analyzed. Compared with normal-weight individuals, the obese individuals exhibit lower diversity of species and higher diversity of metabolic pathways. In addition, various machine learning models were employed to quantify the relationship between obesity status and Body mass index (BMI) values, of which support vector machine model achieves best performance with 0.716 classification accuracy and 0.485 R2 score. In addition to two well-established obesity-associated species, three species that have potential to be obesity-related biomarkers, including Bacteroides caccae, Odoribacter splanchnicus and Roseburia hominis were identified. Further analyses of functional pathways also reveal some enriched pathways in obese individuals. Collectively, our data demonstrates tight relationship between obesity and gut microbiota in a large-scale Chinese population. These findings may provide potential targets for the prevention and treatment of obesity.
Collapse
Affiliation(s)
- Wanjun Liu
- School of Life and Pharmaceutical Sciences, Dalian University of Technology, Panjin 124221, China; Department of Scientific Research, KMHD, Shenzhen 518126, China
| | - Xiaojie Fang
- Guangdong Provincial Hospital of Chinese Medicine, Guangzhou 510120, China
| | - Yong Zhou
- Department of Scientific Research, KMHD, Shenzhen 518126, China
| | - Lihong Dou
- The First People's Hospital of Jiashan, Zhejiang 314100, China
| | - Tongyi Dou
- School of Life and Pharmaceutical Sciences, Dalian University of Technology, Panjin 124221, China.
| |
Collapse
|
159
|
Cantoni V, Green R, Ricciardi C, Assante R, Donisi L, Zampella E, Cesarelli G, Nappi C, Sannino V, Gaudieri V, Mannarino T, Genova A, De Simini G, Giordano A, D'Antonio A, Acampa W, Petretta M, Cuocolo A. Comparing the Prognostic Value of Stress Myocardial Perfusion Imaging by Conventional and Cadmium-Zinc Telluride Single-Photon Emission Computed Tomography through a Machine Learning Approach. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:5288844. [PMID: 34697554 PMCID: PMC8541857 DOI: 10.1155/2021/5288844] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 11/18/2022]
Abstract
We compared the prognostic value of myocardial perfusion imaging (MPI) by conventional- (C-) single-photon emission computed tomography (SPECT) and cadmium-zinc-telluride- (CZT-) SPECT in a cohort of patients with suspected or known coronary artery disease (CAD) using machine learning (ML) algorithms. A total of 453 consecutive patients underwent stress MPI by both C-SPECT and CZT-SPECT. The outcome was a composite end point of all-cause death, cardiac death, nonfatal myocardial infarction, or coronary revascularization procedures whichever occurred first. ML analysis performed through the implementation of random forest (RF) and k-nearest neighbors (KNN) algorithms proved that CZT-SPECT has greater accuracy than C-SPECT in detecting CAD. For both algorithms, the sensitivity of CZT-SPECT (96% for RF and 60% for KNN) was greater than that of C-SPECT (88% for RF and 53% for KNN). A preliminary univariate analysis was performed through Mann-Whitney tests separately on the features of each camera in order to understand which ones could distinguish patients who will experience an adverse event from those who will not. Then, a machine learning analysis was performed by using Matlab (v. 2019b). Tree, KNN, support vector machine (SVM), Naïve Bayes, and RF were implemented twice: first, the analysis was performed on the as-is dataset; then, since the dataset was imbalanced (patients experiencing an adverse event were lower than the others), the analysis was performed again after balancing the classes through the Synthetic Minority Oversampling Technique. According to KNN and SVM with and without balancing the classes, the accuracy (p value = 0.02 and p value = 0.01) and recall (p value = 0.001 and p value = 0.03) of the CZT-SPECT were greater than those obtained by C-SPECT in a statistically significant way. ML approach showed that although the prognostic value of stress MPI by C-SPECT and CZT-SPECT is comparable, CZT-SPECT seems to have higher accuracy and recall.
Collapse
Affiliation(s)
- Valeria Cantoni
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Roberta Green
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Carlo Ricciardi
- Department of Electrical Engineering and Information Technology, University of Naples Federico II, Naples, Italy
- Bioengineering Unit, Institute of Care and Scientific Research Maugeri, Telese Terme, Campania, Italy
| | - Roberta Assante
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Leandro Donisi
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Emilia Zampella
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Giuseppe Cesarelli
- Bioengineering Unit, Institute of Care and Scientific Research Maugeri, Telese Terme, Campania, Italy
- Department of Chemical, Materials and Production Engineering, University of Naples Federico II, Naples, Italy
| | - Carmela Nappi
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Vincenzo Sannino
- Department of Electrical Engineering and Information Technology, University of Naples Federico II, Naples, Italy
| | - Valeria Gaudieri
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Teresa Mannarino
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Andrea Genova
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Giovanni De Simini
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Alessia Giordano
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Adriana D'Antonio
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| | - Wanda Acampa
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
- Institute of Biostructure and Bioimaging, National Council of Research, Naples, Italy
| | | | - Alberto Cuocolo
- Department of Advanced Biomedical Sciences, University of Naples Federico II, Naples, Italy
| |
Collapse
|
160
|
Sinha VK, Patro KK, Pławiak P, Prakash AJ. Smartphone-Based Human Sitting Behaviors Recognition Using Inertial Sensor. SENSORS (BASEL, SWITZERLAND) 2021; 21:6652. [PMID: 34640971 PMCID: PMC8512024 DOI: 10.3390/s21196652] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Revised: 09/22/2021] [Accepted: 09/28/2021] [Indexed: 11/21/2022]
Abstract
At present, people spend most of their time in passive rather than active mode. Sitting with computers for a long time may lead to unhealthy conditions like shoulder pain, numbness, headache, etc. To overcome this problem, human posture should be changed for particular intervals of time. This paper deals with using an inertial sensor built in the smartphone and can be used to overcome the unhealthy human sitting behaviors (HSBs) of the office worker. To monitor, six volunteers are considered within the age band of 26 ± 3 years, out of which four were male and two were female. Here, the inertial sensor is attached to the rear upper trunk of the body, and a dataset is generated for five different activities performed by the subjects while sitting in the chair in the office. Correlation-based feature selection (CFS) technique and particle swarm optimization (PSO) methods are jointly used to select feature vectors. The optimized features are fed to machine learning supervised classifiers such as naive Bayes, SVM, and KNN for recognition. Finally, the SVM classifier achieved 99.90% overall accuracy for different human sitting behaviors using an accelerometer, gyroscope, and magnetometer sensors.
Collapse
Affiliation(s)
- Vikas Kumar Sinha
- Department of Electronics and Communication Engineering, National Institute of Technology, Rourkela 769008, India; (V.K.S.); (A.J.P.)
| | - Kiran Kumar Patro
- Department of Electronics and Communication Engineering, Aditya Institute of Technology and Management (A), Tekkali 532201, India;
| | - Paweł Pławiak
- Department of Computer Science, Faculty of Computer Science and Telecommunications, Cracow University of Technology, Warszawska 24, 31-155 Krakow, Poland
- Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Bałtycka 5, 44-100 Gliwice, Poland
| | - Allam Jaya Prakash
- Department of Electronics and Communication Engineering, National Institute of Technology, Rourkela 769008, India; (V.K.S.); (A.J.P.)
| |
Collapse
|
161
|
Classification of Contaminated Insulators Using k-Nearest Neighbors Based on Computer Vision. COMPUTERS 2021. [DOI: 10.3390/computers10090112] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Contamination on insulators may increase the surface conductivity of the insulator, and as a consequence, electrical discharges occur more frequently, which can lead to interruptions in a power supply. To maintain reliability in an electrical distribution power system, components that have lost their insulating properties must be replaced. Identifying the components that need maintenance is a difficult task as there are several levels of contamination that are hard to notice during inspections. To improve the quality of inspections, this paper proposes using k-nearest neighbors (k-NN) to classify the levels of insulator contamination based on images of insulators at various levels of contamination simulated in the laboratory. Computer vision features such as mean, variance, asymmetry, kurtosis, energy, and entropy are used for training the k-NN. To assess the robustness of the proposed approach, a statistical analysis and a comparative assessment with well-consolidated algorithms such as decision tree, ensemble subspace, and support vector machine models are presented. The k-NN showed up to 85.17% accuracy using the k-fold cross-validation method, with an average accuracy higher than 82% for the multi-classification of contamination of insulators, being superior to the compared models.
Collapse
|
162
|
Lv Z, Cui F, Zou Q, Zhang L, Xu L. Anticancer peptides prediction with deep representation learning features. Brief Bioinform 2021; 22:bbab008. [PMID: 33529337 DOI: 10.1093/bib/bbab008] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 12/20/2020] [Accepted: 01/05/2021] [Indexed: 12/13/2022] Open
Abstract
Anticancer peptides constitute one of the most promising therapeutic agents for combating common human cancers. Using wet experiments to verify whether a peptide displays anticancer characteristics is time-consuming and costly. Hence, in this study, we proposed a computational method named identify anticancer peptides via deep representation learning features (iACP-DRLF) using light gradient boosting machine algorithm and deep representation learning features. Two kinds of sequence embedding technologies were used, namely soft symmetric alignment embedding and unified representation (UniRep) embedding, both of which involved deep neural network models based on long short-term memory networks and their derived networks. The results showed that the use of deep representation learning features greatly improved the capability of the models to discriminate anticancer peptides from other peptides. Also, UMAP (uniform manifold approximation and projection for dimension reduction) and SHAP (shapley additive explanations) analysis proved that UniRep have an advantage over other features for anticancer peptide identification. The python script and pretrained models could be downloaded from https://github.com/zhibinlv/iACP-DRLF or from http://public.aibiochem.net/iACP-DRLF/.
Collapse
Affiliation(s)
- Zhibin Lv
- University of Electronic Science and Technology of China
| | - Feifei Cui
- University of Electronic Science and Technology of China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences at University of Electronic Science and Technology of China
| | - Lichao Zhang
- School of Intelligent Manufacturing and Equipment, Shenzhen Institute of Information Technology, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, China
| |
Collapse
|
163
|
Wastewater Plant Reliability Prediction Using the Machine Learning Classification Algorithms. Symmetry (Basel) 2021. [DOI: 10.3390/sym13081518] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
One way to optimize wastewater treatment system infrastructure, its operations, monitoring, maintenance and management is through development of smart forecasting, monitoring and failure prediction systems using machine learning modeling. The aim of this paper was to develop a model that was able to predict a water pump failure based on the asymmetrical type of data obtained from sensors such as water levels, capacity, current and flow values. Several machine learning classification algorithms were used for predicting water pump failure. Using the classification algorithms, it was possible to make predictions of future values with a simple input of current values, as well as predicting probabilities of each sample belonging to each class. In order to build a prediction model, an asymmetrical type dataset containing the aforementioned variables was used.
Collapse
|
164
|
Arslan M, Zareef M, Tahir HE, Guo Z, Rakha A, Xuetao H, Shi J, Zhihua L, Xiaobo Z, Khan MR. Discrimination of rice varieties using smartphone-based colorimetric sensor arrays and gas chromatography techniques. Food Chem 2021; 368:130783. [PMID: 34399174 DOI: 10.1016/j.foodchem.2021.130783] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 07/13/2021] [Accepted: 08/03/2021] [Indexed: 11/04/2022]
Abstract
A smartphone-based colorimetric sensor array system was established for discrimination of rice varieties having different geographical origins. Purposely, aroma profiling of nine rice varieties was performed using solid-phase microextraction gas chromatography-mass spectrometry. Alcohols, aldehydes, alkanes, ketones, heterocyclic compounds, and organic acids represent the abundant compounds. Colorimetric sensor array system produced a characteristic color difference map upon its exposure to volatile compounds of rice. Discrimination of rice varieties was subsequently achieved using principal component analysis, hierarchical clustering analysis, and k-nearest neighbors. Rice varieties from same geographical source were clustered together in the scatter plot of principal component analysis and hierarchical clustering analysis dendrogram. The k-nearest neighbors algorithm delivered optimal results with discrimination rate of 100% for both calibration and prediction sets using sensor array system. The smartphone-based colorimetric sensor array system and gas chromatography technique were able to effectively differentiate rice varieties with the advantage of being simple, rapid, and low-cost.
Collapse
Affiliation(s)
- Muhammad Arslan
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Rd., 212013 Zhenjiang, Jiangsu, China
| | - Muhammad Zareef
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Rd., 212013 Zhenjiang, Jiangsu, China
| | - Haroon Elrasheid Tahir
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Rd., 212013 Zhenjiang, Jiangsu, China
| | - Ziang Guo
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Rd., 212013 Zhenjiang, Jiangsu, China
| | - Allah Rakha
- National Institute of Food Science and Technology, University of Agriculture, Faisalabad 38000, Pakistan
| | - Hu Xuetao
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Rd., 212013 Zhenjiang, Jiangsu, China
| | - Jiyong Shi
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Rd., 212013 Zhenjiang, Jiangsu, China
| | - Li Zhihua
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Rd., 212013 Zhenjiang, Jiangsu, China
| | - Zou Xiaobo
- School of Food and Biological Engineering, Jiangsu University, 301 Xuefu Rd., 212013 Zhenjiang, Jiangsu, China.
| | - Moazzam Rafiq Khan
- National Institute of Food Science and Technology, University of Agriculture, Faisalabad 38000, Pakistan
| |
Collapse
|
165
|
Koumetio Tekouabou SC, Diop EB, Azmi R, Jaligot R, Chenal J. Reviewing the application of machine learning methods to model urban form indicators in planning decision support systems: Potential, issues and challenges. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2021. [DOI: 10.1016/j.jksuci.2021.08.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
166
|
SNOROSALAB: A Method Facilitating the Diagnosis of Sleep Breathing Disorders Before Polysomnography. Ing Rech Biomed 2021. [DOI: 10.1016/j.irbm.2021.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
167
|
Performance Improvement of Decision Tree: A Robust Classifier Using Tabu Search Algorithm. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11156728] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Classification and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classifiers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classifier based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classifier based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classifiers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classifier algorithm is convenient for large datasets.
Collapse
|
168
|
A Compressive Sensing Model for Speeding Up Text Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2020:8879795. [PMID: 32831821 PMCID: PMC7428956 DOI: 10.1155/2020/8879795] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 07/07/2020] [Accepted: 07/18/2020] [Indexed: 11/18/2022]
Abstract
Text classification plays an important role in various applications of big data by automatically classifying massive text documents. However, high dimensionality and sparsity of text features have presented a challenge to efficient classification. In this paper, we propose a compressive sensing- (CS-) based model to speed up text classification. Using CS to reduce the size of feature space, our model has a low time and space complexity while training a text classifier, and the restricted isometry property (RIP) of CS ensures that pairwise distances between text features can be well preserved in the process of dimensionality reduction. In particular, by structural random matrices (SRMs), CS is free from computation and memory limitations in the construction of random projections. Experimental results demonstrate that CS effectively accelerates the text classification while hardly causing any accuracy loss.
Collapse
|
169
|
Gan J, Peng Z, Zhu X, Hu R, Ma J, Wu G. Brain functional connectivity analysis based on multi-graph fusion. Med Image Anal 2021; 71:102057. [PMID: 33957559 PMCID: PMC8934107 DOI: 10.1016/j.media.2021.102057] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 03/25/2021] [Accepted: 03/27/2021] [Indexed: 12/13/2022]
Abstract
In this paper, we propose a framework for functional connectivity network (FCN) analysis, which conducts the brain disease diagnosis on the resting state functional magnetic resonance imaging (rs-fMRI) data, aiming at reducing the influence of the noise, the inter-subject variability, and the heterogeneity across subjects. To this end, our proposed framework investigates a multi-graph fusion method to explore both the common and the complementary information between two FCNs, i.e., a fully-connected FCN and a 1 nearest neighbor (1NN) FCN, whereas previous methods only focus on conducting FCN analysis from a single FCN. Specifically, our framework first conducts the graph fusion to produce the representation of the rs-fMRI data with high discriminative ability, and then employs the L1SVM to jointly conduct brain region selection and disease diagnosis. We further evaluate the effectiveness of the proposed framework on various data sets of the neuro-diseases, i.e., Fronto-Temporal Dementia (FTD), Obsessive-Compulsive Disorder (OCD), and Alzheimers Disease (AD). The experimental results demonstrate that the proposed framework achieves the best diagnosis performance via selecting reasonable brain regions for the classification tasks, compared to state-of-the-art FCN analysis methods.
Collapse
Affiliation(s)
- Jiangzhang Gan
- Center for Future Media and School of Computer Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China; School of natural and Computational Science, Massey University Auckland Campus, Auckland 0745, New Zealand
| | - Ziwen Peng
- Center for the Study of Applied Psychology, Guangdong Key Laboratory of Mental Health and Cognitive Science and School of Psychology, South China Normal University, Guangzhou 510631, China
| | - Xiaofeng Zhu
- Center for Future Media and School of Computer Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China; School of natural and Computational Science, Massey University Auckland Campus, Auckland 0745, New Zealand
| | - Rongyao Hu
- School of natural and Computational Science, Massey University Auckland Campus, Auckland 0745, New Zealand
| | - Junbo Ma
- Department of Psychiatry, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Guorong Wu
- Department of Psychiatry, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
170
|
Shen HT, Zhu Y, Zheng W, Zhu X. Half-Quadratic Minimization for Unsupervised Feature Selection on Incomplete Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3122-3135. [PMID: 32730208 DOI: 10.1109/tnnls.2020.3009632] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Unsupervised feature selection (UFS) is a popular technique of reducing the dimensions of high-dimensional data. Previous UFS methods were often designed with the assumption that the whole information in the data set is observed. However, incomplete data sets that contain unobserved information can be often found in real applications, especially in industry. Thus, these existing UFS methods have a limitation on conducting feature selection on incomplete data. On the other hand, most existing UFS methods did not consider the sample importance for feature selection, i.e., different samples have various importance. As a result, the constructed UFS models easily suffer from the influence of outliers. This article investigates a new UFS method for conducting UFS on incomplete data sets to investigate the abovementioned issues. Specifically, the proposed method deals with unobserved information by using an indicator matrix to filter it out the process of feature selection and reduces the influence of outliers by employing the half-quadratic minimization technique to automatically assigning outliers with small or even zero weights and important samples with large weights. This article further designs an alternative optimization strategy to optimize the proposed objective function as well as theoretically and experimentally prove the convergence of the proposed optimization strategy. Experimental results on both real and synthetic incomplete data sets verified the effectiveness of the proposed method compared with previous methods, in terms of clustering performance on the low-dimensional space of the high-dimensional data.
Collapse
|
171
|
Obstructive sleep apnea screening from unprocessed ECG signals using statistical modelling. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102685] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
172
|
Hussain L, Huang P, Nguyen T, Lone KJ, Ali A, Khan MS, Li H, Suh DY, Duong TQ. Machine learning classification of texture features of MRI breast tumor and peri-tumor of combined pre- and early treatment predicts pathologic complete response. Biomed Eng Online 2021; 20:63. [PMID: 34183038 PMCID: PMC8240261 DOI: 10.1186/s12938-021-00899-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 06/09/2021] [Indexed: 12/02/2022] Open
Abstract
Purpose This study used machine learning classification of texture features from MRI of breast tumor and peri-tumor at multiple treatment time points in conjunction with molecular subtypes to predict eventual pathological complete response (PCR) to neoadjuvant chemotherapy. Materials and method This study employed a subset of patients (N = 166) with PCR data from the I-SPY-1 TRIAL (2002–2006). This cohort consisted of patients with stage 2 or 3 breast cancer that underwent anthracycline–cyclophosphamide and taxane treatment. Magnetic resonance imaging (MRI) was acquired pre-neoadjuvant chemotherapy, early, and mid-treatment. Texture features were extracted from post-contrast-enhanced MRI, pre- and post-contrast subtraction images, and with morphological dilation to include peri-tumoral tissue. Molecular subtypes and Ki67 were also included in the prediction model. Performance of classification models used the receiver operating characteristics curve analysis including area under the curve (AUC). Statistical analysis was done using unpaired two-tailed t-tests. Results Molecular subtypes alone yielded moderate prediction performance of PCR (AUC = 0.82, p = 0.07). Pre-, early, and mid-treatment data alone yielded moderate performance (AUC = 0.88, 0.72, and 0.78, p = 0.03, 0.13, 0.44, respectively). The combined pre- and early treatment data markedly improved performance (AUC = 0.96, p = 0.0003). Addition of molecular subtypes improved performance slightly for individual time points but substantially for the combined pre- and early treatment (AUC = 0.98, p = 0.0003). The optimal morphological dilation was 3–5 pixels. Subtraction of post- and pre-contrast MRI further improved performance (AUC = 0.98, p = 0.00003). Finally, among the machine-learning algorithms evaluated, the RUSBoosted Tree machine-learning method yielded the highest performance. Conclusion AI-classification of texture features from MRI of breast tumor at multiple treatment time points accurately predicts eventual PCR. Longitudinal changes in texture features and peri-tumoral features further improve PCR prediction performance. Accurate assessment of treatment efficacy early on could minimize unnecessary toxic chemotherapy and enable mid-treatment modification for patients to achieve better clinical outcomes.
Collapse
Affiliation(s)
- Lal Hussain
- Department of Computer Science & IT, Neelum Campus, The University of Azad Jammu and Kashmir, Muzaffarabad, Azad Kashmir, Pakistan.,Department of Computer Science & IT, King Abdullah Campus, The University of Azad Jammu and Kashmir, Muzaffarabad, Azad Kashmir, Pakistan.,Department of Radiology, Renaissance School of Medicine At Stony, Brook University, 101 Nicolls Rd, Stony Brook, NY, 11794, USA.,Department of Radiology, Albert Einstein College of Medicine and Montefiore Medical Center, 111 East 210th Street, Bronx, NY, 10467, USA
| | - Pauline Huang
- Department of Radiology, Renaissance School of Medicine At Stony, Brook University, 101 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Tony Nguyen
- Department of Radiology, Renaissance School of Medicine At Stony, Brook University, 101 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Kashif J Lone
- Department of Computer Science & IT, King Abdullah Campus, The University of Azad Jammu and Kashmir, Muzaffarabad, Azad Kashmir, Pakistan
| | - Amjad Ali
- Department of Computer Science, COMSATS University Islamabad, Lahore Campus, Lahore, Pakistan
| | - Muhammad Salman Khan
- Department of Computer Science, COMSATS University Islamabad, Lahore Campus, Lahore, Pakistan
| | - Haifang Li
- Department of Radiology, Renaissance School of Medicine At Stony, Brook University, 101 Nicolls Rd, Stony Brook, NY, 11794, USA
| | - Doug Young Suh
- College of Electronics and Convergence Engineering, Kyung Hee University, Seoul, South Korea.
| | - Tim Q Duong
- Department of Radiology, Albert Einstein College of Medicine and Montefiore Medical Center, 111 East 210th Street, Bronx, NY, 10467, USA
| |
Collapse
|
173
|
Hybrid Contractive Auto-encoder with Restricted Boltzmann Machine For Multiclass Classification. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2021. [DOI: 10.1007/s13369-021-05674-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
174
|
Evolving fuzzy k-nearest neighbors using an enhanced sine cosine algorithm: Case study of lupus nephritis. Comput Biol Med 2021; 135:104582. [PMID: 34214940 DOI: 10.1016/j.compbiomed.2021.104582] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 06/13/2021] [Accepted: 06/13/2021] [Indexed: 02/05/2023]
Abstract
Because of its simplicity and effectiveness, fuzzy K-nearest neighbors (FKNN) is widely used in literature. The parameters have an essential impact on the performance of FKNN. Hence, the parameters need to be attuned to suit different problems. Also, choosing more representative features can enhance the performance of FKNN. This research proposes an improved optimization technique based on the sine cosine algorithm (LSCA), which introduces a linear population size reduction mechanism for enhancing the original algorithm's performance. Moreover, we developed an FKNN model based on the LSCA, it simultaneously performs feature selection and parameter optimization. Firstly, the search performance of LSCA is verified on the IEEE CEC2017 benchmark test function compared to the classical and improved algorithms. Secondly, the validity of the LSCA-FKNN model is verified on three medical datasets. Finally, we used the proposed LSCA-FKNN to predict lupus nephritis classes, and the model showed competitive results. The paper will be supported by an online web service for any question at https://aliasgharheidari.com.
Collapse
|
175
|
Wang K, Tian J, Zheng C, Yang H, Ren J, Li C, Han Q, Zhang Y. Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning. Risk Manag Healthc Policy 2021; 14:2453-2463. [PMID: 34149290 PMCID: PMC8206455 DOI: 10.2147/rmhp.s310295] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 05/24/2021] [Indexed: 01/14/2023] Open
Abstract
PURPOSE This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 cases were discharged after improvement. The study subjects were hospitalized patients diagnosed with HF from a regional cardiovascular hospital and the cardiology department of a medical university hospital in Shanxi Province of China between January 2014 and June 2019. Synthesizing minority oversampling technology combined with edited nearest neighbors (SMOTE+ENN) was used to pre-process unbalanced data. Traditional logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) were used to build risk identification models, and each model was repeated 100 times. Model discrimination and calibration were estimated using F1-score, the area under the receiver-operating characteristic curve (AUROC), and Brier score. The best performing of the five models was used to identify the risk of adverse outcomes and evaluate the influencing factors. RESULTS The SME-XGBoost was the best performing model with means of F1-score (0.3673, 95% confidence interval [CI]: 0.3633-0.3712), AUC (0.8010, CI: 0.7974-0.8046), and Brier score (0.1769, CI: 0.1748-0.1789). Age, N-terminal pronatriuretic peptide, pulmonary disease, etc. were the most significant factors of adverse outcomes in patients with HF. CONCLUSION The combination of SMOTE+ENN and advanced machine learning methods effectively improved the discrimination efficacy of adverse outcomes in HF patients, accurately stratified patients at risk of adverse outcomes, and found the top factors of adverse outcomes. These models and factors emphasize the importance of health status data in determining adverse outcomes in patients with HF.
Collapse
Affiliation(s)
- Ke Wang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Department of Epidemiology and Biostatistics, Xuzhou Medical University, Xuzhou, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| | - Jing Tian
- Department of Cardiology, The First Affiliated Hospital of Shanxi Medical University, Taiyuan, People’s Republic of China
| | - Chu Zheng
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| | - Hong Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| | - Jia Ren
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
| | - Chenhao Li
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| | - Qinghua Han
- Department of Cardiology, The First Affiliated Hospital of Shanxi Medical University, Taiyuan, People’s Republic of China
| | - Yanbo Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, People’s Republic of China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Shanxi Medical University, Taiyuan, People's Republic of China
| |
Collapse
|
176
|
|
177
|
López-Hernández JL, González-Carrasco I, López-Cuadrado JL, Ruiz-Mezcua B. Framework for the Classification of Emotions in People With Visual Disabilities Through Brain Signals. Front Neuroinform 2021; 15:642766. [PMID: 34025381 PMCID: PMC8137841 DOI: 10.3389/fninf.2021.642766] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 03/26/2021] [Indexed: 11/13/2022] Open
Abstract
Nowadays, the recognition of emotions in people with sensory disabilities still represents a challenge due to the difficulty of generalizing and modeling the set of brain signals. In recent years, the technology that has been used to study a person's behavior and emotions based on brain signals is the brain-computer interface (BCI). Although previous works have already proposed the classification of emotions in people with sensory disabilities using machine learning techniques, a model of recognition of emotions in people with visual disabilities has not yet been evaluated. Consequently, in this work, the authors present a twofold framework focused on people with visual disabilities. Firstly, auditory stimuli have been used, and a component of acquisition and extraction of brain signals has been defined. Secondly, analysis techniques for the modeling of emotions have been developed, and machine learning models for the classification of emotions have been defined. Based on the results, the algorithm with the best performance in the validation is random forest (RF), with an accuracy of 85 and 88% in the classification for negative and positive emotions, respectively. According to the results, the framework is able to classify positive and negative emotions, but the experimentation performed also shows that the framework performance depends on the number of features in the dataset and the quality of the Electroencephalogram (EEG) signals is a determining factor.
Collapse
|
178
|
Chen Y, Hu M, Hua C, Zhai G, Zhang J, Li Q, Yang SX. Face Mask Assistant: Detection of Face Mask Service Stage Based on Mobile Phone. IEEE SENSORS JOURNAL 2021; 21:11084-11093. [PMID: 36820762 PMCID: PMC8768979 DOI: 10.1109/jsen.2021.3061178] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 02/18/2021] [Accepted: 02/19/2021] [Indexed: 05/10/2023]
Abstract
Coronavirus Disease 2019 (COVID-19) has spread all over the world since it broke out massively in December 2019, which has caused a large loss to the whole world. Both the confirmed cases and death cases have reached a relatively frightening number. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of COVID-19, can be transmitted by small respiratory droplets. To curb its spread at the source, wearing masks is a convenient and effective measure. In most cases, people use face masks in a high-frequent but short-time way. Aimed at solving the problem that we do not know which service stage of the mask belongs to, we propose a detection system based on the mobile phone. We first extract four features from the gray level co-occurrence matrixes (GLCMs) of the face mask's micro-photos. Next, a three-result detection system is accomplished by using K Nearest Neighbor (KNN) algorithm. The results of validation experiments show that our system can reach an accuracy of 82.87% (measured by macro-measures) on the testing dataset. The precision of Type I 'normal use' and the recall of type III 'not recommended' reach 92.00% and 92.59%. In future work, we plan to expand the detection objects to more mask types. This work demonstrates that the proposed mobile microscope system can be used as an assistant for face mask being used, which may play a positive role in fighting against COVID-19.
Collapse
Affiliation(s)
- Yuzhen Chen
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic EngineeringEast China Normal UniversityShanghai200062China
| | - Menghan Hu
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic EngineeringEast China Normal UniversityShanghai200062China
| | - Chunjun Hua
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic EngineeringEast China Normal UniversityShanghai200062China
| | - Guangtao Zhai
- Key Laboratory of Artificial IntelligenceMinistry of EducationShanghai200240China
| | - Jian Zhang
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic EngineeringEast China Normal UniversityShanghai200062China
| | - Qingli Li
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic EngineeringEast China Normal UniversityShanghai200062China
| | - Simon X. Yang
- Advanced Robotics and Intelligent Systems Laboratory, School of EngineeringUniversity of GuelphGuelphONN1G 2W1Canada
| |
Collapse
|
179
|
Wu X, Xu X, Liu J, Wang H, Hu B, Nie F. Supervised Feature Selection With Orthogonal Regression and Feature Weighting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:1831-1838. [PMID: 32406845 DOI: 10.1109/tnnls.2020.2991336] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Effective features can improve the performance of a model and help us understand the characteristics and underlying structure of complex data. Previously proposed feature selection methods usually cannot retain more discriminative information. To address this shortcoming, we propose a novel supervised orthogonal least square regression model with feature weighting for feature selection. The optimization problem of the objective function can be solved by employing generalized power iteration and augmented Lagrangian multiplier methods. Experimental results show that the proposed method can more effectively reduce feature dimensionality and obtain better classification results than traditional feature selection methods. The convergence of our iterative method is also proved. Consequently, the effectiveness and superiority of the proposed method are verified both theoretically and experimentally.
Collapse
|
180
|
Hybrid Basketball Game Outcome Prediction Model by Integrating Data Mining Methods for the National Basketball Association. ENTROPY 2021; 23:e23040477. [PMID: 33920720 PMCID: PMC8073849 DOI: 10.3390/e23040477] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/08/2021] [Accepted: 04/14/2021] [Indexed: 12/18/2022]
Abstract
The sports market has grown rapidly over the last several decades. Sports outcomes prediction is an attractive sports analytic challenge as it provides useful information for operations in the sports market. In this study, a hybrid basketball game outcomes prediction scheme is developed for predicting the final score of the National Basketball Association (NBA) games by integrating five data mining techniques, including extreme learning machine, multivariate adaptive regression splines, k-nearest neighbors, eXtreme gradient boosting (XGBoost), and stochastic gradient boosting. Designed features are generated by merging different game-lags information from fundamental basketball statistics and used in the proposed scheme. This study collected data from all the games of the NBA 2018-2019 seasons. There are 30 teams in the NBA and each team play 82 games per season. A total of 2460 NBA game data points were collected. Empirical results illustrated that the proposed hybrid basketball game prediction scheme achieves high prediction performance and identifies suitable game-lag information and relevant game features (statistics). Our findings suggested that a two-stage XGBoost model using four pieces of game-lags information achieves the best prediction performance among all competing models. The six designed features, including averaged defensive rebounds, averaged two-point field goal percentage, averaged free throw percentage, averaged offensive rebounds, averaged assists, and averaged three-point field goal attempts, from four game-lags have a greater effect on the prediction of final scores of NBA games than other game-lags. The findings of this study provide relevant insights and guidance for other team or individual sports outcomes prediction research.
Collapse
|
181
|
Abstract
Expectiles have gained considerable attention in recent years due to wide applications in many areas. In this study, the k-nearest neighbours approach, together with the asymmetric least squares loss function, called ex-kNN, is proposed for computing expectiles. Firstly, the effect of various distance measures on ex-kNN in terms of test error and computational time is evaluated. It is found that Canberra, Lorentzian, and Soergel distance measures lead to minimum test error, whereas Euclidean, Canberra, and Average of (L1,L∞) lead to a low computational cost. Secondly, the performance of ex-kNN is compared with existing packages er-boost and ex-svm for computing expectiles that are based on nine real life examples. Depending on the nature of data, the ex-kNN showed two to 10 times better performance than er-boost and comparable performance with ex-svm regarding test error. Computationally, the ex-kNN is found two to five times faster than ex-svm and much faster than er-boost, particularly, in the case of high dimensional data.
Collapse
|
182
|
Abstract
k-nearest neighbor (kNN) is a widely used learning algorithm for supervised learning tasks. In practice, the main challenge when using kNN is its high sensitivity to its hyperparameter setting, including the number of nearest neighbors k, the distance function, and the weighting function. To improve the robustness to hyperparameters, this study presents a novel kNN learning method based on a graph neural network, named kNNGNN. Given training data, the method learns a task-specific kNN rule in an end-to-end fashion by means of a graph neural network that takes the kNN graph of an instance to predict the label of the instance. The distance and weighting functions are implicitly embedded within the graph neural network. For a query instance, the prediction is obtained by performing a kNN search from the training data to create a kNN graph and passing it through the graph neural network. The effectiveness of the proposed method is demonstrated using various benchmark datasets for classification and regression tasks.
Collapse
|
183
|
Cheng S, Li M, Fan J, Shang Z, Wan H. Decoding route selection of pigeon during goal-directed behavior: A joint spike-LFP study. Behav Brain Res 2021; 409:113289. [PMID: 33836168 DOI: 10.1016/j.bbr.2021.113289] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Revised: 03/31/2021] [Accepted: 04/03/2021] [Indexed: 10/21/2022]
Abstract
How to reach the goal is one of the core problems that animals must solve to complete goal-directed behavior. Studies have proved the important role of hippocampus (Hp) in spatial navigation and shown that hippocampal neural activities can represent the current location and goal location. However, for the different routes linking these two locations, the neural representation mechanism of the route selection in Hp is not clear. Here, we addressed this question using neural recordings of Hp ensembles and decoding analyses in pigeons performing a goal-directed route selection task known to require Hp participation. The hippocampal spike trains and local field potentials (LFPs) of five pigeons performing the task were acquired and analyzed. We found that the neuron firing rates and power spectrum characteristics in Hp could encode the animal's route selection during goal-directed behavior, suggesting that the representation of route selection was coherent for hippocampal spike and LFP signals. Decoding results further indicated that joint spike-LFP features resulted in a significant improvement in the representation accuracy of the route selection. These findings of this study will help to understand the encoding mechanism of route selection in goal-directed behavior.
Collapse
Affiliation(s)
- Shuguan Cheng
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, China; Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou, China
| | - Mengmeng Li
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, China; Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou, China
| | - Jiantao Fan
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, China; Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou, China
| | - Zhigang Shang
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, China; Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou, China
| | - Hong Wan
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, China; Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou, China.
| |
Collapse
|
184
|
|
185
|
Extreme gradient boosting machine learning method for predicting medical treatment in patients with acute bronchiolitis. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2021.04.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
186
|
Ma M, Chen Y, Chong X, Jiang F, Gao J, Shen L, Zhang C. Integrative analysis of genomic, epigenomic and transcriptomic data identified molecular subtypes of esophageal carcinoma. Aging (Albany NY) 2021; 13:6999-7019. [PMID: 33638948 PMCID: PMC7993659 DOI: 10.18632/aging.202556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Accepted: 12/29/2020] [Indexed: 12/16/2022]
Abstract
Esophageal cancer (EC) involves many genomic, epigenetic and transcriptomic disorders, which play key roles in the heterogeneous progression of cancer. However, the study of EC with multi-omics has not been conducted. This study identified a high consistency between DNA copy number variations and abnormal methylations in EC by analyzing genomics, epigenetics and transcriptomics data and investigating mutual correlations of DNA copy number variation, methylation and gene expressions, and stratified copy number variation genes (CNV-Gs) and methylation genes (MET-Gs). The methylation, CNVs and expression profiles of CNV-Gs and MET-Gs were analyzed by consistent clustering using iCluster integration, here, we determined three subtypes (iC1, iC2, iC3) with different molecular traits, prognostic characteristics and tumor immune microenvironment features. We also identified 4 prognostic genes (CLDN3, FAM221A, GDF15 and YBX2) differentially expressed in the three subtypes, and could therefore be used as representative biomarkers for the three subtypes of EC. In conclusion, by performing comprehensive analysis on genomic, epigenetic and transcriptomic regulations, the current study provided new insights into the multilayer molecular and pathological traits of EC, and contributed to the precision medication for EC patients.
Collapse
Affiliation(s)
- Mingyang Ma
- Department of Gastrointestinal Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital and Institute, Beijing 100142, China
| | - Yang Chen
- Department of Gastrointestinal Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital and Institute, Beijing 100142, China
| | - Xiaoyi Chong
- Department of Gastrointestinal Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital and Institute, Beijing 100142, China
| | - Fangli Jiang
- Department of Gastrointestinal Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital and Institute, Beijing 100142, China
| | - Jing Gao
- National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital and Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen 518116, China
| | - Lin Shen
- Department of Gastrointestinal Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital and Institute, Beijing 100142, China
| | - Cheng Zhang
- Department of Gastrointestinal Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Peking University Cancer Hospital and Institute, Beijing 100142, China
| |
Collapse
|
187
|
Jiang C, Li Y, Tang Y, Guan C. Enhancing EEG-Based Classification of Depression Patients Using Spatial Information. IEEE Trans Neural Syst Rehabil Eng 2021; 29:566-575. [PMID: 33587703 DOI: 10.1109/tnsre.2021.3059429] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
BACKGROUND Depression has become a leading mental disorder worldwide. Evidence has shown that subjects with depression exhibit different spatial responses in neurophysiological signals from the healthy controls when they are exposed to positive and negative stimuli. METHODS We proposed an effective electroencephalogram-based detection method for depression classification using spatial information. A face-in-the-crowd task, including positive and negative emotional facial expressions, was presented to 30 participants, including 16 depression patients and 14 healthy controls. Differential entropy and the genetic algorithm were used for feature extraction and selection, and a support vector machine was used for classification. A task-related common spatial pattern (TCSP) was proposed to enhance the spatial differences before the feature extraction. RESULTS AND DISCUSSION We achieved a leave-one-subject-out cross-validation classification result of 84% and 85.7% for positive and negative stimuli, respectively, using TCSP, which is statistically significantly higher than 81.7% and 83.2%, respectively, acquired without the TCSP (p < 0.05). We also evaluated the classification performance using individual frequency bands and found that the contribution of the gamma band was predominant. In addition, we evaluated different classifiers, including k-nearest neighbor and logistic regression, which showed similar trends in the improvement of classification by employing TCSP. CONCLUSION The results show that our proposed method, employing spatial information, significantly improves the accuracy of classifying depression patients.
Collapse
|
188
|
Yan T, Xu W, Lin J, Duan L, Gao P, Zhang C, Lv X. Combining Multi-Dimensional Convolutional Neural Network (CNN) With Visualization Method for Detection of Aphis gossypii Glover Infection in Cotton Leaves Using Hyperspectral Imaging. FRONTIERS IN PLANT SCIENCE 2021; 12:604510. [PMID: 33659014 PMCID: PMC7917247 DOI: 10.3389/fpls.2021.604510] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/11/2021] [Indexed: 05/08/2023]
Abstract
Cotton is a significant economic crop. It is vulnerable to aphids (Aphis gossypii Glovers) during the growth period. Rapid and early detection has become an important means to deal with aphids in cotton. In this study, the visible/near-infrared (Vis/NIR) hyperspectral imaging system (376-1044 nm) and machine learning methods were used to identify aphid infection in cotton leaves. Both tall and short cotton plants (Lumianyan 24) were inoculated with aphids, and the corresponding plants without aphids were used as control. The hyperspectral images (HSIs) were acquired five times at an interval of 5 days. The healthy and infected leaves were used to establish the datasets, with each leaf as a sample. The spectra and RGB images of each cotton leaf were extracted from the hyperspectral images for one-dimensional (1D) and two-dimensional (2D) analysis. The hyperspectral images of each leaf were used for three-dimensional (3D) analysis. Convolutional Neural Networks (CNNs) were used for identification and compared with conventional machine learning methods. For the extracted spectra, 1D CNN had a fine classification performance, and the classification accuracy could reach 98%. For RGB images, 2D CNN had a better classification performance. For HSIs, 3D CNN performed moderately and performed better than 2D CNN. On the whole, CNN performed relatively better than conventional machine learning methods. In the process of 1D, 2D, and 3D CNN visualization, the important wavelength ranges were analyzed in 1D and 3D CNN visualization, and the importance of wavelength ranges and spatial regions were analyzed in 2D and 3D CNN visualization. The overall results in this study illustrated the feasibility of using hyperspectral imaging combined with multi-dimensional CNN to detect aphid infection in cotton leaves, providing a new alternative for pest infection detection in plants.
Collapse
Affiliation(s)
- Tianying Yan
- College of Information Science and Technology, Shihezi University, Shihezi, China
- Key Laboratory of Oasis Ecology Agriculture, Shihezi University, Shihezi, China
| | - Wei Xu
- College of Agriculture, Shihezi University, Shihezi, China
- Xinjiang Production and Construction Corps Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization, Shihezi, China
| | - Jiao Lin
- College of Agriculture, Shihezi University, Shihezi, China
| | - Long Duan
- College of Information Science and Technology, Shihezi University, Shihezi, China
- Key Laboratory of Oasis Ecology Agriculture, Shihezi University, Shihezi, China
| | - Pan Gao
- College of Information Science and Technology, Shihezi University, Shihezi, China
- Key Laboratory of Oasis Ecology Agriculture, Shihezi University, Shihezi, China
| | - Chu Zhang
- College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou, China
- Key Laboratory of Spectroscopy Sensing, Ministry of Agriculture and Rural Affairs, Hangzhou, China
- School of Information Engineering, Huzhou University, Huzhou, China
| | - Xin Lv
- Key Laboratory of Oasis Ecology Agriculture, Shihezi University, Shihezi, China
- College of Agriculture, Shihezi University, Shihezi, China
| |
Collapse
|
189
|
Li K, Wu Z, Yao J, Fan J, Wei Q. DNA methylation patterns-based subtype distinction and identification of soft tissue sarcoma prognosis. Medicine (Baltimore) 2021; 100:e23787. [PMID: 33592836 PMCID: PMC7870194 DOI: 10.1097/md.0000000000023787] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 11/13/2020] [Indexed: 01/05/2023] Open
Abstract
Soft tissue sarcomas (STSs) are heterogeneous at the clinical with a variable tendency of aggressive behavior. In this study, we constructed a specific DNA methylation-based classification to identify the distinct prognosis-subtypes of STSs based on the DNA methylation spectrum from the TCGA database. Eventually, samples were clustered into 4 subgroups, and their survival curves were distinct from each other. Meanwhile, the samples in each subgroup reflected differentially in several clinical features. Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was also conducted on the genes of the corresponding promoter regions of the above-described specific methylation sites, revealing that these genes were mainly concentrated in certain cancer-associated biological functions and pathways. In addition, we calculated the differences among clustered methylation sites and performed the specific methylation sites with LASSO algorithm. The selection operator algorithm was employed to derive a risk signature model, and a prognostic signature based on these methylation sites performed well for risk stratification in STSs patients. At last, a nomogram consisted of clinical features and risk score was developed for the survival prediction. This study declares that DNA methylation-based STSs subtype classification is highly relevant for future development of personalized therapy as it identifies the prediction value of patient prognosis.
Collapse
Affiliation(s)
- Kai Li
- Department of Orthopedics Trauma and Hand Surgery
| | - Zhengyuan Wu
- Department of Orthopedics Trauma and Hand Surgery
| | - Jun Yao
- Department of Bone and Joint Surgery, The First Affiliated Hospital of Guangxi Medical University
- Guangxi Collaborative Innovation Center for Biomedicine, Guangxi Medical University, Nanning, China
| | - Jingyuan Fan
- Department of Orthopedics Trauma and Hand Surgery
| | - Qingjun Wei
- Department of Orthopedics Trauma and Hand Surgery
| |
Collapse
|
190
|
Asadi S, Roshan SE. A bi-objective optimization method to produce a near-optimal number of classifiers and increase diversity in Bagging. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106656] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
191
|
Pan Z, Pan Y, Wang Y, Wang W. A new globally adaptive k-nearest neighbor classifier based on local mean optimization. Soft comput 2021. [DOI: 10.1007/s00500-020-05311-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
192
|
A Brief Analysis of Key Machine Learning Methods for Predicting Medicare Payments Related to Physical Therapy Practices in the United States. INFORMATION 2021. [DOI: 10.3390/info12020057] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Background and objectives: Machine learning approaches using random forest have been effectively used to provide decision support in health and medical informatics. This is especially true when predicting variables associated with Medicare reimbursements. However, more work is needed to analyze and predict data associated with reimbursements through Medicare and Medicaid services for physical therapy practices in the United States. The key objective of this study is to analyze different machine learning models to predict key variables associated with Medicare standardized payments for physical therapy practices in the United States. Materials and Methods: This study employs five methods, namely, multiple linear regression, decision tree regression, random forest regression, K-nearest neighbors, and linear generalized additive model, (GAM) to predict key variables associated with Medicare payments for physical therapy practices in the United States. Results: The study described in this article adds to the body of knowledge on the effective use of random forest regression and linear generalized additive model in predicting Medicare Standardized payment. It turns out that random forest regression may have any edge over other methods employed for this purpose. Conclusions: The study provides a useful insight into comparing the performance of the aforementioned methods, while identifying a few intricate details associated with predicting Medicare costs while also ascertaining that linear generalized additive model and random forest regression as the most suitable machine learning models for predicting key variables associated with standardized Medicare payments.
Collapse
|
193
|
A Machine Learning-Based Investigation of Gender-Specific Prognosis of Lung Cancers. ACTA ACUST UNITED AC 2021; 57:medicina57020099. [PMID: 33499377 PMCID: PMC7911834 DOI: 10.3390/medicina57020099] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 01/13/2021] [Accepted: 01/15/2021] [Indexed: 01/21/2023]
Abstract
Background and Objective: Primary lung cancer is a lethal and rapidly-developing cancer type and is one of the most leading causes of cancer deaths. Materials and Methods: Statistical methods such as Cox regression are usually used to detect the prognosis factors of a disease. This study investigated survival prediction using machine learning algorithms. The clinical data of 28,458 patients with primary lung cancers were collected from the Surveillance, Epidemiology, and End Results (SEER) database. Results: This study indicated that the survival rate of women with primary lung cancer was often higher than that of men (p < 0.001). Seven popular machine learning algorithms were utilized to evaluate one-year, three-year, and five-year survival prediction The two classifiers extreme gradient boosting (XGB) and logistic regression (LR) achieved the best prediction accuracies. The importance variable of the trained XGB models suggested that surgical removal (feature “Surgery”) made the largest contribution to the one-year survival prediction models, while the metastatic status (feature “N” stage) of the regional lymph nodes was the most important contributor to three-year and five-year survival prediction. The female patients’ three-year prognosis model achieved a prediction accuracy of 0.8297 on the independent future samples, while the male model only achieved the accuracy 0.7329. Conclusions: This data suggested that male patients may have more complicated factors in lung cancer than females, and it is necessary to develop gender-specific diagnosis and prognosis models.
Collapse
|
194
|
Hu J, Zhao FY, Huang B, Ran J, Chen MY, Liu HL, Deng YS, Zhao X, Han XF. An Eight-CpG-based Methylation Classifier for Preoperative Discriminating Early and Advanced-Late Stage of Colorectal Cancer. Front Genet 2021; 11:614160. [PMID: 33519917 PMCID: PMC7838682 DOI: 10.3389/fgene.2020.614160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 12/14/2020] [Indexed: 11/28/2022] Open
Abstract
Aim To develop and validate a CpG-based classifier for preoperative discrimination of early and advanced-late stage colorectal cancer (CRC). Methods We identified an epigenetic signature based on methylation status of multiple CpG sites (CpGs) from 372 subjects in The Cancer Genome Atlas (TCGA) CRC cohort, and an external cohort (GSE48684) with 64 subjects by LASSO regression algorithm. A classifier derived from the methylation signature was used to establish a multivariable logistic regression model to predict the advanced-late stage of CRC. A nomogram was further developed by incorporating the classifier and some independent clinical risk factors, and its performance was evaluated by discrimination and calibration analysis. The prognostic value of the classifier was determined by survival analysis. Furthermore, the diagnostic performance of several CpGs in the methylation signature was evaluated. Results The eight-CpG-based methylation signature discriminated early stage from advanced-late stage CRC, with a satisfactory AUC of more than 0.700 in both the training and validation sets. This methylation classifier was identified as an independent predictor for CRC staging. The nomogram showed favorable predictive power for preoperative staging, and the C-index reached 0.817 (95% CI: 0.753–0.881) and 0.817 (95% CI: 0.721–0.913) in another training set and validation set respectively, with good calibration. The patients stratified in the high-risk group by the methylation classifier had significantly worse survival outcome than those in the low-risk group. Combination diagnosis utilizing only four of the eight specific CpGs performed well, even in CRC patients with low CEA level or at early stage. Conclusions Our classifier is a valuable predictive indicator that can supplement established methods for more accurate preoperative staging and also provides prognostic information for CRC patients. Besides, the combination of multiple CpGs has a high value in the diagnosis of CRC.
Collapse
Affiliation(s)
- Ji Hu
- Department of General Surgery, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China
| | - Fu-Ying Zhao
- Department of Medical Laboratory, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China
| | - Bin Huang
- Department of General Surgery, Daping Hospital, Army Medical University, Chongqing, China
| | - Jing Ran
- Department of Pathology, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China
| | - Mei-Yuan Chen
- Department of General Surgery, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China
| | - Hai-Lin Liu
- Department of Clinical Pharmacy, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China
| | - You-Song Deng
- Department of General Surgery, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China
| | - Xia Zhao
- Department of Microbiology, Army Medical University, Chongqing, China
| | - Xiao-Fan Han
- Department of General Surgery, The First People's Hospital of Chongqing Liang Jiang New Area, Chongqing, China
| |
Collapse
|
195
|
Jing XY, Zhang X, Zhu X, Wu F, You X, Gao Y, Shan S, Yang JY. Multiset Feature Learning for Highly Imbalanced Data Classification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:139-156. [PMID: 31331881 DOI: 10.1109/tpami.2019.2929166] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
With the expansion of data, increasing imbalanced data has emerged. When the imbalance ratio (IR) of data is high, most existing imbalanced learning methods decline seriously in classification performance. In this paper, we systematically investigate the highly imbalanced data classification problem, and propose an uncorrelated cost-sensitive multiset learning (UCML) approach for it. Specifically, UCML first constructs multiple balanced subsets through random partition, and then employs the multiset feature learning (MFL) to learn discriminant features from the constructed multiset. To enhance the usability of each subset and deal with the non-linearity issue existed in each subset, we further propose a deep metric based UCML (DM-UCML) approach. DM-UCML introduces the generative adversarial network technique into the multiset constructing process, such that each subset can own similar distribution with the original dataset. To cope with the non-linearity issue, DM-UCML integrates deep metric learning with MFL, such that more favorable performance can be achieved. In addition, DM-UCML designs a new discriminant term to enhance the discriminability of learned metrics. Experiments on eight traditional highly class-imbalanced datasets and two large-scale datasets indicate that: the proposed approaches outperform state-of-the-art highly imbalanced learning methods and are more robust to high IR.
Collapse
|
196
|
Indoor Floor Localization Based on Multi-Intelligent Sensors. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2020. [DOI: 10.3390/ijgi10010006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
With the continuous expansion of the market of indoor localization, the requirements of indoor localization technology are becoming higher and higher. Existing indoor floor localization (IFL) systems based on Wi-Fi signal and barometer data are susceptible to external environment changes, resulting in large errors. A method for indoor floor localization using multiple intelligent sensors (MIS-IFL) is proposed to decrease the localization errors, which consists of a fingerprint database construction phase and a floor localization phase. In the fingerprint database construction phase, data acquisition is performed using magnetometer sensor, accelerator sensor and gyro sensor in the smartphone. In the floor localization phase, an active pattern recognition is performed through the collaborative work of multiple intelligent sensors and machine learning classifiers. Then floor localization is performed using magnetic data mapping, Euclidean closest approximation and majority principle. Finally, the inter-floor detection link based on machine learning is added to improve the overall localization accuracy of MIS-IFL. The experimental results show that the performance of the proposed method is superior to the existing IFL.
Collapse
|
197
|
Classification of Biodegradable Substances Using Balanced Random Trees and Boosted C5.0 Decision Trees. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:ijerph17249322. [PMID: 33322123 PMCID: PMC7763457 DOI: 10.3390/ijerph17249322] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Revised: 11/28/2020] [Accepted: 12/11/2020] [Indexed: 12/12/2022]
Abstract
Substances that do not degrade over time have proven to be harmful to the environment and are dangerous to living organisms. Being able to predict the biodegradability of substances without costly experiments is useful. Recently, the quantitative structure-activity relationship (QSAR) models have proposed effective solutions to this problem. However, the molecular descriptor datasets usually suffer from the problems of unbalanced class distribution, which adversely affects the efficiency and generalization of the derived models. Accordingly, this study aims at validating the performances of balanced random trees (RTs) and boosted C5.0 decision trees (DTs) to construct QSAR models to classify the ready biodegradation of substances and their abilities to deal with unbalanced data. The balanced RTs model algorithm builds individual trees using balanced bootstrap samples, while the boosted C5.0 DT is modeled using cost-sensitive learning. We employed the two-dimensional molecular descriptor dataset, which is publicly available through the University of California, Irvine (UCI) machine learning repository. The molecular descriptors were ranked according to their contributions to the balanced RTs classification process. The performance of the proposed models was compared with previously reported results. Based on the statistical measures, the experimental results showed that the proposed models outperform the classification results of the support vector machine (SVM), K-nearest neighbors (KNN), and discrimination analysis (DA). Classification measures were analyzed in terms of accuracy, sensitivity, specificity, precision, false positive rate, false negative rate, F1 score, receiver operating characteristic (ROC) curve, and area under the ROC curve (AUROC).
Collapse
|
198
|
An Improved Intrusion Detection System Based on KNN Hyperparameter Tuning and Cross-Validation. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2020. [DOI: 10.1007/s13369-020-04907-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
199
|
Convolutional Sparse Coded Dynamic Brain Functional Connectivity. Neural Process Lett 2020. [DOI: 10.1007/s11063-020-10295-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
200
|
Singh D, Singh B. Investigating the impact of data normalization on classification performance. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2019.105524] [Citation(s) in RCA: 211] [Impact Index Per Article: 42.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|