101
|
Lee SK, Shin JH, Ahn J, Lee JY, Jang DE. Identifying the Risk Factors Associated with Nursing Home Residents' Pressure Ulcers Using Machine Learning Methods. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18062954. [PMID: 33805798 PMCID: PMC8001016 DOI: 10.3390/ijerph18062954] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 03/08/2021] [Accepted: 03/09/2021] [Indexed: 12/23/2022]
Abstract
BACKGROUND Machine learning (ML) can keep improving predictions and generating automated knowledge via data-driven predictors or decisions. OBJECTIVE The purpose of this study was to compare different ML methods including random forest, logistics regression, linear support vector machine (SVM), polynomial SVM, radial SVM, and sigmoid SVM in terms of their accuracy, sensitivity, specificity, negative predictor values, and positive predictive values by validating real datasets to predict factors for pressure ulcers (PUs). METHODS We applied representative ML algorithms (random forest, logistic regression, linear SVM, polynomial SVM, radial SVM, and sigmoid SVM) to develop a prediction model (N = 60). RESULTS The random forest model showed the greatest accuracy (0.814), followed by logistic regression (0.782), polynomial SVM (0.779), radial SVM (0.770), linear SVM (0.767), and sigmoid SVM (0.674). CONCLUSIONS The random forest model showed the greatest accuracy for predicting PUs in nursing homes (NHs). Diverse factors that predict PUs in NHs including NH characteristics and residents' characteristics were identified according to diverse ML methods. These factors should be considered to decrease PUs in NH residents.
Collapse
Affiliation(s)
- Soo-Kyoung Lee
- College of Nursing, Keimyung University, 1095 Dalgubeol-daero, Dalseo-gu, Daegu 42601, Korea;
| | - Juh Hyun Shin
- College of Nursing, Ewha Womans University, Science & Ewha Research Institute of Nursing Science, Seoul 120750, Korea
- Correspondence:
| | - Jinhyun Ahn
- Department of Management Information Systems, Jeju National University, Jeju 63243, Korea;
| | - Ji Yeon Lee
- College of Nursing, Catholic University of Pusan, Busan 46252, Korea;
| | - Dong Eun Jang
- School of Nursing, University of Texas at Austin, Austin, TX 78712, USA;
| |
Collapse
|
102
|
miRNA and mRNA expression profiling reveals potential biomarkers for metastatic cutaneous melanoma. Expert Rev Anticancer Ther 2021; 21:557-567. [PMID: 33504224 DOI: 10.1080/14737140.2021.1882860] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Purpose: This study aims to uncover potential biomarkers associated with cutaneous melanoma (CM) metastasis.Methods: The mRNA and microRNA (miRNA) expression data from the metastatic CM and non-metastatic CM population were obtained from The Cancer Genome Atlas database. Functional analysis, protein-protein interaction (PPI), and survival analysis were performed for differentially expressed mRNAs (DEmRNAs) and miRNAs (DEmiRNAs). The interaction between DEmRNAs and DEmiRNAs was analyzed. The expression of several key DEmRNAs and DEmiRNAs was validated by Gene Expression Omnibus datasets.Results: Overall, 1172 DEmRNAs and 26 DEmiRNAs were identified from metastatic and non-metastatic CM. Cytokine-cytokine receptor interaction and chemokine signaling pathway were key pathways. CXCR1, CXCR2, CXCR4, CCR1, CCR2, and CCR5 were hub genes in the PPI network. Among these, miR-29 c-3p, miR-100-5p, miR-150-5p, and miR-150-3p were not only diagnostic biomarkers but also related to survival time. miR-203a-3p interacted with CCR5 and LIFR, while miR-224-5p was strongly associated with CXCR4. LIFR, CXCR1, CXCR2, CXCR4, CCR1, CCR2, and CCR5 were enriched in the cytokine-cytokine receptor interaction pathway. The levels of seven DEmRNAs (CXCR1, CXCR2, CXCR4, CCR1, CCR2, CCR5, and LIFR) and two DEmiRNAs (miR-203a-3p and miR-224-5p) were validated using the GSE65568 and GSE109244 datasets, respectively.Conclusion: Our findings may provide novel biomarkers for CM metastasis.[Formula: see text].
Collapse
|
103
|
Jayanthi S, Rene Robin CR. Analysis of Microarray Data by Empirical Wavelet Transform for Cancer Classification Using Block by Block Method. JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS 2021. [DOI: 10.1166/jmihi.2021.3318] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
In this study, DNA microarray data is analyzed from a signal processing perspective for cancer classification. An adaptive wavelet transform named Empirical Wavelet Transform (EWT) is analyzed using block-by-block procedure to characterize microarray data. The EWT wavelet basis depends
on the input data rather predetermined like in conventional wavelets. Thus, EWT gives more sparse representations than wavelets. The characterization of microarray data is made by block-by-block procedure with predefined block sizes in powers of 2 that starts from 128 to 2048. After characterization,
a statistical hypothesis test is employed to select the informative EWT coefficients. Only the selected coefficients are used for Microarray Data Classification (MDC) by the Support Vector Machine (SVM). Computational experiments are employed on five microarray datasets; colon, breast, leukemia,
CNS and ovarian to test the developed cancer classification system. The obtained results demonstrate that EWT coefficients with SVM emerged as an effective approach with no misclassification for MDC system.
Collapse
Affiliation(s)
- S. Jayanthi
- Research Scholar, Anna University, 600025, Tamilnadu, India; Department of Computer Science and Engineering, Agni College of Technology, 600130, Tamilnadu, India
| | - C. R. Rene Robin
- Department of Computer Science and Engineering, Jerusalem College of Engineering, 600100, Tamilnadu, India
| |
Collapse
|
104
|
Yao Y, Zhao X, Ning Q, Zhou J. ABC-Gly: Identifying Protein Lysine Glycation Sites with Artificial Bee Colony Algorithm. CURR PROTEOMICS 2021. [DOI: 10.2174/1570164617666191227120136] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Glycation is a nonenzymatic post-translational modification process by attaching
a sugar molecule to a protein or lipid molecule. It may impair the function and change the characteristic
of the proteins which may lead to some metabolic diseases. In order to understand the underlying molecular
mechanisms of glycation, computational prediction methods have been developed because of their
convenience and high speed. However, a more effective computational tool is still a challenging task in
computational biology.
Methods:
In this study, we showed an accurate identification tool named ABC-Gly for predicting lysine
glycation sites. At first, we utilized three informative features, including position-specific amino
acid propensity, secondary structure and the composition of k-spaced amino acid pairs to encode the
peptides. Moreover, to sufficiently exploit discriminative features thus can improve the prediction and
generalization ability of the model, we developed a two-step feature selection, which combined the
Fisher score and an improved binary artificial bee colony algorithm based on the support vector machine.
Finally, based on the optimal feature subset, we constructed an effective model by using the
Support Vector Machine on the training dataset.
Results:
The performance of the proposed predictor ABC-Gly was measured with the sensitivity of
76.43%, the specificity of 91.10%, the balanced accuracy of 83.76%, the Area Under the receiveroperating
characteristic Curve (AUC) of 0.9313, a Matthew’s Correlation Coefficient (MCC) of
0.6861 by 10-fold cross-validation on training dataset, and a balanced accuracy of 59.05% on independent
dataset. Compared to the state-of-the-art predictors on the training dataset, the proposed predictor
achieved significant improvement in the AUC of 0.156 and MCC of 0.336.
Conclusion:
The detailed analysis results indicated that our predictor may serve as a powerful complementary
tool to other existing methods for predicting protein lysine glycation. The source code and
datasets of the ABC-Gly were provided in the Supplementary File 1.
Collapse
Affiliation(s)
- Yanqiu Yao
- College of Computer Science and Technology, Changchun Normal University, Changchun, 130032, China
| | - Xiaosa Zhao
- School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, China
| | - Qiao Ning
- School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, China
| | - Junping Zhou
- School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, China
| |
Collapse
|
105
|
Machicao J, Craighero F, Maspero D, Angaroni F, Damiani C, Graudenzi A, Antoniotti M, Bruno OM. On the Use of Topological Features of Metabolic Networks for the Classification of Cancer Samples. Curr Genomics 2021; 22:88-97. [PMID: 34220296 PMCID: PMC8188584 DOI: 10.2174/1389202922666210301084151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 12/16/2020] [Accepted: 12/18/2020] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The increasing availability of omics data collected from patients affected by severe pathologies, such as cancer, is fostering the development of data science methods for their analysis. INTRODUCTION The combination of data integration and machine learning approaches can provide new powerful instruments to tackle the complexity of cancer development and deliver effective diagnostic and prognostic strategies. METHODS We explore the possibility of exploiting the topological properties of sample-specific metabolic networks as features in a supervised classification task. Such networks are obtained by projecting transcriptomic data from RNA-seq experiments on genome-wide metabolic models to define weighted networks modeling the overall metabolic activity of a given sample. RESULTS We show the classification results on a labeled breast cancer dataset from the TCGA database, including 210 samples (cancer vs. normal). In particular, we investigate how the performance is affected by a threshold-based pruning of the networks by comparing Artificial Neural Networks, Support Vector Machines and Random Forests. Interestingly, the best classification performance is achieved within a small threshold range for all methods, suggesting that it might represent an effective choice to recover useful information while filtering out noise from data. Overall, the best accuracy is achieved with SVMs, which exhibit performances similar to those obtained when gene expression profiles are used as features. CONCLUSION These findings demonstrate that the topological properties of sample-specific metabolic networks are effective in classifying cancer and normal samples, suggesting that useful information can be extracted from a relatively limited number of features.
Collapse
Affiliation(s)
- Jeaneth Machicao
- Address correspondence to these authors at the São Carlos Institute of Physics, University of São Paulo, São Carlos, Brazil; Institute of Molecular Bioimaging and Physiology, Consiglio Nazionale delle Ricerche (IBFM-CNR), Segrate, Milan, Italy E-mails: , ,
| | | | | | | | | | - Alex Graudenzi
- Address correspondence to these authors at the São Carlos Institute of Physics, University of São Paulo, São Carlos, Brazil; Institute of Molecular Bioimaging and Physiology, Consiglio Nazionale delle Ricerche (IBFM-CNR), Segrate, Milan, Italy E-mails: , ,
| | | | - Odemir M. Bruno
- Address correspondence to these authors at the São Carlos Institute of Physics, University of São Paulo, São Carlos, Brazil; Institute of Molecular Bioimaging and Physiology, Consiglio Nazionale delle Ricerche (IBFM-CNR), Segrate, Milan, Italy E-mails: , ,
| |
Collapse
|
106
|
Lambrou GI, Adamaki M, Hatziagapiou K, Vlahopoulos S. Gene Expression and Resistance to Glucocorticoid-Induced Apoptosis in Acute Lymphoblastic Leukemia: A Brief Review and Update. Curr Drug Res Rev 2021; 12:131-149. [PMID: 32077838 DOI: 10.2174/2589977512666200220122650] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 12/29/2019] [Accepted: 01/23/2020] [Indexed: 01/18/2023]
Abstract
BACKGROUND Resistance to glucocorticoid (GC)-induced apoptosis in Acute Lymphoblastic Leukemia (ALL), is considered one of the major prognostic factors for the disease. Prednisolone is a corticosteroid and one of the most important agents in the treatment of acute lymphoblastic leukemia. The mechanics of GC resistance are largely unknown and intense ongoing research focuses on this topic. AIM The aim of the present study is to review some aspects of GC resistance in ALL, and in particular of Prednisolone, with emphasis on previous and present knowledge on gene expression and signaling pathways playing a role in the phenomenon. METHODS An electronic literature search was conducted by the authors from 1994 to June 2019. Original articles and systematic reviews selected, and the titles and abstracts of papers screened to determine whether they met the eligibility criteria, and full texts of the selected articles were retrieved. RESULTS Identification of gene targets responsible for glucocorticoid resistance may allow discovery of drugs, which in combination with glucocorticoids may increase the effectiveness of anti-leukemia therapies. The inherent plasticity of clinically evolving cancer justifies approaches to characterize and prevent undesirable activation of early oncogenic pathways. CONCLUSION Study of the pattern of intracellular signal pathway activation by anticancer drugs can lead to development of efficient treatment strategies by reducing detrimental secondary effects.
Collapse
Affiliation(s)
- George I Lambrou
- First Department of Pediatrics, National and Kapodistrian University of Athens, Choremeio Research Laboratory, Athens, Greece
| | - Maria Adamaki
- First Department of Pediatrics, National and Kapodistrian University of Athens, Choremeio Research Laboratory, Athens, Greece
| | - Kyriaki Hatziagapiou
- First Department of Pediatrics, National and Kapodistrian University of Athens, Choremeio Research Laboratory, Athens, Greece
| | - Spiros Vlahopoulos
- First Department of Pediatrics, National and Kapodistrian University of Athens, Choremeio Research Laboratory, Athens, Greece
| |
Collapse
|
107
|
Kjellman M, Knigge U, Welin S, Thiis-Evensen E, Gronbaek H, Schalin-Jäntti C, Sorbye H, Joergensen MT, Johanson V, Metso S, Waldum H, Søreide JA, Ebeling T, Lindberg F, Landerholm K, Wallin G, Salem F, Schneider MDP, Belusa R. A Plasma Protein Biomarker Strategy for Detection of Small Intestinal Neuroendocrine Tumors. Neuroendocrinology 2021; 111:840-849. [PMID: 32721955 PMCID: PMC8686712 DOI: 10.1159/000510483] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 07/27/2020] [Indexed: 12/27/2022]
Abstract
BACKGROUND Small intestinal neuroendocrine tumors (SI-NETs) are difficult to diagnose in the early stage of disease. Current blood biomarkers such as chromogranin A (CgA) and 5-hydroxyindolacetic acid have low sensitivity (SEN) and specificity (SPE). This is a first preplanned interim analysis (Nordic non-interventional, prospective, exploratory, EXPLAIN study [NCT02630654]). Its objective is to investigate if a plasma protein multi-biomarker strategy can improve diagnostic accuracy (ACC) in SI-NETs. METHODS At the time of diagnosis, before any disease-specific treatment was initiated, blood was collected from patients with advanced SI-NETs and 92 putative cancer-related plasma proteins from 135 patients were analyzed and compared with the results of age- and sex-matched controls (n = 143), using multiplex proximity extension assay and machine learning techniques. RESULTS Using a random forest model including 12 top ranked plasma proteins in patients with SI-NETs, the multi-biomarker strategy showed SEN and SPE of 89 and 91%, respectively, with negative predictive value (NPV) and positive predictive value (PPV) of 90 and 91%, respectively, to identify patients with regional or metastatic disease with an area under the receiver operator characteristic curve (AUROC) of 99%. In 30 patients with normal CgA concentrations, the model provided a diagnostic SPE of 98%, SEN of 56%, and NPV 90%, PPV of 90%, and AUROC 97%, regardless of proton pump inhibitor intake. CONCLUSION This interim analysis demonstrates that a multi-biomarker/machine learning strategy improves diagnostic ACC of patients with SI-NET at the time of diagnosis, especially in patients with normal CgA levels. The results indicate that this multi-biomarker strategy can be useful for early detection of SI-NETs at presentation and conceivably detect recurrence after radical primary resection.
Collapse
Affiliation(s)
- Magnus Kjellman
- Endocrine Surgery Unit, Karolinska Hospital, Stockholm, Sweden,
| | - Ulrich Knigge
- Department of Endocrinology and Gastrointestinal Surgery, ENETS Neuroendocrine Tumor Centre of Excellence, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Staffan Welin
- Department of Endocrine Oncology, ENETS Neuroendocrine Tumor Centre of Excellence, Uppsala University Hospital, Uppsala, Sweden
| | - Espen Thiis-Evensen
- Department of Gastroenterology, ENETS Neuroendocrine Tumor Centre of Excellence, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Henning Gronbaek
- Department of Hepatology and Gastroenterology, ENETS Neuroendocrine Tumor Centre of Excellence, Aarhus University Hospital, Aarhus, Denmark
| | - Camilla Schalin-Jäntti
- Department of Endocrinology, Abdominal Center, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Halfdan Sorbye
- Department of Oncology and Department of Clinical Science, Haukeland University Hospital, Bergen, Norway
| | | | - Viktor Johanson
- Department of Surgery, Institute of Clinical Sciences at the Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Saara Metso
- Unit of Endocrinology, Department of Internal Medicine, Tampere University Hospital, Teiskontie Tampere, Tampere, Finland
| | | | - Jon Arne Søreide
- Department of Gastrointestinal Surgery, Stavanger University Hospital, Stavanger, Norway
| | - Tapani Ebeling
- Faculty of Medicine, University of Oulu, Finland and Division of Endocrinology, Oulu University Hospital, Oulu, Finland
| | - Fredrik Lindberg
- Department of Surgery, Norrland University Hospital, Umeå, Sweden
| | - Kalle Landerholm
- Department of Clinical and Experimental Medicine, Linköping University and Department of Surgery, Ryhov County Hospital, Jönköping, Sweden
| | - Goran Wallin
- Faculty of Medicine and Health, Örebro University Hospital, Örebro, Sweden
| | - Farhad Salem
- Skånes University Hospital, Unit for Endocrine and Sarcoma Surgery, Lund, Sweden
| | | | | |
Collapse
|
108
|
Wang Y, Ding Y, Tang J, Dai Y, Guo F. CrystalM: A Multi-View Fusion Approach for Protein Crystallization Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:325-335. [PMID: 31027046 DOI: 10.1109/tcbb.2019.2912173] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Improving the accuracy of predicting protein crystallization is very important for protein crystallization projects, which is a critical step for the determination of protein structure by X-ray crystallography. At present, many machine learning methods are used to predict protein crystallization. Here, we use a novel feature combination to construct a SVM model in the prediction of protein crystallization, called as CrystalM. In this work, we extract six features to represent protein sequences, namely Average Block-Position specific scoring matrix (AVBlock-PSSM), Average Block-Secondary Structure (AVBlock-SS), Global Encoding (GE), Pseudo-Position specific scoring matrix (PsePSSM), Protscale, and Discrete Wavelet Transform-Position specific scoring matrix (DWT-PSSM). Moreover, we employ two training datasets (TRAIN3587 and TRAIN1500) and their corresponding independent test datasets (TEST3585 and TEST500) to evaluate CrystalM by feeding multi-view features into Support Vector Machine (SVM) classifier. Two training datasets are employed for five-fold cross validation, and two test datasets are separately used to test the corresponding datasets. Finally, we compare CrystalM with other existing methods in the performance. For the datasets of TRAIN3587 and TEST3585, CrystalM achieves best Accuracy (ACC), best Specificity (SP), and the same Mathew's correlation coefficient (MCC) as the previous outperforming methods in the five-fold cross validation. In particular, ACC, SP, and MCC have surpassed the existing methods in independent test, which proves the effectiveness of CrystalM. Meanwhile, ACC, SP, and MCC are higher than existing methods in the five-fold cross validation for TRAIN1500. Although the performance of independent test for TEST500 is not the best, CrystalM also has a certain predictability in the prediction of protein crystallization. In addition, we find that only choosing the first four features can improve the performance of prediction for TRAIN1500 and TEST500, not only in independent tests but also in five-fold cross validation. This phenomenon indicates that the latter two features can not effectively represent proteins of TRAIN1500 and TEST500. CrystalM is a sequence-based protein crystallization prediction method. The good performance on the datasets proves the effectiveness of CrystalM and the better performance on large datasets further demonstrates the stability and superiority of CrystalM.
Collapse
|
109
|
Making the Third Dimension (3D) Explicit in Hedonic Price Modelling: A Case Study of Xi’an, China. LAND 2020. [DOI: 10.3390/land10010024] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Recent rapid population growth and increasing urbanisation have led to fast vertical developments in urban areas. Therefore, in the context of the dynamic property market, factors related to the third dimension (3D) need to be considered. Current hedonic price modelling (HPM) studies have little explicit consideration for the third dimension, which may have a significant influence on modelling property values in complex urban environments. Therefore, our research aims to narrow the cognitive gap of the missing third dimension by assessing both 2D and 3D HPM and identifying important 3D factors for spatial analysis and visualisation in the selected study area, Xi’an, China. The statistical methods we used for 2D HPM are ordinary least squares (OLS) and geographically weighted regression (GWR). In 2D HPM, they both have very low R2 (0.111 in OLS and 0.217 in GWR), showing a very limited generalisation potential. However, a significant improvement is observed when adding 3D factors, namely view quality, sky view factor (SVF), sunlight and property orientation. The obtained higher R2 (0.414) shows the importance of the third dimension or—3D factors for HPM. Our findings demonstrate the necessity to include such factors into HPM and to develop 3D models with a higher level of details (LoD) to serve more purposes such as fair property taxation.
Collapse
|
110
|
Rehman AU, Qureshi SA. A review of the medical hyperspectral imaging systems and unmixing algorithms' in biological tissues. Photodiagnosis Photodyn Ther 2020; 33:102165. [PMID: 33383204 DOI: 10.1016/j.pdpdt.2020.102165] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 12/18/2020] [Accepted: 12/21/2020] [Indexed: 01/27/2023]
Abstract
Hyperspectral fluorescence imaging (HFI) is a well-known technique in the medical research field and is considered a non-invasive tool for tissue diagnosis. This review article gives a brief introduction to acquisition methods, including the image preprocessing methods, feature selection and extraction methods, data classification techniques and medical image analysis along with recent relevant references. The process of fusion of unsupervised unmixing techniques with other classification methods, like the combination of support vector machine with an artificial neural network, the latest snapshot Hyperspectral imaging (HSI) and vortex analysis techniques are also outlined. Finally, the recent applications of hyperspectral images in cellular differentiation of various types of cancer are discussed.
Collapse
Affiliation(s)
- Aziz Ul Rehman
- Agri & Biophotonics Division, National Institute of Lasers and Optronics College, PIEAS, 45650, Islamabad, Pakistan; Department of Physics and Astronomy Macquarie University, Sydney, 2109, New South Wales, Australia.
| | - Shahzad Ahmad Qureshi
- Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad, 45650, Pakistan
| |
Collapse
|
111
|
Jin Q, Meng Z, Sun C, Cui H, Su R. RA-UNet: A Hybrid Deep Attention-Aware Network to Extract Liver and Tumor in CT Scans. Front Bioeng Biotechnol 2020; 8:605132. [PMID: 33425871 PMCID: PMC7785874 DOI: 10.3389/fbioe.2020.605132] [Citation(s) in RCA: 116] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 12/01/2020] [Indexed: 02/01/2023] Open
Abstract
Automatic extraction of liver and tumor from CT volumes is a challenging task due to their heterogeneous and diffusive shapes. Recently, 2D deep convolutional neural networks have become popular in medical image segmentation tasks because of the utilization of large labeled datasets to learn hierarchical features. However, few studies investigate 3D networks for liver tumor segmentation. In this paper, we propose a 3D hybrid residual attention-aware segmentation method, i.e., RA-UNet, to precisely extract the liver region and segment tumors from the liver. The proposed network has a basic architecture as U-Net which extracts contextual information combining low-level feature maps with high-level ones. Attention residual modules are integrated so that the attention-aware features change adaptively. This is the first work that an attention residual mechanism is used to segment tumors from 3D medical volumetric images. We evaluated our framework on the public MICCAI 2017 Liver Tumor Segmentation dataset and tested the generalization on the 3DIRCADb dataset. The experiments show that our architecture obtains competitive results.
Collapse
Affiliation(s)
- Qiangguo Jin
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
- CSIRO Data61, Sydney, NSW, Australia
| | - Zhaopeng Meng
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
- Tianjin University of Traditional Chinese Medicine, Tianjin, China
| | | | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, VIC, Australia
| | - Ran Su
- School of Computer Software, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
112
|
Chen WF, Ou HY, Liu KH, Li ZY, Liao CC, Wang SY, Huang W, Cheng YF, Pan CT. In-Series U-Net Network to 3D Tumor Image Reconstruction for Liver Hepatocellular Carcinoma Recognition. Diagnostics (Basel) 2020; 11:11. [PMID: 33374672 PMCID: PMC7822491 DOI: 10.3390/diagnostics11010011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 12/16/2020] [Accepted: 12/20/2020] [Indexed: 12/27/2022] Open
Abstract
Cancer is one of the common diseases. Quantitative biomarkers extracted from standard-of-care computed tomography (CT) scan can create a robust clinical decision tool for the diagnosis of hepatocellular carcinoma (HCC). According to the current clinical methods, the situation usually accounts for high expenditure of time and resources. To improve the current clinical diagnosis and therapeutic procedure, this paper proposes a deep learning-based approach, called Successive Encoder-Decoder (SED), to assist in the automatic interpretation of liver lesion/tumor segmentation through CT images. The SED framework consists of two different encoder-decoder networks connected in series. The first network aims to remove unwanted voxels and organs and to extract liver locations from CT images. The second network uses the results of the first network to further segment the lesions. For practical purpose, the predicted lesions on individual CTs were extracted and reconstructed on 3D images. The experiments conducted on 4300 CT images and LiTS dataset demonstrate that the liver segmentation and the tumor prediction achieved 0.92 and 0.75 in Dice score, respectively, by as-proposed SED method.
Collapse
Affiliation(s)
- Wen-Fan Chen
- Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung 80424, Taiwan;
| | - Hsin-You Ou
- Liver Transplantation Program and Departments of Diagnostic Radiology, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 833401, Taiwan; (H.-Y.O.); (C.-C.L.)
| | - Keng-Hao Liu
- Department of Mechanical and Electro-Mechanical Engineering, National SunYat-sen University, Kaohsiung 80424, Taiwan; (K.-H.L.); (Z.-Y.L.); (S.-Y.W.); (W.H.)
| | - Zhi-Yun Li
- Department of Mechanical and Electro-Mechanical Engineering, National SunYat-sen University, Kaohsiung 80424, Taiwan; (K.-H.L.); (Z.-Y.L.); (S.-Y.W.); (W.H.)
| | - Chien-Chang Liao
- Liver Transplantation Program and Departments of Diagnostic Radiology, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 833401, Taiwan; (H.-Y.O.); (C.-C.L.)
| | - Shao-Yu Wang
- Department of Mechanical and Electro-Mechanical Engineering, National SunYat-sen University, Kaohsiung 80424, Taiwan; (K.-H.L.); (Z.-Y.L.); (S.-Y.W.); (W.H.)
| | - Wen Huang
- Department of Mechanical and Electro-Mechanical Engineering, National SunYat-sen University, Kaohsiung 80424, Taiwan; (K.-H.L.); (Z.-Y.L.); (S.-Y.W.); (W.H.)
| | - Yu-Fan Cheng
- Liver Transplantation Program and Departments of Diagnostic Radiology, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 833401, Taiwan; (H.-Y.O.); (C.-C.L.)
| | - Cheng-Tang Pan
- Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung 80424, Taiwan;
- Department of Mechanical and Electro-Mechanical Engineering, National SunYat-sen University, Kaohsiung 80424, Taiwan; (K.-H.L.); (Z.-Y.L.); (S.-Y.W.); (W.H.)
| |
Collapse
|
113
|
LogSum + L 2 penalized logistic regression model for biomarker selection and cancer classification. Sci Rep 2020; 10:22125. [PMID: 33335163 PMCID: PMC7747646 DOI: 10.1038/s41598-020-79028-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 11/25/2020] [Indexed: 12/11/2022] Open
Abstract
Biomarker selection and cancer classification play an important role in knowledge discovery using genomic data. Successful identification of gene biomarkers and biological pathways can significantly improve the accuracy of diagnosis and help machine learning models have better performance on classification of different types of cancer. In this paper, we proposed a LogSum + L2 penalized logistic regression model, and furthermore used a coordinate decent algorithm to solve it. The results of simulations and real experiments indicate that the proposed method is highly competitive among several state-of-the-art methods. Our proposed model achieves the excellent performance in group feature selection and classification problems.
Collapse
|
114
|
Wu S, Tseng IC, Huang WC, Su CW, Lai YH, Lin C, Lee AYL, Kuo CY, Su LY, Lee MC, Hsu TC, Yu CH. Establishment of an Immunocompetent Metastasis Rat Model with Hepatocyte Cancer Stem Cells. Cancers (Basel) 2020; 12:cancers12123721. [PMID: 33322441 PMCID: PMC7764036 DOI: 10.3390/cancers12123721] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 11/30/2020] [Accepted: 12/09/2020] [Indexed: 12/12/2022] Open
Abstract
Hepatocellular carcinoma (HCC) is one of the leading causes of cancer mortality. Cancer stem cells (CSCs) are responsible for the maintenance, metastasis, and relapse of various tumors. The effects of CSCs on the tumorigenesis of HCC are still not fully understood, however. We have recently established two new rat HCC cell lines HTC and TW-1, which we isolated from diethylnitrosamine-induced rat liver cancer. Results showed that TW-1 expressed the genetic markers of CSCs, including CD133, GSTP1, CD44, CD90, and EpCAM. Moreover, TW-1 showed higher tolerance to sorafenib than HTC did. In addition, tumorigenesis and metastasis were observed in nude mice and wild-type rats with TW-1 xenografts. Finally, we combined highly expressed genes in TW-1/HTC with well-known biomarkers from recent HCC studies to predict HCC-related biomarkers and able to identify HCC with AUCs > 0.9 after machine learning. These results indicated that TW-1 was a novel rat CSC line, and the mice or rat models we established with TW-1 has great potential on HCC studies in the future.
Collapse
Affiliation(s)
- Semon Wu
- Department of Life Science, Chinese Culture University, Taipei 11114, Taiwan;
- Correspondence: (S.W.); (C.-H.Y.); Tel.: +886-2-2861-0511(ext. 26234) (S.W.); +886-2-66289779 (C.-H.Y.); Fax: +886-2-2862-3724 (S.W.); +886-2-66289009 (C.-H.Y.)
| | - I-Chieh Tseng
- Department of Life Science, Chinese Culture University, Taipei 11114, Taiwan;
| | - Wen-Cheng Huang
- License Biotech, Co., Ltd., Taipei 10690, Taiwan; (W.-C.H.); (C.-W.S.)
| | - Cheng-Wen Su
- License Biotech, Co., Ltd., Taipei 10690, Taiwan; (W.-C.H.); (C.-W.S.)
| | - Yu-Heng Lai
- Department of Chemistry, Chinese Culture University, Taipei 11114, Taiwan;
| | - Che Lin
- Department of Electrical Engineering and Graduate Institute of Communication Engineering, National Taiwan University, Taipei 10617, Taiwan;
| | - Alan Yueh-Luen Lee
- National Institute of Cancer Research, National Health Research Institutes, Zhunan, Miaoli 35053, Taiwan;
| | - Chan-Yen Kuo
- Department of Research, Taipei Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Taipei 23142, Taiwan; (C.-Y.K.); (L.-Y.S.); (M.-C.L.)
| | - Li-Yu Su
- Department of Research, Taipei Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Taipei 23142, Taiwan; (C.-Y.K.); (L.-Y.S.); (M.-C.L.)
| | - Ming-Cheng Lee
- Department of Research, Taipei Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Taipei 23142, Taiwan; (C.-Y.K.); (L.-Y.S.); (M.-C.L.)
| | - Te-Cheng Hsu
- Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taipei 30013, Taiwan;
| | - Chun-Hsien Yu
- Department of Pediatrics, Taipei Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Taipei 23142, Taiwan
- Department of Pediatrics, School of Medicine, Tzu Chi University, Hualien 97071, Taiwan
- Correspondence: (S.W.); (C.-H.Y.); Tel.: +886-2-2861-0511(ext. 26234) (S.W.); +886-2-66289779 (C.-H.Y.); Fax: +886-2-2862-3724 (S.W.); +886-2-66289009 (C.-H.Y.)
| |
Collapse
|
115
|
Qayyum A, Lalande A, Meriaudeau F. Automatic segmentation of tumors and affected organs in the abdomen using a 3D hybrid model for computed tomography imaging. Comput Biol Med 2020; 127:104097. [DOI: 10.1016/j.compbiomed.2020.104097] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 10/25/2020] [Accepted: 10/25/2020] [Indexed: 11/28/2022]
|
116
|
Yuan B, Yang D, Rothberg BEG, Chang H, Xu T. Unsupervised and supervised learning with neural network for human transcriptome analysis and cancer diagnosis. Sci Rep 2020; 10:19106. [PMID: 33154423 PMCID: PMC7644700 DOI: 10.1038/s41598-020-75715-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 10/15/2020] [Indexed: 11/09/2022] Open
Abstract
Deep learning analysis of images and text unfolds new horizons in medicine. However, analysis of transcriptomic data, the cause of biological and pathological changes, is hampered by structural complexity distinctive from images and text. Here we conduct unsupervised training on more than 20,000 human normal and tumor transcriptomic data and show that the resulting Deep-Autoencoder, DeepT2Vec, has successfully extracted informative features and embedded transcriptomes into 30-dimensional Transcriptomic Feature Vectors (TFVs). We demonstrate that the TFVs could recapitulate expression patterns and be used to track tissue origins. Trained on these extracted features only, a supervised classifier, DeepC, can effectively distinguish tumors from normal samples with an accuracy of 90% for Pan-Cancer and reach an average 94% for specific cancers. Training on a connected network, the accuracy is further increased to 96% for Pan-Cancer. Together, our study shows that deep learning with autoencoder is suitable for transcriptomic analysis, and DeepT2Vec could be successfully applied to distinguish cancers, normal tissues, and other potential traits with limited samples.
Collapse
Affiliation(s)
- Bo Yuan
- Department of Genetics, Yale Cancer Center, Howard Hughes Medical Institute, Yale University School of Medicine, 295 Congress Avenue, New Haven, CT, 06510, USA.,Zhiyuan College, Shanghai Jiao Tong University, Shanghai, China.,Deptartment of Cell Biology, Harvard Medical School, Boston, MA, 02138, USA
| | - Dong Yang
- Westlake Institute for Advanced Study, Westlake University, Hangzhou, China. .,Department of Genetics, Yale Cancer Center, Howard Hughes Medical Institute, Yale University School of Medicine, 295 Congress Avenue, New Haven, CT, 06510, USA.
| | - Bonnie E G Rothberg
- Medical Oncology, Department of Internal Medicine, Yale Cancer Center, Yale University School of Medicine, New Haven, USA
| | - Hao Chang
- Department of Genetics, Yale Cancer Center, Howard Hughes Medical Institute, Yale University School of Medicine, 295 Congress Avenue, New Haven, CT, 06510, USA
| | - Tian Xu
- Westlake Institute for Advanced Study, Westlake University, Hangzhou, China. .,Department of Genetics, Yale Cancer Center, Howard Hughes Medical Institute, Yale University School of Medicine, 295 Congress Avenue, New Haven, CT, 06510, USA.
| |
Collapse
|
117
|
Gan H, Zhang J, Towsey M, Truskinger A, Stark D, van Rensburg BJ, Li Y, Roe P. Data selection in frog chorusing recognition with acoustic indices. ECOL INFORM 2020. [DOI: 10.1016/j.ecoinf.2020.101160] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
118
|
Hung M, Hon ES, Lauren E, Xu J, Judd G, Su W. Machine Learning Approach to Predict Risk of 90-Day Hospital Readmissions in Patients With Atrial Fibrillation: Implications for Quality Improvement in Healthcare. Health Serv Res Manag Epidemiol 2020; 7:2333392820961887. [PMID: 33088848 PMCID: PMC7545784 DOI: 10.1177/2333392820961887] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Background: Atrial fibrillation (AF) in the elderly population is projected to increase over the next several decades. Catheter ablation shows promise as a treatment option and is becoming increasingly available. We examined 90-day hospital readmission for AF patients undergoing catheter ablation and utilized machine learning methods to explore the risk factors associated with these readmission trends. Methods: Data from the 2013 Nationwide Readmissions Database on AF cases were used to predict 90-day readmissions for AF with catheter ablation. Multiple machine learning methods such as k-Nearest Neighbors, Decision Tree, and Support Vector Machine were employed to determine variable importance and build risk prediction models. Accuracy, precision, sensitivity, specificity, and area under the curve were compared for each model. Results: The 90-day hospital readmission rate was 17.6%; the average age of the patients was 64.9 years; 62.9% of patients were male. Important variables in predicting 90-day hospital readmissions in patients with AF undergoing catheter ablation included the age of the patient, number of diagnoses on the patient’s record, and the total number of discharges from a hospital. The k-Nearest Neighbor had the best performance with a prediction accuracy of 85%. This was closely followed by Decision Tree, but Support Vector Machine was less ideal. Conclusions: Machine learning methods can produce accurate models in predicting hospital readmissions for patients with AF. The likelihood of readmission to the hospital increases as the patient age, total number of hospital discharges, and total number of patient diagnoses increase. Findings from this study can inform quality improvement in healthcare and in achieving patient-centered care.
Collapse
Affiliation(s)
- Man Hung
- Roseman University of Health Sciences College of Dental Medicine, South Jordan, UT, USA.,University of Utah School of Medicine, Salt Lake City, UT, USA.,Utah Center for Clinical and Translational Sciences, Salt Lake City, UT, USA.,Huntsman Cancer Institute, Salt Lake City, UT, USA
| | - Eric S Hon
- University of Chicago Department of Economics, Chicago, IL, USA
| | - Evelyn Lauren
- University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Julie Xu
- University of Utah College of Nursing, Salt Lake City, UT, USA
| | - Gary Judd
- Roseman University of Health Sciences College of Dental Medicine, South Jordan, UT, USA
| | - Weicong Su
- University of Utah Department of Mathematics, Salt Lake City, UT, USA
| |
Collapse
|
119
|
Byeon H. Development of a depression in Parkinson's disease prediction model using machine learning. World J Psychiatry 2020; 10:234-244. [PMID: 33134114 PMCID: PMC7582129 DOI: 10.5498/wjp.v10.i10.234] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 09/01/2020] [Accepted: 09/22/2020] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND It is important to diagnose depression in Parkinson's disease (DPD) as soon as possible and identify the predictors of depression to improve quality of life in Parkinson's disease (PD) patients. AIM To develop a model for predicting DPD based on the support vector machine, while considering sociodemographic factors, health habits, Parkinson's symptoms, sleep behavior disorders, and neuropsychiatric indicators as predictors and provide baseline data for identifying DPD. METHODS This study analyzed 223 of 335 patients who were 60 years or older with PD. Depression was measured using the 30 items of the Geriatric Depression Scale, and the explanatory variables included PD-related motor signs, rapid eye movement sleep behavior disorders, and neuropsychological tests. The support vector machine was used to develop a DPD prediction model. RESULTS When the effects of PD motor symptoms were compared using "functional weight", late motor complications (occurrence of levodopa-induced dyskinesia) were the most influential risk factors for Parkinson's symptoms. CONCLUSION It is necessary to develop customized screening tests that can detect DPD in the early stage and continuously monitor high-risk groups based on the factors related to DPD derived from this predictive model in order to maintain the emotional health of PD patients.
Collapse
Affiliation(s)
- Haewon Byeon
- Major in Medical Big Data, College of AI Convergence, Inje University, Gimhae 50834, Gyeonsangnamdo, South Korea
| |
Collapse
|
120
|
Poppenberg KE, Tutino VM, Li L, Waqas M, June A, Chaves L, Jiang K, Jarvis JN, Sun Y, Snyder KV, Levy EI, Siddiqui AH, Kolega J, Meng H. Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm. J Transl Med 2020; 18:392. [PMID: 33059716 PMCID: PMC7565814 DOI: 10.1186/s12967-020-02550-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 09/27/2020] [Indexed: 12/14/2022] Open
Abstract
Background Intracranial aneurysms (IAs) are dangerous because of their potential to rupture. We previously found significant RNA expression differences in circulating neutrophils between patients with and without unruptured IAs and trained machine learning models to predict presence of IA using 40 neutrophil transcriptomes. Here, we aim to develop a predictive model for unruptured IA using neutrophil transcriptomes from a larger population and more robust machine learning methods. Methods Neutrophil RNA extracted from the blood of 134 patients (55 with IA, 79 IA-free controls) was subjected to next-generation RNA sequencing. In a randomly-selected training cohort (n = 94), the Least Absolute Shrinkage and Selection Operator (LASSO) selected transcripts, from which we constructed prediction models via 4 well-established supervised machine-learning algorithms (K-Nearest Neighbors, Random Forest, and Support Vector Machines with Gaussian and cubic kernels). We tested the models in the remaining samples (n = 40) and assessed model performance by receiver-operating-characteristic (ROC) curves. Real-time quantitative polymerase chain reaction (RT-qPCR) of 9 IA-associated genes was used to verify gene expression in a subset of 49 neutrophil RNA samples. We also examined the potential influence of demographics and comorbidities on model prediction. Results Feature selection using LASSO in the training cohort identified 37 IA-associated transcripts. Models trained using these transcripts had a maximum accuracy of 90% in the testing cohort. The testing performance across all methods had an average area under ROC curve (AUC) = 0.97, an improvement over our previous models. The Random Forest model performed best across both training and testing cohorts. RT-qPCR confirmed expression differences in 7 of 9 genes tested. Gene ontology and IPA network analyses performed on the 37 model genes reflected dysregulated inflammation, cell signaling, and apoptosis processes. In our data, demographics and comorbidities did not affect model performance. Conclusions We improved upon our previous IA prediction models based on circulating neutrophil transcriptomes by increasing sample size and by implementing LASSO and more robust machine learning methods. Future studies are needed to validate these models in larger cohorts and further investigate effect of covariates.
Collapse
Affiliation(s)
- Kerry E Poppenberg
- Canon Stroke and Vascular Research Center, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA.,Department of Neurosurgery, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Vincent M Tutino
- Canon Stroke and Vascular Research Center, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA.,Department of Biomedical Engineering, University of Buffalo, Buffalo, USA.,Department of Neurosurgery, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA.,Department of Pathology and Anatomical Sciences, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Lu Li
- Department of Computer Science and Engineering, University of Buffalo, Buffalo, USA
| | - Muhammad Waqas
- Department of Neurosurgery, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA.,Department of Neurology, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Armond June
- Department of Pathology and Anatomical Sciences, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Lee Chaves
- Department of Internal Medicine, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Kaiyu Jiang
- Genetics, Genomics, and Bioinformatics Program, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - James N Jarvis
- Genetics, Genomics, and Bioinformatics Program, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA.,Department of Pediatrics, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Yijun Sun
- Genetics, Genomics, and Bioinformatics Program, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA.,Department of Microbiology and Immunology, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Kenneth V Snyder
- Canon Stroke and Vascular Research Center, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA.,Department of Neurosurgery, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA.,Department of Radiology, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA.,Department of Neurology, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Elad I Levy
- Canon Stroke and Vascular Research Center, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA.,Department of Neurosurgery, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA.,Department of Radiology, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Adnan H Siddiqui
- Canon Stroke and Vascular Research Center, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA.,Department of Neurosurgery, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA.,Department of Radiology, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - John Kolega
- Canon Stroke and Vascular Research Center, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA.,Department of Pathology and Anatomical Sciences, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA
| | - Hui Meng
- Canon Stroke and Vascular Research Center, Clinical and Translational Research Center, 875 Ellicott Street, Buffalo, NY, 14214, USA. .,Department of Biomedical Engineering, University of Buffalo, Buffalo, USA. .,Department of Neurosurgery, Jacobs School of Medicine and Biomedical Sciences, Buffalo, USA. .,Department of Mechanical & Aerospace Engineering, University At Buffalo, Buffalo, NY, USA.
| |
Collapse
|
121
|
Machine learning approach to integrated endometrial transcriptomic datasets reveals biomarkers predicting uterine receptivity in cattle at seven days after estrous. Sci Rep 2020; 10:16981. [PMID: 33046742 PMCID: PMC7550564 DOI: 10.1038/s41598-020-72988-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 09/07/2020] [Indexed: 12/12/2022] Open
Abstract
The main goal was to apply machine learning (ML) methods on integrated multi-transcriptomic data, to identify endometrial genes capable of predicting uterine receptivity according to their expression patterns in the cow. Public data from five studies were re-analyzed. In all of them, endometrial samples were obtained at day 6–7 of the estrous cycle, from cows or heifers of four different European breeds, classified as pregnant (n = 26) or not (n = 26). First, gene selection was performed through supervised and unsupervised ML algorithms. Then, the predictive ability of potential key genes was evaluated through support vector machine as classifier, using the expression levels of the samples from all the breeds but one, to train the model, and the samples from that one breed, to test it. Finally, the biological meaning of the key genes was explored. Fifty genes were identified, and they could predict uterine receptivity with an overall 96.1% accuracy, despite the animal’s breed and category. Genes with higher expression in the pregnant cows were related to circadian rhythm, Wnt receptor signaling pathway, and embryonic development. This novel and robust combination of computational tools allowed the identification of a group of biologically relevant endometrial genes that could support pregnancy in the cattle.
Collapse
|
122
|
SoK: Machine vs. machine – A systematic classification of automated machine learning-based CAPTCHA solvers. Comput Secur 2020. [DOI: 10.1016/j.cose.2020.101947] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
123
|
Lee SK, Ahn J, Shin JH, Lee JY. Application of Machine Learning Methods in Nursing Home Research. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:E6234. [PMID: 32867250 PMCID: PMC7503291 DOI: 10.3390/ijerph17176234] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 08/23/2020] [Accepted: 08/24/2020] [Indexed: 12/13/2022]
Abstract
Background: A machine learning (ML) system is able to construct algorithms to continue improving predictions and generate automated knowledge through data-driven predictors or decisions. Objective: The purpose of this study was to compare six ML methods (random forest (RF), logistics regression, linear support vector machine (SVM), polynomial SVM, radial SVM, and sigmoid SVM) of predicting falls in nursing homes (NHs). Methods: We applied three representative six-ML algorithms to the preprocessed dataset to develop a prediction model (N = 60). We used an accuracy measure to evaluate prediction models. Results: RF was the most accurate model (0.883), followed by the logistic regression model, SVM linear, and polynomial SVM (0.867). Conclusions: RF was a powerful algorithm to discern predictors of falls in NHs. For effective fall management, researchers should consider organizational characteristics as well as personal factors. Recommendations for Future Research: To confirm the superiority of ML in NH research, future studies are required to discern additional potential factors using newly introduced ML methods.
Collapse
Affiliation(s)
- Soo-Kyoung Lee
- College of Nursing, Keimyung University, 1095, Dalgubeol-daero, Dalseo-gu, Daegu 42601, Korea;
| | - Jinhyun Ahn
- Department of Management Information Systems, Jeju National University, Jeju-do 63243, Korea;
| | - Juh Hyun Shin
- College of Nursing, Ewha Womans University, Seoul 03760, Korea;
| | - Ji Yeon Lee
- College of Nursing, Ewha Womans University, Seoul 03760, Korea;
| |
Collapse
|
124
|
Using Machine Learning to Predict 30-Day Hospital Readmissions in Patients with Atrial Fibrillation Undergoing Catheter Ablation. J Pers Med 2020; 10:jpm10030082. [PMID: 32784873 PMCID: PMC7564438 DOI: 10.3390/jpm10030082] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 08/02/2020] [Accepted: 08/06/2020] [Indexed: 12/24/2022] Open
Abstract
Atrial fibrillation (AF) cases are expected to increase over the next several decades, due to the rise in the elderly population. One promising treatment option for AF is catheter ablation, which is increasing in use. We investigated the hospital readmissions data for AF patients undergoing catheter ablation, and used machine learning models to explore the risk factors behind these readmissions. We analyzed data from the 2013 Nationwide Readmissions Database on cases with AF, and determined the relative importance of factors in predicting 30-day readmissions for AF with catheter ablation. Various machine learning methods, such as k-nearest neighbors, decision tree, and support vector machine were utilized to develop predictive models with their accuracy, precision, sensitivity, specificity, and area under the curve computed and compared. We found that the most important variables in predicting 30-day hospital readmissions in patients with AF undergoing catheter ablation were the age of the patient, the total number of discharges from a hospital, and the number of diagnoses on the patient’s record, among others. Out of the methods used, k-nearest neighbor had the highest prediction accuracy of 85%, closely followed by decision tree, while support vector machine was less desirable for these data. Hospital readmissions for AF with catheter ablation can be predicted with relatively high accuracy, utilizing machine learning methods. As patient age, the total number of hospital discharges, and the total number of patient diagnoses increase, the risk of hospital readmissions increases.
Collapse
|
125
|
Broschard MB, Kim J, Love BC, Freeman JH. Category learning in rodents using touchscreen‐based tasks. GENES BRAIN AND BEHAVIOR 2020; 20:e12665. [DOI: 10.1111/gbb.12665] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 05/01/2020] [Accepted: 05/04/2020] [Indexed: 01/29/2023]
Affiliation(s)
- Matthew B. Broschard
- Department of Psychological and Brain Sciences University of Iowa Iowa City Iowa USA
| | - Jangjin Kim
- Department of Psychological and Brain Sciences University of Iowa Iowa City Iowa USA
| | - Bradley C. Love
- Department of Experimental Psychology and The Alan Turing Institute University College London London UK
| | - John H. Freeman
- Department of Psychological and Brain Sciences University of Iowa Iowa City Iowa USA
| |
Collapse
|
126
|
Mahmud MS, Ahmed F, Al-Fahad R, Moinuddin KA, Yeasin M, Alain C, Bidelman GM. Decoding Hearing-Related Changes in Older Adults' Spatiotemporal Neural Processing of Speech Using Machine Learning. Front Neurosci 2020; 14:748. [PMID: 32765215 PMCID: PMC7378401 DOI: 10.3389/fnins.2020.00748] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 06/25/2020] [Indexed: 12/25/2022] Open
Abstract
Speech perception in noisy environments depends on complex interactions between sensory and cognitive systems. In older adults, such interactions may be affected, especially in those individuals who have more severe age-related hearing loss. Using a data-driven approach, we assessed the temporal (when in time) and spatial (where in the brain) characteristics of cortical speech-evoked responses that distinguish older adults with or without mild hearing loss. We performed source analyses to estimate cortical surface signals from the EEG recordings during a phoneme discrimination task conducted under clear and noise-degraded conditions. We computed source-level ERPs (i.e., mean activation within each ROI) from each of the 68 ROIs of the Desikan-Killiany (DK) atlas, averaged over a randomly chosen 100 trials without replacement to form feature vectors. We adopted a multivariate feature selection method called stability selection and control to choose features that are consistent over a range of model parameters. We use parameter optimized support vector machine (SVM) as a classifiers to investigate the time course and brain regions that segregate groups and speech clarity. For clear speech perception, whole-brain data revealed a classification accuracy of 81.50% [area under the curve (AUC) 80.73%; F1-score 82.00%], distinguishing groups within ∼60 ms after speech onset (i.e., as early as the P1 wave). We observed lower accuracy of 78.12% [AUC 77.64%; F1-score 78.00%] and delayed classification performance when speech was embedded in noise, with group segregation at 80 ms. Separate analysis using left (LH) and right hemisphere (RH) regions showed that LH speech activity was better at distinguishing hearing groups than activity measured in the RH. Moreover, stability selection analysis identified 12 brain regions (among 1428 total spatiotemporal features from 68 regions) where source activity segregated groups with >80% accuracy (clear speech); whereas 16 regions were critical for noise-degraded speech to achieve a comparable level of group segregation (78.7% accuracy). Our results identify critical time-courses and brain regions that distinguish mild hearing loss from normal hearing in older adults and confirm a larger number of active areas, particularly in RH, when processing noise-degraded speech information.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Faruk Ahmed
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Rakib Al-Fahad
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Kazi Ashraf Moinuddin
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Claude Alain
- Rotman Research Institute-Baycrest Centre for Geriatric Care, Toronto, ON, Canada.,Department of Psychology, University of Toronto, Toronto, ON, Canada.,Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States.,School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States.,Department of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, TN, United States
| |
Collapse
|
127
|
Cheng R, Zhang J, Hu P. Document-level emotion detection using graph-based margin regularization. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.01.059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
128
|
Yang X, Tian L, Chen Y, Yang L, Xu S, Wu W. Inverse Projection Representation and Category Contribution Rate for Robust Tumor Recognition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1262-1275. [PMID: 30575544 DOI: 10.1109/tcbb.2018.2886334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Sparse representation based classification (SRC) methods have achieved remarkable results. SRC, however, still suffer from requiring enough training samples, insufficient use of test samples, and instability of representation. In this paper, a stable inverse projection representation based classification (IPRC) is presented to tackle these problems by effectively using test samples. An IPR is first proposed and its feasibility and stability are analyzed. A classification criterion named category contribution rate is constructed to match the IPR and complete classification. Moreover, a statistical measure is introduced to quantify the stability of representation-based classification methods. Based on the IPRC technique, a robust tumor recognition framework is presented by interpreting microarray gene expression data, where a two-stage hybrid gene selection method is introduced to select informative genes. Finally, the functional analysis of candidate's pathogenicity-related genes is given. Extensive experiments on six public tumor microarray gene expression datasets demonstrate the proposed technique is competitive with state-of-the-art methods.
Collapse
|
129
|
Wu G, Zhang M. A novel risk score model based on eight genes and a nomogram for predicting overall survival of patients with osteosarcoma. BMC Cancer 2020; 20:456. [PMID: 32448271 PMCID: PMC7245838 DOI: 10.1186/s12885-020-06741-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 03/12/2020] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND This study aims to identify a predictive model to predict survival outcomes of osteosarcoma (OS) patients. METHODS A RNA sequencing dataset (the training set) and a microarray dataset (the validation set) were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) database, respectively. Differentially expressed genes (DEGs) between metastatic and non-metastatic OS samples were identified in training set. Prognosis-related DEGs were screened and optimized by support vector machine (SVM) recursive feature elimination. A SVM classifier was built to classify metastatic and non-metastatic OS samples. Independent prognosic genes were extracted by multivariate regression analysis to build a risk score model followed by performance evaluation in two datasets by Kaplan-Meier (KM) analysis. Independent clinical prognostic indicators were identified followed by nomogram analysis. Finally, functional analyses of survival-related genes were conducted. RESULT Totally, 345 DEGs and 45 prognosis-related genes were screened. A SVM classifier could distinguish metastatic and non-metastatic OS samples. An eight-gene signature was an independent prognostic marker and used for constructing a risk score model. The risk score model could separate OS samples into high and low risk groups in two datasets (training set: log-rank p < 0.01, C-index = 0.805; validation set: log-rank p < 0.01, C-index = 0.797). Tumor metastasis and RS model status were independent prognostic factors and nomogram model exhibited accurate survival prediction for OS. Additionally, functional analyses of survival-related genes indicated they were closely associated with immune responses and cytokine-cytokine receptor interaction pathway. CONCLUSION An eight-gene predictive model and nomogram were developed to predict OS prognosis.
Collapse
Affiliation(s)
- Guangzhi Wu
- Departments of Hand Surgery, The Third Hospital of Jilin University, Changchun, Jilin Province China
| | - Minglei Zhang
- Departments of Orthopedics, The Third Hospital of Jilin University, Changchun, Jilin Province China
| |
Collapse
|
130
|
He YL, Ma Y, Xu Y, Zhu QX. Fault Diagnosis Using Novel Class-Specific Distributed Monitoring Weighted Naı̈ve Bayes: Applications to Process Industry. Ind Eng Chem Res 2020. [DOI: 10.1021/acs.iecr.0c01071] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Yan-Lin He
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, China
- Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, China
| | - Yongchao Ma
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, China
- Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, China
| | - Yuan Xu
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, China
- Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, China
| | - Qun-Xiong Zhu
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing, 100029, China
- Engineering Research Center of Intelligent PSE, Ministry of Education of China, Beijing 100029, China
| |
Collapse
|
131
|
Liu Z, Cao Y, Li Y, Xiao X, Qiu Q, Yang M, Zhao Y, Cui L. Automatic diagnosis of fungal keratitis using data augmentation and image fusion with deep convolutional neural network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2020; 187:105019. [PMID: 31421868 DOI: 10.1016/j.cmpb.2019.105019] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 08/01/2019] [Accepted: 08/06/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND AND OBJECTIVES Fungal keratitis is caused by inflammation of the cornea that results from infection by fungal organisms. The lack of an early effective diagnosis often results in serious complications even blindness. Confocal microscopy is one of the most effective methods in the diagnosis of fungal keratitis, but the diagnosis depends on the subjective judgment of medical experts. METHODS To address this problem, this paper proposes a novel convolutional neural network framework for the automatic diagnosis of fungal keratitis using data augmentation and image fusion. Firstly, a normal image is augmented by flipping to solve the problem of having a limited and imbalanced database. Secondly, a sub-area contrast stretching algorithm is proposed for image preprocessing to highlight the key structures in the images and to filter out irrelevant information. Thirdly, the histogram matching fusion algorithm is implemented, then the preprocessed image is fused with the original image to form a new algorithm framework and a new database. Finally, the traditional convolutional neural network is integrated into the novel algorithm framework to perform the experiments. RESULTS Experiments show that the accuracy of traditional AlexNet and VGGNet is 99.35% and 99.14%, that of AlexNet and VGGNet based on MF fusion is 99.80% and 99.83%, and that of AlexNet and VGGNet based on histogram matching fusion (HMF) is 99.95% and 99.89%. The experimental results show that the AlexNet framework using data augmentation and image fusion achieves a perfect trade-off between the diagnostic performance and the computational complexity, with a diagnostic accuracy of 99.95%. CONCLUSIONS These experimental results demonstrate the novel convolutional neural network framework perfectly balances the diagnostic performance and computational complexity, and can improve the effect and real-time performance in the diagnosis of fungal keratitis.
Collapse
Affiliation(s)
- Zhi Liu
- Research Center of Intelligent Medical Information Processing, School of Information Science and Engineering, Shandong University, Qingdao 266237, China
| | - Yankun Cao
- Research Center of Intelligent Medical Information Processing, School of Information Science and Engineering, Shandong University, Qingdao 266237, China
| | - Yujun Li
- Research Center of Intelligent Medical Information Processing, School of Information Science and Engineering, Shandong University, Qingdao 266237, China.
| | - Xiaoyan Xiao
- Department of Nephrology, Qilu Hospital, Shandong University, Jinan, 250012, China
| | - Qingchen Qiu
- Research Center of Intelligent Medical Information Processing, School of Information Science and Engineering, Shandong University, Qingdao 266237, China
| | - Meijun Yang
- Research Center of Intelligent Medical Information Processing, School of Information Science and Engineering, Shandong University, Qingdao 266237, China
| | - Yuefeng Zhao
- School of Physics and Electronics, Shandong Normal University, Jinan, 250014, China
| | - Lizhen Cui
- School of Software, Shandong University, Jinan, 250101, China
| |
Collapse
|
132
|
Overall survival prediction of non-small cell lung cancer by integrating microarray and clinical data with deep learning. Sci Rep 2020; 10:4679. [PMID: 32170141 PMCID: PMC7069964 DOI: 10.1038/s41598-020-61588-w] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 02/24/2020] [Indexed: 02/07/2023] Open
Abstract
Non-small cell lung cancer (NSCLC) is one of the most common lung cancers worldwide. Accurate prognostic stratification of NSCLC can become an important clinical reference when designing therapeutic strategies for cancer patients. With this clinical application in mind, we developed a deep neural network (DNN) combining heterogeneous data sources of gene expression and clinical data to accurately predict the overall survival of NSCLC patients. Based on microarray data from a cohort set (614 patients), seven well-known NSCLC biomarkers were used to group patients into biomarker- and biomarker+ subgroups. Then, by using a systems biology approach, prognosis relevance values (PRV) were then calculated to select eight additional novel prognostic gene biomarkers. Finally, the combined 15 biomarkers along with clinical data were then used to develop an integrative DNN via bimodal learning to predict the 5-year survival status of NSCLC patients with tremendously high accuracy (AUC: 0.8163, accuracy: 75.44%). Using the capability of deep learning, we believe that our prediction can be a promising index that helps oncologists and physicians develop personalized therapy and build the foundation of precision medicine in the future.
Collapse
|
133
|
Yazdani H, Cheng LL, Christiani DC, Yazdani A. Bounded Fuzzy Possibilistic Method Reveals Information about Lung Cancer through Analysis of Metabolomics. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:526-535. [PMID: 30222581 PMCID: PMC10350680 DOI: 10.1109/tcbb.2018.2869757] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Learning methods, such as conventional clustering and classification, have been applied in diagnosing diseases to categorize samples based on their features. Going beyond clustering samples, membership degrees represent to what degree each sample belongs to a cluster. Variation of membership degrees in each cluster provides information about the cluster as a whole and each sample individually which enables us to have insights toward precision medicine. Membership degrees are measured more accurately through removing restrictions from clustering samples. Bounded Fuzzy Possibilistic Method (BFPM) introduces a membership function that keeps the search space flexible to cluster samples with higher accuracy. The method evaluates samples for their movement from one cluster to another. This technique allows us to find critical samples in advance those with the potential ability to belong to other clusters in the near future. BFPM was applied on metabolomics of individuals in a lung cancer case-control study. Metabolomics as proximal molecular signals to the actual disease processes may serve as strong biomarkers of current disease process. The goal is to know whether serum metabolites of a healthy human can be differentiated from those with lung cancer. Using BFPM, some differences were observed, the pathology data were evaluated, and critical samples were recognized.
Collapse
|
134
|
Dalboni da Rocha JL, Bramati I, Coutinho G, Tovar Moll F, Sitaram R. Fractional Anisotropy changes in Parahippocampal Cingulum due to Alzheimer's Disease. Sci Rep 2020; 10:2660. [PMID: 32060334 PMCID: PMC7021702 DOI: 10.1038/s41598-020-59327-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 01/16/2020] [Indexed: 11/10/2022] Open
Abstract
Current treatments for Alzheimer's disease are only symptomatic and limited to reduce the progression rate of the mental deterioration. Mild Cognitive Impairment, a transitional stage in which the patient is not cognitively normal but do not meet the criteria for specific dementia, is associated with high risk for development of Alzheimer's disease. Thus, non-invasive techniques to predict the individual's risk to develop Alzheimer's disease can be very helpful, considering the possibility of early treatment. Diffusion Tensor Imaging, as an indicator of cerebral white matter integrity, may detect and track earlier evidence of white matter abnormalities in patients developing Alzheimer's disease. Here we performed a voxel-based analysis of fractional anisotropy in three classes of subjects: Alzheimer's disease patients, Mild Cognitive Impairment patients, and healthy controls. We performed Support Vector Machine classification between the three groups, using Fisher Score feature selection and Leave-one-out cross-validation. Bilateral intersection of hippocampal cingulum and parahippocampal gyrus (referred as parahippocampal cingulum) is the region that best discriminates Alzheimer's disease fractional anisotropy values, resulting in an accuracy of 93% for discriminating between Alzheimer's disease and controls, and 90% between Alzheimer's disease and Mild Cognitive Impairment. These results suggest that pattern classification of Diffusion Tensor Imaging can help diagnosis of Alzheimer's disease, specially when focusing on the parahippocampal cingulum.
Collapse
Affiliation(s)
| | - Ivanei Bramati
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil
| | - Gabriel Coutinho
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil
| | - Fernanda Tovar Moll
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil
- Federal Univerisity of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Ranganatha Sitaram
- Institute for Biological and Medical Engineering, Department of Psychiatry, and Section of Neuroscience, Pontificia Universidad Católica de Chile, Santiago, Chile.
| |
Collapse
|
135
|
Mehrian M, Lambrechts T, Marechal M, Luyten FP, Papantoniou I, Geris L. Predicting in vitro human mesenchymal stromal cell expansion based on individual donor characteristics using machine learning. Cytotherapy 2020; 22:82-90. [PMID: 31987754 DOI: 10.1016/j.jcyt.2019.12.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 11/20/2019] [Accepted: 12/08/2019] [Indexed: 12/21/2022]
Abstract
BACKGROUND Human mesenchymal stromal cells (hMSCs) have become attractive candidates for advanced medical cell-based therapies. An in vitro expansion step is routinely used to reach the required clinical quantities. However, this is influenced by many variables including donor characteristics, such as age and gender, and culture conditions, such as cell seeding density and available culture surface area. Computational modeling in general and machine learning in particular could play a significant role in deciphering the relationship between the individual donor characteristics and their growth dynamics. METHODS In this study, hMSCs obtained from 174 male and female donors, between 3 and 64 years of age with passage numbers ranging from 2 to 27, were studied. We applied a Random Forests (RF) technique to model the cell expansion procedure by predicting the population doubling time (PDT) for each passage, taking into account individual donor-related characteristics. RESULTS Using the RF model, the mean absolute error between model predictions and experimental results for the PDT in passage 1 to 4 is significantly lower compared with the errors obtained with theoretical estimates or historical data. Moreover, statistical analysis indicate that the PD and PDT in different age categories are significantly different, especially in the youngest group (younger than 10 years of age) compared with the other age groups. DISCUSSION In summary, we introduce a predictive computational model describing in vitro cell expansion dynamics based on individual donor characteristics, an approach that could greatly assist toward automation of a cell expansion culture process.
Collapse
Affiliation(s)
- Mohammad Mehrian
- Biomechanics Research Unit, GIGA In Silico Medicine, University of Liege, CHU - BAT 34, Quartier Hopital, Liege, Belgium; Prometheus, the Division of Skeletal Tissue Engineering, KU Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium
| | - Toon Lambrechts
- Prometheus, the Division of Skeletal Tissue Engineering, KU Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium; M3-BIORES, KU Leuven, Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium
| | - Marina Marechal
- Prometheus, the Division of Skeletal Tissue Engineering, KU Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium; Skeletal Biology and Engineering Research Center, KU Leuven, Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium
| | - Frank P Luyten
- Prometheus, the Division of Skeletal Tissue Engineering, KU Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium; Skeletal Biology and Engineering Research Center, KU Leuven, Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium
| | - Ioannis Papantoniou
- Prometheus, the Division of Skeletal Tissue Engineering, KU Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium; Skeletal Biology and Engineering Research Center, KU Leuven, Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium; Institute of Chemical Engineering Science, Foundation of Research and Technology - Hellas (FORTH)
| | - Liesbet Geris
- Biomechanics Research Unit, GIGA In Silico Medicine, University of Liege, CHU - BAT 34, Quartier Hopital, Liege, Belgium; Prometheus, the Division of Skeletal Tissue Engineering, KU Leuven, Onderwijs en Navorsing 1 (+8), Leuven, Belgium; Biomechanics Section, KU Leuven, Leuven, Belgium.
| |
Collapse
|
136
|
Touati R, Messaoudi I, Oueslati AE, Lachiri Z, Kharrat M. Classification of intra-genomic helitrons based on features extracted from different orders of FCGS. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2019.100271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
137
|
Van Messem A. Support vector machines: A robust prediction method with applications in bioinformatics. HANDBOOK OF STATISTICS 2020. [DOI: 10.1016/bs.host.2019.08.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
138
|
Xu Y, Cao L, Zhao X, Yao Y, Liu Q, Zhang B, Wang Y, Mao Y, Ma Y, Ma JZ, Payne TJ, Li MD, Li L. Prediction of Smoking Behavior From Single Nucleotide Polymorphisms With Machine Learning Approaches. Front Psychiatry 2020; 11:416. [PMID: 32477189 PMCID: PMC7241440 DOI: 10.3389/fpsyt.2020.00416] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Accepted: 04/23/2020] [Indexed: 12/22/2022] Open
Abstract
Smoking is a complex behavior with a heritability as high as 50%. Given such a large genetic contribution, it provides an opportunity to prevent those individuals who are susceptible to smoking dependence from ever starting to smoke by predicting their inherited predisposition with their genomic profiles. Although previous studies have identified many susceptibility variants for smoking, they have limited power to predict smoking behavior. We applied the support vector machine (SVM) and random forest (RF) methods to build prediction models for smoking behavior. We first used 1,431 smokers and 1,503 non-smokers of African origin for model building with a 10-fold cross-validation and then tested the prediction models on an independent dataset consisting of 213 smokers and 224 non-smokers. The SVM model with 500 top single nucleotide polymorphisms (SNPs) selected using logistic regression (p<0.01) as the feature selection method achieved an area under the curve (AUC) of 0.691, 0.721, and 0.720 for the training, test, and independent test samples, respectively. The RF model with 500 top SNPs selected using logistic regression (p<0.01) achieved AUCs of 0.671, 0.665, and 0.667 for the training, test, and independent test samples, respectively. Finally, we used the combined logistic (p<0.01) and LASSO (λ=10-3) regression to select features and the SVM algorithm for model building. The SVM model with 500 top SNPs achieved AUCs of 0.756, 0.776, and 0.897 for the training, test, and independent test samples, respectively. We conclude that machine learning methods are promising means to build predictive models for smoking.
Collapse
Affiliation(s)
- Yi Xu
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Liyu Cao
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Xinyi Zhao
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yinghao Yao
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Qiang Liu
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Bin Zhang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yan Wang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Ying Mao
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yunlong Ma
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Jennie Z Ma
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, United States
| | - Thomas J Payne
- Department of Otolaryngology and Communicative Sciences, University of Mississippi Medical Center, Jackson, MS, United States
| | - Ming D Li
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.,Research Center for Air Pollution and Health, Zhejiang University, Hangzhou, China
| | - Lanjuan Li
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| |
Collapse
|
139
|
Wang R, He Y, Yao C, Wang S, Xue Y, Zhang Z, Wang J, Liu X. Classification and Segmentation of Hyperspectral Data of Hepatocellular Carcinoma Samples Using 1-D Convolutional Neural Network. Cytometry A 2020; 97:31-38. [PMID: 31403260 DOI: 10.1002/cyto.a.23871] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 07/16/2019] [Accepted: 07/19/2019] [Indexed: 12/24/2022]
Abstract
Pathological diagnosis plays an important role in the diagnosis and treatment of hepatocellular carcinoma (HCC). The traditional method of pathological diagnosis of most cancers requires freezing, slicing, hematoxylin and eosin staining, and manual analysis, limiting the speed of the diagnosis process. In this study, we designed a one-dimensional convolutional neural network to classify the hyperspectral data of HCC sample slices acquired by our hyperspectral imaging system. A weighted loss function was employed to promote the performance of the model. The proposed method allows us to accelerate the diagnosis process of identifying tumor tissues. Our deep learning model achieved good performance on our data set with sensitivity, specificity, and area under receiver operating characteristic curve of 0.871, 0.888, and 0.950, respectively. Meanwhile, our deep learning model outperformed the other machine learning methods assessed on our data set. The proposed method is a potential tool for the label-free and real-time pathologic diagnosis. © 2019 International Society for Advancement of Cytometry.
Collapse
Affiliation(s)
- Rendong Wang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yida He
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
| | - Cuiping Yao
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
| | - Sijia Wang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yuan Xue
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
| | - Zhenxi Zhang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
| | - Jing Wang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xiaolong Liu
- The United Innovation of Mengchao Hepatobiliary Technology Key Laboratory of Fujian Province, Mengchao Hepatobiliary Hospital of Fujian Medical University, Fuzhou, 350025, People's Republic of China
| |
Collapse
|
140
|
Bi Q, Goodman KE, Kaminsky J, Lessler J. What is Machine Learning? A Primer for the Epidemiologist. Am J Epidemiol 2019; 188:2222-2239. [PMID: 31509183 DOI: 10.1093/aje/kwz189] [Citation(s) in RCA: 105] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 07/29/2019] [Accepted: 08/14/2019] [Indexed: 12/22/2022] Open
Abstract
Machine learning is a branch of computer science that has the potential to transform epidemiologic sciences. Amid a growing focus on "Big Data," it offers epidemiologists new tools to tackle problems for which classical methods are not well-suited. In order to critically evaluate the value of integrating machine learning algorithms and existing methods, however, it is essential to address language and technical barriers between the two fields that can make it difficult for epidemiologists to read and assess machine learning studies. Here, we provide an overview of the concepts and terminology used in machine learning literature, which encompasses a diverse set of tools with goals ranging from prediction to classification to clustering. We provide a brief introduction to 5 common machine learning algorithms and 4 ensemble-based approaches. We then summarize epidemiologic applications of machine learning techniques in the published literature. We recommend approaches to incorporate machine learning in epidemiologic research and discuss opportunities and challenges for integrating machine learning and existing epidemiologic research methods.
Collapse
Affiliation(s)
- Qifang Bi
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland
| | - Katherine E Goodman
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland
| | - Joshua Kaminsky
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland
| | - Justin Lessler
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
141
|
Song L, Yin Q, Kang M, Ma N, Li X, Yang Z, Jin H, Lin M, Zhuang P, Zhang Y. Untargeted metabolomics reveals novel serum biomarker of renal damage in rheumatoid arthritis. J Pharm Biomed Anal 2019; 180:113068. [PMID: 31884392 DOI: 10.1016/j.jpba.2019.113068] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2019] [Revised: 12/07/2019] [Accepted: 12/21/2019] [Indexed: 02/06/2023]
Abstract
Rheumatoid arthritis (RA) is a chronic progressive disease, it often involves kidney, lung, heart, and other systems.Renal damage is quite common in RA. Exploring of biomarkers of renal damage in the course of RA progression is of significant importance for disease diagnosis and treatment. We use type II Collagen-Induced Arthritis(CIA) Model. Serums were collected at the 4th, 6th, 8th, and 10th week after the first immunization. An untargeted metabonomic strategy based on UPLC-Q/TOF/MS with support vector machine(SVM) was developed to discover the biomarkers in the rats' serum samples between the RA stage(4-6 weeks in RA model, at which time the kidneys are not affected) and renal damage in RA stage(8-10 weeks in RA model, and the kidneys are affected). Principal component analysis (PCA) and orthogonal partial least squares-discriminant analysis (OPLS-DA) were used to analyze the metabolic profiles of rat serum. The support vector machine (SVM) method was used to screen the specific markers of renal damage in RA. Following multivariate statistical and integration analysis, 5 specific markers of renal damage in RA were screened and found. After the analysis of these metabolites, pentose and glucuronate interconversions are closely related to the pathogenesis of RA renal damage. The present study first use untargeted dmetabonomics combined with the pathological features in the different phases of CIA model rats. This will provide a basis for the choice of treatment drugs for patients with RA who may be complicated by renal damage.
Collapse
Affiliation(s)
- Lili Song
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China
| | - Qingsheng Yin
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China
| | - Mingqin Kang
- Jilin Entry-exit Inspection and Quarantine Bureau, Changchun, People's Republic of China
| | - Ningning Ma
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China
| | - Xin Li
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China
| | - Zhen Yang
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China
| | - Hua Jin
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China
| | - Mengya Lin
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China
| | - Pengwei Zhuang
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China.
| | - Yanjun Zhang
- School of Traditional Chinese Materia Medica, Tianjin University of Traditional Chinese Medicine, Jian Kang Chan Ye Yuan, Jinghai Dist., Tianjin 301617, People's Republic of China.
| |
Collapse
|
142
|
Breitbach ME, Greenspan S, Resnick NM, Perera S, Gurkar AU, Absher D, Levine AS. Exonic Variants in Aging-Related Genes Are Predictive of Phenotypic Aging Status. Front Genet 2019; 10:1277. [PMID: 31921313 PMCID: PMC6931058 DOI: 10.3389/fgene.2019.01277] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 11/19/2019] [Indexed: 01/31/2023] Open
Abstract
Background: Recent studies investigating longevity have revealed very few convincing genetic associations with increased lifespan. This is, in part, due to the complexity of biological aging, as well as the limited power of genome-wide association studies, which assay common single nucleotide polymorphisms (SNPs) and require several thousand subjects to achieve statistical significance. To overcome such barriers, we performed comprehensive DNA sequencing of a panel of 20 genes previously associated with phenotypic aging in a cohort of 200 individuals, half of whom were clinically defined by an "early aging" phenotype, and half of whom were clinically defined by a "late aging" phenotype based on age (65-75 years) and the ability to walk up a flight of stairs or walk for 15 min without resting. A validation cohort of 511 late agers was used to verify our results. Results: We found early agers were not enriched for more total variants in these 20 aging-related genes than late agers. Using machine learning methods, we identified the most predictive model of aging status, both in our discovery and validation cohorts, to be a random forest model incorporating damaging exon variants [Combined Annotation-Dependent Depletion (CADD) > 15]. The most heavily weighted variants in the model were within poly(ADP-ribose) polymerase 1 (PARP1) and excision repair cross complementation group 5 (ERCC5), both of which are involved in a canonical aging pathway, DNA damage repair. Conclusion: Overall, this study implemented a framework to apply machine learning to identify sequencing variants associated with complex phenotypes such as aging. While the small sample size making up our cohort inhibits our ability to make definitive conclusions about the ability of these genes to accurately predict aging, this study offers a unique method for exploring polygenic associations with complex phenotypes.
Collapse
Affiliation(s)
- Megan E. Breitbach
- HudsonAlpha Institute for Biotechnology, Hunstville, AL, United States
- Department of Biotechnology Science and Engineering, University of Alabama in Huntsville, Hunstville, AL, United States
| | - Susan Greenspan
- Division of Geriatric Medicine, Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Neil M. Resnick
- Division of Geriatric Medicine, Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
- Institute on Aging of UPMC, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Subashan Perera
- Division of Geriatric Medicine, Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
- Department of Biostatistics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, PA, United States
| | - Aditi U. Gurkar
- Division of Geriatric Medicine, Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
- Institute on Aging of UPMC, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
| | - Devin Absher
- HudsonAlpha Institute for Biotechnology, Hunstville, AL, United States
| | - Arthur S. Levine
- Department of Microbiology and Molecular Genetics, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States
- UPMC Hillman Cancer Center, Pittsburgh, PA, United States
| |
Collapse
|
143
|
Particle Swarm Optimized Hybrid Kernel-Based Multiclass Support Vector Machine for Microarray Cancer Data Analysis. BIOMED RESEARCH INTERNATIONAL 2019; 2019:4085725. [PMID: 31998772 PMCID: PMC6973196 DOI: 10.1155/2019/4085725] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 10/26/2019] [Accepted: 11/21/2019] [Indexed: 11/17/2022]
Abstract
Determining an optimal decision model is an important but difficult combinatorial task in imbalanced microarray-based cancer classification. Though the multiclass support vector machine (MCSVM) has already made an important contribution in this field, its performance solely depends on three aspects: the penalty factor C, the type of kernel, and its parameters. To improve the performance of this classifier in microarray-based cancer analysis, this paper proposes PSO-PCA-LGP-MCSVM model that is based on particle swarm optimization (PSO), principal component analysis (PCA), and multiclass support vector machine (MCSVM). The MCSVM is based on a hybrid kernel, i.e., linear-Gaussian-polynomial (LGP) that combines the advantages of three standard kernels (linear, Gaussian, and polynomial) in a novel manner, where the linear kernel is linearly combined with the Gaussian kernel embedding the polynomial kernel. Further, this paper proves and makes sure that the LGP kernel confirms the features of a valid kernel. In order to reveal the effectiveness of our model, several experiments were conducted and the obtained results compared between our model and other three single kernel-based models, namely, PSO-PCA-L-MCSVM (utilizing a linear kernel), PSO-PCA-G-MCSVM (utilizing a Gaussian kernel), and PSO-PCA-P-MCSVM (utilizing a polynomial kernel). In comparison, two dual and two multiclass imbalanced standard microarray datasets were used. Experimental results in terms of three extended assessment metrics (F-score, G-mean, and Accuracy) reveal the superior global feature extraction, prediction, and learning abilities of this model against three single kernel-based models.
Collapse
|
144
|
Wnt/ β-Catenin, Carbohydrate Metabolism, and PI3K-Akt Signaling Pathway-Related Genes as Potential Cancer Predictors. JOURNAL OF HEALTHCARE ENGINEERING 2019; 2019:9724589. [PMID: 31781361 PMCID: PMC6855054 DOI: 10.1155/2019/9724589] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 09/17/2019] [Indexed: 01/07/2023]
Abstract
Predicting the outcome after a cancer diagnosis is critical. Advances in high-throughput sequencing technologies provide physicians with vast amounts of data, yet prognostication remains challenging because the data are greatly dimensional and complex. We evaluated Wnt/β-catenin, carbohydrate metabolism, and PI3K-Akt signaling pathway-related genes as predictive features for classifying tumors and normal samples. Using differentially expressed genes as controls, these pathway-related genes were assessed for accuracy using support-vector machines and three other recommended machine learning models, namely, the random forest, decision tree, and k-nearest neighbor algorithms. The first two outperformed the others. All candidate pathway-related genes yielded areas under the curve exceeding 95.00% for cancer outcomes, and they were most accurate in predicting colorectal cancer. These results suggest that these pathway-related genes are useful and accurate biomarkers for understanding the mechanisms behind cancer development.
Collapse
|
145
|
Classification of Hepatitis Viruses from Sequencing Chromatograms Using Multiscale Permutation Entropy and Support Vector Machines. ENTROPY 2019. [PMCID: PMC7514494 DOI: 10.3390/e21121149] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Classifying nucleic acid trace files is an important issue in molecular biology researches. For the purpose of obtaining better classification performance, the question of which features are used and what classifier is implemented to best represent the properties of nucleic acid trace files plays a vital role. In this study, different feature extraction methods based on statistical and entropy theory are utilized to discriminate deoxyribonucleic acid chromatograms, and distinguishing their signals visually is almost impossible. Extracted features are used as the input feature set for the classifiers of Support Vector Machines (SVM) with different kernel functions. The proposed framework is applied to a total number of 200 hepatitis nucleic acid trace files which consist of Hepatitis B Virus (HBV) and Hepatitis C Virus (HCV). While the use of statistical-based feature extraction methods allows representing the properties of hepatitis nucleic acid trace files with descriptive measures such as mean, median and standard deviation, entropy-based feature extraction methods including permutation entropy and multiscale permutation entropy enable quantifying the complexity of these files. The results indicate that using statistical and entropy-based features produces exceptionally high performances in terms of accuracies (reached at nearly 99%) in classifying HBV and HCV.
Collapse
|
146
|
Wan H, Li JM, Ding H, Lin SX, Tu SQ, Tian XH, Hu JP, Chang S. An Overview of Computational Tools of Nucleic Acid Binding Site Prediction for Site-specific Proteins and Nucleases. Protein Pept Lett 2019; 27:370-384. [PMID: 31746287 DOI: 10.2174/0929866526666191028162302] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Revised: 05/24/2019] [Accepted: 09/24/2019] [Indexed: 12/26/2022]
Abstract
Understanding the interaction mechanism of proteins and nucleic acids is one of the most fundamental problems for genome editing with engineered nucleases. Due to some limitations of experimental investigations, computational methods have played an important role in obtaining the knowledge of protein-nucleic acid interaction. Over the past few years, dozens of computational tools have been used for identification of nucleic acid binding site for site-specific proteins and design of site-specific nucleases because of their significant advantages in genome editing. Here, we review existing widely-used computational tools for target prediction of site-specific proteins as well as off-target prediction of site-specific nucleases. This article provides a list of on-line prediction tools according to their features followed by the description of computational methods used by these tools, which range from various sequence mapping algorithms (like Bowtie, FetchGWI and BLAST) to different machine learning methods (such as Support Vector Machine, hidden Markov models, Random Forest, elastic network and deep neural networks). We also make suggestions on the further development in improving the accuracy of prediction methods. This survey will provide a reference guide for computational biologists working in the field of genome editing.
Collapse
Affiliation(s)
- Hua Wan
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Jian-Ming Li
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Huang Ding
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Shuo-Xin Lin
- Department of Electrical and Computer Engineering, James Clark School of Engineering, University of Maryland, College Park, MD 20742, United States
| | - Shu-Qin Tu
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Xu-Hong Tian
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Jian-Ping Hu
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-Evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| |
Collapse
|
147
|
Jang KW, Choi JH, Jeon JH, Kim HS. Combustible Gas Classification Modeling using Support Vector Machine and Pairing Plot Scheme. SENSORS 2019; 19:s19225018. [PMID: 31744238 PMCID: PMC6891470 DOI: 10.3390/s19225018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Revised: 11/14/2019] [Accepted: 11/15/2019] [Indexed: 11/16/2022]
Abstract
Combustible gases, such as CH4 and CO, directly or indirectly affect the human body. Thus, leakage detection of combustible gases is essential for various industrial sites and daily life. Many types of gas sensors are used to identify these combustible gases, but since gas sensors generally have low selectivity among gases, coupling issues often arise which adversely affect gas detection accuracy. To solve this problem, we built a decoupling algorithm with different gas sensors using a machine learning algorithm. Commercially available semiconductor sensors were employed to detect CH4 and CO, and then support vector machine (SVM) applied as a supervised learning algorithm for gas classification. We also introduced a pairing plot scheme to more effectively classify gas type. The proposed model classified CH4 and CO gases 100% correctly at all levels above the minimum concentration the gas sensors could detect. Consequently, SVM with pairing plot is a memory efficient and promising method for more accurate gas classification.
Collapse
|
148
|
Byeon H. Predicting the Swallow-Related Quality of Life of the Elderly Living in a Local Community Using Support Vector Machine. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:4269. [PMID: 31684165 PMCID: PMC6862249 DOI: 10.3390/ijerph16214269] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2019] [Revised: 10/24/2019] [Accepted: 10/31/2019] [Indexed: 02/08/2023]
Abstract
Background and Objectives: This study developed a support vector machine (SVM) algorithm-based prediction model with considering influence factors associated with the swallowing quality-of-life as the predictor variables and provided baseline information for enhancing the swallowing quality of elderly people's lives in the future. Methods and Material: This study sampled 142 elderly people equal to or older than 65 years old who were using a senior welfare center. The swallowing problem associated quality of life was defined by the swallowing quality-of-life (SWAL-QOL). In order to verify the predictive power of the model, this study compared the predictive power of the Gaussian function with that of a linear algorithm, polynomial algorithm, and a sigmoid algorithm. Results: A total of 33.9% of the subjects decreased in swallowing quality-of-life. The swallowing quality-of-life prediction model for the elderly, based on the SVM, showed both preventive factors and risk factors. Risk factors were denture use, experience of using aspiration in the past one month, being economically inactive, having a mean monthly household income <2 million KRW, being an elementary school graduate or below, female, 75 years old or older, living alone, requiring time for finishing one meal on average ≤15 min or ≥40 min, having depression, stress, and cognitive impairment. Conclusions: It is necessary to monitor the high-risk group constantly in order to maintain the swallowing quality-of-life in the elderly based on the prevention and risk factors associated with the swallowing quality-of-life derived from this prediction model.
Collapse
Affiliation(s)
- Haewon Byeon
- Department of Speech Language Pathology, School of Public Health, Honam University, 417, Eodeung-daero, Gwangsan-gu, Gwangju 62399, Korea.
| |
Collapse
|
149
|
Alizadehsani R, Roshanzamir M, Abdar M, Beykikhoshk A, Khosravi A, Panahiazar M, Koohestani A, Khozeimeh F, Nahavandi S, Sarrafzadegan N. A database for using machine learning and data mining techniques for coronary artery disease diagnosis. Sci Data 2019; 6:227. [PMID: 31645559 PMCID: PMC6811630 DOI: 10.1038/s41597-019-0206-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 08/16/2019] [Indexed: 12/28/2022] Open
Abstract
We present the coronary artery disease (CAD) database, a comprehensive resource, comprising 126 papers and 68 datasets relevant to CAD diagnosis, extracted from the scientific literature from 1992 and 2018. These data were collected to help advance research on CAD-related machine learning and data mining algorithms, and hopefully to ultimately advance clinical diagnosis and early treatment. To aid users, we have also built a web application that presents the database through various reports.
Collapse
Affiliation(s)
- R Alizadehsani
- Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, VIC 3216, Australia
| | - M Roshanzamir
- Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, 84156-83111, Iran
| | - M Abdar
- Département d'informatique, Université du Québec à Montréal, Montréal, Québec, Canada
| | - A Beykikhoshk
- Applied Artificial Intelligence Institute, Deakin University, Geelong, Australia
| | - A Khosravi
- Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, VIC 3216, Australia
| | - M Panahiazar
- University of California San Francisco, San Francisco, CA, USA.
| | - A Koohestani
- Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, VIC 3216, Australia
| | - F Khozeimeh
- Mashhad University of Medical Science, Mashhad, Iran
| | - S Nahavandi
- Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, VIC 3216, Australia
| | - N Sarrafzadegan
- Isfahan Cardiovascular Research Center, Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
- School of Population and Public Health, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
150
|
Chang D, Liu M, Lee Y. Accident diagnosis of a PWR fuel pin during unprotected loss of flow accident with support vector machine. NUCLEAR ENGINEERING AND DESIGN 2019. [DOI: 10.1016/j.nucengdes.2019.110184] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|