1
|
Ma W, Li M, Chu Z, Chen H. Smart Biosensor for Breast Cancer Survival Prediction Based on Multi-View Multi-Way Graph Learning. SENSORS (BASEL, SWITZERLAND) 2024; 24:3289. [PMID: 38894082 PMCID: PMC11174864 DOI: 10.3390/s24113289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 05/17/2024] [Accepted: 05/19/2024] [Indexed: 06/21/2024]
Abstract
Biosensors play a crucial role in detecting cancer signals by orchestrating a series of intricate biological and physical transduction processes. Among various cancers, breast cancer stands out due to its genetic underpinnings, which trigger uncontrolled cell proliferation, predominantly impacting women, and resulting in significant mortality rates. The utilization of biosensors in predicting survival time becomes paramount in formulating an optimal treatment strategy. However, conventional biosensors employing traditional machine learning methods encounter challenges in preprocessing features for the learning task. Despite the potential of deep learning techniques to automatically extract useful features, they often struggle to effectively leverage the intricate relationships between features and instances. To address this challenge, our study proposes a novel smart biosensor architecture that integrates a multi-view multi-way graph learning (MVMWGL) approach for predicting breast cancer survival time. This innovative approach enables the assimilation of insights from gene interactions and biosensor similarities. By leveraging real-world data, we conducted comprehensive evaluations, and our experimental results unequivocally demonstrate the superiority of the MVMWGL approach over existing methods.
Collapse
Affiliation(s)
- Wenming Ma
- School of Computer and Control Engineering, Yantai University, Yantai 264005, China; (M.L.); (Z.C.); (H.C.)
| | | | | | | |
Collapse
|
2
|
Cai H, Liao Y, Zhu L, Wang Z, Song J. Improving Cancer Survival Prediction via Graph Convolutional Neural Network Learning on Protein-Protein Interaction Networks. IEEE J Biomed Health Inform 2024; 28:1134-1143. [PMID: 37963003 DOI: 10.1109/jbhi.2023.3332640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]
Abstract
Cancer is one of the most challenging health problems worldwide. Accurate cancer survival prediction is vital for clinical decision making. Many deep learning methods have been proposed to understand the association between patients' genomic features and survival time. In most cases, the gene expression matrix is fed directly to the deep learning model. However, this approach completely ignores the interactions between biomolecules, and the resulting models can only learn the expression levels of genes to predict patient survival. In essence, the interaction between biomolecules is the key to determining the direction and function of biological processes. Proteins are the building blocks and principal undertakings of life activities, and as such, their complex interaction network is potentially informative for deep learning methods. Therefore, a more reliable approach is to have the neural network learn both gene expression data and protein interaction networks. We propose a new computational approach, termed CRESCENT, which is a protein-protein interaction (PPI) prior knowledge graph-based convolutional neural network (GCN) to improve cancer survival prediction. CRESCENT relies on the gene expression networks rather than gene expression levels to predict patient survival. The performance of CRESCENT is evaluated on a large-scale pan-cancer dataset consisting of 5991 patients from 16 different types of cancers. Extensive benchmarking experiments demonstrate that our proposed method is competitive in terms of the evaluation metric of the time-dependent concordance index( Ctd) when compared with several existing state-of-the-art approaches. Experiments also show that incorporating the network structure between genomic features effectively improves cancer survival prediction.
Collapse
|
3
|
Lu YT, Plets M, Morrison G, Cunha AT, Cen SY, Rhie SK, Siegmund KD, Daneshmand S, Quinn DI, Meeks JJ, Lerner SP, Petrylak DP, McConkey D, Flaig TW, Thompson IM, Goldkorn A. Cell-free DNA Methylation as a Predictive Biomarker of Response to Neoadjuvant Chemotherapy for Patients with Muscle-invasive Bladder Cancer in SWOG S1314. Eur Urol Oncol 2023; 6:516-524. [PMID: 37087309 PMCID: PMC10587361 DOI: 10.1016/j.euo.2023.03.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 03/09/2023] [Accepted: 03/27/2023] [Indexed: 04/24/2023]
Abstract
BACKGROUND Neoadjuvant chemotherapy (NAC) is the standard of care in muscle-invasive bladder cancer (MIBC). However, treatment is intense, and the overall benefit is small, necessitating effective biomarkers to identify patients who will benefit most. OBJECTIVE To characterize cell-free DNA (cfDNA) methylation in patients receiving NAC in SWOG S1314, a prospective cooperative group trial, and to correlate the methylation signatures with pathologic response at radical cystectomy. DESIGN, SETTING, AND PARTICIPANTS SWOG S1314 is a prospective cooperative group trial for patients with MIBC (cT2-T4aN0M0, ≥5 mm of viable tumor), with a primary objective of evaluating the coexpression extrapolation (COXEN) gene expression signature as a predictor of NAC response, defined as achieving pT0N0 or ≤pT1N0 at radical cystectomy. For the current exploratory analysis, blood samples were collected prospectively from 72 patients in S1314 before and during NAC, and plasma cfDNA methylation was measured using the Infinium MethylationEPIC BeadChip array. INTERVENTION No additional interventions besides plasma collection. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS Differential methylation between pathologic responders (≤pT1N0) and nonresponders was analyzed, and a classifier predictive of treatment response was generated using the Random Forest machine learning algorithm. RESULTS AND LIMITATIONS Using prechemotherapy plasma cfDNA, we developed a methylation-based response score (mR-score) predictive of pathologic response. Plasma samples collected after the first cycle of NAC yielded mR-scores with similar predictive ability. Furthermore, we used cfDNA methylation data to calculate the circulating bladder DNA fraction, which had a modest but independent predictive ability for treatment response. In a model combining mR-score and circulating bladder DNA fraction, we correctly predicted pathologic response in 79% of patients based on their plasma collected at baseline and after one cycle of chemotherapy. Limitations of this study included a limited sample size and relatively low circulating bladder DNA levels. CONCLUSIONS Our study provides the proof of concept that cfDNA methylation can be used to generate classifiers of NAC response in bladder cancer patients. PATIENT SUMMARY In this exploratory analysis of S1314, we demonstrated that cell-free DNA methylation can be profiled to generate biomarker signatures associated with neoadjuvant chemotherapy response. With validation in additional cohorts, this minimally invasive approach may be used to predict chemotherapy response in locally advanced bladder cancer and perhaps also in metastatic disease.
Collapse
Affiliation(s)
- Yi-Tsung Lu
- Division of Medical Oncology, Department of Medicine and Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Melissa Plets
- SWOG Statistics and Data Management Center, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Gareth Morrison
- Division of Medical Oncology, Department of Medicine and Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Alexander T Cunha
- Division of Medical Oncology, Department of Medicine and Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Steven Y Cen
- Department of Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Suhn K Rhie
- Department of Biochemistry and Molecular Medicine and Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Kimberly D Siegmund
- Department of Population and Public Health Science, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Siamak Daneshmand
- Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - David I Quinn
- Division of Medical Oncology, Department of Medicine and Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Joshua J Meeks
- Departments of Urology, Biochemistry, and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Seth P Lerner
- Scott Department of Urology, Dan L Duncan Cancer Center, Baylor College of Medicine, Houston, TX, USA
| | | | | | - Thomas W Flaig
- University of Colorado, School of Medicine, Aurora, CO, USA
| | - Ian M Thompson
- CHRISTUS Medical Center Hospital, University of Texas Health Science Center at San Antonio, San Antonio, TX, USA
| | - Amir Goldkorn
- Division of Medical Oncology, Department of Medicine and Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
4
|
Arya N, Saha S, Mathur A, Saha S. Improving the robustness and stability of a machine learning model for breast cancer prognosis through the use of multi-modal classifiers. Sci Rep 2023; 13:4079. [PMID: 36906618 PMCID: PMC10008603 DOI: 10.1038/s41598-023-30143-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 02/16/2023] [Indexed: 03/13/2023] Open
Abstract
Breast cancer is a deadly disease with a high mortality rate among PAN cancers. The advancements in biomedical information retrieval techniques have been beneficial in developing early prognosis and diagnosis systems for cancer patients. These systems provide the oncologist with plenty of information from several modalities to make the correct and feasible treatment plan for breast cancer patients and protect them from unnecessary therapies and their toxic side effects. The cancer patient's related information can be collected using various modalities like clinical, copy number variation, DNA-methylation, microRNA sequencing, gene expression, and histopathological whole slide images. High dimensionality and heterogeneity in these modalities demand the development of some intelligent systems to understand related features to the prognosis and diagnosis of diseases and make correct predictions. In this work, we have studied some end-to-end systems having two main components : (a) dimensionality reduction techniques applied to original features from different modalities and (b) classification techniques applied to the fusion of reduced feature vectors from different modalities for automatic predictions of breast cancer patients into two categories: short-time and long-time survivors. Principal component analysis (PCA) and variational auto-encoders (VAEs) are used as the dimensionality reduction techniques, followed by support vector machines (SVM) or random forest as the machine learning classifiers. The study utilizes raw, PCA, and VAE extracted features of the TCGA-BRCA dataset from six different modalities as input to the machine learning classifiers. We conclude this study by suggesting that adding more modalities to the classifiers provides complementary information to the classifier and increases the stability and robustness of the classifiers. In this study, the multimodal classifiers have not been validated on primary data prospectively.
Collapse
Affiliation(s)
- Nikhilanand Arya
- Department of Computer Science & Engineering, Indian Institute of Technology, Patna, Bihar, 801106, India.
| | - Sriparna Saha
- Department of Computer Science & Engineering, Indian Institute of Technology, Patna, Bihar, 801106, India
| | - Archana Mathur
- Department of Information Science & Engineering, Nitte Meenkashi Institute of Technology, Bangalore, 560064, India
| | - Snehanshu Saha
- APPCAIR & CSIS, Birla Institute of Technology and Science, Pilani-Goa Campus, Pilani, Goa, 403726, India
| |
Collapse
|
5
|
Li J, Qi C, Li Q, Liu F. Construction and validation of an aging-related gene signature for prognosis prediction of patients with breast cancer. Cancer Rep (Hoboken) 2023; 6:e1741. [PMID: 36323529 PMCID: PMC10026283 DOI: 10.1002/cnr2.1741] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 09/21/2022] [Accepted: 10/08/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Breast cancer (BC) is an aging-related disease. Aging-related genes (ARGs) participate in the initiation and development of lung and colon cancer, but the prognosis signature of ARGs in BC has not been clearly studied. AIMS This study aimed to construct an ARGs signature to predict the prognosis of patients with breast cancer. METHOD Firstly, the expression data of ARGs from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) were collected. Then COX and least absolulute shrinkage and selection operator(LASSO) were performed to construct the ARGs prognostic signature. The correlation between the signature and immune cell infiltration, immunotherapeutic response and drug sensitivity were subsequently analysed. The TCGA nomogram was constructed by combining the signature with other clinical features, and was validated by using GEO database. RESULTS After LASSO and COX regression analyses, a prognostic signature based on nine ARGs, namely, HSP90AA1, NFKB2, PLAU, PTK2, RECQL4, CLU, JAK2, MAP3K5, and S100B, was built by using the TCGA dataset. Moreover, this risk signature is closely related to immune cell infiltration, immunotherapeutic response, and responses to chemotherapy and targeted therapy. Subsequently, The calibration curve demonstrates that the nomogram agrees well with practical prediction results. The receiver operating characteristic curve and decision-making curve analysis demonstrate that ARG signature has the better prognosis diagnosis ability and clinical net benefits. CONCLUSIONS Therefore, the proposed ARG prognosis signature is a new prognosis molecular marker of patients with BC, and it can provide good references to individual clinical therapy.
Collapse
Affiliation(s)
- Jian Li
- Department of Breast Surgery, The Affiliated Taian City Central Hospital of Qingdao University, Tai'an City, China
- Postdoctoral Workstation, Liaocheng People's Hospital, Liaocheng City, China
| | - Chunling Qi
- Department of Laboratory, The Affiliated Taian City Central Hospital of Qingdao University, Tai'an City, China
| | - Qing Li
- Department of Pharmacy, The Affiliated Taian City Central Hospital of Qingdao University, Tai'an City, China
| | - Fei Liu
- Department of Breast Surgery, The Affiliated Taian City Central Hospital of Qingdao University, Tai'an City, China
| |
Collapse
|
6
|
Bueno-Fortes S, Berral-Gonzalez A, Sánchez-Santos JM, Martin-Merino M, De Las Rivas J. Identification of a gene expression signature associated with breast cancer survival and risk that improves clinical genomic platforms. BIOINFORMATICS ADVANCES 2023; 3:vbad037. [PMID: 37096121 PMCID: PMC10122606 DOI: 10.1093/bioadv/vbad037] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 03/21/2023] [Indexed: 04/26/2023]
Abstract
Motivation Modern genomic technologies allow us to perform genome-wide analysis to find gene markers associated with the risk and survival in cancer patients. Accurate risk prediction and patient stratification based on robust gene signatures is a key path forward in personalized treatment and precision medicine. Several authors have proposed the identification of gene signatures to assign risk in patients with breast cancer (BRCA), and some of these signatures have been implemented within commercial platforms in the clinic, such as Oncotype and Prosigna. However, these platforms are black boxes in which the influence of selected genes as survival markers is unclear and where the risk scores provided cannot be clearly related to the standard clinicopathological tumor markers obtained by immunohistochemistry (IHC), which guide clinical and therapeutic decisions in breast cancer. Results Here, we present a framework to discover a robust list of gene expression markers associated with survival that can be biologically interpreted in terms of the three main biomolecular factors (IHC clinical markers: ER, PR and HER2) that define clinical outcome in BRCA. To test and ensure the reproducibility of the results, we compiled and analyzed two independent datasets with a large number of tumor samples (1024 and 879) that include full genome-wide expression profiles and survival data. Using these two cohorts, we obtained a robust subset of gene survival markers that correlate well with the major IHC clinical markers used in breast cancer. The geneset of survival markers that we identify (which includes 34 genes) significantly improves the risk prediction provided by the genesets included in the commercial platforms: Oncotype (16 genes) and Prosigna (50 genes, i.e. PAM50). Furthermore, some of the genes identified have recently been proposed in the literature as new prognostic markers and may deserve more attention in current clinical trials to improve breast cancer risk prediction. Availability and implementation All data integrated and analyzed in this research will be available on GitHub (https://github.com/jdelasrivas-lab/breastcancersurvsign), including the R scripts and protocols used for the analyses. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Santiago Bueno-Fortes
- Cancer Research Center (CiC-IMBCC, CSIC/USAL and IBSAL), Consejo Superior de Investigaciones Científicas (CSIC) and University of Salamanca (USAL), Salamanca 37007, Spain
| | - Alberto Berral-Gonzalez
- Cancer Research Center (CiC-IMBCC, CSIC/USAL and IBSAL), Consejo Superior de Investigaciones Científicas (CSIC) and University of Salamanca (USAL), Salamanca 37007, Spain
| | | | | | | |
Collapse
|
7
|
Hao Y, Jing XY, Sun Q. Joint learning sample similarity and correlation representation for cancer survival prediction. BMC Bioinformatics 2022; 23:553. [PMID: 36536289 PMCID: PMC9761951 DOI: 10.1186/s12859-022-05110-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND As a highly aggressive disease, cancer has been becoming the leading death cause around the world. Accurate prediction of the survival expectancy for cancer patients is significant, which can help clinicians make appropriate therapeutic schemes. With the high-throughput sequencing technology becoming more and more cost-effective, integrating multi-type genome-wide data has been a promising method in cancer survival prediction. Based on these genomic data, some data-integration methods for cancer survival prediction have been proposed. However, existing methods fail to simultaneously utilize feature information and structure information of multi-type genome-wide data. RESULTS We propose a Multi-type Data Joint Learning (MDJL) approach based on multi-type genome-wide data, which comprehensively exploits feature information and structure information. Specifically, MDJL exploits correlation representations between any two data types by cross-correlation calculation for learning discriminant features. Moreover, based on the learned multiple correlation representations, MDJL constructs sample similarity matrices for capturing global and local structures across different data types. With the learned discriminant representation matrix and fused similarity matrix, MDJL constructs graph convolutional network with Cox loss for survival prediction. CONCLUSIONS Experimental results demonstrate that our approach substantially outperforms established integrative methods and is effective for cancer survival prediction.
Collapse
Affiliation(s)
- Yaru Hao
- grid.49470.3e0000 0001 2331 6153School of Computer Science, Wuhan University, Wuhan, China
| | - Xiao-Yuan Jing
- grid.49470.3e0000 0001 2331 6153School of Computer Science, Wuhan University, Wuhan, China ,grid.459577.d0000 0004 1757 6559Guangdong Provincial Key Laboratory of Petrochemical Equipment Fault Diagnosis and School of Computer, Guangdong University of Petrochemical Technology, Maoming, China ,grid.41156.370000 0001 2314 964XState Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
| | - Qixing Sun
- grid.49470.3e0000 0001 2331 6153School of Computer Science, Wuhan University, Wuhan, China
| |
Collapse
|
8
|
N. Mueller A, Morrisey S, A. Miller H, Hu X, Kumar R, T. Ngo P, Yan J, B. Frieboes H. Prediction of lung cancer immunotherapy response via machine learning analysis of immune cell lineage and surface markers. Cancer Biomark 2022; 34:681-692. [DOI: 10.3233/cbm-210529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND: Although advances have been made in cancer immunotherapy, patient benefits remain elusive. For non-small cell lung cancer (NSCLC), monoclonal antibodies targeting programmed death-1 (PD-1) and programmed death ligand-1 (PD-L1) have shown survival benefit compared to chemotherapy. Personalization of treatment would be facilitated by a priori identification of patients likely to benefit. OBJECTIVE: This pilot study applied a suite of machine learning methods to analyze mass cytometry data of immune cell lineage and surface markers from blood samples of a small cohort (n= 13) treated with Pembrolizumab, Atezolizumab, Durvalumab, or Nivolumab as monotherapy. METHODS: Four different comparisons were evaluated between data collected at an initial visit (baseline), after 12-weeks of immunotherapy, and from healthy (control) samples: healthy vs patients at baseline, Responders vs Non-Responders at baseline, Healthy vs 12-week Responders, and Responders vs Non-Responders at 12-weeks. The algorithms Random Forest, Partial Least Squares Discriminant Analysis, Multi-Layer Perceptron, and Elastic Net were applied to find features differentiating between these groups and provide for the capability to predict outcomes. RESULTS: Particular combinations and proportions of immune cell lineage and surface markers were sufficient to accurately discriminate between the groups without overfitting the data. In particular, markers associated with the B-cell phenotype were identified as key features. CONCLUSIONS: This study illustrates a comprehensive machine learning analysis of circulating immune cell characteristics of NSCLC patients with the potential to predict response to immunotherapy. Upon further evaluation in a larger cohort, the proposed methodology could help guide personalized treatment selection in clinical practice.
Collapse
Affiliation(s)
- Alex N. Mueller
- School of Medicine, University of Louisville, Louisville, KY, USA
| | - Samantha Morrisey
- Division of Immunotherapy, Department of Surgery, University of Louisville, Louisville, KY, USA
| | - Hunter A. Miller
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, USA
| | - Xiaoling Hu
- Division of Immunotherapy, Department of Surgery, University of Louisville, Louisville, KY, USA
| | - Rohit Kumar
- School of Medicine, University of Louisville, Louisville, KY, USA
- UofL Health – Brown Cancer Center, University of Louisville, Louisville, KY, USA
| | - Phuong T. Ngo
- School of Medicine, University of Louisville, Louisville, KY, USA
- UofL Health – Brown Cancer Center, University of Louisville, Louisville, KY, USA
| | - Jun Yan
- Division of Immunotherapy, Department of Surgery, University of Louisville, Louisville, KY, USA
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, USA
- UofL Health – Brown Cancer Center, University of Louisville, Louisville, KY, USA
- Department of Surgery, University of Louisville, Louisville, KY, USA
| | - Hermann B. Frieboes
- Department of Pharmacology and Toxicology, University of Louisville, Louisville, KY, USA
- UofL Health – Brown Cancer Center, University of Louisville, Louisville, KY, USA
- Center for Predictive Medicine, University of Louisville, Louisville, KY, USA
- Department of Bioengineering, University of Louisville, Louisville, KY, USA
| |
Collapse
|
9
|
Ningappa M, Rahman SA, Higgs BW, Ashokkumar CS, Sahni N, Sindhi R, Das J. A network-based approach to identify expression modules underlying rejection in pediatric liver transplantation. Cell Rep Med 2022; 3:100605. [PMID: 35492246 PMCID: PMC9044102 DOI: 10.1016/j.xcrm.2022.100605] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 12/19/2021] [Accepted: 03/23/2022] [Indexed: 10/27/2022]
Abstract
Selecting the right immunosuppressant to ensure rejection-free outcomes poses unique challenges in pediatric liver transplant (LT) recipients. A molecular predictor can comprehensively address these challenges. Currently, there are no well-validated blood-based biomarkers for pediatric LT recipients before or after LT. Here, we discover and validate separate pre- and post-LT transcriptomic signatures of rejection. Using an integrative machine learning approach, we combine transcriptomics data with the reference high-quality human protein interactome to identify network module signatures, which underlie rejection. Unlike gene signatures, our approach is inherently multivariate and more robust to replication and captures the structure of the underlying network, encapsulating additive effects. We also identify, in an individual-specific manner, signatures that can be targeted by current anti-rejection drugs and other drugs that can be repurposed. Our approach can enable personalized adjustment of drug regimens for the dominant targetable pathways before and after LT in children.
Collapse
Affiliation(s)
- Mylarappa Ningappa
- Department of Surgery and Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| | - Syed A Rahman
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Brandon W Higgs
- Department of Surgery and Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| | - Chethan S Ashokkumar
- Department of Surgery and Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| | - Nidhi Sahni
- Department of Epigenetics, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA.,Department of Molecular Carcinogenesis and Bioinformatics, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA.,Department of Computational Biology, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA
| | - Rakesh Sindhi
- Department of Surgery and Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jishnu Das
- Center for Systems Immunology, Departments of Immunology and Computational & Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
10
|
VEGF-A-related genetic variants protect against Alzheimer's disease. Aging (Albany NY) 2022; 14:2524-2536. [PMID: 35347084 PMCID: PMC9004571 DOI: 10.18632/aging.203984] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 03/14/2022] [Indexed: 11/25/2022]
Abstract
The Apolipoprotein E (APOE) genotype has been shown to be the strongest genetic risk factor for Alzheimer’s disease (AD). Moreover, both the lipolysis-stimulated lipoprotein receptor (LSR) and the vascular endothelial growth factor A (VEGF-A) are involved in the development of AD. The aim of the study was to develop a prediction model for AD including single nucleotide polymorphisms (SNP) of APOE, LSR and VEGF-A-related variants. The population consisted of 323 individuals (143 AD cases and 180 controls). Genotyping was performed for: the APOE common polymorphism (rs429358 and rs7412), two LSR variants (rs34259399 and rs916147) and 10 VEGF-A-related SNPs (rs6921438, rs7043199, rs6993770, rs2375981, rs34528081, rs4782371, rs2639990, rs10761741, rs114694170, rs1740073), previously identified as genetic determinants of VEGF-A levels in GWAS studies. The prediction model included direct and epistatic interaction effects, age and sex and was developed using the elastic net machine learning methodology. An optimal model including the direct effect of the APOE e4 allele, age and eight epistatic interactions between APOE and LSR, APOE and VEGF-A-related variants was developed with an accuracy of 72%. Two epistatic interactions (rs7043199*rs6993770 and rs2375981*rs34528081) were the strongest protective factors against AD together with the absence of ε4 APOE allele. Based on pathway analysis, the involved variants and related genes are implicated in neurological diseases. In conclusion, this study demonstrated links between APOE, LSR and VEGF-A-related variants and the development of AD and proposed a model of nine genetic variants which appears to strongly influence the risk for AD.
Collapse
|
11
|
Yoo S, Sinha A, Yang D, Altorki NK, Tandon R, Wang W, Chavez D, Lee E, Patel AS, Sato T, Kong R, Ding B, Schadt EE, Watanabe H, Massion PP, Borczuk AC, Zhu J, Powell CA. Integrative network analysis of early-stage lung adenocarcinoma identifies aurora kinase inhibition as interceptor of invasion and progression. Nat Commun 2022; 13:1592. [PMID: 35332150 PMCID: PMC8948234 DOI: 10.1038/s41467-022-29230-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 03/01/2022] [Indexed: 12/15/2022] Open
Abstract
Here we focus on the molecular characterization of clinically significant histological subtypes of early-stage lung adenocarcinoma (esLUAD), which is the most common histological subtype of lung cancer. Within lung adenocarcinoma, histology is heterogeneous and associated with tumor invasion and diverse clinical outcomes. We present a gene signature distinguishing invasive and non-invasive tumors among esLUAD. Using the gene signatures, we estimate an Invasiveness Score that is strongly associated with survival of esLUAD patients in multiple independent cohorts and with the invasiveness phenotype in lung cancer cell lines. Regulatory network analysis identifies aurora kinase as one of master regulators of the gene signature and the perturbation of aurora kinases in vitro and in a murine model of invasive lung adenocarcinoma reduces tumor invasion. Our study reveals aurora kinases as a therapeutic target for treatment of early-stage invasive lung adenocarcinoma.
Collapse
Affiliation(s)
- Seungyeul Yoo
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
- Sema4, Stamford, CT, USA
| | - Abhilasha Sinha
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Dawei Yang
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Pulmonary and Critical Care Medicine, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Nasser K Altorki
- Department of Cardiothoracic Surgery, Weill Cornell Medicine-New York Presbyterian Hospital, New York, NY, USA
| | - Radhika Tandon
- School of Medicine, St. George's University, West Indies, Grenada
| | - Wenhui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
| | - Deebly Chavez
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eunjee Lee
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
- Sema4, Stamford, CT, USA
| | - Ayushi S Patel
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Vileck Institute of Graduate Biomedical Sciences, New York University School of Medicine, New York, NY, USA
| | - Takashi Sato
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Division of Pulmonary Medicine, Department of Medicine, Keio University School of Medicine, Tokyo, Japan
- Department of Respiratory Medicine, Kitasato University School of Medicine, Sagamihara, Japan
| | - Ranran Kong
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Thoracic Surgery, The Second Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Bisen Ding
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Key Laboratory of Birth Defects and Related Diseases of Women And Children of MOE, State Key Laboratory of Biotherapy, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
- Sema4, Stamford, CT, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Hideo Watanabe
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Pierre P Massion
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Alain C Borczuk
- Department of Pathology, Weill Cornell Medicine, New York, NY, USA
| | - Jun Zhu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Institute for Data Science and Genomic Technology, New York, NY, USA.
- Sema4, Stamford, CT, USA.
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Charles A Powell
- Division of Pulmonary, Critical Care and Sleep Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
12
|
Stacking Machine Learning Algorithms for Biomarker-Based Preoperative Diagnosis of a Pelvic Mass. Cancers (Basel) 2022; 14:cancers14051291. [PMID: 35267599 PMCID: PMC8909341 DOI: 10.3390/cancers14051291] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 02/23/2022] [Accepted: 02/27/2022] [Indexed: 12/28/2022] Open
Abstract
Simple Summary It is critical for women who are diagnosed with a pelvic mass, or an ovarian cyst to be accurately assessed for their risk of having an ovarian malignancy. Accurate risk stratification for these women will allow for appropriate triage and referral to centers best equipped to treat women diagnosed with ovarian cancer. In this study, machine learning (ML) algorithms were used to determine the optimal combination of biomarkers for prediction of malignancy in women presenting with a pelvic mass. Nine unique ML algorithms were employed to evaluate age, menopausal status, race, and levels of 67 biomarkers from serum, urine, and plasma samples prospectively collected in a cohort 140 women with a variety of pelvic mass diagnoses benign and malignant. A complex statistical algorithm using serum levels of CA125, HE4 and transferrin provided greater than 93% sensitivity and specificity for the preoperative prediction of malignancy in women presenting with a pelvic mass. Abstract Objective: To identify the most predictive parameters of ovarian malignancy and develop a machine learning (ML) based algorithm to preoperatively distinguish between a benign and malignant pelvic mass. Methods: Retrospective study of 70 predictive parameters collected from 140 women with a pelvic mass. The women were split into a 3:1 “training” to “testing” dataset. Feature selection was performed using Gini impurity through an embedded random forest model and principal component analysis. Nine unique ML classifiers were assessed across a variety of model-specific hyperparameters using 25 bootstrap resamples of the training data. Model predictions were then combined into an ensemble stack by LASSO regression. The final ensemble stack and individual classifiers were then applied to the testing dataset to assess model performance. Results: Feature selection identified HE4, CA125, and transferrin as three predictive parameters of malignancy. Assessment of the ensemble stack on the testing dataset outperformed all individual ML classifiers in predicting malignancy. The ensemble stack demonstrated an accuracy of 97.1%, a receiver operating characteristic (ROC) area under the curve (AUC) of 0.951, and a sensitivity of 93.3% with a specificity of 100%. Conclusions: Combining the measurement of three distinct biomarkers with the stacking of multiple ML classifiers into an ensemble can provide valuable preoperative diagnostic predictions for patients with a pelvic mass.
Collapse
|
13
|
Li C, Gao Z, Su B, Xu G, Lin X. Data analysis methods for defining biomarkers from omics data. Anal Bioanal Chem 2021; 414:235-250. [PMID: 34951658 DOI: 10.1007/s00216-021-03813-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 11/26/2021] [Accepted: 11/29/2021] [Indexed: 02/01/2023]
Abstract
Omics mainly includes genomics, epigenomics, transcriptomics, proteomics and metabolomics. The rapid development of omics technology has opened up new ways to study disease diagnosis and prognosis and to define prospective information of complex diseases. Since omics data are usually large and complex, the method used to analyze the data and to define important information is crucial in omics study. In this review, we focus on advances in biomarker discovery methods based on omics data in the last decade, and categorize them as individual feature analysis, combinatorial feature analysis and network analysis. We also discuss the challenges and perspectives in this field.
Collapse
Affiliation(s)
- Chao Li
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, Liaoning, China
| | - Zhenbo Gao
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China
| | - Benzhe Su
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China
| | - Guowang Xu
- CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, Liaoning, China
| | - Xiaohui Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, Liaoning, China.
| |
Collapse
|
14
|
Van Goethem N, Robert A, Bossuyt N, Van Poelvoorde LAE, Quoilin S, De Keersmaecker SCJ, Devleesschauwer B, Thomas I, Vanneste K, Roosens NHC, Van Oyen H. Evaluation of the added value of viral genomic information for predicting severity of influenza infection. BMC Infect Dis 2021; 21:785. [PMID: 34376182 PMCID: PMC8353062 DOI: 10.1186/s12879-021-06510-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 07/18/2021] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND The severity of an influenza infection is influenced by both host and viral characteristics. This study aims to assess the relevance of viral genomic data for the prediction of severe influenza A(H3N2) infections among patients hospitalized for severe acute respiratory infection (SARI), in view of risk assessment and patient management. METHODS 160 A(H3N2) influenza positive samples from the 2016-2017 season originating from the Belgian SARI surveillance were selected for whole genome sequencing. Predictor variables for severity were selected using a penalized elastic net logistic regression model from a combined host and genomic dataset, including patient information and nucleotide mutations identified in the viral genome. The goodness-of-fit of the model combining host and genomic data was compared using a likelihood-ratio test with the model including host data only. Internal validation of model discrimination was conducted by calculating the optimism-adjusted area under the Receiver Operating Characteristic curve (AUC) for both models. RESULTS The model including viral mutations in addition to the host characteristics had an improved fit ([Formula: see text]=12.03, df = 3, p = 0.007). The optimism-adjusted AUC increased from 0.671 to 0.732. CONCLUSIONS Adding genomic data (selected season-specific mutations in the viral genome) to the model containing host characteristics improved the prediction of severe influenza infection among hospitalized SARI patients, thereby offering the potential for translation into a prospective strategy to perform early season risk assessment or to guide individual patient management.
Collapse
Affiliation(s)
- Nina Van Goethem
- Scientific Directorate of Epidemiology and Public Health, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium.
- Department of Epidemiology and Biostatistics, Institut de Recherche Expérimentale et Clinique, Faculty of Public Health, Université Catholique de Louvain, Clos Chapelle-aux-champs 30, 1200, Woluwe-Saint-Lambert, Belgium.
| | - Annie Robert
- Department of Epidemiology and Biostatistics, Institut de Recherche Expérimentale et Clinique, Faculty of Public Health, Université Catholique de Louvain, Clos Chapelle-aux-champs 30, 1200, Woluwe-Saint-Lambert, Belgium
| | - Nathalie Bossuyt
- Scientific Directorate of Epidemiology and Public Health, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium
| | - Laura A E Van Poelvoorde
- Transversal Activities in Applied Genomics, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium
| | - Sophie Quoilin
- Scientific Directorate of Epidemiology and Public Health, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium
| | | | - Brecht Devleesschauwer
- Scientific Directorate of Epidemiology and Public Health, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium
- Department of Veterinary Public Health and Food Safety, Ghent University, Salisburylaan 133, 9820, Merelbeke, Belgium
| | - Isabelle Thomas
- National Reference Center Influenza, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium
| | - Kevin Vanneste
- Transversal Activities in Applied Genomics, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium
| | - Nancy H C Roosens
- Transversal Activities in Applied Genomics, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium
| | - Herman Van Oyen
- Scientific Directorate of Epidemiology and Public Health, Sciensano, J. Wytsmanstraat 14, 1050, Brussels, Belgium
- Department of Public Health and Primary Care, Ghent University, De Pintelaan 185, 9000, Ghent, Belgium
| |
Collapse
|
15
|
Hao S, You J, Chen L, Zhao H, Huang Y, Zheng L, Tian L, Maric I, Liu X, Li T, Bianco YK, Winn VD, Aghaeepour N, Gaudilliere B, Angst MS, Zhou X, Li YM, Mo L, Wong RJ, Shaw GM, Stevenson DK, Cohen HJ, Mcelhinney DB, Sylvester KG, Ling XB. Changes in pregnancy-related serum biomarkers early in gestation are associated with later development of preeclampsia. PLoS One 2020; 15:e0230000. [PMID: 32126118 PMCID: PMC7053753 DOI: 10.1371/journal.pone.0230000] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 02/19/2020] [Indexed: 12/19/2022] Open
Abstract
Background Placental protein expression plays a crucial role during pregnancy. We hypothesized that: (1) circulating levels of pregnancy-associated, placenta-related proteins throughout gestation reflect the temporal progression of the uncomplicated, full-term pregnancy, and can effectively estimate gestational ages (GAs); and (2) preeclampsia (PE) is associated with disruptions in these protein levels early in gestation; and can identify impending PE. We also compared gestational profiles of proteins in the human and mouse, using pregnant heme oxygenase-1 (HO-1) heterozygote (Het) mice, a mouse model reflecting PE-like symptoms. Methods Serum levels of placenta-related proteins–leptin (LEP), chorionic somatomammotropin hormone like 1 (CSHL1), elabela (ELA), activin A, soluble fms-like tyrosine kinase 1 (sFlt-1), and placental growth factor (PlGF)–were quantified by ELISA in blood serially collected throughout human pregnancies (20 normal subjects with 66 samples, and 20 subjects who developed PE with 61 samples). Multivariate analysis was performed to estimate the GA in normal pregnancy. Mean-squared errors of GA estimations were used to identify impending PE. The human protein profiles were then compared with those in the pregnant HO-1 Het mice. Results An elastic net-based gestational dating model was developed (R2 = 0.76) and validated (R2 = 0.61) using serum levels of the 6 proteins measured at various GAs from women with normal uncomplicated pregnancies. In women who developed PE, the model was not (R2 = -0.17) associated with GA. Deviations from the model estimations were observed in women who developed PE (P = 0.01). The model developed with 5 proteins (ELA excluded) performed similarly from sera from normal human (R2 = 0.68) and WT mouse (R2 = 0.85) pregnancies. Disruptions of this model were observed in both human PE-associated (R2 = 0.27) and mouse HO-1 Het (R2 = 0.30) pregnancies. LEP outperformed sFlt-1 and PlGF in differentiating impending PE at early human and late mouse GAs. Conclusions Serum placenta-related protein profiles are temporally regulated throughout normal pregnancies and significantly disrupted in women who develop PE. LEP changes earlier than the well-established biomarkers (sFlt-1 and PlGF). There may be evidence of a causative action of HO-1 deficiency in LEP upregulation in a PE-like murine model.
Collapse
Affiliation(s)
- Shiying Hao
- Department of Cardiothoracic Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children’s Hospital, Palo Alto, CA, United States of America
| | - Jin You
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Lin Chen
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Hui Zhao
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Yujuan Huang
- Department of Emergency, Shanghai Children’s Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Le Zheng
- Department of Cardiothoracic Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children’s Hospital, Palo Alto, CA, United States of America
| | - Lu Tian
- Department of Health Research and Policy, Stanford University, Stanford, CA, United States of America
| | - Ivana Maric
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Xin Liu
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Tian Li
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Ylayaly K. Bianco
- Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Virginia D. Winn
- Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Nima Aghaeepour
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Brice Gaudilliere
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Martin S. Angst
- Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Xin Zhou
- Tianjin Key Laboratory of Cardiovascular Remodeling and Target Organ Injury, Pingjin Hospital Heart Center, Tianjin, China
| | - Yu-Ming Li
- Tianjin Key Laboratory of Cardiovascular Remodeling and Target Organ Injury, Pingjin Hospital Heart Center, Tianjin, China
| | - Lihong Mo
- Department of Obstetrics and Gynecology, University of California San Francisco-Fresno, Fresno, CA, United States of America
| | - Ronald J. Wong
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Gary M. Shaw
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, United States of America
| | - David K. Stevenson
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Harvey J. Cohen
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Doff B. Mcelhinney
- Department of Cardiothoracic Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children’s Hospital, Palo Alto, CA, United States of America
| | - Karl G. Sylvester
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
| | - Xuefeng B. Ling
- Clinical and Translational Research Program, Betty Irene Moore Children's Heart Center, Lucile Packard Children’s Hospital, Palo Alto, CA, United States of America
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, United States of America
- * E-mail:
| |
Collapse
|
16
|
Natarajan N, Vujic A, Das J, Wang AC, Phu KK, Kiehm SH, Ricci-Blair EM, Zhu AY, Vaughan KL, Colman RJ, Mattison JA, Lee RT. Effect of dietary fat and sucrose consumption on cardiac fibrosis in mice and rhesus monkeys. JCI Insight 2019; 4:128685. [PMID: 31415241 PMCID: PMC6795382 DOI: 10.1172/jci.insight.128685] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 08/12/2019] [Indexed: 01/11/2023] Open
Abstract
Calorie restriction (CR) improved health span in 2 longitudinal studies in nonhuman primates (NHPs), yet only the University of Wisconsin (UW) study demonstrated an increase in survival in CR monkeys relative to controls; the National Institute on Aging (NIA) study did not. Here, analysis of left ventricle samples showed that CR did not reduce cardiac fibrosis relative to controls. However, there was a 5.9-fold increase of total fibrosis in UW hearts, compared with NIA hearts. Diet composition was a prominent difference between the studies; therefore, we used the NHP diets to characterize diet-associated molecular and functional changes in the hearts of mice. Consistent with the findings from the NHP samples, mice fed a UW or a modified NIA diet with increased sucrose and fat developed greater cardiac fibrosis compared with mice fed the NIA diet, and transcriptomics analysis revealed diet-induced activation of myocardial oxidative phosphorylation and cardiac muscle contraction pathways.
Collapse
Affiliation(s)
- Niranjana Natarajan
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts, USA
| | - Ana Vujic
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts, USA
| | - Jishnu Das
- Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Annie C. Wang
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts, USA
| | - Krystal K. Phu
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts, USA
| | - Spencer H. Kiehm
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts, USA
| | - Elisabeth M. Ricci-Blair
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts, USA
| | - Anthony Y. Zhu
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts, USA
| | - Kelli L. Vaughan
- National Institute on Aging, NIH, Baltimore, Maryland, USA
- SoBran BioSciences, SoBran Inc., Burtonsville, Maryland, USA
| | - Ricki J. Colman
- Department of Cell and Regenerative Biology and Wisconsin National Primate Research Center, University of Wisconsin–Madison, Madison, Wisconsin, USA
| | | | - Richard T. Lee
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, Massachusetts, USA
- Cardiovascular Division, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
17
|
Wang C, Guo J, Zhao N, Liu Y, Liu X, Liu G, Guo M. A Cancer Survival Prediction Method Based on Graph Convolutional Network. IEEE Trans Nanobioscience 2019; 19:117-126. [PMID: 31443039 DOI: 10.1109/tnb.2019.2936398] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
BACKGROUND AND OBJECTIVE Cancer, as the most challenging part in the human disease history, has always been one of the main threats to human life and health. The high mortality of cancer is largely due to the complexity of cancer and the significant differences in clinical outcomes. Therefore, it will be significant to improve accuracy of cancer survival prediction, which has become one of the main fields of cancer research. Many calculation models for cancer survival prediction have been proposed at present, but most of them generate prediction models only by using single genomic data or clinical data. Multiple genomic data and clinical data have not been integrated yet to take a comprehensive consideration of cancers and predict their survival. METHOD In order to effectively integrate multiple genomic data (including genetic expression, copy number alteration, DNA methylation and exon expression) and clinical data and apply them to predictive studies on cancer survival, similar network fusion algorithm (SNF) was proposed in this paper to integrate multiple genomic data and clinical data so as to generate sample similarity matrix, min-redundancy and max-relevance algorithm (mRMR) was used to conduct feature selection of multiple genomic data and clinical data of cancer samples and generate sample feature matrix, and finally two matrixes were used for semi-supervised training through graph convolutional network (GCN) so as to obtain a cancer survival prediction method integrating multiple genomic data and clinical data based on graph convolutional network (GCGCN). RESULT Performance indexes of GCGCN model indicate that both multiple genomic data and clinical data play significant roles in the accurate survival time prediction of cancer patients. It is compared with existing survival prediction methods, and results show that cancer survival prediction method GCGCN which integrates multiple genomic data and clinical data has obviously superior prediction effect than existing survival prediction methods. CONCLUSION All study results in this paper have verified effectiveness and superiority of GCGCN in the aspect of cancer survival prediction.
Collapse
|
18
|
Gilvary C, Madhukar N, Elkhader J, Elemento O. The Missing Pieces of Artificial Intelligence in Medicine. Trends Pharmacol Sci 2019; 40:555-564. [DOI: 10.1016/j.tips.2019.06.001] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 06/03/2019] [Accepted: 06/04/2019] [Indexed: 12/22/2022]
|
19
|
Bing X, Bunea F, Royer M, Das J. Latent Model-Based Clustering for Biological Discovery. iScience 2019; 14:125-135. [PMID: 30954780 PMCID: PMC6449745 DOI: 10.1016/j.isci.2019.03.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 11/26/2018] [Accepted: 03/18/2019] [Indexed: 11/27/2022] Open
Abstract
LOVE, a robust, scalable latent model-based clustering method for biological discovery, can be used across a range of datasets to generate both overlapping and non-overlapping clusters. In our formulation, a cluster comprises variables associated with the same latent factor and is determined from an allocation matrix that indexes our latent model. We prove that the allocation matrix and corresponding clusters are uniquely defined. We apply LOVE to biological datasets (gene expression, serological responses measured from HIV controllers and chronic progressors, vaccine-induced humoral immune responses) resulting in meaningful biological output. For all three datasets, the clusters generated by LOVE remain stable across tuning parameters. Finally, we compared LOVE's performance to that of 13 state-of-the-art methods using previously established benchmarks and found that LOVE outperformed these methods across datasets. Our results demonstrate that LOVE can be broadly used across large-scale biological datasets to generate accurate and meaningful overlapping and non-overlapping clusters. LOVE is a robust, scalable, and versatile latent model-based clustering method Has theoretical guarantees, and can generate overlapping and non-overlapping clusters Generates meaningful clusters from datasets spanning a range of biological domains Using established benchmarks, outperforms 13 state-of-the-art methods across datasets
Collapse
|
20
|
Wang Y, Cho DY, Lee H, Fear J, Oliver B, Przytycka TM. Reprogramming of regulatory network using expression uncovers sex-specific gene regulation in Drosophila. Nat Commun 2018; 9:4061. [PMID: 30283019 PMCID: PMC6170494 DOI: 10.1038/s41467-018-06382-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 08/13/2018] [Indexed: 02/07/2023] Open
Abstract
Gene regulatory networks (GRNs) describe regulatory relationships between transcription factors (TFs) and their target genes. Computational methods to infer GRNs typically combine evidence across different conditions to infer context-agnostic networks. We develop a method, Network Reprogramming using EXpression (NetREX), that constructs a context-specific GRN given context-specific expression data and a context-agnostic prior network. NetREX remodels the prior network to obtain the topology that provides the best explanation for expression data. Because NetREX utilizes prior network topology, we also develop PriorBoost, a method that evaluates a prior network in terms of its consistency with the expression data. We validate NetREX and PriorBoost using the "gold standard" E. coli GRN from the DREAM5 network inference challenge and apply them to construct sex-specific Drosophila GRNs. NetREX constructed sex-specific Drosophila GRNs that, on all applied measures, outperform networks obtained from other methods indicating that NetREX is an important milestone toward building more accurate GRNs.
Collapse
Affiliation(s)
- Yijie Wang
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA
| | - Dong-Yeon Cho
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA
| | - Hangnoh Lee
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA
| | - Justin Fear
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA
| | - Brian Oliver
- Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, 50 South Drive, Bethesda, MD, 20892, USA.
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, 20894, USA.
| |
Collapse
|
21
|
Sun D, Li A, Tang B, Wang M. Integrating genomic data and pathological images to effectively predict breast cancer clinical outcome. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018; 161:45-53. [PMID: 29852967 DOI: 10.1016/j.cmpb.2018.04.008] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Revised: 03/31/2018] [Accepted: 04/11/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND AND OBJECTIVE Breast cancer is a leading cause of death from cancer for females. The high mortality rate of breast cancer is largely due to the complexity among invasive breast cancer and its significantly varied clinical outcomes. Therefore, improving the accuracy of breast cancer survival prediction has important significance and becomes one of the major research areas. Nowadays many computational models have been proposed for breast cancer survival prediction, however, most of them generate the predictive models by employing only the genomic data information and few of them consider the complementary information from pathological images. METHODS In our study, we introduce a novel method called GPMKL based on multiple kernel learning (MKL), which efficiently employs heterogeneous information containing genomic data (gene expression, copy number alteration, gene methylation, protein expression) and pathological images. With above heterogeneous features, GPMKL is proposed to execute feature fusion which is embedded in breast cancer classification. RESULTS Performance analysis of the GPMKL model indicates that the pathological image information plays a critical part in accurately predicting the survival time of breast cancer patients. Furthermore, the proposed method is compared with other existing breast cancer survival prediction methods, and the results demonstrate that the proposed framework with pathological images performs remarkably better than the existing survival prediction methods. CONCLUSIONS All results performed in our study suggest that the usefulness and superiority of GPMKL in predicting human breast cancer survival.
Collapse
Affiliation(s)
- Dongdong Sun
- School of Information Science and Technology, University of Science and Technology of China, 443 Huangshan Road, Hefei 230027, China.
| | - Ao Li
- School of Information Science and Technology, University of Science and Technology of China, 443 Huangshan Road, Hefei 230027, China; Research Centers for Biomedical Engineering, University of Science and Technology of China, 443 Huangshan Road, Hefei 230027, China.
| | - Bo Tang
- School of Information Science and Technology, University of Science and Technology of China, 443 Huangshan Road, Hefei 230027, China.
| | - Minghui Wang
- School of Information Science and Technology, University of Science and Technology of China, 443 Huangshan Road, Hefei 230027, China; Research Centers for Biomedical Engineering, University of Science and Technology of China, 443 Huangshan Road, Hefei 230027, China.
| |
Collapse
|
22
|
Mat AM, Klopp C, Payton L, Jeziorski C, Chalopin M, Amzil Z, Tran D, Wikfors GH, Hégaret H, Soudant P, Huvet A, Fabioux C. Oyster transcriptome response to Alexandrium exposure is related to saxitoxin load and characterized by disrupted digestion, energy balance, and calcium and sodium signaling. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2018; 199:127-137. [PMID: 29621672 DOI: 10.1016/j.aquatox.2018.03.030] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Revised: 03/22/2018] [Accepted: 03/25/2018] [Indexed: 06/08/2023]
Abstract
Harmful Algal Blooms are worldwide occurrences that can cause poisoning in human seafood consumers as well as mortality and sublethal effets in wildlife, propagating economic losses. One of the most widespread toxigenic microalgal taxa is the dinoflagellate Genus Alexandrium, that includes species producing neurotoxins referred to as PST (Paralytic Shellfish Toxins). Blooms cause shellfish harvest restrictions to protect human consumers from accumulated toxins. Large inter-individual variability in toxin load within an exposed bivalve population complicates monitoring of shellfish toxicity for ecology and human health regulation. To decipher the physiological pathways involved in the bivalve response to PST, we explored the whole transcriptome of the digestive gland of the Pacific oyster Crassostrea gigas fed experimentally with a toxic Alexandrium minutum culture. The largest differences in transcript abundance were between oysters with contrasting toxin loads (1098 transcripts), rather than between exposed and non-exposed oysters (16 transcripts), emphasizing the importance of toxin load in oyster response to toxic dinoflagellates. Additionally, penalized regressions, innovative in this field, modeled accurately toxin load based upon only 70 transcripts. Transcriptomic differences between oysters with contrasting PST burdens revealed a limited suite of metabolic pathways affected, including ion channels, neuromuscular communication, and digestion, all of which are interconnected and linked to sodium and calcium exchanges. Carbohydrate metabolism, unconsidered previously in studies of harmful algal effects on shellfish, was also highlighted, suggesting energy challenge in oysters with high toxin loads. Associations between toxin load, genotype, and mRNA levels were revealed that open new doors for genetic studies identifying genetically-based low toxin accumulation.
Collapse
Affiliation(s)
- Audrey M Mat
- Ifremer, LEMAR UMR 6539 CNRS/UBO/IRD/Ifremer, CS 10070, 29280 Plouzané, France
| | | | - Laura Payton
- UMR 5805 EPOC, CNRS - Université de Bordeaux, F-33120 Arcachon, France
| | | | - Morgane Chalopin
- Ifremer, LEMAR UMR 6539 CNRS/UBO/IRD/Ifremer, CS 10070, 29280 Plouzané, France
| | - Zouher Amzil
- Ifremer, Laboratoire Phycotoxines, rue de l'Ile d'Yeu, BP 21105, F-44311 Nantes, France
| | - Damien Tran
- UMR 5805 EPOC, CNRS - Université de Bordeaux, F-33120 Arcachon, France
| | - Gary H Wikfors
- Northeast Fisheries Science Center, NOAA National Marine Fisheries Service, 212 Rogers Avenue, Milford, CT 06460, USA
| | - Hélène Hégaret
- LEMAR UMR 6539 CNRS/UBO/IRD/Ifremer, IUEM, rue Dumont d'Urville, 29280 Plouzané, France
| | - Philippe Soudant
- LEMAR UMR 6539 CNRS/UBO/IRD/Ifremer, IUEM, rue Dumont d'Urville, 29280 Plouzané, France
| | - Arnaud Huvet
- Ifremer, LEMAR UMR 6539 CNRS/UBO/IRD/Ifremer, CS 10070, 29280 Plouzané, France
| | - Caroline Fabioux
- LEMAR UMR 6539 CNRS/UBO/IRD/Ifremer, IUEM, rue Dumont d'Urville, 29280 Plouzané, France.
| |
Collapse
|
23
|
Park H, Niida A, Imoto S, Miyano S. Interaction-Based Feature Selection for Uncovering Cancer Driver Genes Through Copy Number-Driven Expression Level. J Comput Biol 2016; 24:138-152. [PMID: 27759426 DOI: 10.1089/cmb.2016.0140] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Driver gene selection is crucial to understand the heterogeneous system of cancer. To identity cancer driver genes, various statistical strategies have been proposed, especially the L1-type regularization methods have drawn a large amount of attention. However, the statistical approaches have been developed purely from algorithmic and statistical point, and the existing studies have applied the statistical approaches to genomic data analysis without consideration of biological knowledge. We consider a statistical strategy incorporating biological knowledge to identify cancer driver gene. The alterations of copy number have been considered to driver cancer pathogenesis processes, and the region of strong interaction of copy number alterations and expression levels was known as a tumor-related symptom. We incorporate the influence of copy number alterations on expression levels to cancer driver gene-selection processes. To quantify the dependence of copy number alterations on expression levels, we consider [Formula: see text] and [Formula: see text] effects of copy number alterations on expression levels of genes, and incorporate the symptom of tumor pathogenesis to gene-selection procedures. We then proposed an interaction-based feature-selection strategy based on the adaptive L1-type regularization and random lasso procedures. The proposed method imposes a large amount of penalty on genes corresponding to a low dependency of the two features, thus the coefficients of the genes are estimated to be small or exactly 0. It implies that the proposed method can provide biologically relevant results in cancer driver gene selection. Monte Carlo simulations and analysis of the Cancer Genome Atlas (TCGA) data show that the proposed strategy is effective for high-dimensional genomic data analysis. Furthermore, the proposed method provides reliable and biologically relevant results for cancer driver gene selection in TCGA data analysis.
Collapse
Affiliation(s)
- Heewon Park
- 1 Faculty of Global and Science Studies, Yamaguchi University , Yamaguchi Prefecture, Japan
| | - Atsushi Niida
- 2 Health Intelligence Center, Institute of Medical Science, University of Tokyo , Tokyo, Japan
| | - Seiya Imoto
- 2 Health Intelligence Center, Institute of Medical Science, University of Tokyo , Tokyo, Japan
| | - Satoru Miyano
- 3 Human Genome Center, Institute of Medical Science, University of Tokyo , Tokyo, Japan
| |
Collapse
|
24
|
Zhang F, Ren C, Lau KK, Zheng Z, Lu G, Yi Z, Zhao Y, Su F, Zhang S, Zhang B, Sobie EA, Zhang W, Walsh MJ. A network medicine approach to build a comprehensive atlas for the prognosis of human cancer. Brief Bioinform 2016; 17:1044-1059. [PMID: 27559151 DOI: 10.1093/bib/bbw076] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2016] [Revised: 04/26/2016] [Indexed: 02/07/2023] Open
Abstract
The Cancer Genome Atlas project has generated multi-dimensional and highly integrated genomic data from a large number of patient samples with detailed clinical records across many cancer types, but it remains unclear how to best integrate the massive amount of genomic data into clinical practice. We report here our methodology to build a multi-dimensional subnetwork atlas for cancer prognosis to better investigate the potential impact of multiple genetic and epigenetic (gene expression, copy number variation, microRNA expression and DNA methylation) changes on the molecular states of networks that in turn affects complex cancer survivorship. We uncover an average of 38 novel subnetworks in the protein-protein interaction network that correlate with prognosis across four prominent cancer types. The clinical utility of these subnetwork biomarkers was further evaluated by prognostic impact evaluation, functional enrichment analysis, drug target annotation, tumor stratification and independent validation. Some pathways including the dynactin, cohesion and pyruvate dehydrogenase-related subnetworks are identified as promising new targets for therapy in specific cancer types. In conclusion, this integrative analysis of existing protein interactome and cancer genomics data allows us to systematically dissect the molecular mechanisms that underlie unexpected outcomes for cancer, which could be used to better understand and predict clinical outcomes, optimize treatment and to provide new opportunities for developing therapeutics related to the subnetworks identified.
Collapse
|
25
|
Wu MY, Zhang XF, Dai DQ, Ou-Yang L, Zhu Y, Yan H. Regularized logistic regression with network-based pairwise interaction for biomarker identification in breast cancer. BMC Bioinformatics 2016; 17:108. [PMID: 26921029 PMCID: PMC4769543 DOI: 10.1186/s12859-016-0951-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 01/28/2016] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND To facilitate advances in personalized medicine, it is important to detect predictive, stable and interpretable biomarkers related with different clinical characteristics. These clinical characteristics may be heterogeneous with respect to underlying interactions between genes. Usually, traditional methods just focus on detection of differentially expressed genes without taking the interactions between genes into account. Moreover, due to the typical low reproducibility of the selected biomarkers, it is difficult to give a clear biological interpretation for a specific disease. Therefore, it is necessary to design a robust biomarker identification method that can predict disease-associated interactions with high reproducibility. RESULTS In this article, we propose a regularized logistic regression model. Different from previous methods which focus on individual genes or modules, our model takes gene pairs, which are connected in a protein-protein interaction network, into account. A line graph is constructed to represent the adjacencies between pairwise interactions. Based on this line graph, we incorporate the degree information in the model via an adaptive elastic net, which makes our model less dependent on the expression data. Experimental results on six publicly available breast cancer datasets show that our method can not only achieve competitive performance in classification, but also retain great stability in variable selection. Therefore, our model is able to identify the diagnostic and prognostic biomarkers in a more robust way. Moreover, most of the biomarkers discovered by our model have been verified in biochemical or biomedical researches. CONCLUSIONS The proposed method shows promise in the diagnosis of disease pathogenesis with different clinical characteristics. These advances lead to more accurate and stable biomarker discovery, which can monitor the functional changes that are perturbed by diseases. Based on these predictions, researchers may be able to provide suggestions for new therapeutic approaches.
Collapse
Affiliation(s)
- Meng-Yun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Guoding Road, Shanghai, 200433, China. .,Key Laboratory of Mathematical Economics SUFE, Ministry of Education, Guoding Road, Shanghai, 200433, China.
| | - Xiao-Fei Zhang
- School of Mathematics and Statistics & Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Luoyu Road, Wuhan, 430079, China.
| | - Dao-Qing Dai
- Intelligent Data Center and Department of Mathematics, Sun Yat-Sen University, Xingang West Road, Guangzhou, 510275, China.
| | - Le Ou-Yang
- College of Information Engineering, Shenzhen University, Nanhai Avenue, Shenzhen, 518060, China.
| | - Yuan Zhu
- School of Automation, China University of Geosciences, Lumo Road, Wuhan, 430074, China.
| | - Hong Yan
- Department of Electronic and Engineering, City University of Hong Kong, Tat Chee Avenue, Hong Kong, 999077, China.
| |
Collapse
|