1
|
Sheehy J, Rutledge H, Acharya UR, Loh HW, Gururajan R, Tao X, Zhou X, Li Y, Gurney T, Kondalsamy-Chennakesavan S. Gynecological cancer prognosis using machine learning techniques: A systematic review of last three decades (1990–2022). Artif Intell Med 2023; 139:102536. [PMID: 37100507 DOI: 10.1016/j.artmed.2023.102536] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 03/19/2023] [Accepted: 03/23/2023] [Indexed: 03/30/2023]
Abstract
OBJECTIVE Many Computer Aided Prognostic (CAP) systems based on machine learning techniques have been proposed in the field of oncology. The objective of this systematic review was to assess and critically appraise the methodologies and approaches used in predicting the prognosis of gynecological cancers using CAPs. METHODS Electronic databases were used to systematically search for studies utilizing machine learning methods in gynecological cancers. Study risk of bias (ROB) and applicability were assessed using the PROBAST tool. 139 studies met the inclusion criteria, of which 71 predicted outcomes for ovarian cancer patients, 41 predicted outcomes for cervical cancer patients, 28 predicted outcomes for uterine cancer patients, and 2 predicted outcomes for gynecological malignancies broadly. RESULTS Random forest (22.30 %) and support vector machine (21.58 %) classifiers were used most commonly. Use of clinicopathological, genomic and radiomic data as predictors was observed in 48.20 %, 51.08 % and 17.27 % of studies, respectively, with some studies using multiple modalities. 21.58 % of studies were externally validated. Twenty-three individual studies compared ML and non-ML methods. Study quality was highly variable and methodologies, statistical reporting and outcome measures were inconsistent, preventing generalized commentary or meta-analysis of performance outcomes. CONCLUSION There is significant variability in model development when prognosticating gynecological malignancies with respect to variable selection, machine learning (ML) methods and endpoint selection. This heterogeneity prevents meta-analysis and conclusions regarding the superiority of ML methods. Furthermore, PROBAST-mediated ROB and applicability analysis demonstrates concern for the translatability of existing models. This review identifies ways that this can be improved upon in future works to develop robust, clinically translatable models within this promising field.
Collapse
|
2
|
Nguyen H, Shrestha S, Tran D, Shafi A, Draghici S, Nguyen T. A Comprehensive Survey of Tools and Software for Active Subnetwork Identification. Front Genet 2019; 10:155. [PMID: 30891064 PMCID: PMC6411791 DOI: 10.3389/fgene.2019.00155] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 02/13/2019] [Indexed: 12/13/2022] Open
Abstract
A recent focus of computational biology has been to integrate the complementary information available in molecular profiles as well as in multiple network databases in order to identify connected regions that show significant changes under different conditions. This allows for capturing dynamic and condition-specific mechanisms of the underlying phenomena and disease stages. Here we review 22 such integrative approaches for active module identification published over the last decade. This article only focuses on tools that are currently available for use and are well-maintained. We compare these methods focusing on their primary features, integrative abilities, network structures, mathematical models, and implementations. We also provide real-world scenarios in which these methods have been successfully applied, as well as highlight outstanding challenges in the field that remain to be addressed. The main objective of this review is to help potential users and researchers to choose the best method that is suitable for their data and analysis purpose.
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Sangam Shrestha
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Duc Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| | - Adib Shafi
- Department of Computer Science, Wayne State University, Detroit, MI, United States
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, United States
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, United States
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States
| |
Collapse
|
3
|
ARD-PRED: an in silico tool for predicting age-related-disorder-associated proteins. Soft comput 2019. [DOI: 10.1007/s00500-018-3154-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
4
|
Ozturk K, Dow M, Carlin DE, Bejar R, Carter H. The Emerging Potential for Network Analysis to Inform Precision Cancer Medicine. J Mol Biol 2018; 430:2875-2899. [PMID: 29908887 PMCID: PMC6097914 DOI: 10.1016/j.jmb.2018.06.016] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Revised: 05/30/2018] [Accepted: 06/06/2018] [Indexed: 12/19/2022]
Abstract
Precision cancer medicine promises to tailor clinical decisions to patients using genomic information. Indeed, successes of drugs targeting genetic alterations in tumors, such as imatinib that targets BCR-ABL in chronic myelogenous leukemia, have demonstrated the power of this approach. However, biological systems are complex, and patients may differ not only by the specific genetic alterations in their tumor, but also by more subtle interactions among such alterations. Systems biology and more specifically, network analysis, provides a framework for advancing precision medicine beyond clinical actionability of individual mutations. Here we discuss applications of network analysis to study tumor biology, early methods for N-of-1 tumor genome analysis, and the path for such tools to the clinic.
Collapse
Affiliation(s)
- Kivilcim Ozturk
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Michelle Dow
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Daniel E Carlin
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA 92093, USA
| | - Rafael Bejar
- Moores Cancer Center, Division of Hematology and Oncology, University of California San Diego, La Jolla, CA 92093, USA
| | - Hannah Carter
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA; Moores Cancer Center and Institute for Genomic Medicine, University of California San Diego, La Jolla, CA 92093, USA; CIFAR, MaRS Centre, West Tower, 661 University Ave., Suite 505, Toronto, ON M5G 1M1, Canada.
| |
Collapse
|
5
|
Stenemo M, Teleman J, Sjöström M, Grubb G, Malmström E, Malmström J, Niméus E. Cancer associated proteins in blood plasma: Determining normal variation. Proteomics 2016; 16:1928-37. [PMID: 27121749 DOI: 10.1002/pmic.201500204] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Revised: 03/12/2016] [Accepted: 04/15/2016] [Indexed: 11/07/2022]
Abstract
Protein biomarkers have the potential to improve diagnosis, stratification of patients into treatment cohorts, follow disease progression and treatment response. One distinct group of potential biomarkers comprises proteins which have been linked to cancer, known as cancer associated proteins (CAPs). We determined the normal variation of 86 CAPs in 72 individual plasma samples collected from ten individuals using SRM mass spectrometry. Samples were collected weekly during 5 weeks from ten volunteers and over one day at nine fixed time points from three volunteers. We determined the degree of the normal variation depending on interpersonal variation, variation due to time of day, and variation over weeks and observed that the variation dependent on the time of day appeared to be the most important. Subdivision of the proteins resulted in two predominant protein groups containing 21 proteins with relatively high variation in all three factors (day, week and individual), and 22 proteins with relatively low variation in all factors. We present a strategy for prioritizing biomarker candidates for future studies based on stratification over their normal variation and have made all data publicly available. Our findings can be used to improve selection of biomarker candidates in future studies and to determine which proteins are most suitable depending on study design.
Collapse
Affiliation(s)
- Markus Stenemo
- Department of Clinical Sciences Lund, Division of Infection Medicine, Lund University, Lund, Sweden.,Department of Clinical Sciences Lund, Oncology and Pathology, Lund University, Lund, Sweden
| | - Johan Teleman
- Department of Clinical Sciences Lund, Division of Infection Medicine, Lund University, Lund, Sweden
| | - Martin Sjöström
- Department of Clinical Sciences Lund, Oncology and Pathology, Lund University, Lund, Sweden
| | - Gabriel Grubb
- Department of Clinical Sciences Lund, Oncology and Pathology, Lund University, Lund, Sweden
| | - Erik Malmström
- Department of Clinical Sciences Lund, Division of Infection Medicine, Lund University, Lund, Sweden
| | - Johan Malmström
- Department of Clinical Sciences Lund, Division of Infection Medicine, Lund University, Lund, Sweden
| | - Emma Niméus
- Department of Clinical Sciences Lund, Oncology and Pathology, Lund University, Lund, Sweden.,Skåne University Hospital, Department of Surgery, Lund, Sweden
| |
Collapse
|
6
|
Paul Y, Hasija Y. Gene Prioritization by Integrated Analysis of Protein Structural and Network Topological Properties for the Protein-Protein Interaction Network of Neurological Disorders. SCIENTIFICA 2016; 2016:9589404. [PMID: 27034906 PMCID: PMC4808548 DOI: 10.1155/2016/9589404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Revised: 02/11/2016] [Accepted: 02/18/2016] [Indexed: 06/05/2023]
Abstract
Neurological disorders are known to show similar phenotypic manifestations like anxiety, depression, and cognitive impairment. There is a need to identify shared genetic markers and molecular pathways in these diseases, which lead to such comorbid conditions. Our study aims to prioritize novel genetic markers that might increase the susceptibility of patients affected with one neurological disorder to other diseases with similar manifestations. Identification of pathways involving common candidate markers will help in the development of improved diagnosis and treatments strategies for patients affected with neurological disorders. This systems biology study for the first time integratively uses 3D-structural protein interface descriptors and network topological properties that characterize proteins in a neurological protein interaction network, to aid the identification of genes that are previously not known to be shared between these diseases. Results of protein prioritization by machine learning have identified known as well as new genetic markers which might have direct or indirect involvement in several neurological disorders. Important gene hubs have also been identified that provide an evidence for shared molecular pathways in the neurological disease network.
Collapse
Affiliation(s)
- Yashna Paul
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, New Delhi, Delhi 110042, India
| | - Yasha Hasija
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, New Delhi, Delhi 110042, India
| |
Collapse
|
7
|
Cheng F, Zhao J, Zhao Z. Advances in computational approaches for prioritizing driver mutations and significantly mutated genes in cancer genomes. Brief Bioinform 2015; 17:642-56. [PMID: 26307061 DOI: 10.1093/bib/bbv068] [Citation(s) in RCA: 91] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2015] [Indexed: 12/27/2022] Open
Abstract
Cancer is often driven by the accumulation of genetic alterations, including single nucleotide variants, small insertions or deletions, gene fusions, copy-number variations, and large chromosomal rearrangements. Recent advances in next-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data and catalog somatic mutations in both common and rare cancer types. So far, the somatic mutation landscapes and signatures of >10 major cancer types have been reported; however, pinpointing driver mutations and cancer genes from millions of available cancer somatic mutations remains a monumental challenge. To tackle this important task, many methods and computational tools have been developed during the past several years and, thus, a review of its advances is urgently needed. Here, we first summarize the main features of these methods and tools for whole-exome, whole-genome and whole-transcriptome sequencing data. Then, we discuss major challenges like tumor intra-heterogeneity, tumor sample saturation and functionality of synonymous mutations in cancer, all of which may result in false-positive discoveries. Finally, we highlight new directions in studying regulatory roles of noncoding somatic mutations and quantitatively measuring circulating tumor DNA in cancer. This review may help investigators find an appropriate tool for detecting potential driver or actionable mutations in rapidly emerging precision cancer medicine.
Collapse
|
8
|
Shi M, Wu M, Pan P, Zhao R. Network-based sub-network signatures unveil the potential for acute myeloid leukemia therapy. MOLECULAR BIOSYSTEMS 2015; 10:3290-7. [PMID: 25313005 DOI: 10.1039/c4mb00440j] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Although gene expression profiling studies of acute myeloid leukemia (AML) patients have provided key insights into potential diagnostic and prognostic markers and therapeutic targets, it is not clear that the patterns of molecular heterogeneity affect the tumor biology and respond to the treatment. We hypothesized that network-based gene expression signatures of AML represent the mechanistically important genes and may improve the predicted performance of prognosis and clinical outcome. We provided the random walk with restart (RWR) analysis to discover the sub-network of genomic alterations. The RWR approach integrates the signature genes derived from the random forest (RF) analysis as "seeds" to identify genes critical to the AML recurrence phenotype. To test whether the 81-gene biomarkers could predict AML recurrence, we developed Survival Support Vector Machine (SSVM) models using a gene expression dataset and test on an independent dataset. The random forest classifier was built based on 81-gene biomarkers to separate the AML patients into "recurrence" and "non-recurrence" groups. The 81-gene biomarkers showed significant enrichment related to cancer pathophysiology and provided good coverage of sub-network biomarkers and AML-related signaling pathways. The SSVM-based score was significantly associated with overall survival (hazard ratio [HR], 2.16; 95% confidence interval [CI], 1.18-3.97; p = 0.01). Similar results were obtained with reversed training and testing datasets (hazard ratio [HR], 1.6; 95% confidence interval [CI], 1.08-2.37; p = 0.02). The 81-gene biomarker based RF classifier improved classification performance. Overall, 81-gene biomarkers might be useful prognostic and predictive molecular markers to predict the clinical outcome of AML patients.
Collapse
Affiliation(s)
- Mingguang Shi
- School of Electric Engineering and Automation, Hefei University of Technology, Hefei, Anhui 230009, China.
| | | | | | | |
Collapse
|
9
|
Jain P, Thukral N, Gahlot LK, Hasija Y. CARDIO-PRED: an in silico tool for predicting cardiovascular-disorder associated proteins. SYSTEMS AND SYNTHETIC BIOLOGY 2015; 9:55-66. [PMID: 25972989 DOI: 10.1007/s11693-015-9164-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 03/06/2015] [Indexed: 10/23/2022]
Abstract
Interactions between proteins largely govern cellular processes and this has led to numerous efforts culminating in enormous information related to the proteins, their interactions and the function which is determined by their interactions. The main concern of the present study is to present interface analysis of cardiovascular-disorder (CVD) related proteins to shed lights on details of interactions and to emphasize the importance of using structures in network studies. This study combines the network-centred approach with three dimensional studies to comprehend the fundamentals of biology. Interface properties were used as descriptors to classify the CVD associated proteins and non-CVD associated proteins. Machine learning algorithm was used to generate a classifier based on the training set which was then used to predict potential CVD related proteins from a set of polymorphic proteins which are not known to be involved in any disease. Among several classifying algorithms applied to generate models, best performance was achieved using Random Forest with an accuracy of 69.5 %. The tool named CARDIO-PRED, based on the prediction model is present at http://www.genomeinformatics.dce.edu/CARDIO-PRED/. The predicted CVD related proteins may not be the causing factor of particular disease but can be involved in pathways and reactions yet unknown to us thus permitting a more rational analysis of disease mechanism. Study of their interactions with other proteins can significantly improve our understanding of the molecular mechanism of diseases.
Collapse
Affiliation(s)
- Prerna Jain
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042 India
| | - Nitin Thukral
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042 India
| | - Lokesh Kumar Gahlot
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042 India
| | - Yasha Hasija
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042 India
| |
Collapse
|
10
|
Suo C, Hrydziuszko O, Lee D, Pramana S, Saputra D, Joshi H, Calza S, Pawitan Y. Integration of somatic mutation, expression and functional data reveals potential driver genes predictive of breast cancer survival. Bioinformatics 2015; 31:2607-13. [PMID: 25810432 DOI: 10.1093/bioinformatics/btv164] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Accepted: 03/16/2015] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Genome and transcriptome analyses can be used to explore cancers comprehensively, and it is increasingly common to have multiple omics data measured from each individual. Furthermore, there are rich functional data such as predicted impact of mutations on protein coding and gene/protein networks. However, integration of the complex information across the different omics and functional data is still challenging. Clinical validation, particularly based on patient outcomes such as survival, is important for assessing the relevance of the integrated information and for comparing different procedures. RESULTS An analysis pipeline is built for integrating genomic and transcriptomic alterations from whole-exome and RNA sequence data and functional data from protein function prediction and gene interaction networks. The method accumulates evidence for the functional implications of mutated potential driver genes found within and across patients. A driver-gene score (DGscore) is developed to capture the cumulative effect of such genes. To contribute to the score, a gene has to be frequently mutated, with high or moderate mutational impact at protein level, exhibiting an extreme expression and functionally linked to many differentially expressed neighbors in the functional gene network. The pipeline is applied to 60 matched tumor and normal samples of the same patient from The Cancer Genome Atlas breast-cancer project. In clinical validation, patients with high DGscores have worse survival than those with low scores (P = 0.001). Furthermore, the DGscore outperforms the established expression-based signatures MammaPrint and PAM50 in predicting patient survival. In conclusion, integration of mutation, expression and functional data allows identification of clinically relevant potential driver genes in cancer. AVAILABILITY AND IMPLEMENTATION The documented pipeline including annotated sample scripts can be found in http://fafner.meb.ki.se/biostatwiki/driver-genes/. CONTACT yudi.pawitan@ki.se SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chen Suo
- School of Life Sciences, Peking University, Beijing, China, Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Olga Hrydziuszko
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Donghwan Lee
- Department of Statistics, Ewha Womans University, Seoul, South Korea
| | - Setia Pramana
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, Department of Computational Statistics, Institute of Statistics, Jakarta, Indonesia and
| | - Dhany Saputra
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Himanshu Joshi
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Stefano Calza
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
11
|
Li H, Liu C, Rwebangira MR, Burge L. Mono-isotope Prediction for Mass Spectra Using Bayes Network. TSINGHUA SCIENCE AND TECHNOLOGY 2014; 19:617-623. [PMID: 25620856 PMCID: PMC4302766 DOI: 10.1109/tst.2014.6961030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Mass spectrometry is one of the widely utilized important methods to study protein functions and components. The challenge of mono-isotope pattern recognition from large scale protein mass spectral data needs computational algorithms and tools to speed up the analysis and improve the analytic results. We utilized naïve Bayes network as the classifier with the assumption that the selected features are independent to predict mono-isotope pattern from mass spectrometry. Mono-isotopes detected from validated theoretical spectra were used as prior information in the Bayes method. Three main features extracted from the dataset were employed as independent variables in our model. The application of the proposed algorithm to publicMo dataset demonstrates that our naïve Bayes classifier is advantageous over existing methods in both accuracy and sensitivity.
Collapse
|
12
|
Das J, Gayvert KM, Yu H. Predicting cancer prognosis using functional genomics data sets. Cancer Inform 2014; 13:85-8. [PMID: 25392695 PMCID: PMC4218897 DOI: 10.4137/cin.s14064] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Revised: 09/17/2014] [Accepted: 09/19/2014] [Indexed: 11/06/2022] Open
Abstract
Elucidating the molecular basis of human cancers is an extremely complex and challenging task. A wide variety of computational tools and experimental techniques have been used to address different aspects of this characterization. One major hurdle faced by both clinicians and researchers has been to pinpoint the mechanistic basis underlying a wide range of prognostic outcomes for the same type of cancer. Here, we provide an overview of various computational methods that have leveraged different functional genomics data sets to identify molecular signatures that can be used to predict prognostic outcome for various human cancers. Furthermore, we outline challenges that remain and future directions that may be explored to address them.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA. ; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Kaitlyn M Gayvert
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA. ; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| |
Collapse
|
13
|
Yu C, Boutté A, Yu X, Dutta B, Feala JD, Schmid K, Dave J, Tawa GJ, Wallqvist A, Reifman J. A systems biology strategy to identify molecular mechanisms of action and protein indicators of traumatic brain injury. J Neurosci Res 2014; 93:199-214. [PMID: 25399920 PMCID: PMC4305271 DOI: 10.1002/jnr.23503] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Revised: 08/26/2014] [Accepted: 09/24/2014] [Indexed: 01/01/2023]
Abstract
The multifactorial nature of traumatic brain injury (TBI), especially the complex secondary tissue injury involving intertwined networks of molecular pathways that mediate cellular behavior, has confounded attempts to elucidate the pathology underlying the progression of TBI. Here, systems biology strategies are exploited to identify novel molecular mechanisms and protein indicators of brain injury. To this end, we performed a meta-analysis of four distinct high-throughput gene expression studies involving different animal models of TBI. By using canonical pathways and a large human protein-interaction network as a scaffold, we separately overlaid the gene expression data from each study to identify molecular signatures that were conserved across the different studies. At 24 hr after injury, the significantly activated molecular signatures were nonspecific to TBI, whereas the significantly suppressed molecular signatures were specific to the nervous system. In particular, we identified a suppressed subnetwork consisting of 58 highly interacting, coregulated proteins associated with synaptic function. We selected three proteins from this subnetwork, postsynaptic density protein 95, nitric oxide synthase 1, and disrupted in schizophrenia 1, and hypothesized that their abundance would be significantly reduced after TBI. In a penetrating ballistic-like brain injury rat model of severe TBI, Western blot analysis confirmed our hypothesis. In addition, our analysis recovered 12 previously identified protein biomarkers of TBI. The results suggest that systems biology may provide an efficient, high-yield approach to generate testable hypotheses that can be experimentally validated to identify novel mechanisms of action and molecular indicators of TBI.
Collapse
Affiliation(s)
- Chenggang Yu
- Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, Maryland
| | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Begum T, Ghosh TC. Elucidating the genotype-phenotype relationships and network perturbations of human shared and specific disease genes from an evolutionary perspective. Genome Biol Evol 2014; 6:2741-53. [PMID: 25287147 PMCID: PMC4224346 DOI: 10.1093/gbe/evu220] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To date, numerous studies have been attempted to determine the extent of variation in evolutionary rates between human disease and nondisease (ND) genes. In our present study, we have considered human autosomal monogenic (Mendelian) disease genes, which were classified into two groups according to the number of phenotypic defects, that is, specific disease (SPD) gene (one gene: one defect) and shared disease (SHD) gene (one gene: multiple defects). Here, we have compared the evolutionary rates of these two groups of genes, that is, SPD genes and SHD genes with respect to ND genes. We observed that the average evolutionary rates are slow in SHD group, intermediate in SPD group, and fast in ND group. Group-to-group evolutionary rate differences remain statistically significant regardless of their gene expression levels and number of defects. We demonstrated that disease genes are under strong selective constraint if they emerge through edgetic perturbation or drug-induced perturbation of the interactome network, show tissue-restricted expression, and are involved in transmembrane transport. Among all the factors, our regression analyses interestingly suggest the independent effects of 1) drug-induced perturbation and 2) the interaction term of expression breadth and transmembrane transport on protein evolutionary rates. We reasoned that the drug-induced network disruption is a combination of several edgetic perturbations and, thus, has more severe effect on gene phenotypes.
Collapse
Affiliation(s)
- Tina Begum
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
| | | |
Collapse
|
15
|
Abstract
The past decade has seen a dramatic expansion in the number and range of techniques available to obtain genome-wide information and to analyze this information so as to infer both the functions of individual molecules and how they interact to modulate the behavior of biological systems. Here, we review these techniques, focusing on the construction of physical protein-protein interaction networks, and highlighting approaches that incorporate protein structure, which is becoming an increasingly important component of systems-level computational techniques. We also discuss how network analyses are being applied to enhance our basic understanding of biological systems and their disregulation, as well as how these networks are being used in drug development.
Collapse
Affiliation(s)
- Donald Petrey
- Center for Computational Biology and Bioinformatics, Department of Systems Biology
| | | |
Collapse
|
16
|
Hudler P, Kocevar N, Komel R. Proteomic approaches in biomarker discovery: new perspectives in cancer diagnostics. ScientificWorldJournal 2014; 2014:260348. [PMID: 24550697 PMCID: PMC3914447 DOI: 10.1155/2014/260348] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 10/08/2013] [Indexed: 12/14/2022] Open
Abstract
Despite remarkable progress in proteomic methods, including improved detection limits and sensitivity, these methods have not yet been established in routine clinical practice. The main limitations, which prevent their integration into clinics, are high cost of equipment, the need for highly trained personnel, and last, but not least, the establishment of reliable and accurate protein biomarkers or panels of protein biomarkers for detection of neoplasms. Furthermore, the complexity and heterogeneity of most solid tumours present obstacles in the discovery of specific protein signatures, which could be used for early detection of cancers, for prediction of disease outcome, and for determining the response to specific therapies. However, cancer proteome, as the end-point of pathological processes that underlie cancer development and progression, could represent an important source for the discovery of new biomarkers and molecular targets for tailored therapies.
Collapse
Affiliation(s)
- Petra Hudler
- Medical Centre for Molecular Biology, Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Vrazov trg 2, 1000 Ljubljana, Slovenia
| | - Nina Kocevar
- Medical Centre for Molecular Biology, Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Vrazov trg 2, 1000 Ljubljana, Slovenia
| | - Radovan Komel
- Medical Centre for Molecular Biology, Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Vrazov trg 2, 1000 Ljubljana, Slovenia
| |
Collapse
|
17
|
Integrative approaches for finding modular structure in biological networks. Nat Rev Genet 2013; 14:719-32. [PMID: 24045689 DOI: 10.1038/nrg3552] [Citation(s) in RCA: 343] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
A central goal of systems biology is to elucidate the structural and functional architecture of the cell. To this end, large and complex networks of molecular interactions are being rapidly generated for humans and model organisms. A recent focus of bioinformatics research has been to integrate these networks with each other and with diverse molecular profiles to identify sets of molecules and interactions that participate in a common biological function - that is, 'modules'. Here, we classify such integrative approaches into four broad categories, describe their bioinformatic principles and review their applications.
Collapse
|
18
|
Sehhati MR, Dehnavi AM, Rabbani H, Javanmard SH. Using protein interaction database and support vector machines to improve gene signatures for prediction of breast cancer recurrence. JOURNAL OF MEDICAL SIGNALS & SENSORS 2013; 3:87-93. [PMID: 24098862 PMCID: PMC3788198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2013] [Accepted: 04/07/2013] [Indexed: 11/03/2022]
Abstract
Numerous studies used microarray gene expression data to extract metastasis-driving gene signatures for the prediction of breast cancer relapse. However, the accuracy and generality of the previously introduced biomarkers are not acceptable for reliable usage in independent datasets. This inadequacy is attributed to ignoring gene interactions by simple feature selection methods, due to their computational burden. In this study, an integrated approach with low computational cost was proposed for identifying a more predictive gene signature, for prediction of breast cancer recurrence. First, a small set of genes was primarily selected as signature by an appropriate filter feature selection (FFS) method. Then, a binary sub-class of protein-protein interaction (PPI) network was used to expand the primary set by adding adjacent proteins of each gene signature from the PPI-network. Subsequently, the support vector machine-based recursive feature elimination (SVMRFE) method was applied to the expression level of all the genes in the expanded set. Finally, the genes with the highest score by SVMRFE were selected as the new biomarkers. Accuracy of the final selected biomarkers was evaluated to classify four datasets on breast cancer patients, including 800 cases, into two cohorts of poor and good prognosis. The results of the five-fold cross validation test, using the support vector machine as a classifier, showed more than 13% improvement in the average accuracy, after modifying the primary selected signatures. Moreover, the method used in this study showed a lower computational cost compared to the other PPI-based methods. The proposed method demonstrated more robust and accurate biomarkers using the PPI network, at a low computational cost. This approach could be used as a supplementary procedure in microarray studies after applying various gene selection methods.
Collapse
Affiliation(s)
- Mohammad Reza Sehhati
- Department of Biomedical Engineering, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Alireza Mehri Dehnavi
- Department of Biomedical Engineering, Isfahan University of Medical Sciences, Isfahan, Iran,Address for correspondence: Dr. Alireza Mehri Dehnavi, Department of Biomedical Engineering, Medical Image & Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, Iran. E-mail:
| | - Hossein Rabbani
- Department of Biomedical Engineering, Isfahan University of Medical Sciences, Isfahan, Iran
| | | |
Collapse
|
19
|
Wu G, Stein L. A network module-based method for identifying cancer prognostic signatures. Genome Biol 2012; 13:R112. [PMID: 23228031 PMCID: PMC3580410 DOI: 10.1186/gb-2012-13-12-r112] [Citation(s) in RCA: 127] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2012] [Revised: 11/21/2012] [Accepted: 12/10/2012] [Indexed: 12/12/2022] Open
Abstract
Discovering robust prognostic gene signatures as biomarkers using genomics data can be challenging. We have developed a simple but efficient method for discovering prognostic biomarkers in cancer gene expression data sets using modules derived from a highly reliable gene functional interaction network. When applied to breast cancer, we discover a novel 31-gene signature associated with patient survival. The signature replicates across 5 independent gene expression studies, and outperforms 48 published gene signatures. When applied to ovarian cancer, the algorithm identifies a 75-gene signature associated with patient survival. A Cytoscape plugin implementation of the signature discovery method is available at http://wiki.reactome.org/index.php/Reactome_FI_Cytoscape_Plugin.
Collapse
Affiliation(s)
- Guanming Wu
- Ontario Institute for Cancer Research, MaRS Centre, South Tower, 101 College Street, Suite 800, Toronto, ON M5G 0A3, Canada
| | - Lincoln Stein
- Ontario Institute for Cancer Research, MaRS Centre, South Tower, 101 College Street, Suite 800, Toronto, ON M5G 0A3, Canada
- Department of Molecular Genetics, University of Toronto, 1 King's College Circle, #4386, Medical Sciences Building, Toronto ON M5S 1A8, Canada
| |
Collapse
|
20
|
Hüttenhain R, Soste M, Selevsek N, Röst H, Sethi A, Carapito C, Farrah T, Deutsch EW, Kusebauch U, Moritz RL, Niméus-Malmström E, Rinner O, Aebersold R. Reproducible quantification of cancer-associated proteins in body fluids using targeted proteomics. Sci Transl Med 2012; 4:142ra94. [PMID: 22786679 PMCID: PMC3766734 DOI: 10.1126/scitranslmed.3003989] [Citation(s) in RCA: 195] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The rigorous testing of hypotheses on suitable sample cohorts is a major limitation in translational research. This is particularly the case for the validation of protein biomarkers; the lack of accurate, reproducible, and sensitive assays for most proteins has precluded the systematic assessment of hundreds of potential marker proteins described in the literature. Here, we describe a high-throughput method for the development and refinement of selected reaction monitoring (SRM) assays for human proteins. The method was applied to generate such assays for more than 1000 cancer-associated proteins, which are functionally related to candidate cancer driver mutations. We used the assays to determine the detectability of the target proteins in two clinically relevant samples: plasma and urine. One hundred eighty-two proteins were detected in depleted plasma, spanning five orders of magnitude in abundance and reaching below a concentration of 10 ng/ml. The narrower concentration range of proteins in urine allowed the detection of 408 proteins. Moreover, we demonstrate that these SRM assays allow reproducible quantification by monitoring 34 biomarker candidates across 83 patient plasma samples. Through public access to the entire assay library, researchers will be able to target their cancer-associated proteins of interest in any sample type using the detectability information in plasma and urine as a guide. The generated expandable reference map of SRM assays for cancer-associated proteins will be a valuable resource for accelerating and planning biomarker verification studies.
Collapse
Affiliation(s)
- Ruth Hüttenhain
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Affiliation(s)
- Ian W. Taylor
- Samuel Lunenfeld Research Institute; Mount Sinai Hospital; Toronto Ontario Canada
- Department of Molecular Genetics; University of Toronto; Toronto Ontario Canada
| | - Jeffrey L. Wrana
- Samuel Lunenfeld Research Institute; Mount Sinai Hospital; Toronto Ontario Canada
- Department of Molecular Genetics; University of Toronto; Toronto Ontario Canada
| |
Collapse
|
22
|
Hanash S, Schliekelman M, Zhang Q, Taguchi A. Integration of proteomics into systems biology of cancer. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2012; 4:327-37. [PMID: 22407608 DOI: 10.1002/wsbm.1169] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Deciphering the complexity and heterogeneity of cancer, benefits from integration of proteomic level data into systems biology efforts. The opportunities available as a result of advances in proteomic technologies, the successes to date, and the challenges involved in integrating diverse datasets are addressed in this review.
Collapse
Affiliation(s)
- S Hanash
- Molecular Diagnostics Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.
| | | | | | | |
Collapse
|
23
|
Nicolini A, Ferrari P, Fallahi P, Antonelli A. An iron regulatory gene signature in breast cancer: more than a prognostic genetic profile? Future Oncol 2012; 8:131-4. [DOI: 10.2217/fon.11.148] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Miller LD, Coffman LG, Chou JW et al. An iron regulatory gene signature predicts outcome in breast cancer. Cancer Res. 71(21), 6728–6737 (2011). In breast cancer, recent progress in technology has enabled us to define different prognostic genetic signatures. Based upon them, breast tumors have been grouped into the four principal categories: basal-like or triple-negative, erbB2-positive, normal-like, and luminal type (A and B); with luminal types sharing the expression of estrogen receptor- and/or progesterone receptor-related genes and, basal-like and erbB2-positive subgroups associated with worse prognosis. So far, Oncotype DX® (Genomic Health Inc., Redwood City, CA, USA), Mammaprint® (Agendia Inc, Huntington Beach, CA, USA), the Breast Cancer Index® (BCI, Biotheranostics, San Diego, CA, USA) and PAM50 (Expression Analysis Inc., Durham, NC, USA) are the only multigene assays that have been marketed in North America and Europe. However, any genetic signature assay still has to gain acceptance as a validated assay before introduction into current clinical practice. This study describes an iron regulatory gene signature (IRGS) in breast cancer associated with clinical outcome. Within the molecular luminal type, the IRGS provides prognostic information similar to Oncotype DX and gene sets selected to assess proliferation. In spite of this, it is relevant that two complementary pathways that are regulatory of iron metabolism – the iron export (Fp/HAMP) and the iron import (TFRC/HFE) gene dyads – were embedded in the IRGS gene set and were associated with clinical outcome as well. Differences in metabolic pathways between cancer and normal cells have been widely described, and potential applications for more refined therapy have been proposed by expanding genetic signature assessment technology to concomitant metabolic pathways investigation. Consistent with this, it is reasonable to imagine that the iron-export and the iron-import gene dyads will be considered potential targets for treatment of breast cancer patients expressing the IRGS genes.
Collapse
Affiliation(s)
| | - Paola Ferrari
- Department of Internal Medicine, University of Pisa, Italy
| | - Pupak Fallahi
- Department of Internal Medicine, University of Pisa, Italy
| | | |
Collapse
|