1
|
Pettitt SJ, Ryan CJ, Lord CJ. Exploiting Cancer Synthetic Lethality in Cancer-Lessons Learnt from PARP Inhibitors. Cancer Treat Res 2023; 186:13-23. [PMID: 37978128 DOI: 10.1007/978-3-031-30065-3_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
PARP inhibitors now have proven utility in the treatment of homologous recombination (HR) defective cancers. These drugs, and the synthetic lethality effect they exploit, have not only taught us how to approach the treatment of HR defective cancers but have also illuminated how resistance to a synthetic lethal approach can occur, how cancer-associated synthetic lethal effects are perhaps more complex than we imagine, how the better use of biomarkers could improve the success of treatment and even how drug resistance might be targeted. Here, we discuss some of the lessons learnt from the study of PARP inhibitor synthetic lethality and how these lessons might have wider application. Specifically, we discuss the concept of synthetic lethal penetrance, phenocopy effects in cancer such as BRCAness, synthetic lethal resistance, the polygenic and complex nature of synthetic lethal interactions, how evolutionary double binds could be exploited in treatment as well as future horizons for the field.
Collapse
Affiliation(s)
- Stephen J Pettitt
- The CRUK Gene Function Laboratory and Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, SW3 6JB, UK
| | - Colm J Ryan
- School of Computer Science and Systems Biology Ireland, University College Dublin, Dublin, Ireland
| | - Christopher J Lord
- The CRUK Gene Function Laboratory and Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, SW3 6JB, UK.
| |
Collapse
|
2
|
Kingdom R, Wright CF. Incomplete Penetrance and Variable Expressivity: From Clinical Studies to Population Cohorts. Front Genet 2022; 13:920390. [PMID: 35983412 PMCID: PMC9380816 DOI: 10.3389/fgene.2022.920390] [Citation(s) in RCA: 115] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 06/09/2022] [Indexed: 12/20/2022] Open
Abstract
The same genetic variant found in different individuals can cause a range of diverse phenotypes, from no discernible clinical phenotype to severe disease, even among related individuals. Such variants can be said to display incomplete penetrance, a binary phenomenon where the genotype either causes the expected clinical phenotype or it does not, or they can be said to display variable expressivity, in which the same genotype can cause a wide range of clinical symptoms across a spectrum. Both incomplete penetrance and variable expressivity are thought to be caused by a range of factors, including common variants, variants in regulatory regions, epigenetics, environmental factors, and lifestyle. Many thousands of genetic variants have been identified as the cause of monogenic disorders, mostly determined through small clinical studies, and thus, the penetrance and expressivity of these variants may be overestimated when compared to their effect on the general population. With the wealth of population cohort data currently available, the penetrance and expressivity of such genetic variants can be investigated across a much wider contingent, potentially helping to reclassify variants that were previously thought to be completely penetrant. Research into the penetrance and expressivity of such genetic variants is important for clinical classification, both for determining causative mechanisms of disease in the affected population and for providing accurate risk information through genetic counseling. A genotype-based definition of the causes of rare diseases incorporating information from population cohorts and clinical studies is critical for our understanding of incomplete penetrance and variable expressivity. This review examines our current knowledge of the penetrance and expressivity of genetic variants in rare disease and across populations, as well as looking into the potential causes of the variation seen, including genetic modifiers, mosaicism, and polygenic factors, among others. We also considered the challenges that come with investigating penetrance and expressivity.
Collapse
Affiliation(s)
| | - Caroline F. Wright
- Institute of Biomedical and Clinical Science, Royal Devon & Exeter Hospital, University of Exeter Medical School, Exeter, United Kingdom
| |
Collapse
|
3
|
Poh J, Ponsford AH, Boyd J, Woodsmith J, Stelzl U, Wanker E, Harper N, MacEwan D, Sanderson CM. A functionally defined high-density NRF2 interactome reveals new conditional regulators of ARE transactivation. Redox Biol 2020; 37:101686. [PMID: 32911434 PMCID: PMC7490560 DOI: 10.1016/j.redox.2020.101686] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 08/02/2020] [Accepted: 08/12/2020] [Indexed: 12/15/2022] Open
Abstract
NRF2 (NFE2L2) is a cytoprotective transcription factor associated with >60 human diseases, adverse drug reactions and therapeutic resistance. To provide insight into the complex regulation of NRF2 responses, 1962 predicted NRF2-partner interactions were systematically tested to generate an experimentally defined high-density human NRF2 interactome. Verification and conditional stratification of 46 new NRF2 partners was achieved by co-immunoprecipitation and the novel integration of quantitative data from dual luminescence-based co-immunoprecipitation (DULIP) assays and live-cell fluorescence cross-correlation spectroscopy (FCCS). The functional impact of new partners was then assessed in genetically edited loss-of-function (NRF2-/-) and disease-related gain-of-function (NRF2T80K and KEAP1-/-) cell-lines. Of the new partners investigated >77% (17/22) modified NRF2 responses, including partners that only exhibited effects under disease-related conditions. This experimentally defined binary NRF2 interactome provides a new vision of the complex molecular networks that govern the modulation and consequence of NRF2 activity in health and disease.
Collapse
Affiliation(s)
- Jonathan Poh
- Institute of Translational Medicine, University of Liverpool, UK
| | - Amy H Ponsford
- Institute of Translational Medicine, University of Liverpool, UK
| | - James Boyd
- Institute of Translational Medicine, University of Liverpool, UK
| | - Jonathan Woodsmith
- Institute of Pharmaceutical Sciences, Department of Pharmaceutical Chemistry, University of Graz, Austria
| | - Ulrich Stelzl
- Institute of Pharmaceutical Sciences, Department of Pharmaceutical Chemistry, University of Graz, Austria
| | - Erich Wanker
- Max-Delbrück Center for Molecular Medicine (MDC), Berlin-Buch, Germany
| | - Nicholas Harper
- Institute of Translational Medicine, University of Liverpool, UK
| | - David MacEwan
- Institute of Translational Medicine, University of Liverpool, UK
| | | |
Collapse
|
4
|
Byrd JB, Greene AC, Prasad DV, Jiang X, Greene CS. Responsible, practical genomic data sharing that accelerates research. Nat Rev Genet 2020; 21:615-629. [PMID: 32694666 PMCID: PMC7974070 DOI: 10.1038/s41576-020-0257-5] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2020] [Indexed: 12/13/2022]
Abstract
Data sharing anchors reproducible science, but expectations and best practices are often nebulous. Communities of funders, researchers and publishers continue to grapple with what should be required or encouraged. To illuminate the rationales for sharing data, the technical challenges and the social and cultural challenges, we consider the stakeholders in the scientific enterprise. In biomedical research, participants are key among those stakeholders. Ethical sharing requires considering both the value of research efforts and the privacy costs for participants. We discuss current best practices for various types of genomic data, as well as opportunities to promote ethical data sharing that accelerates science by aligning incentives.
Collapse
Affiliation(s)
- James Brian Byrd
- Department of Internal Medicine, Medical School, University of Michigan, Ann Arbor, MI, USA
| | - Anna C Greene
- Alex's Lemonade Stand Foundation, Bala Cynwyd, PA, USA
| | | | - Xiaoqian Jiang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Casey S Greene
- Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, PA, USA.
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
5
|
Radeke LJ, Herman MA. Identification and characterization of differentially expressed genes in Caenorhabditis elegans in response to pathogenic and nonpathogenic Stenotrophomonas maltophilia. BMC Microbiol 2020; 20:170. [PMID: 32560629 PMCID: PMC7304212 DOI: 10.1186/s12866-020-01771-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Accepted: 03/29/2020] [Indexed: 12/27/2022] Open
Abstract
Background Stenotrophomonas maltophilia is an emerging nosocomial pathogen that causes infection in immunocompromised patients. S. maltophilia isolates are genetically diverse, contain diverse virulence factors, and are variably pathogenic within several host species. Members of the Stenotrophomonas genus are part of the native microbiome of C. elegans, being found in greater relative abundance within the worm than its environment, suggesting that these bacteria accumulate within C. elegans. Thus, study of the C. elegans-Stenotrophomonas interaction is of both medical and ecological significance. To identify host defense mechanisms, we analyzed the C. elegans transcriptomic response to S. maltophilia strains of varying pathogenicity: K279a, an avirulent clinical isolate, JCMS, a virulent strain isolated in association with soil nematodes near Manhattan, KS, and JV3, an even more virulent environmental isolate. Results Overall, we found 145 genes that are commonly differentially expressed in response to pathogenic S. maltophilia strains, 89% of which are upregulated, with many even further upregulated in response to JV3 as compared to JCMS. There are many more JV3-specific differentially expressed genes (225, 11% upregulated) than JCMS-specific differentially expressed genes (14, 86% upregulated), suggesting JV3 has unique pathogenic mechanisms that could explain its increased virulence. We used connectivity within a gene network model to choose pathogen-specific and strain-specific differentially expressed candidate genes for functional analysis. Mutations in 13 of 22 candidate genes caused significant differences in C. elegans survival in response to at least one S. maltophilia strain, although not always the strain that induced differential expression, suggesting a dynamic response to varying levels of pathogenicity. Conclusions Variation in observed pathogenicity and differences in host transcriptional responses to S. maltophilia strains reveal that strain-specific mechanisms play important roles in S. maltophilia pathogenesis. Furthermore, utilizing bacteria closely related to strains found in C. elegans natural environment provides a more realistic interaction for understanding host-pathogen response.
Collapse
Affiliation(s)
- Leah J Radeke
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA
| | - Michael A Herman
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA.
| |
Collapse
|
6
|
Rahit KMTH, Tarailo-Graovac M. Genetic Modifiers and Rare Mendelian Disease. Genes (Basel) 2020; 11:E239. [PMID: 32106447 PMCID: PMC7140819 DOI: 10.3390/genes11030239] [Citation(s) in RCA: 105] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 02/21/2020] [Indexed: 12/11/2022] Open
Abstract
Despite advances in high-throughput sequencing that have revolutionized the discovery of gene defects in rare Mendelian diseases, there are still gaps in translating individual genome variation to observed phenotypic outcomes. While we continue to improve genomics approaches to identify primary disease-causing variants, it is evident that no genetic variant acts alone. In other words, some other variants in the genome (genetic modifiers) may alleviate (suppress) or exacerbate (enhance) the severity of the disease, resulting in the variability of phenotypic outcomes. Thus, to truly understand the disease, we need to consider how the disease-causing variants interact with the rest of the genome in an individual. Here, we review the current state-of-the-field in the identification of genetic modifiers in rare Mendelian diseases and discuss the potential for future approaches that could bridge the existing gap.
Collapse
Affiliation(s)
- K. M. Tahsin Hassan Rahit
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada;
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Maja Tarailo-Graovac
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada;
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| |
Collapse
|
7
|
Ryan CJ, Bajrami I, Lord CJ. Synthetic Lethality and Cancer - Penetrance as the Major Barrier. Trends Cancer 2018; 4:671-683. [PMID: 30292351 DOI: 10.1016/j.trecan.2018.08.003] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Revised: 08/21/2018] [Accepted: 08/22/2018] [Indexed: 12/20/2022]
Abstract
Synthetic lethality has long been proposed as an approach for targeting genetic defects in tumours. Despite a decade of screening efforts, relatively few robust synthetic lethal targets have been identified. Improved genetic perturbation techniques, including CRISPR/Cas9 gene editing, have resulted in renewed enthusiasm for searching for synthetic lethal effects in cancer. An implicit assumption behind this enthusiasm is that the lack of reproducibly identified targets can be attributed to limitations of RNAi technologies. We argue here that a bigger hurdle is that most synthetic lethal interactions (SLIs) are not highly penetrant, in other words they are not robust to the extensive molecular heterogeneity seen in tumours. We outline strategies for identifying and prioritising SLIs that are most likely to be highly penetrant.
Collapse
Affiliation(s)
- Colm J Ryan
- School of Computer Science and Systems Biology Ireland, University College Dublin, Belfield, Dublin 4, Ireland.
| | - Ilirjana Bajrami
- Breast Cancer Now Toby Robins Research Centre and Cancer Research UK (CRUK) Gene Function Laboratory, Institute of Cancer Research (ICR), London SW3 6JB, UK.
| | - Christopher J Lord
- Breast Cancer Now Toby Robins Research Centre and Cancer Research UK (CRUK) Gene Function Laboratory, Institute of Cancer Research (ICR), London SW3 6JB, UK.
| |
Collapse
|
8
|
White CV, Herman MA. Transcriptomic, Functional, and Network Analyses Reveal Novel Genes Involved in the Interaction Between Caenorhabditis elegans and Stenotrophomonas maltophilia. Front Cell Infect Microbiol 2018; 8:266. [PMID: 30177956 PMCID: PMC6109753 DOI: 10.3389/fcimb.2018.00266] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Accepted: 07/16/2018] [Indexed: 12/12/2022] Open
Abstract
The bacterivorous nematode Caenorhabditis elegans is an excellent model for the study of innate immune responses to a variety of bacterial pathogens, including the emerging nosocomial bacterial pathogen Stenotrophomonas maltophilia. The study of this interaction has ecological and medical relevance as S. maltophilia is found in association with C. elegans and other nematodes in the wild and is an emerging opportunistic bacterial pathogen. We identified 393 genes that were differentially expressed when exposed to virulent and avirulent strains of S. maltophilia and an avirulent strain of E. coli. We then used a probabilistic functional gene network model (WormNet) to determine that 118 of the 393 differentially expressed genes formed an interacting network and identified a set of highly connected genes with eight or more predicted interactions. We hypothesized that these highly connected genes might play an important role in the defense against S. maltophila and found that mutations of six of seven highly connected genes have a significant effect on nematode survival in response to these bacteria. Of these genes, C48B4.1, mpk-2, cpr-4, clec-67, and lys-6 are needed for combating the virulent S. maltophilia JCMS strain, while dod-22 was solely involved in response to the avirulent S. maltophilia K279a strain. We further found that dod-22 and clec-67 were up regulated in response to JCMS vs. K279a, while C48B4.1, mpk-2, cpr-4, and lys-6 were down regulated. Only dod-22 had a documented role in innate immunity, which demonstrates the merit of our approach in the identification of novel genes that are involved in combating S. maltophilia infection.
Collapse
Affiliation(s)
- Corin V White
- Ecological Genomics Institute, Division of Biology, Kansas State University, Manhattan, KS, United States
| | - Michael A Herman
- Ecological Genomics Institute, Division of Biology, Kansas State University, Manhattan, KS, United States
| |
Collapse
|
9
|
Ma J, Yu MK, Fong S, Ono K, Sage E, Demchak B, Sharan R, Ideker T. Using deep learning to model the hierarchical structure and function of a cell. Nat Methods 2018; 15:290-298. [PMID: 29505029 PMCID: PMC5882547 DOI: 10.1038/nmeth.4627] [Citation(s) in RCA: 238] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 02/07/2018] [Indexed: 01/20/2023]
Abstract
Although artificial neural networks simulate a variety of human functions, their internal structures are hard to interpret. In the life sciences, extensive knowledge of cell biology provides an opportunity to design visible neural networks (VNNs) which couple the model’s inner workings to those of real systems. Here we develop DCell, a VNN embedded in the hierarchical structure of 2526 subsystems comprising a eukaryotic cell (http://d-cell.ucsd.edu/). Trained on several million genotypes, DCell simulates cellular growth nearly as accurately as laboratory observations. During simulation, genotypes induce patterns of subsystem activities, enabling in-silico investigations of the molecular mechanisms underlying genotype-phenotype associations. These mechanisms can be validated and many are unexpected; some are governed by Boolean logic. Cumulatively, 80% of the importance for growth prediction is captured by 484 subsystems (21%), reflecting the emergence of a complex phenotype. DCell provides a foundation for decoding the genetics of disease, drug resistance, and synthetic life.
Collapse
Affiliation(s)
- Jianzhu Ma
- Department of Medicine, University of California San Diego, La Jolla, California, USA
| | - Michael Ku Yu
- Department of Medicine, University of California San Diego, La Jolla, California, USA.,Program in Bioinformatics, University of California San Diego, La Jolla, California, USA
| | - Samson Fong
- Department of Medicine, University of California San Diego, La Jolla, California, USA.,Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| | - Keiichiro Ono
- Department of Medicine, University of California San Diego, La Jolla, California, USA
| | - Eric Sage
- Department of Medicine, University of California San Diego, La Jolla, California, USA
| | - Barry Demchak
- Department of Medicine, University of California San Diego, La Jolla, California, USA
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla, California, USA.,Program in Bioinformatics, University of California San Diego, La Jolla, California, USA.,Department of Bioengineering, University of California San Diego, La Jolla, California, USA
| |
Collapse
|
10
|
Murray JI. Systems biology of embryonic development: Prospects for a complete understanding of the Caenorhabditis elegans embryo. WILEY INTERDISCIPLINARY REVIEWS-DEVELOPMENTAL BIOLOGY 2018; 7:e314. [PMID: 29369536 DOI: 10.1002/wdev.314] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 12/01/2017] [Accepted: 12/12/2017] [Indexed: 01/07/2023]
Abstract
The convergence of developmental biology and modern genomics tools brings the potential for a comprehensive understanding of developmental systems. This is especially true for the Caenorhabditis elegans embryo because its small size, invariant developmental lineage, and powerful genetic and genomic tools provide the prospect of a cellular resolution understanding of messenger RNA (mRNA) expression and regulation across the organism. We describe here how a systems biology framework might allow large-scale determination of the embryonic regulatory relationships encoded in the C. elegans genome. This framework consists of two broad steps: (a) defining the "parts list"-all genes expressed in all cells at each time during development and (b) iterative steps of computational modeling and refinement of these models by experimental perturbation. Substantial progress has been made towards defining the parts list through imaging methods such as large-scale green fluorescent protein (GFP) reporter analysis. Imaging results are now being augmented by high-resolution transcriptome methods such as single-cell RNA sequencing, and it is likely the complete expression patterns of all genes across the embryo will be known within the next few years. In contrast, the modeling and perturbation experiments performed so far have focused largely on individual cell types or genes, and improved methods will be needed to expand them to the full genome and organism. This emerging comprehensive map of embryonic expression and regulatory function will provide a powerful resource for developmental biologists, and would also allow scientists to ask questions not accessible without a comprehensive picture. This article is categorized under: Invertebrate Organogenesis > Worms Technologies > Analysis of the Transcriptome Gene Expression and Transcriptional Hierarchies > Gene Networks and Genomics.
Collapse
Affiliation(s)
- John Isaac Murray
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania
| |
Collapse
|
11
|
Grove C, Cain S, Chen WJ, Davis P, Harris T, Howe KL, Kishore R, Lee R, Paulini M, Raciti D, Tuli MA, Van Auken K, Williams G. Using WormBase: A Genome Biology Resource for Caenorhabditis elegans and Related Nematodes. Methods Mol Biol 2018; 1757:399-470. [PMID: 29761466 DOI: 10.1007/978-1-4939-7737-6_14] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
WormBase ( www.wormbase.org ) provides the nematode research community with a centralized database for information pertaining to nematode genes and genomes. As more nematode genome sequences are becoming available and as richer data sets are published, WormBase strives to maintain updated information, displays, and services to facilitate efficient access to and understanding of the knowledge generated by the published nematode genetics literature. This chapter aims to provide an explanation of how to use basic features of WormBase, new features, and some commonly used tools and data queries. Explanations of the curated data and step-by-step instructions of how to access the data via the WormBase website and available data mining tools are provided.
Collapse
Affiliation(s)
- Christian Grove
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Wen J Chen
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Todd Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Ranjana Kishore
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Raymond Lee
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Michael Paulini
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Mary Ann Tuli
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Gary Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| |
Collapse
|
12
|
Goebels F, Hu L, Bader G, Emili A. Automated Computational Inference of Multi-protein Assemblies from Biochemical Co-purification Data. Methods Mol Biol 2018; 1764:391-399. [PMID: 29605929 DOI: 10.1007/978-1-4939-7759-8_25] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Biology has amassed a wealth of information about the function of a multitude of protein-coding genes across species. The challenge now is to understand how all these proteins work together to form a living organism, and a crucial step for gaining this knowledge is a complete description of the molecular "wiring circuits" that underlie cellular processes. In this chapter, we describe a general computational framework for predicting multi-protein assemblies from biochemical co-fractionation data.
Collapse
Affiliation(s)
- Florian Goebels
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Lucas Hu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Gary Bader
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Andrew Emili
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
13
|
Taxonomically Restricted Genes with Essential Functions Frequently Play Roles in Chromosome Segregation in Caenorhabditis elegans and Saccharomyces cerevisiae. G3-GENES GENOMES GENETICS 2017; 7:3337-3347. [PMID: 28839119 PMCID: PMC5633384 DOI: 10.1534/g3.117.300193] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Genes encoding essential components of core cellular processes are typically highly conserved across eukaryotes. However, a small proportion of essential genes are highly taxonomically restricted; there appear to be no similar genes outside the genomes of highly related species. What are the functions of these poorly characterized taxonomically restricted genes (TRGs)? Systematic screens in Saccharomyces cerevisiae and Caenorhabditis elegans previously identified yeast or nematode TRGs that are essential for viability and we find that these genes share many molecular features, despite having no significant sequence similarity. Specifically, we find that those TRGs with essential phenotypes have an expression profile more similar to highly conserved genes, they have more protein–protein interactions and more protein disorder. Surprisingly, many TRGs play central roles in chromosome segregation; a core eukaryotic process. We thus find that genes that appear to be highly evolutionarily restricted do not necessarily play roles in species-specific biological functions but frequently play essential roles in core eukaryotic processes.
Collapse
|
14
|
Abstract
The oncogenic Ras/MAPK pathway is evolutionarily conserved across metazoans. Yet, almost all our knowledge on this pathway comes from studies using single genetic backgrounds, whereas mutational effects can be highly background dependent. Therefore, we lack insight in the interplay between genetic backgrounds and the Ras/MAPK-signaling pathway. Here, we used a Caenorhabditis elegans RIL population containing a gain-of-function mutation in the Ras/MAPK-pathway gene let-60 and measured how gene expression regulation is affected by this mutation. We mapped eQTL and found that the majority (∼73%) of the 1516 detected cis-eQTL were not specific for the let-60 mutation, whereas most (∼76%) of the 898 detected trans-eQTL were associated with the let-60 mutation. We detected six eQTL trans-bands specific for the interaction between the genetic background and the mutation, one of which colocalized with the polymorphic Ras/MAPK modifier amx-2. Comparison between transgenic lines expressing allelic variants of amx-2 showed the involvement of amx-2 in 79% of the trans-eQTL for genes mapping to this trans-band. Together, our results have revealed hidden loci affecting Ras/MAPK signaling using sensitized backgrounds in C. elegans. These loci harbor putative polymorphic modifier genes that would not have been detected using mutant screens in single genetic backgrounds.
Collapse
|
15
|
Garland J. Unravelling the complexity of signalling networks in cancer: A review of the increasing role for computational modelling. Crit Rev Oncol Hematol 2017; 117:73-113. [PMID: 28807238 DOI: 10.1016/j.critrevonc.2017.06.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Revised: 06/01/2017] [Accepted: 06/08/2017] [Indexed: 02/06/2023] Open
Abstract
Cancer induction is a highly complex process involving hundreds of different inducers but whose eventual outcome is the same. Clearly, it is essential to understand how signalling pathways and networks generated by these inducers interact to regulate cell behaviour and create the cancer phenotype. While enormous strides have been made in identifying key networking profiles, the amount of data generated far exceeds our ability to understand how it all "fits together". The number of potential interactions is astronomically large and requires novel approaches and extreme computation methods to dissect them out. However, such methodologies have high intrinsic mathematical and conceptual content which is difficult to follow. This review explains how computation modelling is progressively finding solutions and also revealing unexpected and unpredictable nano-scale molecular behaviours extremely relevant to how signalling and networking are coherently integrated. It is divided into linked sections illustrated by numerous figures from the literature describing different approaches and offering visual portrayals of networking and major conceptual advances in the field. First, the problem of signalling complexity and data collection is illustrated for only a small selection of known oncogenes. Next, new concepts from biophysics, molecular behaviours, kinetics, organisation at the nano level and predictive models are presented. These areas include: visual representations of networking, Energy Landscapes and energy transfer/dissemination (entropy); diffusion, percolation; molecular crowding; protein allostery; quinary structure and fractal distributions; energy management, metabolism and re-examination of the Warburg effect. The importance of unravelling complex network interactions is then illustrated for some widely-used drugs in cancer therapy whose interactions are very extensive. Finally, use of computational modelling to develop micro- and nano- functional models ("bottom-up" research) is highlighted. The review concludes that computational modelling is an essential part of cancer research and is vital to understanding network formation and molecular behaviours that are associated with it. Its role is increasingly essential because it is unravelling the huge complexity of cancer induction otherwise unattainable by any other approach.
Collapse
Affiliation(s)
- John Garland
- Manchester Interdisciplinary Biocentre, Manchester University, Manchester, UK.
| |
Collapse
|
16
|
Abstract
Characterizing genetic interactions is crucial to understanding cellular and organismal response to gene-level perturbations. Such knowledge can inform the selection of candidate disease therapy targets, yet experimentally determining whether genes interact is technically nontrivial and time-consuming. High-fidelity prediction of different classes of genetic interactions in multiple organisms would substantially alleviate this experimental burden. Under the hypothesis that functionally related genes tend to share common genetic interaction partners, we evaluate a computational approach to predict genetic interactions in Homo sapiens, Drosophila melanogaster, and Saccharomyces cerevisiae. By leveraging knowledge of functional relationships between genes, we cross-validate predictions on known genetic interactions and observe high predictive power of multiple classes of genetic interactions in all three organisms. Additionally, our method suggests high-confidence candidate interaction pairs that can be directly experimentally tested. A web application is provided for users to query genes for predicted novel genetic interaction partners. Finally, by subsampling the known yeast genetic interaction network, we found that novel genetic interactions are predictable even when knowledge of currently known interactions is minimal.
Collapse
|
17
|
Vidulin V, Šmuc T, Supek F. Extensive complementarity between gene function prediction methods. Bioinformatics 2016; 32:3645-3653. [PMID: 27522084 DOI: 10.1093/bioinformatics/btw532] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 07/11/2016] [Accepted: 08/09/2016] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION The number of sequenced genomes rises steadily but we still lack the knowledge about the biological roles of many genes. Automated function prediction (AFP) is thus a necessity. We hypothesized that AFP approaches that draw on distinct genome features may be useful for predicting different types of gene functions, motivating a systematic analysis of the benefits gained by obtaining and integrating such predictions. RESULTS Our pipeline amalgamates 5 133 543 genes from 2071 genomes in a single massive analysis that evaluates five established genomic AFP methodologies. While 1227 Gene Ontology (GO) terms yielded reliable predictions, the majority of these functions were accessible to only one or two of the methods. Moreover, different methods tend to assign a GO term to non-overlapping sets of genes. Thus, inferences made by diverse genomic AFP methods display a striking complementary, both gene-wise and function-wise. Because of this, a viable integration strategy is to rely on a single most-confident prediction per gene/function, rather than enforcing agreement across multiple AFP methods. Using an information-theoretic approach, we estimate that current databases contain 29.2 bits/gene of known Escherichia coli gene functions. This can be increased by up to 5.5 bits/gene using individual AFP methods or by 11 additional bits/gene upon integration, thereby providing a highly-ranking predictor on the Critical Assessment of Function Annotation 2 community benchmark. Availability of more sequenced genomes boosts the predictive accuracy of AFP approaches and also the benefit from integrating them. AVAILABILITY AND IMPLEMENTATION The individual and integrated GO predictions for the complete set of genes are available from http://gorbi.irb.hr/ CONTACT: fran.supek@irb.hrSupplementary information: Supplementary materials are available at Bioinformatics online.
Collapse
Affiliation(s)
- Vedrana Vidulin
- Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, Zagreb 10000, Croatia
| | - Tomislav Šmuc
- Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, Zagreb 10000, Croatia
| | - Fran Supek
- Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, Zagreb 10000, Croatia.,EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology and UPF, Dr. Aiguader 88, Barcelona 08003, Spain
| |
Collapse
|
18
|
PoplarGene: poplar gene network and resource for mining functional information for genes from woody plants. Sci Rep 2016; 6:31356. [PMID: 27515999 PMCID: PMC4981870 DOI: 10.1038/srep31356] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Accepted: 07/18/2016] [Indexed: 01/05/2023] Open
Abstract
Poplar is not only an important resource for the production of paper, timber and other wood-based products, but it has also emerged as an ideal model system for studying woody plants. To better understand the biological processes underlying various traits in poplar, e.g., wood development, a comprehensive functional gene interaction network is highly needed. Here, we constructed a genome-wide functional gene network for poplar (covering ~70% of the 41,335 poplar genes) and created the network web service PoplarGene, offering comprehensive functional interactions and extensive poplar gene functional annotations. PoplarGene incorporates two network-based gene prioritization algorithms, neighborhood-based prioritization and context-based prioritization, which can be used to perform gene prioritization in a complementary manner. Furthermore, the co-functional information in PoplarGene can be applied to other woody plant proteomes with high efficiency via orthology transfer. In addition to poplar gene sequences, the webserver also accepts Arabidopsis reference gene as input to guide the search for novel candidate functional genes in PoplarGene. We believe that PoplarGene (http://bioinformatics.caf.ac.cn/PoplarGene and http://124.127.201.25/PoplarGene) will greatly benefit the research community, facilitating studies of poplar and other woody plants.
Collapse
|
19
|
Burdick J, Walton T, Preston E, Zacharias A, Raj A, Murray JI. Overlapping cell population expression profiling and regulatory inference in C. elegans. BMC Genomics 2016; 17:159. [PMID: 26926147 PMCID: PMC4772325 DOI: 10.1186/s12864-016-2482-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 02/17/2016] [Indexed: 12/30/2022] Open
Abstract
Background Understanding gene expression across the diverse metazoan cell types during development is critical to understanding their function and regulation. However, most cell types have not been assayed for expression genome-wide. Results We applied a novel approach we term “Profiling of Overlapping Populations of cells (POP-Seq)” to assay differential expression across all embryonic cells in the nematode Caenorhabditis elegans. In this approach, we use RNA-seq to define the transcriptome of diverse partially overlapping FACS-sorted cell populations. This identified thousands of transcripts differentially expressed across embryonic cells. Hierarchical clustering analysis identified over 100 sets of coexpressed genes corresponding to distinct patterns of cell type specific expression. We identified thousands of candidate regulators of these clusters based on enrichment of transcription factor motifs and experimentally determined binding sites. Conclusions Our analysis provides new insight into embryonic gene regulation, and provides a resource for improving our knowledge of tissue-specific expression and its regulation throughout C. elegans development. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2482-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Joshua Burdick
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
| | - Travis Walton
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
| | - Elicia Preston
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
| | - Amanda Zacharias
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
| | - Arjun Raj
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
| | - John Isaac Murray
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, USA. .,Department of Genetics, Perelman School of Medicine, University of Pennsylvania, 437A Clinical Research Building, 415 Curie Boulevard, Philadelphia, PA, 19104-6145, USA.
| |
Collapse
|
20
|
Yu MK, Kramer M, Dutkowski J, Srivas R, Licon K, Kreisberg J, Ng CT, Krogan N, Sharan R, Ideker T. Translation of Genotype to Phenotype by a Hierarchy of Cell Subsystems. Cell Syst 2016; 2:77-88. [PMID: 26949740 PMCID: PMC4772745 DOI: 10.1016/j.cels.2016.02.003] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Accurately translating genotype to phenotype requires accounting for the functional impact of genetic variation at many biological scales. Here we present a strategy for genotype-phenotype reasoning based on existing knowledge of cellular subsystems. These subsystems and their hierarchical organization are defined by the Gene Ontology or a complementary ontology inferred directly from previously published datasets. Guided by the ontology's hierarchical structure, we organize genotype data into an "ontotype," that is, a hierarchy of perturbations representing the effects of genetic variation at multiple cellular scales. The ontotype is then interpreted using logical rules generated by machine learning to predict phenotype. This approach substantially outperforms previous, non-hierarchical methods for translating yeast genotype to cell growth phenotype, and it accurately predicts the growth outcomes of two new screens of 2,503 double gene knockouts impacting DNA repair or nuclear lumen. Ontotypes also generalize to larger knockout combinations, setting the stage for interpreting the complex genetics of disease.
Collapse
Affiliation(s)
- Michael Ku Yu
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla CA 92093, USA
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | - Michael Kramer
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Biomedical Sciences Program, University of California San Diego, La Jolla CA 92093, USA
| | - Janusz Dutkowski
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Data4Cure, La Jolla, CA 92037, USA
| | - Rohith Srivas
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Department of Bioengineering, University of California San Diego, La Jolla CA 92093, USA
| | - Katherine Licon
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | - Jason Kreisberg
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | | | - Nevan Krogan
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco 94143, USA
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978, Israel
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| |
Collapse
|
21
|
Structural and Functional Characterization of a Caenorhabditis elegans Genetic Interaction Network within Pathways. PLoS Comput Biol 2016; 12:e1004738. [PMID: 26871911 PMCID: PMC4752231 DOI: 10.1371/journal.pcbi.1004738] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Accepted: 01/05/2016] [Indexed: 12/02/2022] Open
Abstract
A genetic interaction (GI) is defined when the mutation of one gene modifies the phenotypic expression associated with the mutation of a second gene. Genome-wide efforts to map GIs in yeast revealed structural and functional properties of a GI network. This provided insights into the mechanisms underlying the robustness of yeast to genetic and environmental insults, and also into the link existing between genotype and phenotype. While a significant conservation of GIs and GI network structure has been reported between distant yeast species, such a conservation is not clear between unicellular and multicellular organisms. Structural and functional characterization of a GI network in these latter organisms is consequently of high interest. In this study, we present an in-depth characterization of ~1.5K GIs in the nematode Caenorhabditis elegans. We identify and characterize six distinct classes of GIs by examining a wide-range of structural and functional properties of genes and network, including co-expression, phenotypical manifestations, relationship with protein-protein interaction dense subnetworks (PDS) and pathways, molecular and biological functions, gene essentiality and pleiotropy. Our study shows that GI classes link genes within pathways and display distinctive properties, specifically towards PDS. It suggests a model in which pathways are composed of PDS-centric and PDS-independent GIs coordinating molecular machines through two specific classes of GIs involving pleiotropic and non-pleiotropic connectors. Our study provides the first in-depth characterization of a GI network within pathways of a multicellular organism. It also suggests a model to understand better how GIs control system robustness and evolution. Network biology has focused for years on protein-protein interaction (PPI) networks, identifying nodes with central structural functions and modules associated to bioprocesses, phenotypes and diseases. Network biology field moved to a higher level of abstraction, and started characterizing a less intuitive kind of interactions, called genetic interactions (GIs) or epistasis. Mostly due to technical challenges associated to the genome-wide mapping of GIs, these studies primarily focused on unicellular organisms. They uncovered modules embedded within the structure of these networks and started characterizing their relationship with PPI-network and biological functions. We provide here the first in-depth characterization of a network composed of ~600 GIs within signaling and metabolic pathways of a multicellular organism, the nematode Caenorhabditis elegans. We characterize the structure of this network, and the function of GI classes found in this network. We also discuss how these GI classes contribute to the genomic robustness and the adaptive evolution of multicellular organisms.
Collapse
|
22
|
Valba OV, Nechaev SK, Sterken MG, Snoek LB, Kammenga JE, Vasieva OO. On predicting regulatory genes by analysis of functional networks in C. elegans. BioData Min 2015; 8:33. [PMID: 26535058 PMCID: PMC4631084 DOI: 10.1186/s13040-015-0066-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Accepted: 10/20/2015] [Indexed: 12/14/2022] Open
Abstract
Background Connectivity networks, which reflect multiple interactions between genes and proteins, possess not only a descriptive but also a predictive value, as new connections can be extrapolated and tested by means of computational analysis. Integration of different types of connectivity data (such as co-expression and genetic interactions) in one network has proven to benefit ‘guilt by association’ analysis. However predictive values of connectives of different types, that had their specific functional meaning and topological characteristics were not obvious, and have been addressed in this analysis. Methods eQTL data for 3 experimental C.elegans age groups were retrieved from WormQTL. WormNet has been used to obtain pair-wise gene interactions. The Shortest Path Function (SPF) has been adopted for statistical validation of the co-expressed gene clusters and for computational prediction of their potential gene expression regulators from a network context. A new SPF-based algorithm has been applied to genetic interactions sub-networks adjacent to the clusters of co-expressed genes for ranking the most likely gene expression regulators causal to eQTLs. Results We have demonstrated that known co-expression and genetic interactions between C. elegans genes can be complementary in predicting gene expression regulators. Several algorithms were compared in respect to their predictive potential in different network connectivity contexts. We found that genes associated with eQTLs are highly clustered in a C. elegans co-expression sub-network, and their adjacent genetic interactions provide the optimal functional connectivity environment for application of the new SPF-based algorithm. It was successfully tested in the reverse-prediction analysis on groups of genes with known regulators and applied to co-expressed genes and experimentally observed expression quantitative trait loci (eQTLs). Conclusions This analysis demonstrates differences in topology and connectivity of co-expression and genetic interactions sub-networks in WormNet. The modularity of less continuous genetic interaction network does not correspond to modularity of the dense network comprised by gene co-expression interactions. However the genetic interaction network can be used much more efficiently with the SPF method in prediction of potential regulators of gene expression. The developed method can be used for validation of functional significance of suggested eQTLs and a discovery of new regulatory modules.
Collapse
Affiliation(s)
- Olga V Valba
- Laboratory of Nematology, Wageningen University, Wageninge, Netherlands
| | - Sergei K Nechaev
- LPTMS, Université Paris Sud, Orsay Cedex, France ; National Research University, Higher School of Economics, Moscow, Russia
| | - Mark G Sterken
- LPTMS, Université Paris Sud, Orsay Cedex, France ; National Research University, Higher School of Economics, Moscow, Russia ; P.N. Lebedev Physical Institute of the Russian Academy of Sciences, Moscow, Russia
| | - L Basten Snoek
- Laboratory of Nematology, Wageningen University, Wageninge, Netherlands
| | - Jan E Kammenga
- Laboratory of Nematology, Wageningen University, Wageninge, Netherlands
| | - Olga O Vasieva
- Laboratory of Nematology, Wageningen University, Wageninge, Netherlands
| |
Collapse
|
23
|
Gonzalez GH, Tahsin T, Goodale BC, Greene AC, Greene CS. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery. Brief Bioinform 2015; 17:33-42. [PMID: 26420781 PMCID: PMC4719073 DOI: 10.1093/bib/bbv087] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Indexed: 02/06/2023] Open
Abstract
Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease. We are starting to address this challenge through automatic approaches for information extraction, representation and analysis. Recent advances in text and data mining have been applied to a broad spectrum of key biomedical questions in genomics, pharmacogenomics and other fields. We present an overview of the fundamental methods for text and data mining, as well as recent advances and emerging applications toward precision medicine.
Collapse
|
24
|
Zhu F, Panwar B, Guan Y. Algorithms for modeling global and context-specific functional relationship networks. Brief Bioinform 2015; 17:686-95. [PMID: 26254431 DOI: 10.1093/bib/bbv065] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Indexed: 02/07/2023] Open
Abstract
Functional genomics has enormous potential to facilitate our understanding of normal and disease-specific physiology. In the past decade, intensive research efforts have been focused on modeling functional relationship networks, which summarize the probability of gene co-functionality relationships. Such modeling can be based on either expression data only or heterogeneous data integration. Numerous methods have been deployed to infer the functional relationship networks, while most of them target the global (non-context-specific) functional relationship networks. However, it is expected that functional relationships consistently reprogram under different tissues or biological processes. Thus, advanced methods have been developed targeting tissue-specific or developmental stage-specific networks. This article brings together the state-of-the-art functional relationship network modeling methods, emphasizes the need for heterogeneous genomic data integration and context-specific network modeling and outlines future directions for functional relationship networks.
Collapse
|
25
|
Paredes-Sánchez FA, Sifuentes-Rincón AM, Segura Cabrera A, García Pérez CA, Parra Bracamonte GM, Ambriz Morales P. Associations of SNPs located at candidate genes to bovine growth traits, prioritized with an interaction networks construction approach. BMC Genet 2015. [PMID: 26198337 PMCID: PMC4511253 DOI: 10.1186/s12863-015-0247-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Background For most domestic animal species, including bovines, it is difficult to identify causative genetic variants involved in economically relevant traits. The candidate gene approach is efficient because it investigates genes that are expected to be associated with the expression of a trait and defines whether the genetic variation present in a population is associated with phenotypic diversity. A potential limitation of this approach is the identification of candidates. This study used a bioinformatics approach to identify candidate genes via a search guided by a functional interaction network. Results A functional interaction network tool, BosNet, was constructed for Bos taurus. Predictions for candidate genes were performed using the guilt-by-association principle in BosNet. Association analyses identified five novel markers within BosNet-prioritized genes that had significant effects on different growth traits in Charolais and Brahman cattle. Conclusions BosNet is an excellent tool for the identification of single nucleotide polymorphisms that are potentially associated with complex traits.
Collapse
Affiliation(s)
- Francisco Alejandro Paredes-Sánchez
- Laboratorio de Biotecnología Animal, Centro de Biotecnología Genómica. IPN, Boulevard del Maestro esq. Elías Piña, Col. Narciso Mendoza, Cd. Reynosa, Tam, C.P. 88710, Mexico.
| | - Ana María Sifuentes-Rincón
- Laboratorio de Biotecnología Animal, Centro de Biotecnología Genómica. IPN, Boulevard del Maestro esq. Elías Piña, Col. Narciso Mendoza, Cd. Reynosa, Tam, C.P. 88710, Mexico.
| | - Aldo Segura Cabrera
- Red de Estudios Moleculares Avanzados, Instituto de Ecología, A.C., Xalapa, Mexico.
| | - Carlos Armando García Pérez
- Laboratorio de Bioinformática, Centro de Biotecnología Genómica. IPN, Boulevard del Maestro esq. Elías Piña, Col. Narciso Mendoza, Cd. Reynosa, Tam, C.P. 88710, Mexico.
| | - Gaspar Manuel Parra Bracamonte
- Laboratorio de Biotecnología Animal, Centro de Biotecnología Genómica. IPN, Boulevard del Maestro esq. Elías Piña, Col. Narciso Mendoza, Cd. Reynosa, Tam, C.P. 88710, Mexico.
| | - Pascuala Ambriz Morales
- Laboratorio de Biotecnología Animal, Centro de Biotecnología Genómica. IPN, Boulevard del Maestro esq. Elías Piña, Col. Narciso Mendoza, Cd. Reynosa, Tam, C.P. 88710, Mexico.
| |
Collapse
|
26
|
Shim JE, Hwang S, Lee I. Pathway-Dependent Effectiveness of Network Algorithms for Gene Prioritization. PLoS One 2015; 10:e0130589. [PMID: 26091506 PMCID: PMC4474432 DOI: 10.1371/journal.pone.0130589] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2014] [Accepted: 05/22/2015] [Indexed: 01/18/2023] Open
Abstract
A network-based approach has proven useful for the identification of novel genes associated with complex phenotypes, including human diseases. Because network-based gene prioritization algorithms are based on propagating information of known phenotype-associated genes through networks, the pathway structure of each phenotype might significantly affect the effectiveness of algorithms. We systematically compared two popular network algorithms with distinct mechanisms – direct neighborhood which propagates information to only direct network neighbors, and network diffusion which diffuses information throughout the entire network – in prioritization of genes for worm and human phenotypes. Previous studies reported that network diffusion generally outperforms direct neighborhood for human diseases. Although prioritization power is generally measured for all ranked genes, only the top candidates are significant for subsequent functional analysis. We found that high prioritizing power of a network algorithm for all genes cannot guarantee successful prioritization of top ranked candidates for a given phenotype. Indeed, the majority of the phenotypes that were more efficiently prioritized by network diffusion showed higher prioritizing power for top candidates by direct neighborhood. We also found that connectivity among pathway genes for each phenotype largely determines which network algorithm is more effective, suggesting that the network algorithm used for each phenotype should be chosen with consideration of pathway gene connectivity.
Collapse
Affiliation(s)
- Jung Eun Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Sohyun Hwang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
- * E-mail:
| |
Collapse
|
27
|
Hartman JL, Stisher C, Outlaw DA, Guo J, Shah NA, Tian D, Santos SM, Rodgers JW, White RA. Yeast Phenomics: An Experimental Approach for Modeling Gene Interaction Networks that Buffer Disease. Genes (Basel) 2015; 6:24-45. [PMID: 25668739 PMCID: PMC4377832 DOI: 10.3390/genes6010024] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 01/12/2015] [Indexed: 01/10/2023] Open
Abstract
The genome project increased appreciation of genetic complexity underlying disease phenotypes: many genes contribute each phenotype and each gene contributes multiple phenotypes. The aspiration of predicting common disease in individuals has evolved from seeking primary loci to marginal risk assignments based on many genes. Genetic interaction, defined as contributions to a phenotype that are dependent upon particular digenic allele combinations, could improve prediction of phenotype from complex genotype, but it is difficult to study in human populations. High throughput, systematic analysis of S. cerevisiae gene knockouts or knockdowns in the context of disease-relevant phenotypic perturbations provides a tractable experimental approach to derive gene interaction networks, in order to deduce by cross-species gene homology how phenotype is buffered against disease-risk genotypes. Yeast gene interaction network analysis to date has revealed biology more complex than previously imagined. This has motivated the development of more powerful yeast cell array phenotyping methods to globally model the role of gene interaction networks in modulating phenotypes (which we call yeast phenomic analysis). The article illustrates yeast phenomic technology, which is applied here to quantify gene X media interaction at higher resolution and supports use of a human-like media for future applications of yeast phenomics for modeling human disease.
Collapse
Affiliation(s)
- John L Hartman
- Department of Genetics, University of Alabama at Birmingham, 730 Hugh Kaul Human Genetics Building, 720 20th Street South, Birmingham, AL 35294, USA.
| | - Chandler Stisher
- Department of Genetics, University of Alabama at Birmingham, 730 Hugh Kaul Human Genetics Building, 720 20th Street South, Birmingham, AL 35294, USA.
| | - Darryl A Outlaw
- Department of Genetics, University of Alabama at Birmingham, 730 Hugh Kaul Human Genetics Building, 720 20th Street South, Birmingham, AL 35294, USA.
| | - Jingyu Guo
- Department of Genetics, University of Alabama at Birmingham, 730 Hugh Kaul Human Genetics Building, 720 20th Street South, Birmingham, AL 35294, USA.
| | - Najaf A Shah
- Department of Genetics, University of Alabama at Birmingham, 730 Hugh Kaul Human Genetics Building, 720 20th Street South, Birmingham, AL 35294, USA.
| | - Dehua Tian
- Department of Genetics, University of Alabama at Birmingham, 730 Hugh Kaul Human Genetics Building, 720 20th Street South, Birmingham, AL 35294, USA.
| | - Sean M Santos
- Department of Genetics, University of Alabama at Birmingham, 730 Hugh Kaul Human Genetics Building, 720 20th Street South, Birmingham, AL 35294, USA.
| | - John W Rodgers
- Department of Genetics, University of Alabama at Birmingham, 730 Hugh Kaul Human Genetics Building, 720 20th Street South, Birmingham, AL 35294, USA.
| | - Richard A White
- Department of Statistics and Michael Smith Laboratories, University of British Columbia, 3182 Earth Sciences Building, 2207 Main Mall, Vancouver, BC V6T-1Z4, Canada.
| |
Collapse
|
28
|
Xu Y, Guo M, Zou Q, Liu X, Wang C, Liu Y. System-level insights into the cellular interactome of a non-model organism: inferring, modelling and analysing functional gene network of soybean (Glycine max). PLoS One 2014; 9:e113907. [PMID: 25423109 PMCID: PMC4244207 DOI: 10.1371/journal.pone.0113907] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Accepted: 10/24/2014] [Indexed: 01/30/2023] Open
Abstract
Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN), a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max), due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs), in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional interactome at the genome and microRNome levels. Additionally, a web tool for information retrieval and analysis of SoyFGNs can be accessed at SoyFN: http://nclab.hit.edu.cn/SoyFN.
Collapse
Affiliation(s)
- Yungang Xu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Maozu Guo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Quan Zou
- School of Information Science and Technology, Xiamen University, Xiamen, China
| | - Xiaoyan Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
29
|
Grennan KS, Chen C, Gershon ES, Liu C. Molecular network analysis enhances understanding of the biology of mental disorders. Bioessays 2014; 36:606-616. [PMID: 24733456 PMCID: PMC4300946 DOI: 10.1002/bies.201300147] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
We provide an introduction to network theory, evidence to support a connection between molecular network structure and neuropsychiatric disease, and examples of how network approaches can expand our knowledge of the molecular bases of these diseases. Without systematic methods to derive their biological meanings and inter-relatedness, the many molecular changes associated with neuropsychiatric disease, including genetic variants, gene expression changes, and protein differences, present an impenetrably complex set of findings. Network approaches can potentially help integrate and reconcile these findings, as well as provide new insights into the molecular architecture of neuropsychiatric diseases. Network approaches to neuropsychiatric disease are still in their infancy, and we discuss what might be done to improve their prospects.
Collapse
Affiliation(s)
| | | | - Elliot S. Gershon
- Department of Psychiatry, University of Illinois at Chicago, Chicago, IL 60607, USA
| | - Chunyu Liu
- Department of Psychiatry, University of Illinois at Chicago, Chicago, IL 60607, USA
| |
Collapse
|
30
|
Cho A, Shin J, Hwang S, Kim C, Shim H, Kim H, Kim H, Lee I. WormNet v3: a network-assisted hypothesis-generating server for Caenorhabditis elegans. Nucleic Acids Res 2014; 42:W76-82. [PMID: 24813450 PMCID: PMC4086142 DOI: 10.1093/nar/gku367] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
High-throughput experimental technologies gradually shift the paradigm of
biological research from hypothesis-validation toward hypothesis-generation
science. Translating diverse types of large-scale experimental data into
testable hypotheses, however, remains a daunting task. We previously
demonstrated that heterogeneous genomics data can be integrated into a single
genome-scale gene network with high prediction power for ribonucleic acid
interference (RNAi) phenotypes in Caenorhabditis elegans, a
popular metazoan model in the study of developmental biology, neurobiology and
genetics. Here, we present WormNet version 3 (v3), which is a new
network-assisted hypothesis-generating server for C. elegans.
WormNet v3 includes major updates to the base gene network, which substantially
improved predictions of RNAi phenotypes. The server generates various gene
network-based hypotheses using three complementary network methods: (i) a
phenotype-centric approach to ‘find new members for a pathway’;
(ii) a gene-centric approach to ‘infer functions from network
neighbors’ and (iii) a context-centric approach to ‘find
context-associated hub genes’, which is a new method to identify key
genes that mediate physiology within a specific context. For example, we
demonstrated that the context-centric approach can be used to identify potential
molecular targets of toxic chemicals. WormNet v3 is freely accessible at
http://www.inetbio.org/wormnet.
Collapse
Affiliation(s)
- Ara Cho
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Junha Shin
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Sohyun Hwang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX, USA
| | - Chanyoung Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Hongseok Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Hyojin Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Hanhae Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| |
Collapse
|
31
|
Tsai PW, Chen YT, Yang CY, Chen HF, Tan TS, Lin TW, Hsieh WP, Lan CY. The role of Mss11 in Candida albicans biofilm formation. Mol Genet Genomics 2014; 289:807-19. [PMID: 24752399 DOI: 10.1007/s00438-014-0846-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Accepted: 03/22/2014] [Indexed: 01/08/2023]
Abstract
Candida albicans is an opportunistic human pathogen that can form a biofilm on biotic or inert surfaces such as epithelia and clinical devices. In this study, we examine the formation of C. albicans biofilm by establishing a key gene-centered network based on protein-protein interaction (PPI) and gene expression datasets. Starting from C. albicans Cph1 and Efg1, transcription factors associated with morphogenesis of biofilm formation, a network elucidates the complex cellular process and predicts potential unknown components related to biofilm formation. Subsequently, we analyzed the functions of Mss11 among these identified proteins to test the efficiency of the proposed computational approach. MSS11-deleted mutants were compared with a wild-type strain, indicating that the mutant is defective in forming a mature biofilm and partially attenuates the virulence of C. albicans in an infected mouse model. Finally, a DNA microarray analysis was conducted to identify the potential target genes of C. albicans Mss11. The findings of this study clarify complex gene or protein interaction during the biofilm formation process of C. albicans, supporting the application of a systems biology approach to study fungal pathogenesis.
Collapse
Affiliation(s)
- Pei-Wen Tsai
- Institute of Molecular and Cellular Biology, National Tsing Hua University, No. 101, Section 2, Kuang-Fu Road, Hsinchu, 30013, Taiwan, ROC
| | | | | | | | | | | | | | | |
Collapse
|
32
|
Xu Y, Guo M, Liu X, Wang C, Liu Y. SoyFN: a knowledge database of soybean functional networks. Database (Oxford) 2014; 2014:bau019. [PMID: 24618044 PMCID: PMC3949006 DOI: 10.1093/database/bau019] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2013] [Revised: 01/22/2014] [Accepted: 02/06/2014] [Indexed: 01/08/2023]
Abstract
Many databases for soybean genomic analysis have been built and made publicly available, but few of them contain knowledge specifically targeting the omics-level gene-gene, gene-microRNA (miRNA) and miRNA-miRNA interactions. Here, we present SoyFN, a knowledge database of soybean functional gene networks and miRNA functional networks. SoyFN provides user-friendly interfaces to retrieve, visualize, analyze and download the functional networks of soybean genes and miRNAs. In addition, it incorporates much information about KEGG pathways, gene ontology annotations and 3'-UTR sequences as well as many useful tools including SoySearch, ID mapping, Genome Browser, eFP Browser and promoter motif scan. SoyFN is a schema-free database that can be accessed as a Web service from any modern programming language using a simple Hypertext Transfer Protocol call. The Web site is implemented in Java, JavaScript, PHP, HTML and Apache, with all major browsers supported. We anticipate that this database will be useful for members of research communities both in soybean experimental science and bioinformatics. Database URL: http://nclab.hit.edu.cn/SoyFN.
Collapse
Affiliation(s)
- Yungang Xu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| | - Maozu Guo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| | - Xiaoyan Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| |
Collapse
|
33
|
Boucher B, Jenna S. Genetic interaction networks: better understand to better predict. Front Genet 2013; 4:290. [PMID: 24381582 PMCID: PMC3865423 DOI: 10.3389/fgene.2013.00290] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 11/28/2013] [Indexed: 12/21/2022] Open
Abstract
A genetic interaction (GI) between two genes generally indicates that the phenotype of a double mutant differs from what is expected from each individual mutant. In the last decade, genome scale studies of quantitative GIs were completed using mainly synthetic genetic array technology and RNA interference in yeast and Caenorhabditis elegans. These studies raised questions regarding the functional interpretation of GIs, the relationship of genetic and molecular interaction networks, the usefulness of GI networks to infer gene function and co-functionality, the evolutionary conservation of GI, etc. While GIs have been used for decades to dissect signaling pathways in genetic models, their functional interpretations are still not trivial. The existence of a GI between two genes does not necessarily imply that these two genes code for interacting proteins or that the two genes are even expressed in the same cell. In fact, a GI only implies that the two genes share a functional relationship. These two genes may be involved in the same biological process or pathway; or they may also be involved in compensatory pathways with unrelated apparent function. Considering the powerful opportunity to better understand gene function, genetic relationship, robustness and evolution, provided by a genome-wide mapping of GIs, several in silico approaches have been employed to predict GIs in unicellular and multicellular organisms. Most of these methods used weighted data integration. In this article, we will review the later knowledge acquired on GI networks in metazoans by looking more closely into their relationship with pathways, biological processes and molecular complexes but also into their modularity and organization. We will also review the different in silico methods developed to predict GIs and will discuss how the knowledge acquired on GI networks can be used to design predictive tools with higher performances.
Collapse
Affiliation(s)
- Benjamin Boucher
- Laboratory of Integrative Genomics and Cell Signalling, Pharmaqam, Biomed, Department of Chemistry, Université du Québec à Montréal Montréal, QC, Canada
| | - Sarah Jenna
- Laboratory of Integrative Genomics and Cell Signalling, Pharmaqam, Biomed, Department of Chemistry, Université du Québec à Montréal Montréal, QC, Canada
| |
Collapse
|
34
|
Florido J, Pomares H, Rojas I, Guillén A, Ortuno F, Urquiza J. An effective, practical and low computational cost framework for the integration of heterogeneous data to predict functional associations between proteins by means of Artificial Neural Networks. Neurocomputing 2013. [DOI: 10.1016/j.neucom.2012.11.040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
35
|
Engin HB, Guney E, Keskin O, Oliva B, Gursoy A. Integrating structure to protein-protein interaction networks that drive metastasis to brain and lung in breast cancer. PLoS One 2013; 8:e81035. [PMID: 24278371 PMCID: PMC3838352 DOI: 10.1371/journal.pone.0081035] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2013] [Accepted: 10/05/2013] [Indexed: 11/18/2022] Open
Abstract
Blocking specific protein interactions can lead to human diseases. Accordingly, protein interactions and the structural knowledge on interacting surfaces of proteins (interfaces) have an important role in predicting the genotype-phenotype relationship. We have built the phenotype specific sub-networks of protein-protein interactions (PPIs) involving the relevant genes responsible for lung and brain metastasis from primary tumor in breast cancer. First, we selected the PPIs most relevant to metastasis causing genes (seed genes), by using the "guilt-by-association" principle. Then, we modeled structures of the interactions whose complex forms are not available in Protein Databank (PDB). Finally, we mapped mutations to interface structures (real and modeled), in order to spot the interactions that might be manipulated by these mutations. Functional analyses performed on these sub-networks revealed the potential relationship between immune system-infectious diseases and lung metastasis progression, but this connection was not observed significantly in the brain metastasis. Besides, structural analyses showed that some PPI interfaces in both metastasis sub-networks are originating from microbial proteins, which in turn were mostly related with cell adhesion. Cell adhesion is a key mechanism in metastasis, therefore these PPIs may be involved in similar molecular pathways that are shared by infectious disease and metastasis. Finally, by mapping the mutations and amino acid variations on the interface regions of the proteins in the metastasis sub-networks we found evidence for some mutations to be involved in the mechanisms differentiating the type of the metastasis.
Collapse
Affiliation(s)
- H. Billur Engin
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| | - Emre Guney
- Structural Bioinformatics Group (GRIB), Universitat Pompeu Fabra
| | - Ozlem Keskin
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| | - Baldo Oliva
- Structural Bioinformatics Group (GRIB), Universitat Pompeu Fabra
| | - Attila Gursoy
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Istanbul, Turkey
| |
Collapse
|
36
|
Abstract
Proteins are not monolithic entities; rather, they can contain multiple domains that mediate distinct interactions, and their functionality can be regulated through post-translational modifications at multiple distinct sites. Traditionally, network biology has ignored such properties of proteins and has instead examined either the physical interactions of whole proteins or the consequences of removing entire genes. In this Review, we discuss experimental and computational methods to increase the resolution of protein-protein, genetic and drug-gene interaction studies to the domain and residue levels. Such work will be crucial for using interaction networks to connect sequence and structural information, and to understand the biological consequences of disease-associated mutations, which will hopefully lead to more effective therapeutic strategies.
Collapse
|
37
|
Kim H, Shin J, Kim E, Kim H, Hwang S, Shim JE, Lee I. YeastNet v3: a public database of data-specific and integrated functional gene networks for Saccharomyces cerevisiae. Nucleic Acids Res 2013; 42:D731-6. [PMID: 24165882 PMCID: PMC3965021 DOI: 10.1093/nar/gkt981] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Saccharomyces cerevisiae, i.e. baker’s yeast, is a widely studied model organism in eukaryote genetics because of its simple protocols for genetic manipulation and phenotype profiling. The high abundance of publicly available data that has been generated through diverse ‘omics’ approaches has led to the use of yeast for many systems biology studies, including large-scale gene network modeling to better understand the molecular basis of the cellular phenotype. We have previously developed a genome-scale gene network for yeast, YeastNet v2, which has been used for various genetics and systems biology studies. Here, we present an updated version, YeastNet v3 (available at http://www.inetbio.org/yeastnet/), that significantly improves the prediction of gene–phenotype associations. The extended genome in YeastNet v3 covers up to 5818 genes (∼99% of the coding genome) wired by 362 512 functional links. YeastNet v3 provides a new web interface to run the tools for network-guided hypothesis generations. YeastNet v3 also provides edge information for all data-specific networks (∼2 million functional links) as well as the integrated networks. Therefore, users can construct alternative versions of the integrated network by applying their own data integration algorithm to the same data-specific links.
Collapse
Affiliation(s)
- Hanhae Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | | | | | | | | | | | | |
Collapse
|
38
|
Kim E, Kim H, Lee I. JiffyNet: a web-based instant protein network modeler for newly sequenced species. Nucleic Acids Res 2013; 41:W192-7. [PMID: 23685435 PMCID: PMC3692116 DOI: 10.1093/nar/gkt419] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Revolutionary DNA sequencing technology has enabled affordable genome sequencing for numerous species. Thousands of species already have completely decoded genomes, and tens of thousands more are in progress. Naturally, parallel expansion of the functional parts list library is anticipated, yet genome-level understanding of function also requires maps of functional relationships, such as functional protein networks. Such networks have been constructed for many sequenced species including common model organisms. Nevertheless, the majority of species with sequenced genomes still have no protein network models available. Moreover, biologists might want to obtain protein networks for their species of interest on completion of the genome projects. Therefore, there is high demand for accessible means to automatically construct genome-scale protein networks based on sequence information from genome projects only. Here, we present a public web server, JiffyNet, specifically designed to instantly construct genome-scale protein networks based on associalogs (functional associations transferred from a template network by orthology) for a query species with only protein sequences provided. Assessment of the networks by JiffyNet demonstrated generally high predictive ability for pathway annotations. Furthermore, JiffyNet provides network visualization and analysis pages for wide variety of molecular concepts to facilitate network-guided hypothesis generation. JiffyNet is freely accessible at http://www.jiffynet.org.
Collapse
Affiliation(s)
- Eiru Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 120-749, Korea
| | | | | |
Collapse
|
39
|
Lee I. Network approaches to the genetic dissection of phenotypes in animals and humans. Anim Cells Syst (Seoul) 2013. [DOI: 10.1080/19768354.2013.789076] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
40
|
Fang H, Gough J. A disease-drug-phenotype matrix inferred by walking on a functional domain network. MOLECULAR BIOSYSTEMS 2013; 9:1686-96. [PMID: 23462907 DOI: 10.1039/c3mb25495j] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Protein domains are classified as units of structure, evolution and function, and thus form the molecular backbone of biosphere. Although functional networks at the protein level have been reported to be of value in predicting diseases (phenotypes or drugs), they have not previously been applied at the sub-protein resolution (protein domain in this case). We herein introduce a domain network with a functional perspective. This network has nodes consisting of protein domains (at the superfamily/evolutionary level), with edges weighted by the semantic similarity according to domain-centric Gene Ontology (dcGO) annotations, which henceforth we call "dcGOnet". By globally exploring this network via a random walk, we demonstrate its predictive value on disease, drug, or phenotype-related ontologies. On cross-validation recovering ontology labels for domains, we achieve an overall area under the ROC curve of 89.0% for drugs, 87.3% for diseases, 87.6% for human phenotypes and 88.2% for mouse phenotypes. We show that the performance using global information from this network is significantly better than using local information, and also illustrate that the better performance is not sensitive to network size, or the choice of algorithm parameters, and is universal to different ontologies. Based on the dcGOnet and its global properties, we further develop an approach to build a disease-drug-phenotype matrix. The predicted interconnections are statistically supported using a novel randomization procedure, and are also empirically supported by inspection for biological relevance. Most of the high-ranking predictions recover connections that are well known, but others uncover connections that have only suggestive or obscure support in the literature; we show that these are missed by simpler methods, in particular for drug-disease connections. The value of this work is threefold: we describe a general methodology and make the software available, we provide the functional domain network itself, and the ranked drug-disease-phenotype matrix provides rich targets for investigation. All three can be found at .
Collapse
Affiliation(s)
- Hai Fang
- Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, UK.
| | | |
Collapse
|
41
|
Abstract
To what extent can variation in phenotypic traits such as disease risk be accurately predicted in individuals? In this Review, I highlight recent studies in model organisms that are relevant both to the challenge of accurately predicting phenotypic variation from individual genome sequences ('whole-genome reverse genetics') and for understanding why, in many cases, this may be impossible. These studies argue that only by combining genetic knowledge with in vivo measurements of biological states will it be possible to make accurate genetic predictions for individual humans.
Collapse
|
42
|
Lee J, Lee J. Hypoxia-inducible Factor-1 (HIF-1)-independent hypoxia response of the small heat shock protein hsp-16.1 gene regulated by chromatin-remodeling factors in the nematode Caenorhabditis elegans. J Biol Chem 2012; 288:1582-9. [PMID: 23229554 DOI: 10.1074/jbc.m112.401554] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Oxygen deprivation is accompanied by the coordinated expression of numerous hypoxia-responsive genes, many of which are controlled by hypoxia-inducible factor-1 (HIF-1). However, the cellular response to hypoxia is not likely to be mediated by HIF-1 alone, and little is known about HIF-1-independent hypoxia responses. To better establish the molecular mechanisms of HIF-1-independent hypoxia responses, we sought to characterize the molecular basis of the hypoxia response of the hsp-16.1 gene in the nematode Caenorhabditis elegans; this gene has been shown to be induced by hypoxia independently of hif-1. Using affinity purification followed by LC-MS/MS, we identified HMG-1.2 as a protein that binds to a specific promoter region under hypoxic conditions. By systematic prediction followed by validation of these interactions through RNAi, we identified the chromatin modifiers isw-1 and hda-1, histone H4, and NURF-1 chromatin-remodeling factors as new components of the hif-1-independent hypoxia response. These data suggest that the modulation of nucleosome positioning at the hsp-16.1 promoter may be important for the hypoxia response. In addition, we found that calcineurin acts independently of hif-1 to modulate the cellular response to hypoxia and that calcium ions are necessary for the induction of hsp-16.1 under hypoxic conditions.
Collapse
Affiliation(s)
- Jihyun Lee
- Institute of Molecular Biology and Genetics, Research Center for Cellulomics, Department of Biological Sciences, World Class University Department of Biophysics and Chemical Biology, Seoul National University, Seoul 151-742, South Korea
| | | |
Collapse
|
43
|
Abstract
Background Co-expression based Cancer Modules (CMs) are sets of genes that act in concert to carry out specific functions in different cancer types, and are constructed by exploiting gene expression profiles related to specific clinical conditions or expression signatures associated to specific processes altered in cancer. Unfortunately, genes involved in cancer are not always detectable using only expression signatures or co-expressed sets of genes, and in principle other types of functional interactions should be exploited to obtain a comprehensive picture of the molecular mechanisms underlying the onset and progression of cancer. Results We propose a novel semi-supervised method to rank genes with respect to CMs using networks constructed from different sources of functional information, not limited to gene expression data. It exploits on the one hand local learning strategies through score functions that extend the guilt-by-association approach, and on the other hand global learning strategies through graph kernels embedded in the score functions, able to take into account the overall topology of the network. The proposed kernelized score functions compare favorably with other state-of-the-art semi-supervised machine learning methods for gene ranking in biological networks and scales well with the number of genes, thus allowing fast processing of very large gene networks. Conclusions The modular nature of kernelized score functions provides an algorithmic scheme from which different gene ranking algorithms can be derived, and the results show that using integrated functional networks we can successfully predict CMs defined mainly through expression signatures obtained from gene expression data profiling. A preliminary analysis of top ranked "false positive" genes shows that our approach could be in perspective applied to discover novel genes involved in the onset and progression of tumors related to specific CMs.
Collapse
Affiliation(s)
- Matteo Re
- Dipartimento di Informatica, Università degli Studi di Milano, via Comelico 39/41, 20135 Milano MI, Italia
| | | |
Collapse
|
44
|
Dissecting the gene network of dietary restriction to identify evolutionarily conserved pathways and new functional genes. PLoS Genet 2012; 8:e1002834. [PMID: 22912585 PMCID: PMC3415404 DOI: 10.1371/journal.pgen.1002834] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 06/04/2012] [Indexed: 01/19/2023] Open
Abstract
Dietary restriction (DR), limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR–essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/). To dissect the interactions of DR–essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR–essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR–essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2) had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR–induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of multiple organisms led us to suggest that DR commonly suppresses translation, while stimulating an ancient reproduction-related process. Dietary restriction has been shown to extend lifespan in diverse, evolutionarily distant species, yet its underlying mechanisms remain unknown. We first constructed a database of genes essential for the life-extending effects of dietary restriction in various model organisms and then studied their interactions using a variety of network and systems biology approaches. This enabled us to predict novel genes related to dietary restriction, which we validated experimentally in yeast. By comparing large-scale data compilations (interactomes and transcriptomes) from multiple organisms, we were able to condense this -omics information to the most conserved essential elements, eliminating species-specific adaptive responses. These results lead us to the rather surprising conclusion that lifespan extension by a restricted diet commonly may exploit an ancient rejuvenation process derived from gametogenesis.
Collapse
|
45
|
Wang X, Castro MA, Mulder KW, Markowetz F. Posterior association networks and functional modules inferred from rich phenotypes of gene perturbations. PLoS Comput Biol 2012; 8:e1002566. [PMID: 22761558 PMCID: PMC3386165 DOI: 10.1371/journal.pcbi.1002566] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2012] [Accepted: 05/03/2012] [Indexed: 11/19/2022] Open
Abstract
Combinatorial gene perturbations provide rich information for a systematic exploration of genetic interactions. Despite successful applications to bacteria and yeast, the scalability of this approach remains a major challenge for higher organisms such as humans. Here, we report a novel experimental and computational framework to efficiently address this challenge by limiting the 'search space' for important genetic interactions. We propose to integrate rich phenotypes of multiple single gene perturbations to robustly predict functional modules, which can subsequently be subjected to further experimental investigations such as combinatorial gene silencing. We present posterior association networks (PANs) to predict functional interactions between genes estimated using a Bayesian mixture modelling approach. The major advantage of this approach over conventional hypothesis tests is that prior knowledge can be incorporated to enhance predictive power. We demonstrate in a simulation study and on biological data, that integrating complementary information greatly improves prediction accuracy. To search for significant modules, we perform hierarchical clustering with multiscale bootstrap resampling. We demonstrate the power of the proposed methodologies in applications to Ewing's sarcoma and human adult stem cells using publicly available and custom generated data, respectively. In the former application, we identify a gene module including many confirmed and highly promising therapeutic targets. Genes in the module are also significantly overrepresented in signalling pathways that are known to be critical for proliferation of Ewing's sarcoma cells. In the latter application, we predict a functional network of chromatin factors controlling epidermal stem cell fate. Further examinations using ChIP-seq, ChIP-qPCR and RT-qPCR reveal that the basis of their genetic interactions may arise from transcriptional cross regulation. A Bioconductor package implementing PAN is freely available online at http://bioconductor.org/packages/release/bioc/html/PANR.html.
Collapse
Affiliation(s)
- Xin Wang
- Cancer Research UK Cambridge Research Institute, Cambridge, Cambridgeshire, United Kingdom
- Department of Oncology, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
| | - Mauro A. Castro
- Cancer Research UK Cambridge Research Institute, Cambridge, Cambridgeshire, United Kingdom
- Department of Oncology, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
| | - Klaas W. Mulder
- Cancer Research UK Cambridge Research Institute, Cambridge, Cambridgeshire, United Kingdom
| | - Florian Markowetz
- Cancer Research UK Cambridge Research Institute, Cambridge, Cambridgeshire, United Kingdom
- Department of Oncology, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
- * E-mail:
| |
Collapse
|
46
|
Fortes MRS, Snelling WM, Reverter A, Nagaraj SH, Lehnert SA, Hawken RJ, DeAtley KL, Peters SO, Silver GA, Rincon G, Medrano JF, Islas-Trejo A, Thomas MG. Gene network analyses of first service conception in Brangus heifers: use of genome and trait associations, hypothalamic-transcriptome information, and transcription factors. J Anim Sci 2012; 90:2894-906. [PMID: 22739780 DOI: 10.2527/jas.2011-4601] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Measures of heifer fertility are economically relevant traits for beef production systems and knowledge of candidate genes could be incorporated into future genomic selection strategies. Ten traits related to growth and fertility were measured in 890 Brangus heifers (3/8 Brahman × 5/8 Angus, from 67 sires). These traits were: BW and hip height adjusted to 205 and 365 d of age, postweaning ADG, yearling assessment of carcass traits (i.e., back fat thickness, intramuscular fat, and LM area), as well as heifer pregnancy and first service conception (FSC). These fertility traits were collected from controlled breeding seasons initiated with estrous synchronization and AI targeting heifers to calve by 24 mo of age. The BovineSNP50 BeadChip was used to ascertain 53,692 SNP genotypes for ∼802 heifers. Associations of genotypes and phenotypes were performed and SNP effects were estimated for each trait. Minimally associated SNP (P < 0.05) and their effects across the 10 traits formed the basis for an association weight matrix and its derived gene network related to FSC (57.3% success and heritability = 0.06 ± 0.05). These analyses yielded 1,555 important SNP, which inferred genes linked by 113,873 correlations within a network. Specifically, 1,386 SNP were nodes and the 5,132 strongest correlations (|r| ≥ 0.90) were edges. The network was filtered with genes queried from a transcriptome resource created from deep sequencing of RNA (i.e., RNA-Seq) from the hypothalamus of a prepubertal and a postpubertal Brangus heifer. The remaining hypothalamic-influenced network contained 978 genes connected by 2,560 edges or predicted gene interactions. This hypothalamic gene network was enriched with genes involved in axon guidance, which is a pathway known to influence pulsatile release of LHRH. There were 5 transcription factors with 21 or more connections: ZMAT3, STAT6, RFX4, PLAGL1, and NR6A1 for FSC. The SNP that identified these genes were intragenic and were on chromosomes 1, 5, 9, and 11. Chromosome 5 harbored both STAT6 and RFX4. The large number of interactions and genes observed with network analyses of multiple sources of genomic data (i.e., GWAS and RNA-Seq) support the concept of FSC being a polygenic trait.
Collapse
Affiliation(s)
- M R S Fortes
- School of Veterinary Science, The University of Queensland, Gatton Campus, QLD 4343, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
A Resource of Quantitative Functional Annotation for Homo sapiens Genes. G3-GENES GENOMES GENETICS 2012; 2:223-33. [PMID: 22384401 PMCID: PMC3284330 DOI: 10.1534/g3.111.000828] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2011] [Accepted: 11/23/2011] [Indexed: 01/31/2023]
Abstract
The body of human genomic and proteomic evidence continues to grow at ever-increasing rates, while annotation efforts struggle to keep pace. A surprisingly small fraction of human genes have clear, documented associations with specific functions, and new functions continue to be found for characterized genes. Here we assembled an integrated collection of diverse genomic and proteomic data for 21,341 human genes and make quantitative associations of each to 4333 Gene Ontology terms. We combined guilt-by-profiling and guilt-by-association approaches to exploit features unique to the data types. Performance was evaluated by cross-validation, prospective validation, and by manual evaluation with the biological literature. Functional-linkage networks were also constructed, and their utility was demonstrated by identifying candidate genes related to a glioma FLN using a seed network from genome-wide association studies. Our annotations are presented—alongside existing validated annotations—in a publicly accessible and searchable web interface.
Collapse
|
48
|
Gunsalus KC, Rhrissorrakrai K. Networks in Caenorhabditis elegans. Curr Opin Genet Dev 2011; 21:787-98. [PMID: 22054717 DOI: 10.1016/j.gde.2011.10.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2011] [Accepted: 10/11/2011] [Indexed: 10/15/2022]
Abstract
The network paradigm has become a pervasive theme in biology over the last decade, as increasingly large functional genomic datasets are being collected to interrogate regulatory influences, physical interactions, and genetic dependencies between genes, transcripts, and proteins. These 'molecular interaction' networks can be analyzed collectively and individually to define their global architecture and local patterns of connectivity. These structural features ultimately underlie functional properties such as robustness, modularity, component circuitry (e.g. feedback loops), dynamics, and responses to perturbations. This review focuses on recent progress in elucidating molecular interaction networks using different kinds of functional assays in the classical genetic model for animal development, the roundworm Caenorhabditis elegans, with representative examples to illustrate current directions in different areas of network biology.
Collapse
Affiliation(s)
- Kristin C Gunsalus
- Center for Genomics and Systems Biology and Department of Biology, New York University, 12 Waverly Place, 8th floor, New York, NY 10012, USA.
| | | |
Collapse
|
49
|
Hwang S, Rhee SY, Marcotte EM, Lee I. Systematic prediction of gene function in Arabidopsis thaliana using a probabilistic functional gene network. Nat Protoc 2011; 6:1429-42. [PMID: 21886106 DOI: 10.1038/nprot.2011.372] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
AraNet is a functional gene network for the reference plant Arabidopsis and has been constructed in order to identify new genes associated with plant traits. It is highly predictive for diverse biological pathways and can be used to prioritize genes for functional screens. Moreover, AraNet provides a web-based tool with which plant biologists can efficiently discover novel functions of Arabidopsis genes (http://www.functionalnet.org/aranet/). This protocol explains how to conduct network-based prediction of gene functions using AraNet and how to interpret the prediction results. Functional discovery in plant biology is facilitated by combining candidate prioritization by AraNet with focused experimental tests.
Collapse
Affiliation(s)
- Sohyun Hwang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | | | | | | |
Collapse
|
50
|
Elefsinioti A, Saraç ÖS, Hegele A, Plake C, Hubner NC, Poser I, Sarov M, Hyman A, Mann M, Schroeder M, Stelzl U, Beyer A. Large-scale de novo prediction of physical protein-protein association. Mol Cell Proteomics 2011; 10:M111.010629. [PMID: 21836163 DOI: 10.1074/mcp.m111.010629] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Information about the physical association of proteins is extensively used for studying cellular processes and disease mechanisms. However, complete experimental mapping of the human interactome will remain prohibitively difficult in the near future. Here we present a map of predicted human protein interactions that distinguishes functional association from physical binding. Our network classifies more than 5 million protein pairs predicting 94,009 new interactions with high confidence. We experimentally tested a subset of these predictions using yeast two-hybrid analysis and affinity purification followed by quantitative mass spectrometry. Thus we identified 462 new protein-protein interactions and confirmed the predictive power of the network. These independent experiments address potential issues of circular reasoning and are a distinctive feature of this work. Analysis of the physical interactome unravels subnetworks mediating between different functional and physical subunits of the cell. Finally, we demonstrate the utility of the network for the analysis of molecular mechanisms of complex diseases by applying it to genome-wide association studies of neurodegenerative diseases. This analysis provides new evidence implying TOMM40 as a factor involved in Alzheimer's disease. The network provides a high-quality resource for the analysis of genomic data sets and genetic association studies in particular. Our interactome is available via the hPRINT web server at: www.print-db.org.
Collapse
|