1
|
Use of a combined antibacterial synergy approach and the ANNOgesic tool to identify novel targets within the gene networks of multidrug-resistant Klebsiella pneumoniae. mSystems 2024; 9:e0087723. [PMID: 38349171 PMCID: PMC10949472 DOI: 10.1128/msystems.00877-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 01/13/2024] [Indexed: 03/20/2024] Open
Abstract
Since the 1980s, the development of new drug classes for the treatment of multidrug-resistant Klebsiella pneumoniae has become limited, highlighting the urgent need for novel antibiotics. To address this challenge, this study aimed to explore the synergistic interactions between chemical compounds and representative antibiotics, such as carbapenem and colistin. The primary objective of this study was not only to mitigate the adverse impact of multidrug-resistant K. pneumoniae on public health but also to establish a sustainable balance among humans, animals, and the environment. Phenotypical measurements were conducted using the broth microdilution technique to determine the drug sensitivity of bacterial strains. Additionally, a genotypical approach was employed, involving traditional RNA sequencing analysis to identify differentially expressed genes and the computational ANNOgesic tool to detect noncoding RNAs. This study revealed the existence of various pathways and regulatory RNA elements that form a functional network. These pathways, characterized by the expression of specific genes, contribute to the combined treatment effect and bacterial survival strategies. The connections between pathways are facilitated by regulatory RNA elements that respond to environmental changes. These findings suggest an adaptive response of bacteria to harsh environmental conditions.IMPORTANCENoncoding RNAs were identified as key players in post-transcriptional regulation. Moreover, this study predicted the presence of novel small regulatory RNAs that interact with target genes, as well as the involvement of riboswitches and RNA thermometers in conjunction with associated genes. These findings will contribute to the discovery of potential antimicrobial therapeutic candidates. Overall, this study offers valuable insights into the synergistic effects of chemical compounds and antibiotics, highlighting the role of regulatory RNA elements in bacterial response, and survival strategies. The identification of novel noncoding RNAs and their interactions with target genes, riboswitches, and RNA thermometers holds promise for the development of antimicrobial therapies.
Collapse
|
2
|
RNA sequencing and bioinformatics analysis revealed PACSIN3 as a potential novel biomarker for platinum resistance in epithelial ovarian cancer. J Gene Med 2022; 24:e3452. [PMID: 36170157 DOI: 10.1002/jgm.3452] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 09/08/2022] [Accepted: 09/21/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Failure to respond to treatment in epithelial ovarian cancer can often be attributed to platinum-based chemotherapy resistance. However, the possible mechanisms or candidate biomarkers associated with platinum resistance are yet to be elucidated, even though many researchers have performed related studies. METHODS We performed RNA sequencing of clinical specimens obtained from patients with platinum-sensitive or resistant EOC. Furthermore, various bioinformatics approaches, including spatial analysis of functional enrichment, were used to identify key regulators and associated underlying mechanisms of platinum resistance in EOC. RESULTS Through RNA-sequencing, we identified 263 differentially expressed genes, 98 were upregulated and 165 were downregulated, and subjected them to Gene Oncology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses, which were characterized to the traditional platinum-resistant characteristics. Subsequently, the gene interaction network and module analysis by SAFE demonstrated protein kinase C and casein kinase substrate in neurons 3 (PACSIN3) as the only upregulated hub gene, and neurotensin (NTS) and KIAA0319 as downregulated hub genes in platinum-resistant EOC. We selected PACSIN3 for further analysis as it has not been studied in relation to response to platinum-based chemotherapy. PACSIN3 was significantly upregulated in ovarian cancer cells as compared to iHOSE cells. In addition, cisplatin-induced apoptosis was measured in PACSIN3 knockout OVCA433 and BRCA-mutated EOC cell line, SNU251, by a FACS-based Annexin-V/PI double staining assay, which revealed a significant increase in apoptosis. CONCLUSIONS Taken together, this study presents PACSIN3 as a promising predictive biomarker associated with platinum resistance, especially in BRCA-mutated epithelial ovarian cancers.
Collapse
|
3
|
Abstract
Background Keloid scarring is a fibroproliferative disease caused by aberrant genetic activation with an unclear underlying mechanism. Genetic predisposition, aberrant cellular responses to environmental factors, increased inflammatory cytokines and epithelial–mesenchymal transition (EMT) phenomena are known as major contributors. In this study, we aimed to identify the molecular drivers that initiate keloid pathogenesis. Methods Bulk tissue RNA sequencing analyses of keloid and normal tissues along with ex vivo and in vitro tests were performed to identify the contributing genes to keloid pathogenesis. An animal model of inflammatory keloid scarring was reproduced by replication of a skin fibrosis model with intradermal bleomycin injection in C57BL/6 mice. Results Gene set enrichment analysis revealed upregulation of Wnt family member 5A (WNT5A) expression and genes associated with EMT in keloid tissues. Consistently, human keloid tissues and the bleomycin-induced skin fibrosis animal model showed significantly increased expression of WNT5A and EMT markers. Increased activation of the interleukin (IL)-6/Janus kinase (JAK)/signal transducer and activator of transcription (STAT) pathway and subsequent elevation of EMT markers was also observed in keratinocytes co-cultured with WNT5A-activated fibroblasts or keloid fibroblasts. Furthermore, WNT5A silencing and the blockage of IL-6 secretion via neutralizing IL-6 antibody reversed hyperactivation of the STAT pathway and EMT markers in keratinocytes. Lastly, STAT3 silencing significantly reduced the EMT-like phenotypes in both keratinocytes and IL-6-stimulated keratinocytes. Conclusions Intercellular communication via the WNT5A and STAT pathways possibly underlies a partial mechanism of EMT-like phenomena in keloid pathogenesis. IL-6 secreted from WNT5A-activated fibroblasts or keloid fibroblasts activates the JAK/STAT signaling pathway in adjacent keratinocytes which in turn express EMT markers. A better understanding of keloid development and the role of WNT5A in EMT will promote the development of next-generation targeted treatments for keloid scars.
Collapse
|
4
|
Abstract
Proteins are major functional molecules that physically and functionally interact to carry out cellular processes. The physical interactions are generally mediated by domain-level interactions. Thus, novel protein-protein interactions can be predicted using various computational methods based on domain-domain interactions, using resolved structures of protein complexes. Functional protein interactions can be inferred based on shared domains between proteins, since proteins involved in the same biological processes tend to harbor common domains. We recently developed a method of inferring functional interactions between proteins using associations between their domain compositions, which can be represented as domain profiles. Since the method requires only protein domain annotations, it can be easily applied to any species with a sequenced genome. Here, we describe in detail the method of generating domain profiles for proteins and measuring the association between them to infer functional interactions between proteins. We also demonstrate that domain profile association can be used to successfully construct a large-scale functional network of human proteins.
Collapse
|
5
|
Pathway-specific protein domains are predictive for human diseases. PLoS Comput Biol 2019; 15:e1007052. [PMID: 31075101 PMCID: PMC6530867 DOI: 10.1371/journal.pcbi.1007052] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 05/22/2019] [Accepted: 04/19/2019] [Indexed: 01/04/2023] Open
Abstract
Protein domains are basic functional units of proteins. Many protein domains are pervasive among diverse biological processes, yet some are associated with specific pathways. Human complex diseases are generally viewed as pathway-level disorders. Therefore, we hypothesized that pathway-specific domains could be highly informative for human diseases. To test the hypothesis, we developed a network-based scoring scheme to quantify specificity of domain-pathway associations. We first generated domain profiles for human proteins, then constructed a co-pathway protein network based on the associations between domain profiles. Based on the score, we classified human protein domains into pathway-specific domains (PSDs) and non-specific domains (NSDs). We found that PSDs contained more pathogenic variants than NSDs. PSDs were also enriched for disease-associated mutations that disrupt protein-protein interactions (PPIs) and tend to have a moderate number of domain interactions. These results suggest that mutations in PSDs are likely to disrupt within-pathway PPIs, resulting in functional failure of pathways. Finally, we demonstrated the prediction capacity of PSDs for disease-associated genes with experimental validations in zebrafish. Taken together, the network-based quantitative method of modeling domain-pathway associations presented herein suggested underlying mechanisms of how protein domains associated with specific pathways influence mutational impacts on diseases via perturbations in within-pathway PPIs, and provided a novel genomic feature for interpreting genetic variants to facilitate the discovery of human disease genes. Protein domains are basic functional units of proteins, yet domain-based pathway annotations for proteins are challenging tasks because many domains are pervasive among diverse pathways. Therefore, we developed a network-based scoring scheme to measure pathway specificity of domains, and then used it to identify pathway-specific domains. Surprisingly, we observed substantially more disease mutations in pathway-specific domains than non-specific domains. We found evidences that mutations of pathway-specific domains tend to perturb pathway integrity via disrupting within-pathway protein-protein interactions. We also demonstrated prediction capacity of pathway-specific domains for complex diseases with experimental validations. Our study demonstrated the usefulness of pathway information for protein domains in interpreting non-random distribution of disease mutations among domains and identification of disease genes and variants.
Collapse
|
6
|
GWAB: a web server for the network-based boosting of human genome-wide association data. Nucleic Acids Res 2017; 45:W154-W161. [PMID: 28449091 PMCID: PMC5793838 DOI: 10.1093/nar/gkx284] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2017] [Revised: 04/01/2017] [Accepted: 04/17/2017] [Indexed: 12/29/2022] Open
Abstract
During the last decade, genome-wide association studies (GWAS) have represented a major approach to dissect complex human genetic diseases. Due in part to limited statistical power, most studies identify only small numbers of candidate genes that pass the conventional significance thresholds (e.g. P ≤ 5 × 10-8). This limitation can be partly overcome by increasing the sample size, but this comes at a higher cost. Alternatively, weak association signals can be boosted by incorporating independent data. Previously, we demonstrated the feasibility of boosting GWAS disease associations using gene networks. Here, we present a web server, GWAB (www.inetbio.org/gwab), for the network-based boosting of human GWAS data. Using GWAS summary statistics (P-values) for SNPs along with reference genes for a disease of interest, GWAB reprioritizes candidate disease genes by integrating the GWAS and network data. We found that GWAB could more effectively retrieve disease-associated reference genes than GWAS could alone. As an example, we describe GWAB-boosted candidate genes for coronary artery disease and supporting data in the literature. These results highlight the inherent value in sub-threshold GWAS associations, which are often not publicly released. GWAB offers a feasible general approach to boost such associations for human disease genetics.
Collapse
|
7
|
TomatoNet: A Genome-wide Co-functional Network for Unveiling Complex Traits of Tomato, a Model Crop for Fleshy Fruits. MOLECULAR PLANT 2017; 10:652-655. [PMID: 27913317 DOI: 10.1016/j.molp.2016.11.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2016] [Revised: 09/28/2016] [Accepted: 11/19/2016] [Indexed: 05/03/2023]
|
8
|
From sequencing data to gene functions: co-functional network approaches. Anim Cells Syst (Seoul) 2017; 21:77-83. [PMID: 30460054 PMCID: PMC6138336 DOI: 10.1080/19768354.2017.1284156] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2017] [Accepted: 01/15/2017] [Indexed: 01/04/2023] Open
Abstract
Advanced high-throughput sequencing technology accumulated massive amount of genomics and transcriptomics data in the public databases. Due to the high technical accessibility, DNA and RNA sequencing have huge potential for the study of gene functions in most species including animals and crops. A proven analytic platform to convert sequencing data to gene functional information is co-functional network. Because all genes exert their functions through interactions with others, network analysis is a legitimate way to study gene functions. The workflow of network-based functional study is composed of three steps: (i) inferencing co-functional links, (ii) evaluating and integrating the links into genome-scale networks, and (iii) generating functional hypotheses from the networks. Co-functional links can be inferred from DNA sequencing data by using phylogenetic profiling, gene neighborhood, domain profiling, associalogs, and co-expression analysis from RNA sequencing data. The inferred links are then evaluated and integrated into a genome-scale network with aid from gold-standard co-functional links. Functional hypotheses can be generated from the network based on (i) network connectivity, (ii) network propagation, and (iii) subnetwork analysis. The functional analysis pipeline described here requires only sequencing data which can be readily available for most species by next-generation sequencing technology. Therefore, co-functional networks will greatly potentiate the use of the sequencing data for the study of genetics in any cellular organism.
Collapse
|
9
|
MUFFINN: cancer gene discovery via network analysis of somatic mutation data. Genome Biol 2016; 17:129. [PMID: 27333808 PMCID: PMC4918128 DOI: 10.1186/s13059-016-0989-x] [Citation(s) in RCA: 93] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2015] [Accepted: 05/24/2016] [Indexed: 12/21/2022] Open
Abstract
A major challenge for distinguishing cancer-causing driver mutations from inconsequential passenger mutations is the long-tail of infrequently mutated genes in cancer genomes. Here, we present and evaluate a method for prioritizing cancer genes accounting not only for mutations in individual genes but also in their neighbors in functional networks, MUFFINN (MUtations For Functional Impact on Network Neighbors). This pathway-centric method shows high sensitivity compared with gene-centric analyses of mutation data. Notably, only a marginal decrease in performance is observed when using 10 % of TCGA patient samples, suggesting the method may potentiate cancer genome projects with small patient populations.
Collapse
|
10
|
Weighted mutual information analysis substantially improves domain-based functional network models. Bioinformatics 2016; 32:2824-30. [PMID: 27207946 PMCID: PMC5018372 DOI: 10.1093/bioinformatics/btw320] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 05/16/2016] [Indexed: 11/30/2022] Open
Abstract
Motivation: Functional protein–protein interaction (PPI) networks elucidate molecular pathways underlying complex phenotypes, including those of human diseases. Extrapolation of domain–domain interactions (DDIs) from known PPIs is a major domain-based method for inferring functional PPI networks. However, the protein domain is a functional unit of the protein. Therefore, we should be able to effectively infer functional interactions between proteins based on the co-occurrence of domains. Results: Here, we present a method for inferring accurate functional PPIs based on the similarity of domain composition between proteins by weighted mutual information (MI) that assigned different weights to the domains based on their genome-wide frequencies. Weighted MI outperforms other domain-based network inference methods and is highly predictive for pathways as well as phenotypes. A genome-scale human functional network determined by our method reveals numerous communities that are significantly associated with known pathways and diseases. Domain-based functional networks may, therefore, have potential applications in mapping domain-to-pathway or domain-to-phenotype associations. Availability and Implementation: Source code for calculating weighted mutual information based on the domain profile matrix is available from www.netbiolab.org/w/WMI. Contact:Insuklee@yonsei.ac.kr Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|
11
|
|
12
|
Pathway-Dependent Effectiveness of Network Algorithms for Gene Prioritization. PLoS One 2015; 10:e0130589. [PMID: 26091506 PMCID: PMC4474432 DOI: 10.1371/journal.pone.0130589] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2014] [Accepted: 05/22/2015] [Indexed: 01/18/2023] Open
Abstract
A network-based approach has proven useful for the identification of novel genes associated with complex phenotypes, including human diseases. Because network-based gene prioritization algorithms are based on propagating information of known phenotype-associated genes through networks, the pathway structure of each phenotype might significantly affect the effectiveness of algorithms. We systematically compared two popular network algorithms with distinct mechanisms – direct neighborhood which propagates information to only direct network neighbors, and network diffusion which diffuses information throughout the entire network – in prioritization of genes for worm and human phenotypes. Previous studies reported that network diffusion generally outperforms direct neighborhood for human diseases. Although prioritization power is generally measured for all ranked genes, only the top candidates are significant for subsequent functional analysis. We found that high prioritizing power of a network algorithm for all genes cannot guarantee successful prioritization of top ranked candidates for a given phenotype. Indeed, the majority of the phenotypes that were more efficiently prioritized by network diffusion showed higher prioritizing power for top candidates by direct neighborhood. We also found that connectivity among pathway genes for each phenotype largely determines which network algorithm is more effective, suggesting that the network algorithm used for each phenotype should be chosen with consideration of pathway gene connectivity.
Collapse
|
13
|
FlyNet: a versatile network prioritization server for the Drosophila community. Nucleic Acids Res 2015; 43:W91-7. [PMID: 25943544 PMCID: PMC4489278 DOI: 10.1093/nar/gkv453] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 04/24/2015] [Indexed: 12/18/2022] Open
Abstract
Drosophila melanogaster (fruit fly) has been a popular model organism in animal genetics due to the high accessibility of reverse-genetics tools. In addition, the close relationship between the Drosophila and human genomes rationalizes the use of Drosophila as an invertebrate model for human neurobiology and disease research. A platform technology for predicting candidate genes or functions would further enhance the usefulness of this long-established model organism for gene-to-phenotype mapping. Recently, the power of network prioritization for gene-to-phenotype mapping has been demonstrated in many organisms. Here we present a network prioritization server dedicated to Drosophila that covers ∼95% of the coding genome. This server, dubbed FlyNet, has several distinctive features, including (i) prioritization for both genes and functions; (ii) two complementary network algorithms: direct neighborhood and network diffusion; (iii) spatiotemporal-specific networks as an additional prioritization strategy for traits associated with a specific developmental stage or tissue and (iv) prioritization for human disease genes. FlyNet is expected to serve as a versatile hypothesis-generation platform for genes and functions in the study of basic animal genetics, developmental biology and human disease. FlyNet is available for free at http://www.inetbio.org/flynet.
Collapse
|
14
|
RiceNet v2: an improved network prioritization server for rice genes. Nucleic Acids Res 2015; 43:W122-7. [PMID: 25813048 PMCID: PMC4489288 DOI: 10.1093/nar/gkv253] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Accepted: 03/12/2015] [Indexed: 11/20/2022] Open
Abstract
Rice is the most important staple food crop and a model grass for studies of bioenergy crops. We previously published a genome-scale functional network server called RiceNet, constructed by integrating diverse genomics data and demonstrated the use of the network in genetic dissection of rice biotic stress responses and its usefulness for other grass species. Since the initial construction of the network, there has been a significant increase in the amount of publicly available rice genomics data. Here, we present an updated network prioritization server for Oryza sativa ssp. japonica, RiceNet v2 (http://www.inetbio.org/ricenet), which provides a network of 25 765 genes (70.1% of the coding genome) and 1 775 000 co-functional links. Ricenet v2 also provides two complementary methods for network prioritization based on: (i) network direct neighborhood and (ii) context-associated hubs. RiceNet v2 can use genes of the related subspecies O. sativa ssp. indica and the reference plant Arabidopsis for versatility in generating hypotheses. We demonstrate that RiceNet v2 effectively identifies candidate genes involved in rice root/shoot development and defense responses, demonstrating its usefulness for the grass research community.
Collapse
|
15
|
Network-assisted genetic dissection of pathogenicity and drug resistance in the opportunistic human pathogenic fungus Cryptococcus neoformans. Sci Rep 2015; 5:8767. [PMID: 25739925 PMCID: PMC4350084 DOI: 10.1038/srep08767] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 01/29/2015] [Indexed: 12/12/2022] Open
Abstract
Cryptococcus neoformans is an opportunistic human pathogenic fungus that causes meningoencephalitis. Due to the increasing global risk of cryptococcosis and the emergence of drug-resistant strains, the development of predictive genetics platforms for the rapid identification of novel genes governing pathogenicity and drug resistance of C. neoformans is imperative. The analysis of functional genomics data and genome-scale mutant libraries may facilitate the genetic dissection of such complex phenotypes but with limited efficiency. Here, we present a genome-scale co-functional network for C. neoformans, CryptoNet, which covers ~81% of the coding genome and provides an efficient intermediary between functional genomics data and reverse-genetics resources for the genetic dissection of C. neoformans phenotypes. CryptoNet is the first genome-scale co-functional network for any fungal pathogen. CryptoNet effectively identified novel genes for pathogenicity and drug resistance using guilt-by-association and context-associated hub algorithms. CryptoNet is also the first genome-scale co-functional network for fungi in the basidiomycota phylum, as Saccharomyces cerevisiae belongs to the ascomycota phylum. CryptoNet may therefore provide insights into pathway evolution between two distinct phyla of the fungal kingdom. The CryptoNet web server (www.inetbio.org/cryptonet) is a public resource that provides an interactive environment of network-assisted predictive genetics for C. neoformans.
Collapse
|
16
|
EcoliNet: a database of cofunctional gene network for Escherichia coli. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav001. [PMID: 25650278 PMCID: PMC4314589 DOI: 10.1093/database/bav001] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
During the past several decades, Escherichia coli has been a treasure chest for molecular biology. The molecular mechanisms of many fundamental cellular processes have been discovered through research on this bacterium. Although much basic research now focuses on more complex model organisms, E. coli still remains important in metabolic engineering and synthetic biology. Despite its long history as a subject of molecular investigation, more than one-third of the E. coli genome has no pathway annotation supported by either experimental evidence or manual curation. Recently, a network-assisted genetics approach to the efficient identification of novel gene functions has increased in popularity. To accelerate the speed of pathway annotation for the remaining uncharacterized part of the E. coli genome, we have constructed a database of cofunctional gene network with near-complete genome coverage of the organism, dubbed EcoliNet. We find that EcoliNet is highly predictive for diverse bacterial phenotypes, including antibiotic response, indicating that it will be useful in prioritizing novel candidate genes for a wide spectrum of bacterial phenotypes. We have implemented a web server where biologists can easily run network algorithms over EcoliNet to predict novel genes involved in a pathway or novel functions for a gene. All integrated cofunctional associations can be downloaded, enabling orthology-based reconstruction of gene networks for other bacterial species as well. Database URL: http://www.inetbio.org/ecolinet.
Collapse
|
17
|
AraNet v2: an improved database of co-functional gene networks for the study of Arabidopsis thaliana and 27 other nonmodel plant species. Nucleic Acids Res 2014; 43:D996-1002. [PMID: 25355510 PMCID: PMC4383895 DOI: 10.1093/nar/gku1053] [Citation(s) in RCA: 104] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Arabidopsis thaliana is a reference plant that has been studied intensively for several decades. Recent advances in high-throughput experimental technology have enabled the generation of an unprecedented amount of data from A. thaliana, which has facilitated data-driven approaches to unravel the genetic organization of plant phenotypes. We previously published a description of a genome-scale functional gene network for A. thaliana, AraNet, which was constructed by integrating multiple co-functional gene networks inferred from diverse data types, and we demonstrated the predictive power of this network for complex phenotypes. More recently, we have observed significant growth in the availability of omics data for A. thaliana as well as improvements in data analysis methods that we anticipate will further enhance the integrated database of co-functional networks. Here, we present an updated co-functional gene network for A. thaliana, AraNet v2 (available at http://www.inetbio.org/aranet), which covers approximately 84% of the coding genome. We demonstrate significant improvements in both genome coverage and accuracy. To enhance the usability of the network, we implemented an AraNet v2 web server, which generates functional predictions for A. thaliana and 27 nonmodel plant species using an orthology-based projection of nonmodel plant genes on the A. thaliana gene network.
Collapse
|
18
|
YeastNet v3: a public database of data-specific and integrated functional gene networks for Saccharomyces cerevisiae. Nucleic Acids Res 2013; 42:D731-6. [PMID: 24165882 PMCID: PMC3965021 DOI: 10.1093/nar/gkt981] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Saccharomyces cerevisiae, i.e. baker’s yeast, is a widely studied model organism in eukaryote genetics because of its simple protocols for genetic manipulation and phenotype profiling. The high abundance of publicly available data that has been generated through diverse ‘omics’ approaches has led to the use of yeast for many systems biology studies, including large-scale gene network modeling to better understand the molecular basis of the cellular phenotype. We have previously developed a genome-scale gene network for yeast, YeastNet v2, which has been used for various genetics and systems biology studies. Here, we present an updated version, YeastNet v3 (available at http://www.inetbio.org/yeastnet/), that significantly improves the prediction of gene–phenotype associations. The extended genome in YeastNet v3 covers up to 5818 genes (∼99% of the coding genome) wired by 362 512 functional links. YeastNet v3 provides a new web interface to run the tools for network-guided hypothesis generations. YeastNet v3 also provides edge information for all data-specific networks (∼2 million functional links) as well as the integrated networks. Therefore, users can construct alternative versions of the integrated network by applying their own data integration algorithm to the same data-specific links.
Collapse
|
19
|
Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res 2011; 21:1109-21. [PMID: 21536720 DOI: 10.1101/gr.118992.110] [Citation(s) in RCA: 488] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.
Collapse
|
20
|
Exception discovery: a novel method for the identification of differentially expressed proteins. ACTA ACUST UNITED AC 2010; 14:473-80. [PMID: 20659834 DOI: 10.1109/titb.2008.917927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The identification of differentially expressed proteins (DEPs) observed under specific conditions is one of the key issues in proteomics research. There are currently several ways to detect the changes of a specific protein's expression level in two-dimensional electrophoresis (2-DE) gel images such as statistical analysis and graphical visualization. However, it is quite difficult to handle the information of an individual protein manually by these methods due to the large distortions of patterns in 2-DE images. This paper proposes a method of analyzing DEPs for a specific disease. In order to automatically extract meaningful DEPs in a set of 2-DE gel images, we have designed an exception function that is suitable to measure the anomalous change of the expression level of an individual protein. We present the comparison results of the proposed method versus a Wilcoxon paired t -test that is one of the widely used statistical analysis methods. Several experiments are performed to address not only the effectiveness of the exception function but also the fact that these two methods can compensate each other practically.
Collapse
|
21
|
Validation and reproducibility of food frequency questionnaire for Korean genome epidemiologic study. Eur J Clin Nutr 2007; 61:1435-41. [PMID: 17299477 DOI: 10.1038/sj.ejcn.1602657] [Citation(s) in RCA: 565] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
OBJECTIVE To evaluate validity and reliability of the food-frequency questionnaire (FFQ) developed for the Korean Genome Epidemiologic Study (KoGES). METHODS FFQ was administered twice at 1-year interval (first FFQ (FFQ1) at the beginning and second FFQ (FFQ2) at the end of the study) and diet records (DRs) were collected for 3 days during each of the four seasons from December 2002 to May 2004 for those who attended the health examination center. At the end of the study period, we collected the 12-day DRs of 124 participants. The nutrient intakes from the DRs were compared with both FFQ1 and FFQ2. RESULTS The intakes of energy and some nutrients estimated from FFQ1 and FFQ2 were different from those assessed by the DRs. Especially, the consumption of carbohydrates was higher in FFQ1 and FFQ2 than in the DRs. The de-attenuated, age, sex and energy intake adjusted correlation coefficients between the FFQ2 and the 12-day DRs in Korean population ranged between 0.23 (Vitamin A) and 0.64 (carbohydrate). The median for all nutrients was 0.39. The correlations were similar when we compared nutrient densities of both methods. Joint classification of calorie-adjusted nutrient intakes assessed by FFQ2 and 12-day DRs by quartile ranged from 25.8% (vitamin A) to 39.5% (carbohydrate, iron) for exact concordance. Except vitamin A, the proportion of subjects classified into distant quartile was less than 7% in all nutrients. The median of correlations between the two FFQs 1 year apart were 0.45 for all nutrient intakes and 0.39 for nutrient densities. CONCLUSIONS We conclude that the FFQ we have developed appears to be an acceptable tool for assessing the nutrient intakes in this population. Further studies for calibration of the FFQ collected from multicenters participating in the KoGES are needed. SPONSORSHIP This study was supported by the budget of the National Genome Research Institute, Korea National Institute of Health (2002-347-6111-221).
Collapse
|
22
|
A landmark extraction method for protein 2DE gel images based on multi-dimensional clustering. Artif Intell Med 2005; 35:157-70. [PMID: 16085402 DOI: 10.1016/j.artmed.2005.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
OBJECTIVE Two-dimensional electrophoresis (2DE) is a separation technique that can identify target proteins existing in a tissue. Its result is represented by a gel image that displays an individual protein in a tissue as a spot. However, because the technique suffers from low reproducibility, a user should manually annotate landmark spots on each gel image to analyze the spots of different images together. This operation is an error-prone and tedious job. For this reason, this paper proposes a method of extracting landmark spots automatically by using a data mining technique. METHOD AND MATERIAL A landmark profile which summarizes the characteristics of landmark spots in a set of training gel images of the same tissue is generated by extracting the common properties of the landmark spots. On the basis of the landmark profile, candidate landmark spots in a new gel image of the same tissue are identified, and final landmark spots are determined by the well-known A* search algorithm. RESULT AND CONCLUSIONS The performance of the proposed method is analyzed through a series of experiments in order to identify its various characteristics.
Collapse
|
23
|
Abstract
We describe an integrated proteome database, termed Yonsei Proteome Research Center Proteome Database (YPRC-PDB) which can store, retrieve and analyze various information including two-dimensional electrophoresis (2-DE) images and associated spot information that were obtained during studies of hepatocellular carcinoma (HCC). YPRC-PDB is also designed to perform as a laboratory information management system that manages sample information, clinical background, conditions of both sample preparation and 2-DE, and entire sets of experimental results. It also features query system and data-mining applications, which are amenable to automatically analyze expression level changes of a specific protein and directly link to clinical information. The user interface is web-based, so that the results from other laboratories can be shared effectively. In particular, the master gel image query is equipped with a graphic tool that can easily identify the relationship between the specific pathological stage of HCC and expression levels of a potential marker protein on the master gel image. Thus, YPRC-PDB is a versatile integrated database suitable for subsequent analyses. The information in YPRC-PDB is updated easily and it is available to authorized users on the World Wide Web (http://yprcpdb.proteomix.org/ approximately damduck/).
Collapse
|