Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Yu H, Jansen R, Stolovitzky G, Gerstein M. Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications. ACTA ACUST UNITED AC 2007;23:2163-73. [PMID: 17540677 DOI: 10.1093/bioinformatics/btm291] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

For:	Yu H, Jansen R, Stolovitzky G, Gerstein M. Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications. ACTA ACUST UNITED AC 2007;23:2163-73. [PMID: 17540677 DOI: 10.1093/bioinformatics/btm291] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

Number

Cited by Other Article(s)

Lastra-Díaz JJ, Lara-Clares A, Garcia-Serrano A. HESML: a real-time semantic measures library for the biomedical domain with a reproducible survey. BMC Bioinformatics 2022;23:23. [PMID: 34991460 PMCID: PMC8734250 DOI: 10.1186/s12859-021-04539-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 12/15/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Ontology-based semantic similarity measures based on SNOMED-CT, MeSH, and Gene Ontology are being extensively used in many applications in biomedical text mining and genomics respectively, which has encouraged the development of semantic measures libraries based on the aforementioned ontologies. However, current state-of-the-art semantic measures libraries have some performance and scalability drawbacks derived from their ontology representations based on relational databases, or naive in-memory graph representations. Likewise, a recent reproducible survey on word similarity shows that one hybrid IC-based measure which integrates a shortest-path computation sets the state of the art in the family of ontology-based semantic measures. However, the lack of an efficient shortest-path algorithm for their real-time computation prevents both their practical use in any application and the use of any other path-based semantic similarity measure.

RESULTS

To bridge the two aforementioned gaps, this work introduces for the first time an updated version of the HESML Java software library especially designed for the biomedical domain, which implements the most efficient and scalable ontology representation reported in the literature, together with a new method for the approximation of the Dijkstra's algorithm for taxonomies, called Ancestors-based Shortest-Path Length (AncSPL), which allows the real-time computation of any path-based semantic similarity measure.

CONCLUSIONS

We introduce a set of reproducible benchmarks showing that HESML outperforms by several orders of magnitude the current state-of-the-art libraries in the three aforementioned biomedical ontologies, as well as the real-time performance and approximation quality of the new AncSPL shortest-path algorithm. Likewise, we show that AncSPL linearly scales regarding the dimension of the common ancestor subgraph regardless of the ontology size. Path-based measures based on the new AncSPL algorithm are up to six orders of magnitude faster than their exact implementation in large ontologies like SNOMED-CT and GO. Finally, we provide a detailed reproducibility protocol and dataset as supplementary material to allow the exact replication of all our experiments and results.

Collapse

Acharya S, Saha S, Pradhan P. Multi-Factored Gene-Gene Proximity Measures Exploiting Biological Knowledge Extracted from Gene Ontology: Application in Gene Clustering. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:207-219. [PMID: 29994130 DOI: 10.1109/tcbb.2018.2849362] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Acharya S, Saha S, Pradhan P. Novel symmetry-based gene-gene dissimilarity measures utilizing Gene Ontology: Application in gene clustering. Gene 2018;679:341-351. [PMID: 30184472 DOI: 10.1016/j.gene.2018.08.062] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 08/21/2018] [Accepted: 08/21/2018] [Indexed: 11/25/2022]

Guo Y, Alexander K, Clark AG, Grimson A, Yu H. Integrated network analysis reveals distinct regulatory roles of transcription factors and microRNAs. RNA (NEW YORK, N.Y.) 2016;22:1663-1672. [PMID: 27604961 PMCID: PMC5066619 DOI: 10.1261/rna.048025.114] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2014] [Accepted: 07/25/2016] [Indexed: 06/06/2023]

Lee WP, Lin CH. Combining Expression Data and Knowledge Ontology for Gene Clustering and Network Reconstruction. Cognit Comput 2015. [DOI: 10.1007/s12559-015-9349-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Meng J, Li R, Luan Y. Classification by integrating plant stress response gene expression data with biological knowledge. Math Biosci 2015;266:65-72. [PMID: 26092610 DOI: 10.1016/j.mbs.2015.06.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Revised: 05/03/2015] [Accepted: 06/05/2015] [Indexed: 12/01/2022]

Zhang SB, Lai JH. Semantic similarity measurement between gene ontology terms based on exclusively inherited shared information. Gene 2015;558:108-17. [DOI: 10.1016/j.gene.2014.12.062] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Revised: 12/15/2014] [Accepted: 12/24/2014] [Indexed: 11/25/2022]

Peng J, Uygun S, Kim T, Wang Y, Rhee SY, Chen J. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. BMC Bioinformatics 2015;16:44. [PMID: 25886899 PMCID: PMC4339680 DOI: 10.1186/s12859-015-0474-7] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Accepted: 01/26/2015] [Indexed: 01/18/2023] Open

Peng J, Li H, Jiang Q, Wang Y, Chen J. An integrative approach for measuring semantic similarities using gene ontology. BMC SYSTEMS BIOLOGY 2014;8 Suppl 5:S8. [PMID: 25559943 PMCID: PMC4305987 DOI: 10.1186/1752-0509-8-s5-s8] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]

OrthoClust: an orthology-based network framework for clustering data across multiple species. Genome Biol 2014;15:R100. [PMID: 25249401 PMCID: PMC4289247 DOI: 10.1186/gb-2014-15-8-r100] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2014] [Accepted: 06/26/2014] [Indexed: 01/28/2023] Open

Peng J, Wang Y, Chen J. Towards integrative gene functional similarity measurement. BMC Bioinformatics 2014;15 Suppl 2:S5. [PMID: 24564710 PMCID: PMC4015993 DOI: 10.1186/1471-2105-15-s2-s5] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

CoCiter: an efficient tool to infer gene function by assessing the significance of literature co-citation. PLoS One 2013;8:e74074. [PMID: 24086311 PMCID: PMC3781068 DOI: 10.1371/journal.pone.0074074] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 07/30/2013] [Indexed: 01/17/2023] Open

Guo Y, Wei X, Das J, Grimson A, Lipkin S, Clark A, Yu H. Dissecting disease inheritance modes in a three-dimensional protein network challenges the "guilt-by-association" principle. Am J Hum Genet 2013;93:78-89. [PMID: 23791107 DOI: 10.1016/j.ajhg.2013.05.022] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Revised: 05/02/2013] [Accepted: 05/23/2013] [Indexed: 10/26/2022] Open

Abstract

To better understand different molecular mechanisms by which mutations lead to various human diseases, we classified 82,833 disease-associated mutations according to their inheritance modes (recessive versus dominant) and molecular types (in-frame [missense point mutations and in-frame indels] versus truncating [nonsense mutations and frameshift indels]) and systematically examined the effects of different classes of disease mutations in a three-dimensional protein interactome network with the atomic-resolution interface resolved for each interaction. We found that although recessive mutations affecting the interaction interface of two interacting proteins tend to cause the same disease, this widely accepted "guilt-by-association" principle does not apply to dominant mutations. Furthermore, recessive truncating mutations in regions encoding the same interface are much more likely to cause the same disease, even for interfaces close to the N terminus of the protein. Conversely, dominant truncating mutations tend to be enriched in regions encoding areas between interfaces. These results suggest that a significant fraction of truncating mutations can generate functional protein products. For example, TRIM27, a known cancer-associated protein, interacts with three proteins (MID2, TRIM42, and SIRPA) through two different interfaces. A dominant truncating mutation (c.1024delT [p.Tyr342Thrfs*30]) associated with ovarian carcinoma is located between the regions encoding the two interfaces; the altered protein retains its interaction with MID2 and TRIM42 through the first interface but loses its interaction with SIRPA through the second interface. Our findings will help clarify the molecular mechanisms of thousands of disease-associated genes and their tens of thousands of mutations, especially for those carrying truncating mutations, often erroneously considered "knockout" alleles.

Collapse

Teng Z, Guo M, Liu X, Dai Q, Wang C, Xuan P. Measuring gene functional similarity based on group-wise comparison of GO terms. Bioinformatics 2013;29:1424-32. [PMID: 23572412 DOI: 10.1093/bioinformatics/btt160] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Das J, Mohammed J, Yu H. Genome-scale analysis of interaction dynamics reveals organization of biological networks. ACTA ACUST UNITED AC 2012;28:1873-8. [PMID: 22576179 DOI: 10.1093/bioinformatics/bts283] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Yang H, Nepusz T, Paccanaro A. Improving GO semantic similarity measures by exploring the ontology beneath the terms and modelling uncertainty. ACTA ACUST UNITED AC 2012;28:1383-9. [PMID: 22522134 DOI: 10.1093/bioinformatics/bts129] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

MOTIVATION

Several measures have been recently proposed for quantifying the functional similarity between gene products according to well-structured controlled vocabularies where biological terms are organized in a tree or in a directed acyclic graph (DAG) structure. However, existing semantic similarity measures ignore two important facts. First, when calculating the similarity between two terms, they disregard the descendants of these terms. While this makes no difference when the ontology is a tree, we shall show that it has important consequences when the ontology is a DAG-this is the case, for example, with the Gene Ontology (GO). Second, existing similarity measures do not model the inherent uncertainty which comes from the fact that our current knowledge of the gene annotation and of the ontology structure is incomplete. Here, we propose a novel approach based on downward random walks that can be used to improve any of the existing similarity measures to exhibit these two properties. The approach is computationally efficient-random walks do not need to be simulated as we provide formulas to calculate their stationary distributions.

RESULTS

To show that our approach can potentially improve any semantic similarity measure, we test it on six different semantic similarity measures: three commonly used measures by Resnik (1999), Lin (1998), and Jiang and Conrath (1997); and three recently proposed measures: simUI, simGIC by Pesquita et al. (2008); GraSM by Couto et al. (2007); and Couto and Silva (2011). We applied these improved measures to the GO annotations of the yeast Saccharomyces cerevisiae, and tested how they correlate with sequence similarity, mRNA co-expression and protein-protein interaction data. Our results consistently show that the use of downward random walks leads to more reliable similarity measures.

Collapse

Systems analysis of inflammatory bowel disease based on comprehensive gene information. BMC MEDICAL GENETICS 2012;13:25. [PMID: 22480395 PMCID: PMC3368714 DOI: 10.1186/1471-2350-13-25] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2011] [Accepted: 04/05/2012] [Indexed: 12/19/2022]

Wang X, Wei X, Thijssen B, Das J, Lipkin SM, Yu H. Three-dimensional reconstruction of protein networks provides insight into human genetic disease. Nat Biotechnol 2012;30:159-64. [PMID: 22252508 DOI: 10.1038/nbt.2106] [Citation(s) in RCA: 290] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2011] [Accepted: 12/19/2011] [Indexed: 01/13/2023]

Schulz MH, Köhler S, Bauer S, Robinson PN. Exact score distribution computation for ontological similarity searches. BMC Bioinformatics 2011;12:441. [PMID: 22078312 PMCID: PMC3240574 DOI: 10.1186/1471-2105-12-441] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2011] [Accepted: 11/12/2011] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO). We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact P-value of a given score.

RESULTS

In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a P-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact P-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling.

CONCLUSIONS

The new algorithm enables for the first time exact P-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/

Collapse

Lysenko A, Defoin-Platel M, Hassani-Pak K, Taubert J, Hodgman C, Rawlings CJ, Saqi M. Assessing the functional coherence of modules found in multiple-evidence networks from Arabidopsis. BMC Bioinformatics 2011;12:203. [PMID: 21612636 PMCID: PMC3118170 DOI: 10.1186/1471-2105-12-203] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2010] [Accepted: 05/25/2011] [Indexed: 12/18/2022] Open

Abstract

Background

Combining multiple evidence-types from different information sources has the potential to reveal new relationships in biological systems. The integrated information can be represented as a relationship network, and clustering the network can suggest possible functional modules. The value of such modules for gaining insight into the underlying biological processes depends on their functional coherence. The challenges that we wish to address are to define and quantify the functional coherence of modules in relationship networks, so that they can be used to infer function of as yet unannotated proteins, to discover previously unknown roles of proteins in diseases as well as for better understanding of the regulation and interrelationship between different elements of complex biological systems.

Results

We have defined the functional coherence of modules with respect to the Gene Ontology (GO) by considering two complementary aspects: (i) the fragmentation of the GO functional categories into the different modules and (ii) the most representative functions of the modules. We have proposed a set of metrics to evaluate these two aspects and demonstrated their utility in Arabidopsis thaliana. We selected 2355 proteins for which experimentally established protein-protein interaction (PPI) data were available. From these we have constructed five relationship networks, four based on single types of data: PPI, co-expression, co-occurrence of protein names in scientific literature abstracts and sequence similarity and a fifth one combining these four evidence types. The ability of these networks to suggest biologically meaningful grouping of proteins was explored by applying Markov clustering and then by measuring the functional coherence of the clusters.

Conclusions

Relationship networks integrating multiple evidence-types are biologically informative and allow more proteins to be assigned to a putative functional module. Using additional evidence types concentrates the functional annotations in a smaller number of modules without unduly compromising their consistency. These results indicate that integration of more data sources improves the ability to uncover functional association between proteins, both by allowing more proteins to be linked and producing a network where modular structure more closely reflects the hierarchy in the gene ontology.

Collapse

microRNA-122 as a regulator of mitochondrial metabolic gene network in hepatocellular carcinoma. Mol Syst Biol 2011;6:402. [PMID: 20739924 PMCID: PMC2950084 DOI: 10.1038/msb.2010.58] [Citation(s) in RCA: 162] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2009] [Accepted: 06/29/2010] [Indexed: 02/06/2023] Open

Abstract

A moderate loss of miR-122 function correlates with up-regulation of seed-matched genes and down-regulation of mitochondrially localized genes in both human hepatocellular carcinoma and in normal mice treated with anti-miR-122 antagomir.

Putative direct targets up-regulated with loss of miR-122 and secondary targets down-regulated with loss of miR-122 are conserved between human beings and mice and are rapidly regulated in vitro in response to miR-122 over- and under-expression.

Loss of miR-122 secondary target expression in either tumorous or adjacent non-tumorous tissue predicts poor survival of heptatocellular carcinoma patients.

Hepatocellular carcinoma (HCC) is one of the most aggressive human malignancies, common in Asia, Africa, and in areas with endemic infections of hepatitis-B or -C viruses (HBV or HCV) (But et al, 2008). Globally, the 5-year survival rate of HCC is <5% and about 600 000 HCC patients die each year. The high mortality associated with this disease is mainly attributed to the failure to diagnose HCC patients at an early stage and a lack of effective therapies for patients with advanced stage HCC. Understanding the relationships between phenotypic and molecular changes in HCC is, therefore, of paramount importance for the development of improved HCC diagnosis and treatment methods.

In this study, we examined mRNA and microRNA (miRNA)-expression profiles of tumor and adjacent non-tumor liver tissue from HCC patients. The patient population was selected from a region of endemic HBV infection, and HBV infection appears to contribute to the etiology of HCC in these patients. A total of 96 HCC patients were included in the study, of which about 88% tested positive for HBV antigen; patients testing positive for HCV antigen were excluded. Among the 220 miRNAs profiled, miR-122 was the most highly expressed miRNA in liver, and its expression was decreased almost two-fold in HCC tissue relative to adjacent non-tumor tissue, confirming earlier observations (Lagos-Quintana et al, 2002; Kutay et al, 2006; Budhu et al, 2008).

Over 1000 transcripts were correlated and over 1000 transcripts were anti-correlated with miR-122 expression. Consistent with the idea that transcripts anti-correlated with miR-122 are potential miR-122 targets, the most highly anti-correlated transcripts were highly enriched for the presence of the miR-122 central seed hexamer, CACTCC, in the 3′UTR. Although the complete set of negatively correlated genes was enriched for cell-cycle genes, the subset of seed-matched genes had no significant KEGG Pathway annotation, suggesting that miR-122 is unlikely to directly regulate the cell cycle in these patients. In contrast, transcripts positively correlated with miR-122 were not enriched for 3′UTR seed matches to miR-122. Interestingly, these 1042 transcripts were enriched for genes coding for mitochondrially localized proteins and for metabolic functions.

To analyze the impact of loss of miR-122 in vivo, silencing of miR-122 was performed by antisense inhibition (anti-miR-122) in wild-type mice (Figure 3). As with the genes negatively correlated with miR-122 in HCC patients, no significant biological annotation was associated with the seed-matched genes up-regulated by anti-miR-122 in mouse livers. The most significantly enriched biological annotation for anti-miR-122 down-regulated genes, as for positively correlated genes in HCC, was mitochondrial localization; the down-regulated mitochondrial genes were enriched for metabolic functions. Putative direct and downstream targets with orthologs on both the human and mouse microarrays showed significant overlap for regulations in the same direction. These overlaps defined sets of putative miR-122 primary and secondary targets. The results were further extended in the analysis of a separate dataset from 180 HCC, 40 cirrhotic, and 6 normal liver tissue samples (Figure 4), showing anti-correlation of proposed primary and secondary targets in non-healthy tissues.

To validate the direct correlation between miR-122 and some of the primary and secondary targets, we determined the expression of putative targets after transfection of miR-122 mimetic into PLC/PRF/5 HCC cells, including the putative direct targets SMARCD1 and MAP3K3 (MEKK3), a target described in the literature, CAT-1 (SLC7A1), and three putative secondary targets, PPARGC1A (PGC-1α) and succinate dehydrogenase subunits A and B. As expected, the putative direct targets showed reduced expression, whereas the putative secondary target genes showed increased expression in cells over-expressing miR-122 (Figure 4).

Functional classification of genes using the total ancestry method (Yu et al, 2007) identified PPARGC1A (PGC-1α) as the most connected secondary target. PPARGC1A has been proposed to function as a master regulator of mitochondrial biogenesis (Ventura-Clapier et al, 2008), suggesting that loss of PPARGC1A expression may contribute to the loss of mitochondrial gene expression correlated with loss of miR-122 expression. To further validate the link of miR-122 and PGC-1α protein, we transfected PLC/PRF/5 cells with miR-122-expression vector, and observed an increase in PGC-1α protein levels. Importantly, transfection of both miR-122 mimetic and miR-122-expression vector significantly reduced the lactate content of PLC/PRF/5 cells, whereas anti-miR-122 treatment increased lactate production. Together, the data support the function of miR-122 in mitochondrial metabolic functions.

Patient survival was not directly associated with miR-122-expression levels. However, miR-122 secondary targets were expressed at significantly higher levels in both tumor and adjacent non-tumor tissues among survivors as compared with deceased patients, providing supporting evidence for the potential relevance of loss of miR-122 function in HCC patient morbidity and mortality.

Overall, our findings reveal potentially new biological functions for miR-122 in liver physiology. We observed decreased expression of miR-122, a liver-specific miRNA, in HBV-associated HCC, and loss of miR-122 seemed to correlate with the decrease of mitochondrion-related metabolic pathway gene expression in HCC and in non-tumor liver tissues, a result that is consistent with the outcome of treatment of mice with anti-miR-122 and is of prognostic significance for HCC patients. Further investigation will be conducted to dissect the regulatory function of miR-122 on mitochondrial metabolism in HCC and to test whether increasing miR-122 expression can improve mitochondrial function in liver and perhaps in liver tumor tissues. Moreover, these results support the idea that primary targets of a given miRNA may be distributed over a variety of functional categories while resulting in a coordinated secondary response, potentially through synergistic action (Linsley et al, 2007).

Tumorigenesis involves multistep genetic alterations. To elucidate the microRNA (miRNA)–gene interaction network in carcinogenesis, we examined their genome-wide expression profiles in 96 pairs of tumor/non-tumor tissues from hepatocellular carcinoma (HCC). Comprehensive analysis of the coordinate expression of miRNAs and mRNAs reveals that miR-122 is under-expressed in HCC and that increased expression of miR-122 seed-matched genes leads to a loss of mitochondrial metabolic function. Furthermore, the miR-122 secondary targets, which decrease in expression, are good prognostic markers for HCC. Transcriptome profiling data from additional 180 HCC and 40 liver cirrhotic patients in the same cohort were used to confirm the anti-correlation of miR-122 primary and secondary target gene sets. The HCC findings can be recapitulated in mouse liver by silencing miR-122 with antagomir treatment followed by gene-expression microarray analysis. In vitro miR-122 data further provided a direct link between induction of miR-122-controlled genes and impairment of mitochondrial metabolism. In conclusion, miR-122 regulates mitochondrial metabolism and its loss may be detrimental to sustaining critical liver function and contribute to morbidity and mortality of liver cancer patients.

Collapse

Kyogoku R, Fujimoto R, Ozaki T, Ohkawa T. A method for supporting retrieval of articles on protein structure analysis considering users' intention. BMC Bioinformatics 2011;12 Suppl 1:S42. [PMID: 21342574 PMCID: PMC3044299 DOI: 10.1186/1471-2105-12-s1-s42] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Richards AJ, Muller B, Shotwell M, Cowart LA, Rohrer B, Lu X. Assessing the functional coherence of gene sets with metrics based on the Gene Ontology graph. ACTA ACUST UNITED AC 2010;26:i79-87. [PMID: 20529941 PMCID: PMC2881388 DOI: 10.1093/bioinformatics/btq203] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Davis MJ, Sehgal MSB, Ragan MA. Automatic, context-specific generation of Gene Ontology slims. BMC Bioinformatics 2010;11:498. [PMID: 20929524 PMCID: PMC3098080 DOI: 10.1186/1471-2105-11-498] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2010] [Accepted: 10/07/2010] [Indexed: 11/10/2022] Open

Getting started in gene orthology and functional analysis. PLoS Comput Biol 2010;6:e1000703. [PMID: 20361041 PMCID: PMC2845645 DOI: 10.1371/journal.pcbi.1000703] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Voevodski K, Teng SH, Xia Y. Spectral affinity in protein networks. BMC SYSTEMS BIOLOGY 2009;3:112. [PMID: 19943959 PMCID: PMC2797010 DOI: 10.1186/1752-0509-3-112] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2009] [Accepted: 11/29/2009] [Indexed: 01/15/2023]

Abstract

Background

Protein-protein interaction (PPI) networks enable us to better understand the functional organization of the proteome. We can learn a lot about a particular protein by querying its neighborhood in a PPI network to find proteins with similar function. A spectral approach that considers random walks between nodes of interest is particularly useful in evaluating closeness in PPI networks. Spectral measures of closeness are more robust to noise in the data and are more precise than simpler methods based on edge density and shortest path length.

Results

We develop a novel affinity measure for pairs of proteins in PPI networks, which uses personalized PageRank, a random walk based method used in context-sensitive search on the Web. Our measure of closeness, which we call PageRank Affinity, is proportional to the number of times the smaller-degree protein is visited in a random walk that restarts at the larger-degree protein. PageRank considers paths of all lengths in a network, therefore PageRank Affinity is a precise measure that is robust to noise in the data. PageRank Affinity is also provably related to cluster co-membership, making it a meaningful measure. In our experiments on protein networks we find that our measure is better at predicting co-complex membership and finding functionally related proteins than other commonly used measures of closeness. Moreover, our experiments indicate that PageRank Affinity is very resilient to noise in the network. In addition, based on our method we build a tool that quickly finds nodes closest to a queried protein in any protein network, and easily scales to much larger biological networks.

Conclusion

We define a meaningful way to assess the closeness of two proteins in a PPI network, and show that our closeness measure is more biologically significant than other commonly used methods. We also develop a tool, accessible at http://xialab.bu.edu/resources/pnns, that allows the user to quickly find nodes closest to a queried vertex in any protein network available from BioGRID or specified by the user.

Collapse

Köhler S, Schulz MH, Krawitz P, Bauer S, Dölken S, Ott CE, Mundlos C, Horn D, Mundlos S, Robinson PN. Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am J Hum Genet 2009;85:457-64. [PMID: 19800049 PMCID: PMC2756558 DOI: 10.1016/j.ajhg.2009.09.003] [Citation(s) in RCA: 355] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2009] [Revised: 08/04/2009] [Accepted: 09/01/2009] [Indexed: 10/20/2022] Open

Affiliation(s)

Sebastian Köhler Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
Marcel H. Schulz Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany International Max Planck Research School for Computational Biology and Scientific Computing, 14195 Berlin, Germany
Peter Krawitz Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
Sebastian Bauer Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
Sandra Dölken Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
Claus E. Ott Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
Christine Mundlos Allianz Chronischer Seltener Erkrankungen (ACHSE), 14050 Berlin, Germany
Denise Horn Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
Stefan Mundlos Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany
Peter N. Robinson Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany

Collapse

Voevodski K, Teng SH, Xia Y. Finding local communities in protein networks. BMC Bioinformatics 2009;10:297. [PMID: 19765306 PMCID: PMC2755114 DOI: 10.1186/1471-2105-10-297] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2009] [Accepted: 09/18/2009] [Indexed: 01/27/2023] Open

Abstract

BACKGROUND

Protein-protein interactions (PPIs) play fundamental roles in nearly all biological processes, and provide major insights into the inner workings of cells. A vast amount of PPI data for various organisms is available from BioGRID and other sources. The identification of communities in PPI networks is of great interest because they often reveal previously unknown functional ties between proteins. A large number of global clustering algorithms have been applied to protein networks, where the entire network is partitioned into clusters. Here we take a different approach by looking for local communities in PPI networks.

RESULTS

We develop a tool, named Local Protein Community Finder, which quickly finds a community close to a queried protein in any network available from BioGRID or specified by the user. Our tool uses two new local clustering algorithms Nibble and PageRank-Nibble, which look for a good cluster among the most popular destinations of a short random walk from the queried vertex. The quality of a cluster is determined by proportion of outgoing edges, known as conductance, which is a relative measure particularly useful in undersampled networks. We show that the two local clustering algorithms find communities that not only form excellent clusters, but are also likely to be biologically relevant functional components. We compare the performance of Nibble and PageRank-Nibble to other popular and effective graph partitioning algorithms, and show that they find better clusters in the graph. Moreover, Nibble and PageRank-Nibble find communities that are more functionally coherent.

CONCLUSION

The Local Protein Community Finder, accessible at http://xialab.bu.edu/resources/lpcf, allows the user to quickly find a high-quality community close to a queried protein in any network available from BioGRID or specified by the user. We show that the communities found by our tool form good clusters and are functionally coherent, making our application useful for biologists who wish to investigate functional modules that a particular protein is a part of.

Collapse

Chagoyen M, Carazo JM, Pascual-Montano A. Assessment of protein set coherence using functional annotations. BMC Bioinformatics 2008;9:444. [PMID: 18937846 PMCID: PMC2588600 DOI: 10.1186/1471-2105-9-444] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2008] [Accepted: 10/20/2008] [Indexed: 11/23/2022] Open

Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, Hao T, Rual JF, Dricot A, Vazquez A, Murray RR, Simon C, Tardivo L, Tam S, Svrzikapa N, Fan C, de Smet AS, Motyl A, Hudson ME, Park J, Xin X, Cusick ME, Moore T, Boone C, Snyder M, Roth FP, Barabási AL, Tavernier J, Hill DE, Vidal M. High-quality binary protein interaction map of the yeast interactome network. Science 2008;322:104-10. [PMID: 18719252 DOI: 10.1126/science.1158684] [Citation(s) in RCA: 977] [Impact Index Per Article: 57.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Tetko IV, Rodchenkov IV, Walter MC, Rattei T, Mewes HW. Beyond the 'best' match: machine learning annotation of protein sequences by integration of different sources of information. ACTA ACUST UNITED AC 2008;24:621-8. [PMID: 18174184 DOI: 10.1093/bioinformatics/btm633] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Schlicker A, Albrecht M. FunSimMat: a comprehensive functional similarity database. Nucleic Acids Res 2007;36:D434-9. [PMID: 17932054 PMCID: PMC2238903 DOI: 10.1093/nar/gkm806] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open