Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wu X, Pang E, Lin K, Pei ZM. Improving the measurement of semantic similarity between gene ontology terms and gene products: insights from an edge- and IC-based hybrid method. PLoS One 2013;8:e66745. [PMID: 23741529 PMCID: PMC3669204 DOI: 10.1371/journal.pone.0066745] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2013] [Accepted: 05/10/2013] [Indexed: 12/30/2022] Open

For:	Wu X, Pang E, Lin K, Pei ZM. Improving the measurement of semantic similarity between gene ontology terms and gene products: insights from an edge- and IC-based hybrid method. PLoS One 2013;8:e66745. [PMID: 23741529 PMCID: PMC3669204 DOI: 10.1371/journal.pone.0066745] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2013] [Accepted: 05/10/2013] [Indexed: 12/30/2022] Open

Number

Cited by Other Article(s)

Chundru VK, Zhang Z, Walter K, Lindsay SJ, Danecek P, Eberhardt RY, Gardner EJ, Malawsky DS, Wigdor EM, Torene R, Retterer K, Wright CF, Ólafsdóttir H, Guillen Sacoto MJ, Ayaz A, Akbeyaz IH, Türkdoğan D, Al Balushi AI, Bertoli-Avella A, Bauer P, Szenker-Ravi E, Reversade B, McWalter K, Sheridan E, Firth HV, Hurles ME, Samocha KE, Ustach VD, Martin HC. Federated analysis of autosomal recessive coding variants in 29,745 developmental disorder patients from diverse populations. Nat Genet 2024;56:2046-2053. [PMID: 39313616 PMCID: PMC11525179 DOI: 10.1038/s41588-024-01910-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 08/14/2024] [Indexed: 09/25/2024]

Affiliation(s)

V Kartik Chundru Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK Department of Clinical and Biomedical Sciences, University of Exeter Medical School, Royal Devon and Exeter Hospital, Exeter, UK
Zhancheng Zhang GeneDx, Gaithersburg, MD, USA Deka Biosciences, Germantown, MD, USA
Klaudia Walter Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Sarah J Lindsay Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Petr Danecek Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Ruth Y Eberhardt Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Eugene J Gardner Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK MRC Epidemiology Unit, Cambridge, UK
Daniel S Malawsky Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Emilie M Wigdor Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK Institute of Developmental and Regenerative Medicine, Department of Paediatrics, University of Oxford, Oxford, UK
Rebecca Torene GeneDx, Gaithersburg, MD, USA Geisinger, Danville, PA, USA
Kyle Retterer GeneDx, Gaithersburg, MD, USA Geisinger, Danville, PA, USA
Caroline F Wright Department of Clinical and Biomedical Sciences, University of Exeter Medical School, Royal Devon and Exeter Hospital, Exeter, UK
Hildur Ólafsdóttir GeneDx Iceland, Reykjavík, Iceland
Maria J Guillen Sacoto GeneDx, Gaithersburg, MD, USA
Akif Ayaz Istanbul Medipol University, Medical School, Department of Medical Genetics, Istanbul, Turkey
Ismail Hakki Akbeyaz Marmara University Medical Faculty, Pendik Training and Research Hospital, Department of Pediatric Neurology, Istanbul, Turkey
Dilşad Türkdoğan Marmara University Medical Faculty, Pendik Training and Research Hospital, Department of Pediatric Neurology, Istanbul, Turkey
Aaisha Ibrahim Al Balushi The Royal Hospital, Muscat Al Ghubra Area 111 Seeb, Muscat, Oman
Aida Bertoli-Avella Medical Genetics, CENTOGENE GmbH, Rostock, Germany
Peter Bauer Medical Genetics, CENTOGENE GmbH, Rostock, Germany Clinic of Internal Medicine, Department of Hematology, Oncology, and Palliative Medicine, University Medicine Rostock, Rostock, Germany
Emmanuelle Szenker-Ravi Laboratory of Human Genetics & Therapeutics, BESE, KAUST, Thuwal, Saudi Arabia
Bruno Reversade Laboratory of Human Genetics & Therapeutics, BESE, KAUST, Thuwal, Saudi Arabia
Kirsty McWalter GeneDx, Gaithersburg, MD, USA
Eamonn Sheridan Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK Leeds Institute of Medical Research, University of Leeds, St. James's University Hospital, Leeds, UK Yorkshire Regional Genetics Service, Chapel Allerton Hospital, Leeds, UK
Helen V Firth Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK Cambridge University Hospitals Foundation Trust, Addenbrooke's Hospital, Cambridge, UK
Matthew E Hurles Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Kaitlin E Samocha Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
Vincent D Ustach GeneDx, Gaithersburg, MD, USA
Hilary C Martin Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

Collapse

Koutsandreas T, Felden B, Chevet E, Chatziioannou A. Protein homeostasis imprinting across evolution. NAR Genom Bioinform 2024;6:lqae014. [PMID: 38486886 PMCID: PMC10939379 DOI: 10.1093/nargab/lqae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 10/07/2023] [Accepted: 01/24/2024] [Indexed: 03/17/2024] Open

Muniyappan S, Rayan AXA, Varrieth GT. DTiGNN: Learning drug-target embedding from a heterogeneous biological network based on a two-level attention-based graph neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023;20:9530-9571. [PMID: 37161255 DOI: 10.3934/mbe.2023419] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]

Abstract

MOTIVATION

In vitro experiment-based drug-target interaction (DTI) exploration demands more human, financial and data resources. In silico approaches have been recommended for predicting DTIs to reduce time and cost. During the drug development process, one can analyze the therapeutic effect of the drug for a particular disease by identifying how the drug binds to the target for treating that disease. Hence, DTI plays a major role in drug discovery. Many computational methods have been developed for DTI prediction. However, the existing methods have limitations in terms of capturing the interactions via multiple semantics between drug and target nodes in a heterogeneous biological network (HBN).

METHODS

In this paper, we propose a DTiGNN framework for identifying unknown drug-target pairs. The DTiGNN first calculates the similarity between the drug and target from multiple perspectives. Then, the features of drugs and targets from each perspective are learned separately by using a novel method termed an information entropy-based random walk. Next, all of the learned features from different perspectives are integrated into a single drug and target similarity network by using a multi-view convolutional neural network. Using the integrated similarity networks, drug interactions, drug-disease associations, protein interactions and protein-disease association, the HBN is constructed. Next, a novel embedding algorithm called a meta-graph guided graph neural network is used to learn the embedding of drugs and targets. Then, a convolutional neural network is employed to infer new DTIs after balancing the sample using oversampling techniques.

RESULTS

The DTiGNN is applied to various datasets, and the result shows better performance in terms of the area under receiver operating characteristic curve (AUC) and area under precision-recall curve (AUPR), with scores of 0.98 and 0.99, respectively. There are 23,739 newly predicted DTI pairs in total.

Collapse

Zhang J, Zhu M, Qian Y. protein2vec: Predicting Protein-Protein Interactions Based on LSTM. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:1257-1266. [PMID: 32750870 DOI: 10.1109/tcbb.2020.3003941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Mallick K, Mallik S, Bandyopadhyay S, Chakraborty S. A Novel Graph Topology-Based GO-Similarity Measure for Signature Detection From Multi-Omics Data and its Application to Other Problems. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:773-785. [PMID: 32866101 DOI: 10.1109/tcbb.2020.3020537] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Tenekeci S, Isik Z. Integrative Biological Network Analysis to Identify Shared Genes in Metabolic Disorders. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:522-530. [PMID: 32396100 DOI: 10.1109/tcbb.2020.2993301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Pesaranghader A, Matwin S, Sokolova M, Grenier JC, Beiko RG, Hussin J. OUP accepted manuscript. Bioinformatics 2022;38:3051-3061. [PMID: 35536192 PMCID: PMC9154256 DOI: 10.1093/bioinformatics/btac304] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 02/12/2022] [Indexed: 11/24/2022] Open

Paul M, Anand A. A New Family of Similarity Measures for Scoring Confidence of Protein Interactions Using Gene Ontology. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:19-30. [PMID: 34029194 DOI: 10.1109/tcbb.2021.3083150] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Liu J, Zhu H, Qiu J. Locally Adjust Networks Based on Connectivity and Semantic Similarities for Disease Module Detection. Front Genet 2021;12:726596. [PMID: 34759955 PMCID: PMC8575408 DOI: 10.3389/fgene.2021.726596] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 09/22/2021] [Indexed: 11/13/2022] Open

Kaplanis J, Samocha KE, Wiel L, Zhang Z, Arvai KJ, Eberhardt RY, Gallone G, Lelieveld SH, Martin HC, McRae JF, Short PJ, Torene RI, de Boer E, Danecek P, Gardner EJ, Huang N, Lord J, Martincorena I, Pfundt R, Reijnders MRF, Yeung A, Yntema HG, Vissers LELM, Juusola J, Wright CF, Brunner HG, Firth HV, FitzPatrick DR, Barrett JC, Hurles ME, Gilissen C, Retterer K. Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature 2020;586:757-762. [PMID: 33057194 PMCID: PMC7116826 DOI: 10.1038/s41586-020-2832-5] [Citation(s) in RCA: 371] [Impact Index Per Article: 74.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 07/17/2020] [Indexed: 01/28/2023]

Affiliation(s)

Joanna Kaplanis Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Kaitlin E Samocha Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Laurens Wiel Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
Zhancheng Zhang GeneDx, Gaithersburg, MD, USA
Kevin J Arvai GeneDx, Gaithersburg, MD, USA
Ruth Y Eberhardt Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Giuseppe Gallone Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Stefan H Lelieveld Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
Hilary C Martin Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Jeremy F McRae Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Patrick J Short Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Rebecca I Torene GeneDx, Gaithersburg, MD, USA
Elke de Boer Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
Petr Danecek Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Eugene J Gardner Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Ni Huang Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Jenny Lord Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK
Iñigo Martincorena Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Rolph Pfundt Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
Margot R F Reijnders Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands Department of Clinical Genetics, Maastricht University Medical Centre, Maastricht, The Netherlands
Alison Yeung Victorian Clinical Genetics Services, Melbourne, Victoria, Australia Murdoch Children's Research Institute, Melbourne, Victoria, Australia
Helger G Yntema Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
Lisenka E L M Vissers Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
Jane Juusola GeneDx, Gaithersburg, MD, USA
Caroline F Wright Institute of Biomedical and Clinical Science, University of Exeter Medical School, Royal Devon & Exeter Hospital, Exeter, UK
Han G Brunner Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands Department of Clinical Genetics, Maastricht University Medical Centre, Maastricht, The Netherlands GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands MHENS School for Mental Health and Neuroscience, Maastricht University Medical Centre, Maastricht, The Netherlands
Helen V Firth Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK East Anglian Medical Genetics Service, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
David R FitzPatrick MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Western General Hospital, Edinburgh, UK
Jeffrey C Barrett Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Matthew E Hurles Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
Christian Gilissen Department of Human Genetics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
Kyle Retterer GeneDx, Gaithersburg, MD, USA

Collapse

Ikram N, Qadir MA, Afzal MT. SimExact – An Efficient Method to Compute Function Similarity Between Proteins Using Gene Ontology. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191017092842] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Khorsand B, Savadi A, Zahiri J, Naghibzadeh M. Alpha influenza virus infiltration prediction using virus-human protein-protein interaction network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2020;17:3109-3129. [PMID: 32987519 DOI: 10.3934/mbe.2020176] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Handling Big Data Scalability in Biological Domain Using Parallel and Distributed Processing: A Case of Three Biological Semantic Similarity Measures. BIOMED RESEARCH INTERNATIONAL 2019;2019:6750296. [PMID: 30809545 PMCID: PMC6369486 DOI: 10.1155/2019/6750296] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Accepted: 01/13/2019] [Indexed: 11/30/2022]

Abstract

In the field of biology, researchers need to compare genes or gene products using semantic similarity measures (SSM). Continuous data growth and diversity in data characteristics comprise what is called big data; current biological SSMs cannot handle big data. Therefore, these measures need the ability to control the size of big data. We used parallel and distributed processing by splitting data into multiple partitions and applied SSM measures to each partition; this approach helped manage big data scalability and computational problems. Our solution involves three steps: split gene ontology (GO), data clustering, and semantic similarity calculation. To test this method, split GO and data clustering algorithms were defined and assessed for performance in the first two steps. Three of the best SSMs in biology [Resnik, Shortest Semantic Differentiation Distance (SSDD), and SORA] are enhanced by introducing threaded parallel processing, which is used in the third step. Our results demonstrate that introducing threads in SSMs reduced the time of calculating semantic similarity between gene pairs and improved performance of the three SSMs. Average time was reduced by 24.51% for Resnik, 22.93%, for SSDD, and 33.68% for SORA. Total time was reduced by 8.88% for Resnik, 23.14% for SSDD, and 39.27% for SORA. Using these threaded measures in the distributed system, combined with using split GO and data clustering algorithms to split input data based on their similarity, reduced the average time more than did the approach of equally dividing input data. Time reduction increased with increasing number of splits. Time reduction percentage was 24.1%, 39.2%, and 66.6% for Threaded SSDD; 33.0%, 78.2%, and 93.1% for Threaded SORA in the case of 2, 3, and 4 slaves, respectively; and 92.04% for Threaded Resnik in the case of four slaves.

Collapse

Shen C, Ding Y, Tang J, Guo F. Multivariate Information Fusion With Fast Kernel Learning to Kernel Ridge Regression in Predicting LncRNA-Protein Interactions. Front Genet 2019;9:716. [PMID: 30697228 PMCID: PMC6340980 DOI: 10.3389/fgene.2018.00716] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Accepted: 12/21/2018] [Indexed: 12/31/2022] Open

Abstract

Long non-coding RNAs (lncRNAs) constitute a large class of transcribed RNA molecules. They have a characteristic length of more than 200 nucleotides which do not encode proteins. They play an important role in regulating gene expression by interacting with the homologous RNA-binding proteins. Due to the laborious and time-consuming nature of wet experimental methods, more researchers should pay great attention to computational approaches for the prediction of lncRNA-protein interaction (LPI). An in-depth literature review in the state-of-the-art in silico investigations, leads to the conclusion that there is still room for improving the accuracy and velocity. This paper propose a novel method for identifying LPI by employing Kernel Ridge Regression, based on Fast Kernel Learning (LPI-FKLKRR). This approach, uses four distinct similarity measures for lncRNA and protein space, respectively. It is remarkable, that we extract Gene Ontology (GO) with proteins, in order to improve the quality of information in protein space. The process of heterogeneous kernels integration, applies Fast Kernel Learning (FastKL) to deal with weight optimization. The extrapolation model is obtained by gaining the ultimate prediction associations, after using Kernel Ridge Regression (KRR). Experimental outcomes show that the ability of modeling with LPI-FKLKRR has extraordinary performance compared with LPI prediction schemes. On benchmark dataset, it has been observed that the best Area Under Precision Recall Curve (AUPR) of 0.6950 is obtained by our proposed model LPI-FKLKRR, which outperforms the integrated LPLNP (AUPR: 0.4584), RWR (AUPR: 0.2827), CF (AUPR: 0.2357), LPIHN (AUPR: 0.2299), and LPBNI (AUPR: 0.3302). Also, combined with the experimental results of a case study on a novel dataset, it is anticipated that LPI-FKLKRR will be a useful tool for LPI prediction.

Collapse

Hadarovich A, Anishchenko I, Tuzikov AV, Kundrotas PJ, Vakser IA. Gene ontology improves template selection in comparative protein docking. Proteins 2018;87:245-253. [PMID: 30520123 DOI: 10.1002/prot.25645] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Revised: 10/21/2018] [Accepted: 11/29/2018] [Indexed: 02/06/2023]

GOGO: An improved algorithm to measure the semantic similarity between gene ontology terms. Sci Rep 2018;8:15107. [PMID: 30305653 PMCID: PMC6180005 DOI: 10.1038/s41598-018-33219-y] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 09/24/2018] [Indexed: 01/29/2023] Open

Ding Z, Kihara D. Computational Methods for Predicting Protein-Protein Interactions Using Various Protein Features. CURRENT PROTOCOLS IN PROTEIN SCIENCE 2018;93:e62. [PMID: 29927082 PMCID: PMC6097941 DOI: 10.1002/cpps.62] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Zhang J, Jia K, Jia J, Qian Y. An improved approach to infer protein-protein interaction based on a hierarchical vector space model. BMC Bioinformatics 2018;19:161. [PMID: 29699476 PMCID: PMC5921294 DOI: 10.1186/s12859-018-2152-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2017] [Accepted: 04/09/2018] [Indexed: 02/06/2023] Open

Peng J, Zhang X, Hui W, Lu J, Li Q, Liu S, Shang X. Improving the measurement of semantic similarity by combining gene ontology and co-functional network: a random walk based approach. BMC SYSTEMS BIOLOGY 2018;12:18. [PMID: 29560823 PMCID: PMC5861498 DOI: 10.1186/s12918-018-0539-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Fusing multiple protein-protein similarity networks to effectively predict lncRNA-protein interactions. BMC Bioinformatics 2017;18:420. [PMID: 29072138 PMCID: PMC5657051 DOI: 10.1186/s12859-017-1819-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Lastra-Díaz JJ, García-Serrano A, Batet M, Fernández M, Chirigati F. HESML: A scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. INFORM SYST 2017. [DOI: 10.1016/j.is.2017.02.002] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Exploring Approaches for Detecting Protein Functional Similarity within an Orthology-based Framework. Sci Rep 2017;7:381. [PMID: 28336965 PMCID: PMC5428484 DOI: 10.1038/s41598-017-00465-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 02/28/2017] [Indexed: 11/21/2022] Open

Shui Y, Cho YR. Alignment of PPI Networks Using Semantic Similarity for Conserved Protein Complex Prediction. IEEE Trans Nanobioscience 2017;15:380-389. [PMID: 28113907 DOI: 10.1109/tnb.2016.2555802] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Dai H, Liu Q, Liu B. Research Progress on Mechanism of Podocyte Depletion in Diabetic Nephropathy. J Diabetes Res 2017;2017:2615286. [PMID: 28791309 PMCID: PMC5534294 DOI: 10.1155/2017/2615286] [Citation(s) in RCA: 189] [Impact Index Per Article: 23.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Revised: 02/05/2017] [Accepted: 03/05/2017] [Indexed: 12/13/2022] Open

Holliday GL, Davidson R, Akiva E, Babbitt PC. Evaluating Functional Annotations of Enzymes Using the Gene Ontology. Methods Mol Biol 2017;1446:111-132. [PMID: 27812939 PMCID: PMC5837055 DOI: 10.1007/978-1-4939-3743-1_9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]

Pesquita C. Semantic Similarity in the Gene Ontology. Methods Mol Biol 2017;1446:161-173. [PMID: 27812942 DOI: 10.1007/978-1-4939-3743-1_12] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

Grouping miRNAs of similar functions via weighted information content of gene ontology. BMC Bioinformatics 2016;17:507. [PMID: 28155659 PMCID: PMC5260111 DOI: 10.1186/s12859-016-1367-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Peng J, Li H, Liu Y, Juan L, Jiang Q, Wang Y, Chen J. InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology. BMC Genomics 2016;17 Suppl 5:530. [PMID: 27586009 PMCID: PMC5009821 DOI: 10.1186/s12864-016-2828-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Bastos HP, Sousa L, Clarke LA, Couto FM. Functional coherence metrics in protein families. J Biomed Semantics 2016;7:41. [PMID: 27338101 PMCID: PMC4917928 DOI: 10.1186/s13326-016-0076-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Accepted: 05/17/2016] [Indexed: 12/03/2022] Open

Abstract

Background

Biological sequences, such as proteins, have been provided with annotations that assign functional information. These functional annotations are associations of proteins (or other biological sequences) with descriptors characterizing their biological roles. However, not all proteins are fully (or even at all) annotated. This annotation incompleteness limits our ability to make sound assertions about the functional coherence within sets of proteins. Annotation incompleteness is a problematic issue when measuring semantic functional similarity of biological sequences since they can only capture a limited amount of all the semantic aspects the sequences may encompass.

Methods

Instead of relying uniquely on single (reductive) metrics, this work proposes a comprehensive approach for assessing functional coherence within protein sets. The approach entails using visualization and term enrichment techniques anchored in specific domain knowledge, such as a protein family. For that purpose we evaluate two novel functional coherence metrics, mUI and mGIC that combine aspects of semantic similarity measures and term enrichment.

Results

These metrics were used to effectively capture and measure the local similarity cores within protein sets. Hence, these metrics coupled with visualization tools allow an improved grasp on three important functional annotation aspects: completeness, agreement and coherence.

Conclusions

Measuring the functional similarity between proteins based on their annotations is a non trivial task. Several metrics exist but due both to characteristics intrinsic to the nature of graphs and extrinsic natures related to the process of annotation each measure can only capture certain functional annotation aspects of proteins. Hence, when trying to measure the functional coherence of a set of proteins a single metric is too reductive. Therefore, it is valuable to be aware of how each employed similarity metric works and what similarity aspects it can best capture. Here we test the behaviour and resilience of some similarity metrics.

Electronic supplementary material

The online version of this article (doi:10.1186/s13326-016-0076-y) contains supplementary material, which is available to authorized users.

Collapse

Zhang SB, Tang QR. Protein-protein interaction inference based on semantic similarity of Gene Ontology terms. J Theor Biol 2016;401:30-7. [PMID: 27117309 DOI: 10.1016/j.jtbi.2016.04.020] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Revised: 03/14/2016] [Accepted: 04/16/2016] [Indexed: 11/29/2022]

Koorman T, Klompstra D, van der Voet M, Lemmens I, Ramalho JJ, Nieuwenhuize S, van den Heuvel S, Tavernier J, Nance J, Boxem M. A combined binary interaction and phenotypic map of C. elegans cell polarity proteins. Nat Cell Biol 2016;18:337-46. [PMID: 26780296 PMCID: PMC4767559 DOI: 10.1038/ncb3300] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Accepted: 12/15/2015] [Indexed: 12/12/2022]

Pesaranghader A, Matwin S, Sokolova M, Beiko RG. simDEF: definition-based semantic similarity measure of gene ontology terms for functional similarity analysis of genes. Bioinformatics 2015;32:1380-7. [PMID: 26708333 DOI: 10.1093/bioinformatics/btv755] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Accepted: 12/21/2015] [Indexed: 12/19/2022] Open

Liu B, Jin M, Zeng P. Prioritization of candidate disease genes by combining topological similarity and semantic similarity. J Biomed Inform 2015;57:1-5. [DOI: 10.1016/j.jbi.2015.07.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2014] [Revised: 07/01/2015] [Accepted: 07/06/2015] [Indexed: 10/23/2022]

Song M, Jiang Z. Inferring Association between Compound and Pathway with an Improved Ensemble Learning Method. Mol Inform 2015;34:753-60. [DOI: 10.1002/minf.201500033] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 07/03/2015] [Indexed: 12/20/2022]

Bettembourg C, Diot C, Dameron O. Optimal Threshold Determination for Interpreting Semantic Similarity and Particularity: Application to the Comparison of Gene Sets and Metabolic Pathways Using GO and ChEBI. PLoS One 2015;10:e0133579. [PMID: 26230274 PMCID: PMC4521860 DOI: 10.1371/journal.pone.0133579] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Accepted: 06/30/2015] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

The analysis of gene annotations referencing back to Gene Ontology plays an important role in the interpretation of high-throughput experiments results. This analysis typically involves semantic similarity and particularity measures that quantify the importance of the Gene Ontology annotations. However, there is currently no sound method supporting the interpretation of the similarity and particularity values in order to determine whether two genes are similar or whether one gene has some significant particular function. Interpretation is frequently based either on an implicit threshold, or an arbitrary one (typically 0.5). Here we investigate a method for determining thresholds supporting the interpretation of the results of a semantic comparison.

RESULTS

We propose a method for determining the optimal similarity threshold by minimizing the proportions of false-positive and false-negative similarity matches. We compared the distributions of the similarity values of pairs of similar genes and pairs of non-similar genes. These comparisons were performed separately for all three branches of the Gene Ontology. In all situations, we found overlap between the similar and the non-similar distributions, indicating that some similar genes had a similarity value lower than the similarity value of some non-similar genes. We then extend this method to the semantic particularity measure and to a similarity measure applied to the ChEBI ontology. Thresholds were evaluated over the whole HomoloGene database. For each group of homologous genes, we computed all the similarity and particularity values between pairs of genes. Finally, we focused on the PPAR multigene family to show that the similarity and particularity patterns obtained with our thresholds were better at discriminating orthologs and paralogs than those obtained using default thresholds.

CONCLUSION

We developed a method for determining optimal semantic similarity and particularity thresholds. We applied this method on the GO and ChEBI ontologies. Qualitative analysis using the thresholds on the PPAR multigene family yielded biologically-relevant patterns.

Collapse

Scoring the correlation of genes by their shared properties using OScal, an improved overlap quantification model. Sci Rep 2015;5:10583. [PMID: 26015386 PMCID: PMC4445036 DOI: 10.1038/srep10583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 04/20/2015] [Indexed: 11/17/2022] Open

Zhang SB, Lai JH. Semantic similarity measurement between gene ontology terms based on exclusively inherited shared information. Gene 2015;558:108-17. [DOI: 10.1016/j.gene.2014.12.062] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Revised: 12/15/2014] [Accepted: 12/24/2014] [Indexed: 11/25/2022]

Peng J, Uygun S, Kim T, Wang Y, Rhee SY, Chen J. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. BMC Bioinformatics 2015;16:44. [PMID: 25886899 PMCID: PMC4339680 DOI: 10.1186/s12859-015-0474-7] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Accepted: 01/26/2015] [Indexed: 01/18/2023] Open

Peng J, Li H, Jiang Q, Wang Y, Chen J. An integrative approach for measuring semantic similarities using gene ontology. BMC SYSTEMS BIOLOGY 2014;8 Suppl 5:S8. [PMID: 25559943 PMCID: PMC4305987 DOI: 10.1186/1752-0509-8-s5-s8] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]

Na D, Son H, Gsponer J. Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity. BMC Genomics 2014;15:1091. [PMID: 25495442 PMCID: PMC4298957 DOI: 10.1186/1471-2164-15-1091] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Accepted: 12/04/2014] [Indexed: 11/10/2022] Open

Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type? PLoS One 2014;9:e113859. [PMID: 25474538 PMCID: PMC4256219 DOI: 10.1371/journal.pone.0113859] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 10/31/2014] [Indexed: 12/23/2022] Open

Batet M, Harispe S, Ranwez S, Sánchez D, Ranwez V. An information theoretic approach to improve semantic similarity assessments across multiple ontologies. Inf Sci (N Y) 2014. [DOI: 10.1016/j.ins.2014.06.039] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Peng J, Wang Y, Chen J. Towards integrative gene functional similarity measurement. BMC Bioinformatics 2014;15 Suppl 2:S5. [PMID: 24564710 PMCID: PMC4015993 DOI: 10.1186/1471-2105-15-s2-s5] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Bandyopadhyay S, Mallick K. A New Path Based Hybrid Measure for Gene Ontology Similarity. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014;11:116-127. [PMID: 26355512 DOI: 10.1109/tcbb.2013.149] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Measuring the evolution of ontology complexity: the gene ontology case study. PLoS One 2013;8:e75993. [PMID: 24146805 PMCID: PMC3795689 DOI: 10.1371/journal.pone.0075993] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 08/20/2013] [Indexed: 01/09/2023] Open

Abstract

Ontologies support automatic sharing, combination and analysis of life sciences data. They undergo regular curation and enrichment. We studied the impact of an ontology evolution on its structural complexity. As a case study we used the sixty monthly releases between January 2008 and December 2012 of the Gene Ontology and its three independent branches, i.e. biological processes (BP), cellular components (CC) and molecular functions (MF). For each case, we measured complexity by computing metrics related to the size, the nodes connectivity and the hierarchical structure. The number of classes and relations increased monotonously for each branch, with different growth rates. BP and CC had similar connectivity, superior to that of MF. Connectivity increased monotonously for BP, decreased for CC and remained stable for MF, with a marked increase for the three branches in November and December 2012. Hierarchy-related measures showed that CC and MF had similar proportions of leaves, average depths and average heights. BP had a lower proportion of leaves, and a higher average depth and average height. For BP and MF, the late 2012 increase of connectivity resulted in an increase of the average depth and average height and a decrease of the proportion of leaves, indicating that a major enrichment effort of the intermediate-level hierarchy occurred. The variation of the number of classes and relations in an ontology does not provide enough information about the evolution of its complexity. However, connectivity and hierarchy-related metrics revealed different patterns of values as well as of evolution for the three branches of the Gene Ontology. CC was similar to BP in terms of connectivity, and similar to MF in terms of hierarchy. Overall, BP complexity increased, CC was refined with the addition of leaves providing a finer level of annotations but decreasing slightly its complexity, and MF complexity remained stable.

Collapse