1
|
Persson E, Kaduk M, Forslund SK, Sonnhammer ELL. Domainoid: domain-oriented orthology inference. BMC Bioinformatics 2019; 20:523. [PMID: 31660857 PMCID: PMC6816169 DOI: 10.1186/s12859-019-3137-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 10/09/2019] [Indexed: 11/18/2022] Open
Abstract
Background Orthology inference is normally based on full-length protein sequences. However, most proteins contain independently folding and recurring regions, domains. The domain architecture of a protein is vital for its function, and recombination events mean individual domains can have different evolutionary histories. It has previously been shown that orthologous proteins may differ in domain architecture, creating challenges for orthology inference methods operating on full-length sequences. We have developed Domainoid, a new tool aiming to overcome these challenges faced by full-length orthology methods by inferring orthology on the domain level. It employs the InParanoid algorithm on single domains separately, to infer groups of orthologous domains. Results This domain-oriented approach allows detection of discordant domain orthologs, cases where different domains on the same protein have different evolutionary histories. In addition to domain level analysis, protein level orthology based on the fraction of domains that are orthologous can be inferred. Domainoid orthology assignments were compared to those yielded by the conventional full-length approach InParanoid, and were validated in a standard benchmark. Conclusions Our results show that domain-based orthology inference can reveal many orthologous relationships that are not found by full-length sequence approaches. Availability https://bitbucket.org/sonnhammergroup/domainoid/
Collapse
Affiliation(s)
- Emma Persson
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 17121, Solna, Sweden
| | - Mateusz Kaduk
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 17121, Solna, Sweden
| | - Sofia K Forslund
- Experimental and Clinical Research Cente, a joint cooperation of Max-Delbrück Center for Molecular Medicine and Charité-Universitätsmedizin Berlin, 13125, Berlin, Germany.,European Molecular Biology Laboratory, Structural and Computational Biology Unit, 69117, Heidelberg, Germany
| | - Erik L L Sonnhammer
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 17121, Solna, Sweden.
| |
Collapse
|
2
|
Prediction of Protein-Protein Interactions Based on Domain. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019; 2019:5238406. [PMID: 31531123 PMCID: PMC6720845 DOI: 10.1155/2019/5238406] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 07/09/2019] [Accepted: 07/30/2019] [Indexed: 11/17/2022]
Abstract
Protein-protein interactions (PPIs) play a crucial role in various biological processes. To better comprehend the pathogenesis and treatments of various diseases, it is necessary to learn the detail of these interactions. However, the current experimental method still has many false-positive and false-negative problems. Computational prediction of protein-protein interaction has become a more important prediction method which can overcome the obstacles of the experimental method. In this work, we proposed a novel computational domain-based method for PPI prediction, and an SVM model for the prediction was built based on the physicochemical property of the domain. The outcomes of SVM and the domain-domain score were used to construct the prediction model for protein-protein interaction. The predicted results demonstrated the domain-based research can enhance the ability to predict protein interactions.
Collapse
|
3
|
Abstract
This chapter reviews current research on how protein domain architectures evolve. We begin by summarizing work on the phylogenetic distribution of proteins, as this will directly impact which domain architectures can be formed in different species. Studies relating domain family size to occurrence have shown that they generally follow power law distributions, both within genomes and larger evolutionary groups. These findings were subsequently extended to multi-domain architectures. Genome evolution models that have been suggested to explain the shape of these distributions are reviewed, as well as evidence for selective pressure to expand certain domain families more than others. Each domain has an intrinsic combinatorial propensity, and the effects of this have been studied using measures of domain versatility or promiscuity. Next, we study the principles of protein domain architecture evolution and how these have been inferred from distributions of extant domain arrangements. Following this, we review inferences of ancestral domain architecture and the conclusions concerning domain architecture evolution mechanisms that can be drawn from these. Finally, we examine whether all known cases of a given domain architecture can be assumed to have a single common origin (monophyly) or have evolved convergently (polyphyly). We end by a discussion of some available tools for computational analysis or exploitation of protein domain architectures and their evolution.
Collapse
|
4
|
Park JW, Lee JH, Kim SW, Han JS, Kang KS, Kim SJ, Park TS. Muscle differentiation induced up-regulation of calcium-related gene expression in quail myoblasts. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2018; 31:1507-1515. [PMID: 29879808 PMCID: PMC6127575 DOI: 10.5713/ajas.18.0302] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 05/29/2018] [Indexed: 11/27/2022]
Abstract
Objective In the poultry industry, the most important economic traits are meat quality and carcass yield. Thus, many studies were conducted to investigate the regulatory pathways during muscle differentiation. To gain insight of muscle differentiation mechanism during growth period, we identified and validated calcium-related genes which were highly expressed during muscle differentiation through mRNA sequencing analysis. Methods We conducted next-generation-sequencing (NGS) analysis of mRNA from undifferentiated QM7 cells and differentiated QM7 cells (day 1 to day 3 of differentiation periods). Subsequently, we obtained calcium related genes related to muscle differentiation process and examined the expression patterns by quantitative reverse-transcription polymerase chain reaction (qRT-PCR). Results Through RNA sequencing analysis, we found that the transcription levels of six genes (troponin C1, slow skeletal and cardiac type [TNNC1], myosin light chain 1 [MYL1], MYL3, phospholamban [PLN], caveolin 3 [CAV3], and calsequestrin 2 [CASQ2]) particularly related to calcium regulation were gradually increased according to days of myotube differentiation. Subsequently, we validated the expression patterns of calcium-related genes in quail myoblasts. These results indicated that TNNC1, MYL1, MYL3, PLN, CAV3, CASQ2 responded to differentiation and growth performance in quail muscle. Conclusion These results indicated that calcium regulation might play a critical role in muscle differentiation. Thus, these findings suggest that further studies would be warranted to investigate the role of calcium ion in muscle differentiation and could provide a useful biomarker for muscle differentiation and growth.
Collapse
Affiliation(s)
- Jeong-Woong Park
- Institute of Green-Bio Science and Technology, Seoul National University, Pyeongchang 25354, Korea
| | - Jeong Hyo Lee
- Institute of Green-Bio Science and Technology, Seoul National University, Pyeongchang 25354, Korea
| | - Seo Woo Kim
- Graduate School of International Agricultural Technology, Seoul National University, Pyeongchang 25354, Korea
| | - Ji Seon Han
- Graduate School of International Agricultural Technology, Seoul National University, Pyeongchang 25354, Korea
| | - Kyung Soo Kang
- Bio Division, Medikinetics, Inc., Pyeongtaek 17792, Korea
| | - Sung-Jo Kim
- Division of Cosmetics and Biotechnology, Hoseo University, Asan 31499, Korea
| | - Tae Sub Park
- Institute of Green-Bio Science and Technology, Seoul National University, Pyeongchang 25354, Korea.,Graduate School of International Agricultural Technology, Seoul National University, Pyeongchang 25354, Korea
| |
Collapse
|
5
|
Nichio BTL, Marchaukoski JN, Raittz RT. New Tools in Orthology Analysis: A Brief Review of Promising Perspectives. Front Genet 2017; 8:165. [PMID: 29163633 PMCID: PMC5674930 DOI: 10.3389/fgene.2017.00165] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Accepted: 10/16/2017] [Indexed: 11/23/2022] Open
Abstract
Nowadays defying homology relationships among sequences is essential for biological research. Within homology the analysis of orthologs sequences is of great importance for computational biology, annotation of genomes and for phylogenetic inference. Since 2007, with the increase in the number of new sequences being deposited in large biological databases, researchers have begun to analyse computerized methodologies and tools aimed at selecting the most promising ones in the prediction of orthologous groups. Literature in this field of research describes the problems that the majority of available tools show, such as those encountered in accuracy, time required for analysis (especially in light of the increasing volume of data being submitted, which require faster techniques) and the automatization of the process without requiring manual intervention. Conducting our search through BMC, Google Scholar, NCBI PubMed, and Expasy, we examined more than 600 articles pursuing the most recent techniques and tools developed to solve most the problems still existing in orthology detection. We listed the main computational tools created and developed between 2011 and 2017, taking into consideration the differences in the type of orthology analysis, outlining the main features of each tool and pointing to the problems that each one tries to address. We also observed that several tools still use as their main algorithm the BLAST "all-against-all" methodology, which entails some limitations, such as limited number of queries, computational cost, and high processing time to complete the analysis. However, new promising tools are being developed, like OrthoVenn (which uses the Venn diagram to show the relationship of ortholog groups generated by its algorithm); or proteinOrtho (which improves the accuracy of ortholog groups); or ReMark (tackling the integration of the pipeline to turn the entry process automatic); or OrthAgogue (using algorithms developed to minimize processing time); and proteinOrtho (developed for dealing with large amounts of biological data). We made a comparison among the main features of four tool and tested them using four for prokaryotic genomas. We hope that our review can be useful for researchers and will help them in selecting the most appropriate tool for their work in the field of orthology.
Collapse
Affiliation(s)
| | | | - Roberto Tadeu Raittz
- Department of Bioinformatics, Professional and Technical Education Sector, Federal University of Paraná, Curitiba, Brazil
| |
Collapse
|
6
|
Dohmen E, Kremer LPM, Bornberg-Bauer E, Kemena C. DOGMA: domain-based transcriptome and proteome quality assessment. Bioinformatics 2016; 32:2577-81. [PMID: 27153665 DOI: 10.1093/bioinformatics/btw231] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 04/21/2016] [Indexed: 12/29/2022] Open
Abstract
MOTIVATION Genome studies have become cheaper and easier than ever before, due to the decreased costs of high-throughput sequencing and the free availability of analysis software. However, the quality of genome or transcriptome assemblies can vary a lot. Therefore, quality assessment of assemblies and annotations are crucial aspects of genome analysis pipelines. RESULTS We developed DOGMA, a program for fast and easy quality assessment of transcriptome and proteome data based on conserved protein domains. DOGMA measures the completeness of a given transcriptome or proteome and provides information about domain content for further analysis. DOGMA provides a very fast way to do quality assessment within seconds. AVAILABILITY AND IMPLEMENTATION DOGMA is implemented in Python and published under GNU GPL v.3 license. The source code is available on https://ebbgit.uni-muenster.de/domainWorld/DOGMA/ CONTACTS: e.dohmen@wwu.de or c.kemena@wwu.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Elias Dohmen
- Institute for Evolution and Biodiversity, University of Münster, Münster 48149, Germany Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, Recklinghausen 45665, Germany
| | - Lukas P M Kremer
- Institute for Evolution and Biodiversity, University of Münster, Münster 48149, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster 48149, Germany
| | - Carsten Kemena
- Institute for Evolution and Biodiversity, University of Münster, Münster 48149, Germany
| |
Collapse
|
7
|
Cromar GL, Zhao A, Xiong X, Swapna LS, Loughran N, Song H, Parkinson J. PhyloPro2.0: a database for the dynamic exploration of phylogenetically conserved proteins and their domain architectures across the Eukarya. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw013. [PMID: 26980519 PMCID: PMC4792532 DOI: 10.1093/database/baw013] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 01/29/2016] [Indexed: 11/13/2022]
Abstract
PhyloPro is a database and accompanying web-based application for the construction and exploration of phylogenetic profiles across the Eukarya. In this update article, we present six major new developments in PhyloPro: (i) integration of Pfam-A domain predictions for all proteins; (ii) new summary heatmaps and detailed level views of domain conservation; (iii) an interactive, network-based visualization tool for exploration of domain architectures and their conservation; (iv) ability to browse based on protein functional categories (GOSlim); (v) improvements to the web interface to enhance drill down capability from the heatmap view; and (vi) improved coverage including 164 eukaryotes and 12 reference species. In addition, we provide improved support for downloading data and images in a variety of formats. Among the existing tools available for phylogenetic profiles, PhyloPro provides several innovative domain-based features including a novel domain adjacency visualization tool. These are designed to allow the user to identify and compare proteins with similar domain architectures across species and thus develop hypotheses about the evolution of lineage-specific trajectories. Database URL: http://www.compsysbio.org/phylopro/.
Collapse
Affiliation(s)
- Graham L Cromar
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Anthony Zhao
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Xuejian Xiong
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Lakshmipuram S Swapna
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Noeleen Loughran
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - Hongyan Song
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and
| | - John Parkinson
- Program in Molecular Structure and Function, Hospital for Sick Children, 21-9830 PGCRL, 686 Bay Street, Toronto, ON M5G 0A4, Canada and Departments of Biochemistry, Computer Science and Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
8
|
Tekaia F. Inferring Orthologs: Open Questions and Perspectives. GENOMICS INSIGHTS 2016; 9:17-28. [PMID: 26966373 PMCID: PMC4778853 DOI: 10.4137/gei.s37925] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Revised: 12/30/2015] [Accepted: 01/02/2016] [Indexed: 01/25/2023]
Abstract
With the increasing number of sequenced genomes and their comparisons, the detection of orthologs is crucial for reliable functional annotation and evolutionary analyses of genes and species. Yet, the dynamic remodeling of genome content through gain, loss, transfer of genes, and segmental and whole-genome duplication hinders reliable orthology detection. Moreover, the lack of direct functional evidence and the questionable quality of some available genome sequences and annotations present additional difficulties to assess orthology. This article reviews the existing computational methods and their potential accuracy in the high-throughput era of genome sequencing and anticipates open questions in terms of methodology, reliability, and computation. Appropriate taxon sampling together with combination of methods based on similarity, phylogeny, synteny, and evolutionary knowledge that may help detecting speciation events appears to be the most accurate strategy. This review also raises perspectives on the potential determination of orthology throughout the whole species phylogeny.
Collapse
Affiliation(s)
- Fredj Tekaia
- Institut Pasteur, Unit of Structural Microbiology, CNRS URA 3528 and University Paris Diderot, Sorbonne Paris Cité, Paris, France
| |
Collapse
|