1
|
Zhang Y, Zhang Q, Zhou J, Zou Q. A survey on the algorithm and development of multiple sequence alignment. Brief Bioinform 2022; 23:6546258. [PMID: 35272347 DOI: 10.1093/bib/bbac069] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 01/30/2022] [Accepted: 02/09/2022] [Indexed: 12/21/2022] Open
Abstract
Multiple sequence alignment (MSA) is an essential cornerstone in bioinformatics, which can reveal the potential information in biological sequences, such as function, evolution and structure. MSA is widely used in many bioinformatics scenarios, such as phylogenetic analysis, protein analysis and genomic analysis. However, MSA faces new challenges with the gradual increase in sequence scale and the increasing demand for alignment accuracy. Therefore, developing an efficient and accurate strategy for MSA has become one of the research hotspots in bioinformatics. In this work, we mainly summarize the algorithms for MSA and its applications in bioinformatics. To provide a structured and clear perspective, we systematically introduce MSA's knowledge, including background, database, metric and benchmark. Besides, we list the most common applications of MSA in the field of bioinformatics, including database searching, phylogenetic analysis, genomic analysis, metagenomic analysis and protein analysis. Furthermore, we categorize and analyze classical and state-of-the-art algorithms, divided into progressive alignment, iterative algorithm, heuristics, machine learning and divide-and-conquer. Moreover, we also discuss the challenges and opportunities of MSA in bioinformatics. Our work provides a comprehensive survey of MSA applications and their relevant algorithms. It could bring valuable insights for researchers to contribute their knowledge to MSA and relevant studies.
Collapse
Affiliation(s)
- Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China.,School of Computer Science and Engineering, University of Electronic Science and Technology of China, 611731, Chengdu, China
| | - Qiang Zhang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Jiliu Zhou
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 610054, Chengdu, China
| |
Collapse
|
2
|
MitoZoa: A curated mitochondrial genome database of metazoans for comparative genomics studies. Mitochondrion 2010; 10:192-9. [DOI: 10.1016/j.mito.2010.01.004] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2009] [Revised: 12/14/2009] [Accepted: 01/08/2010] [Indexed: 11/23/2022]
|
3
|
NAGASE M, MAETA K, AIMI T, SUGINAKA K, MORINAGA T. Analytical Methods for Quantification of Relative Flying Fish Paste Content in Processed Sea Food (ago-noyaki). FOOD SCIENCE AND TECHNOLOGY RESEARCH 2010. [DOI: 10.3136/fstr.16.403] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
4
|
Cameron JM, Hurd T, Robinson BH. Computational identification of human mitochondrial proteins based on homology to yeast mitochondrially targeted proteins. Bioinformatics 2005; 21:1825-30. [PMID: 15671119 DOI: 10.1093/bioinformatics/bti280] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
MOTIVATION Patients with defects of the mitochondrial respiratory chain due to mutations in nuclear genes are often undiagnosable due to the lack of information about the role of these genes. We therefore sought to produce a novel dataset of human nuclear-encoded mitochondrial proteins. RESULTS We have used the web-based computer program Mitoprot to predict which proteins in the Saccharomyces cerevisiae genome are targeted to mitochondria. We then used this protein dataset to identify the homologous human proteins in the Unigene database using TBLASTN from NCBI. Human proteins with an Expectation value <10(-5) and an Identity >30% were accepted as true homologues of the yeast proteins. These human proteins were then reanalyzed with Mitoprot. The final set of proteins comprises a dataset of 361 human mitochondrially targeted proteins with homology to all S.cerevisiae mitochondrially targeted proteins. One hundred twenty eight of these proteins are novel and are of unknown function. SUPPLEMENTARY INFORMATION Supplementary tables will be available from http://www.sickkids.ca/Robinsonlab/
Collapse
Affiliation(s)
- J M Cameron
- Metabolism Programme, Research Institute, The Hospital for Sick Children, Toronto, ON Canada
| | | | | |
Collapse
|
5
|
Andreoli C, Prokisch H, Hörtnagel K, Mueller JC, Münsterkötter M, Scharfe C, Meitinger T. MitoP2, an integrated database on mitochondrial proteins in yeast and man. Nucleic Acids Res 2004; 32:D459-62. [PMID: 14681457 PMCID: PMC308871 DOI: 10.1093/nar/gkh137] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2003] [Revised: 10/27/2003] [Accepted: 10/27/2003] [Indexed: 11/14/2022] Open
Abstract
The aim of the MitoP2 database (http://ihg.gsf.de/mitop2) is to provide a comprehensive list of mitochondrial proteins of yeast and man. Based on the current literature we created an annotated reference set of yeast and human proteins. In addition, data sets relevant to the study of the mitochondrial proteome are integrated and accessible via search tools and links. They include computational predictions of signalling sequences, and summarize results from proteome mapping, mutant screening, expression profiling, protein-protein interaction and cellular sublocalization studies. For each individual approach, specificity and sensitivity for allocating mitochondrial proteins was calculated. By providing the evidence for mitochondrial candidate proteins the MitoP2 database lends itself to the genetic characterization of human mitochondriopathies.
Collapse
Affiliation(s)
- C Andreoli
- Institute of Human Genetics, GSF National Research Center for Environment and Health, Neuherberg, Germany
| | | | | | | | | | | | | |
Collapse
|
6
|
Abstract
The mammalian mitochondrial genome encodes for 37 genes which are involved in a broad range of cellular functions. The mitochondrial DNA (mtDNA) molecule is commonly assumed to be inherited through oocyte cytoplasm in a clonal manner, and apparently species-specific mechanisms have evolved to eliminate the contribution of sperm mitochondria after natural fertilization. However, recent evidence for paternal mtDNA inheritance in embryos and offspring questions the general validity of this model, particularly in the context of assisted reproduction and embryo biotechnology. In addition to normal mt DNA haplotype variation, oocytes and spermatozoa show remarkable differences in mtDNA content and may be affected by inherited or acquired mtDNA aberrations. All these parameters have been correlated with gamete quality and reproductive success rates. Nuclear transfer (NT) technology provides experimental models for studying interactions between nuclear and mitochondrial genomes. Recent studies demonstrated (i) a significant effect of mtDNA haplotype or other maternal cytoplasmic factors on the efficiency of NT; (ii) phenotypic differences between transmitochondrial clones pointing to functionally relevant nuclear-cytoplasmic interactions; and (iii) neutral or non-neutral selection of mtDNA haplotypes in heteroplasmic conditions. Mitochondria form a dynamic reticulum, enabling complementation of mitochondrial components and possibly mixing of different mtDNA populations in heteroplasmic individuals. Future directions of research on mtDNA in the context of reproductive biotechnology range from the elimination of adverse effects of artificial heteroplasmy, e.g. created by ooplasm transfer, to engineering of optimized constellations of nuclear and cytoplasmic genes for the production of superior livestock.
Collapse
Affiliation(s)
- S Hiendleder
- Institut für Molekulare Tierzucht und Biotechnologie, Genzentrum der Ludwig-Maximilians-Universität München, Germany.
| | | |
Collapse
|
7
|
Entelis N, Kolesnikova O, Kazakova H, Brandina I, Kamenski P, Martin RP, Tarassov I. Import of nuclear encoded RNAs into yeast and human mitochondria: experimental approaches and possible biomedical applications. GENETIC ENGINEERING 2002; 24:191-213. [PMID: 12416306 DOI: 10.1007/978-1-4615-0721-5_9] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/19/2023]
Abstract
Mitochondria import from the cytoplasm the vast majority of proteins and some RNAs. Although there exists extended knowledge concerning the mechanisms of protein import, the import of RNA is poorly understood. It was almost exclusively studied on the model of tRNA import, in several protozoans, plants and yeast. Mammalian mitochondria, which do not import tRNAs naturally, are hypothesized to import other small RNA molecules from the cytoplasm. We studied tRNA import in the yeast system, both in vitro and in vivo, and applied similar approaches to study 5S rRNA import into human mitochondria. Despite the obvious divergence of RNA import systems suggested for different species, we find that in yeast and human cells this pathway involves similar mechanisms exploiting cytosolic proteins to target the RNA to the organelle and requiring the integrity of pre-protein import apparatus. The import pathway might be of interest from a biomedical point of view, to target into mitochondria RNAs that could suppress pathological mutations in mitochondrial DNA. Yeast represents a good model to elaborate such a gene therapy approach. We have described here the various approaches and protocols to study RNA import into mitochondria of yeast and human cells in vitro and in vivo.
Collapse
Affiliation(s)
- N Entelis
- FRE 2375 of the CNRS (MEPH), Institut de Physiologie et Chimie Biologique 21, rue René Descartes, 67084 Strasbourg, France
| | | | | | | | | | | | | |
Collapse
|
8
|
Tommaseo-Ponzetta M, Attimonelli M, De Robertis M, Tanzariello F, Saccone C. Mitochondrial DNA variability of West New Guinea populations. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2002; 117:49-67. [PMID: 11748562 DOI: 10.1002/ajpa.10010] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
This paper reports human mitochondrial DNA variability in West New Guinea (the least known, western side of the island of New Guinea), not yet described from a molecular perspective. The study was carried out on 202 subjects from 12 ethnic groups, belonging to six different Papuan language families, representative of both mountain and coastal plain areas. Mitochondrial DNA hypervariable region 1 (HVS 1) and the presence of the 9-bp deletion (intergenic region COII-tRNA(Lys)) were investigated. HVS 1 sequencing identified 73 polymorphic sites defining 89 haplotypes; the 9-bp deletion, which is considered a marker of Austronesian migration in the Pacific, was found to be absent in the whole West New Guinea study sample. Statistical analysis applied to the resulting haplotypes reveal high heterogeneity and an intersecting distribution of genetic variability in these populations, despite their cultural and geographic diversity. The results of subsequent phylogenetic approaches subdivide mtDNA diversity in West New Guinea into three main clusters (groups I-III), defined by sets of polymorphisms which are also shared by some individuals from Papua New Guinea. Comparisons with worldwide HVS 1 sequences stored in the MitBASE database show the absence of these patterns outside Oceania and a few Indonesian subjects, who also lack the 9-bp deletion. This finding, which is consistent with the effects of genetic drift and prolonged isolation of West New Guinea populations, lead us to regard these patterns as New Guinea population markers, which may harbor the genetic memory of the earliest human migrations to the island.
Collapse
|
9
|
Affiliation(s)
- T G Wolfsberg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA.
| | | | | | | |
Collapse
|
10
|
Pesole G, Saccone C. A novel method for estimating substitution rate variation among sites in a large dataset of homologous DNA sequences. Genetics 2001; 157:859-65. [PMID: 11157002 PMCID: PMC1461530 DOI: 10.1093/genetics/157.2.859] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We present here a novel method to estimate the site-specific relative variability in large sets of homologous sequences. It is based on the simple idea that the more closely related are the compared sequences, the higher the probability of observing nucleotide changes at rapidly evolving sites. A simulation study has been carried out to support the reliability of the method, which has been applied also to analyzing the site variability of all available human sequences corresponding to the two hypervariable regions of the mitochondrial D-loop.
Collapse
Affiliation(s)
- G Pesole
- Dipartimento di Fisiologia e Biochimica Generali, Università di Milano, via Celoria 26, 20133 Milano, Italy.
| | | |
Collapse
|
11
|
Abstract
GOBASE (http://megasun.bch.umontreal.ca/gobase/) is a network-accessible biological database, which is unique in bringing together diverse biological data on organelles with taxonomically broad coverage, and in furnishing data that have been exhaustively verified and completed by experts. So far, we have focused on mitochondrial data: GOBASE contains all published nucleotide and protein sequences encoded by mitochondrial genomes, selected RNA secondary structures of mitochondria-encoded molecules, genetic maps of completely sequenced genomes, taxonomic information for all species whose sequences are present in the database and organismal descriptions of key protistan eukaryotes. All of these data have been integrated and organized in a formal database structure to allow sophisticated biological queries using terms that are inherent in biological concepts. Most importantly, data have been validated, completed, corrected and standardized, a prerequisite of meaningful analysis. In addition, where critical data are lacking, such as genetic maps and RNA secondary structures, they are generated by the GOBASE team and collaborators, and added to the database. The database is implemented in a relational database management system, but features an object-oriented view of the biological data through a Web/Genera-generated World Wide Web interface. Finally, we have developed software for database curation (i.e. data updates, validation and correction), which will be described in some detail in this paper.
Collapse
Affiliation(s)
- N Shimko
- Program in Evolutionary Biology, Canadian Institute for Advanced Research, Département de Biochimie, Université de Montréal, 2900 Boulevard Edouard-Montpetit, Montréal, Québec, H3T 1J4, Canada
| | | | | | | |
Collapse
|