Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Thompson JD, Holbrook SR, Katoh K, Koehl P, Moras D, Westhof E, Poch O. MAO: a Multiple Alignment Ontology for nucleic acid and protein sequences. Nucleic Acids Res 2005;33:4164-71. [PMID: 16043635 PMCID: PMC1180671 DOI: 10.1093/nar/gki735] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Thompson JD, Holbrook SR, Katoh K, Koehl P, Moras D, Westhof E, Poch O. MAO: a Multiple Alignment Ontology for nucleic acid and protein sequences. Nucleic Acids Res 2005;33:4164-71. [PMID: 16043635 PMCID: PMC1180671 DOI: 10.1093/nar/gki735] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Benítez-Hidalgo A, Aldana-Montes JF, Navas-Delgado I, Roldán-García MDM. SALON ontology for the formal description of sequence alignments. BMC Bioinformatics 2023;24:69. [PMID: 36849882 PMCID: PMC9972671 DOI: 10.1186/s12859-023-05190-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 02/15/2023] [Indexed: 03/01/2023] Open

Zhan Q, Fu Y, Jiang Q, Liu B, Peng J, Wang Y. SpliVert: A Protein Multiple Sequence Alignment Refinement Method Based on Splitting-Splicing Vertically. Protein Pept Lett 2020;27:295-302. [PMID: 31385760 DOI: 10.2174/0929866526666190806143959] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2019] [Revised: 04/26/2019] [Accepted: 06/14/2019] [Indexed: 11/22/2022]

Abstract

BACKGROUND

Multiple Sequence Alignment (MSA) is a fundamental task in bioinformatics and is required for many biological analysis tasks. The more accurate the alignments are, the more credible the downstream analyses. Most protein MSA algorithms realign an alignment to refine it by dividing it into two groups horizontally and then realign the two groups. However, this strategy does not consider that different regions of the sequences have different conservation; this property may lead to incorrect residue-residue or residue-gap pairs, which cannot be corrected by this strategy.

OBJECTIVE

In this article, our motivation is to develop a novel refinement method based on splitting- splicing vertically.

METHODS

Here, we present a novel refinement method based on splitting-splicing vertically, called SpliVert. For an alignment, we split it vertically into 3 parts, remove the gap characters in the middle, realign the middle part alone, and splice the realigned middle parts with the other two initial pieces to obtain a refined alignment. In the realign procedure of our method, the aligner will only focus on a certain part, ignoring the disturbance of the other parts, which could help fix the incorrect pairs.

RESULTS

We tested our refinement strategy for 2 leading MSA tools on 3 standard benchmarks, according to the commonly used average SP (and TC) score. The results show that given appropriate proportions to split the initial alignment, the average scores are increased comparably or slightly after using our method. We also compared the alignments refined by our method with alignments directly refined by the original alignment tools. The results suggest that using our SpliVert method to refine alignments can also outperform direct use of the original alignment tools.

CONCLUSION

The results reveal that splitting vertically and realigning part of the alignment is a good strategy for the refinement of protein multiple sequence alignments.

Collapse

Zhan Q, Wang N, Jin S, Tan R, Jiang Q, Wang Y. ProbPFP: a multiple sequence alignment algorithm combining hidden Markov model optimized by particle swarm optimization with partition function. BMC Bioinformatics 2019;20:573. [PMID: 31760933 PMCID: PMC6876095 DOI: 10.1186/s12859-019-3132-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

During procedures for conducting multiple sequence alignment, that is so essential to use the substitution score of pairwise alignment. To compute adaptive scores for alignment, researchers usually use Hidden Markov Model or probabilistic consistency methods such as partition function. Recent studies show that optimizing the parameters for hidden Markov model, as well as integrating hidden Markov model with partition function can raise the accuracy of alignment. The combination of partition function and optimized HMM, which could further improve the alignment's accuracy, however, was ignored by these researches.

RESULTS

A novel algorithm for MSA called ProbPFP is presented in this paper. It intergrate optimized HMM by particle swarm with partition function. The algorithm of PSO was applied to optimize HMM's parameters. After that, the posterior probability obtained by the HMM was combined with the one obtained by partition function, and thus to calculate an integrated substitution score for alignment. In order to evaluate the effectiveness of ProbPFP, we compared it with 13 outstanding or classic MSA methods. The results demonstrate that the alignments obtained by ProbPFP got the maximum mean TC scores and mean SP scores on these two benchmark datasets: SABmark and OXBench, and it got the second highest mean TC scores and mean SP scores on the benchmark dataset BAliBASE. ProbPFP is also compared with 4 other outstanding methods, by reconstructing the phylogenetic trees for six protein families extracted from the database TreeFam, based on the alignments obtained by these 5 methods. The result indicates that the reference trees are closer to the phylogenetic trees reconstructed from the alignments obtained by ProbPFP than the other methods.

CONCLUSIONS

We propose a new multiple sequence alignment method combining optimized HMM and partition function in this paper. The performance validates this method could make a great improvement of the alignment's accuracy.

Collapse

McSkimming DI, Dastgheib S, Baffi TR, Byrne DP, Ferries S, Scott ST, Newton AC, Eyers CE, Kochut KJ, Eyers PA, Kannan N. KinView: a visual comparative sequence analysis tool for integrated kinome research. MOLECULAR BIOSYSTEMS 2016;12:3651-3665. [PMID: 27731453 PMCID: PMC5508867 DOI: 10.1039/c6mb00466k] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Abstract

Multiple sequence alignments (MSAs) are a fundamental analysis tool used throughout biology to investigate relationships between protein sequence, structure, function, evolutionary history, and patterns of disease-associated variants. However, their widespread application in systems biology research is currently hindered by the lack of user-friendly tools to simultaneously visualize, manipulate and query the information conceptualized in large sequence alignments, and the challenges in integrating MSAs with multiple orthogonal data such as cancer variants and post-translational modifications, which are often stored in heterogeneous data sources and formats. Here, we present the Multiple Sequence Alignment Ontology (MSAOnt), which represents a profile or consensus alignment in an ontological format. Subsets of the alignment are easily selected through the SPARQL Protocol and RDF Query Language for downstream statistical analysis or visualization. We have also created the Kinome Viewer (KinView), an interactive integrative visualization that places eukaryotic protein kinase cancer variants in the context of natural sequence variation and experimentally determined post-translational modifications, which play central roles in the regulation of cellular signaling pathways. Using KinView, we identified differential phosphorylation patterns between tyrosine and serine/threonine kinases in the activation segment, a major kinase regulatory region that is often mutated in proliferative diseases. We discuss cancer variants that disrupt phosphorylation sites in the activation segment, and show how KinView can be used as a comparative tool to identify differences and similarities in natural variation, cancer variants and post-translational modifications between kinase groups, families and subfamilies. Based on KinView comparisons, we identify and experimentally characterize a regulatory tyrosine (Y177PLK4) in the PLK4 C-terminal activation segment region termed the P+1 loop. To further demonstrate the application of KinView in hypothesis generation and testing, we formulate and validate a hypothesis explaining a novel predicted loss-of-function variant (D523NPKCβ) in the regulatory spine of PKCβ, a recently identified tumor suppressor kinase. KinView provides a novel, extensible interface for performing comparative analyses between subsets of kinases and for integrating multiple types of residue specific annotations in user friendly formats.

Collapse

Stoddard CD, Widmann J, Trausch JJ, Marcano-Velázquez JG, Knight R, Batey RT. Nucleotides adjacent to the ligand-binding pocket are linked to activity tuning in the purine riboswitch. J Mol Biol 2013;425:1596-611. [PMID: 23485418 DOI: 10.1016/j.jmb.2013.02.023] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2012] [Revised: 01/31/2013] [Accepted: 02/02/2013] [Indexed: 12/20/2022]

Widmann J, Stombaugh J, McDonald D, Chocholousova J, Gardner P, Iyer MK, Liu Z, Lozupone CA, Quinn J, Smit S, Wikman S, Zaneveld JR, Knight R. RNASTAR: an RNA STructural Alignment Repository that provides insight into the evolution of natural and artificial RNAs. RNA (NEW YORK, N.Y.) 2012;18:1319-27. [PMID: 22645380 PMCID: PMC3383963 DOI: 10.1261/rna.032052.111] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]

Thompson JD, Linard B, Lecompte O, Poch O. A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives. PLoS One 2011;6:e18093. [PMID: 21483869 PMCID: PMC3069049 DOI: 10.1371/journal.pone.0018093] [Citation(s) in RCA: 129] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Accepted: 02/21/2011] [Indexed: 12/18/2022] Open

Abstract

Multiple comparison or alignmentof protein sequences has become a fundamental tool in many different domains in modern molecular biology, from evolutionary studies to prediction of 2D/3D structure, molecular function and inter-molecular interactions etc. By placing the sequence in the framework of the overall family, multiple alignments can be used to identify conserved features and to highlight differences or specificities. In this paper, we describe a comprehensive evaluation of many of the most popular methods for multiple sequence alignment (MSA), based on a new benchmark test set. The benchmark is designed to represent typical problems encountered when aligning the large protein sequence sets that result from today's high throughput biotechnologies. We show that alignmentmethods have significantly progressed and can now identify most of the shared sequence features that determine the broad molecular function(s) of a protein family, even for divergent sequences. However,we have identified a number of important challenges. First, the locally conserved regions, that reflect functional specificities or that modulate a protein's function in a given cellular context,are less well aligned. Second, motifs in natively disordered regions are often misaligned. Third, the badly predicted or fragmentary protein sequences, which make up a large proportion of today's databases, lead to a significant number of alignment errors. Based on this study, we demonstrate that the existing MSA methods can be exploited in combination to improve alignment accuracy, although novel approaches will still be needed to fully explore the most difficult regions. We then propose knowledge-enabled, dynamic solutions that will hopefully pave the way to enhanced alignment construction and exploitation in future evolutionary systems biology studies.

Collapse

Friedrich A, Garnier N, Gagnière N, Nguyen H, Albou LP, Biancalana V, Bettler E, Deléage G, Lecompte O, Muller J, Moras D, Mandel JL, Toursel T, Moulinier L, Poch O. SM2PH-db: an interactive system for the integrated analysis of phenotypic consequences of missense mutations in proteins involved in human genetic diseases. Hum Mutat 2010;31:127-35. [PMID: 19921752 DOI: 10.1002/humu.21155] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]

Brown JW, Birmingham A, Griffiths PE, Jossinet F, Kachouri-Lafond R, Knight R, Lang BF, Leontis N, Steger G, Stombaugh J, Westhof E. The RNA structure alignment ontology. RNA (NEW YORK, N.Y.) 2009;15:1623-31. [PMID: 19622678 PMCID: PMC2743057 DOI: 10.1261/rna.1601409] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2009] [Accepted: 05/26/2009] [Indexed: 05/19/2023]

Prosdocimi F, Chisham B, Pontelli E, Thompson JD, Stoltzfus A. Initial implementation of a comparative data analysis ontology. Evol Bioinform Online 2009;5:47-66. [PMID: 19812726 PMCID: PMC2747124 DOI: 10.4137/ebo.s2320] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Antezana E, Kuiper M, Mironov V. Biological knowledge management: the emerging role of the Semantic Web technologies. Brief Bioinform 2009;10:392-407. [PMID: 19457869 DOI: 10.1093/bib/bbp024] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Levasseur A, Pontarotti P, Poch O, Thompson JD. Strategies for reliable exploitation of evolutionary concepts in high throughput biology. Evol Bioinform Online 2008;4:121-37. [PMID: 19204813 PMCID: PMC2614184 DOI: 10.4137/ebo.s597] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open

Do CB, Katoh K. Protein multiple sequence alignment. Methods Mol Biol 2008;484:379-413. [PMID: 18592193 DOI: 10.1007/978-1-59745-398-1_25] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]

Thompson JD, Muller A, Waterhouse A, Procter J, Barton GJ, Plewniak F, Poch O. MACSIMS: multiple alignment of complete sequences information management system. BMC Bioinformatics 2006;7:318. [PMID: 16792820 PMCID: PMC1539025 DOI: 10.1186/1471-2105-7-318] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2006] [Accepted: 06/23/2006] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family.

RESULTS

MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis.

CONCLUSION

MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at http://bips.u-strasbg.fr/MACSIMS/.

Collapse

Leontis NB, Altman RB, Berman HM, Brenner SE, Brown JW, Engelke DR, Harvey SC, Holbrook SR, Jossinet F, Lewis SE, Major F, Mathews DH, Richardson JS, Williamson JR, Westhof E. The RNA Ontology Consortium: an open invitation to the RNA community. RNA (NEW YORK, N.Y.) 2006;12:533-41. [PMID: 16484377 PMCID: PMC1421088 DOI: 10.1261/rna.2343206] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]

Antezana E, Tsiporkova E, Mironov V, Kuiper M. A Cell-Cycle Knowledge Integration Framework. LECTURE NOTES IN COMPUTER SCIENCE 2006. [DOI: 10.1007/11799511_4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]