Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Choi JH, Cho HG, Kim S. GAME: a simple and efficient whole genome alignment method using maximal exact match filtering. Comput Biol Chem 2005;29:244-53. [PMID: 15979044 DOI: 10.1016/j.compbiolchem.2005.04.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2005] [Revised: 04/17/2005] [Accepted: 04/18/2005] [Indexed: 11/30/2022]

For:	Choi JH, Cho HG, Kim S. GAME: a simple and efficient whole genome alignment method using maximal exact match filtering. Comput Biol Chem 2005;29:244-53. [PMID: 15979044 DOI: 10.1016/j.compbiolchem.2005.04.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2005] [Revised: 04/17/2005] [Accepted: 04/18/2005] [Indexed: 11/30/2022]

Number

Cited by Other Article(s)

Lin HN, Hsu WL. DART: a fast and accurate RNA-seq mapper with a partitioning strategy. Bioinformatics 2018;34:190-197. [PMID: 28968831 PMCID: PMC5860201 DOI: 10.1093/bioinformatics/btx558] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Revised: 08/29/2017] [Accepted: 09/03/2017] [Indexed: 01/13/2023] Open

Khelik K, Lagesen K, Sandve GK, Rognes T, Nederbragt AJ. NucDiff: in-depth characterization and annotation of differences between two sets of DNA sequences. BMC Bioinformatics 2017;18:338. [PMID: 28701187 PMCID: PMC5508607 DOI: 10.1186/s12859-017-1748-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 07/04/2017] [Indexed: 12/05/2022] Open

Abstract

Background

Comparing sets of sequences is a situation frequently encountered in bioinformatics, examples being comparing an assembly to a reference genome, or two genomes to each other. The purpose of the comparison is usually to find where the two sets differ, e.g. to find where a subsequence is repeated or deleted, or where insertions have been introduced. Such comparisons can be done using whole-genome alignments. Several tools for making such alignments exist, but none of them 1) provides detailed information about the types and locations of all differences between the two sets of sequences, 2) enables visualisation of alignment results at different levels of detail, and 3) carefully takes genomic repeats into consideration.

Results

We here present NucDiff, a tool aimed at locating and categorizing differences between two sets of closely related DNA sequences. NucDiff is able to deal with very fragmented genomes, repeated sequences, and various local differences and structural rearrangements. NucDiff determines differences by a rigorous analysis of alignment results obtained by the NUCmer, delta-filter and show-snps programs in the MUMmer sequence alignment package. All differences found are categorized according to a carefully defined classification scheme covering all possible differences between two sequences. Information about the differences is made available as GFF3 files, thus enabling visualisation using genome browsers as well as usage of the results as a component in an analysis pipeline. NucDiff was tested with varying parameters for the alignment step and compared with existing alternatives, called QUAST and dnadiff.

Conclusions

We have developed a whole genome alignment difference classification scheme together with the program NucDiff for finding such differences. The proposed classification scheme is comprehensive and can be used by other tools. NucDiff performs comparably to QUAST and dnadiff but gives much more detailed results that can easily be visualized. NucDiff is freely available on https://github.com/uio-cels/NucDiff under the MPL license.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-017-1748-z) contains supplementary material, which is available to authorized users.

Collapse

Khiste N, Ilie L. E-MEM: efficient computation of maximal exact matches for very large genomes. Bioinformatics 2014;31:509-14. [DOI: 10.1093/bioinformatics/btu687] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open

Heuristic alignment methods. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2013;1079:29-43. [PMID: 24170393 DOI: 10.1007/978-1-62703-646-7_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Liu Y, Schmidt B. Long read alignment based on maximal exact match seeds. ACTA ACUST UNITED AC 2013;28:i318-i324. [PMID: 22962447 PMCID: PMC3436841 DOI: 10.1093/bioinformatics/bts414] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Choi JH, Li Y, Guo J, Pei L, Rauch TA, Kramer RS, Macmil SL, Wiley GB, Bennett LB, Schnabel JL, Taylor KH, Kim S, Xu D, Sreekumar A, Pfeifer GP, Roe BA, Caldwell CW, Bhalla KN, Shi H. Genome-wide DNA methylation maps in follicular lymphoma cells determined by methylation-enriched bisulfite sequencing. PLoS One 2010;5:e13020. [PMID: 20927367 PMCID: PMC2947499 DOI: 10.1371/journal.pone.0013020] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2010] [Accepted: 08/21/2010] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

Follicular lymphoma (FL) is a form of non-Hodgkin's lymphoma (NHL) that arises from germinal center (GC) B-cells. Despite the significant advances in immunotherapy, FL is still not curable. Beyond transcriptional profiling and genomics datasets, there currently is no epigenome-scale dataset or integrative biology approach that can adequately model this disease and therefore identify novel mechanisms and targets for successful prevention and treatment of FL.

METHODOLOGY/PRINCIPAL FINDINGS

We performed methylation-enriched genome-wide bisulfite sequencing of FL cells and normal CD19(+) B-cells using 454 sequencing technology. The methylated DNA fragments were enriched with methyl-binding proteins, treated with bisulfite, and sequenced using the Roche-454 GS FLX sequencer. The total number of bases covered in the human genome was 18.2 and 49.3 million including 726,003 and 1.3 million CpGs in FL and CD19(+) B-cells, respectively. 11,971 and 7,882 methylated regions of interest (MRIs) were identified respectively. The genome-wide distribution of these MRIs displayed significant differences between FL and normal B-cells. A reverse trend in the distribution of MRIs between the promoter and the gene body was observed in FL and CD19(+) B-cells. The MRIs identified in FL cells also correlated well with transcriptomic data and ChIP-on-Chip analyses of genome-wide histone modifications such as tri-methyl-H3K27, and tri-methyl-H3K4, indicating a concerted epigenetic alteration in FL cells.

CONCLUSIONS/SIGNIFICANCE

This study is the first to provide a large scale and comprehensive analysis of the DNA methylation sequence composition and distribution in the FL epigenome. These integrated approaches have led to the discovery of novel and frequent targets of aberrant epigenetic alterations. The genome-wide bisulfite sequencing approach developed here can be a useful tool for profiling DNA methylation in clinical samples.

Collapse

Affiliation(s)

Jeong-Hyeon Choi Center of Genomics and Bioinformatics, Indiana University, Bloomington, Indiana, United States of America
Yajun Li Medical College of Georgia Cancer Center, Medical College of Georgia, Augusta, Georgia, United States of America
Juyuan Guo Department of Pathology and Anatomical Sciences, University of Missouri, Columbia, Missouri, United States of America
Lirong Pei Medical College of Georgia Cancer Center, Medical College of Georgia, Augusta, Georgia, United States of America
Tibor A. Rauch Division of Biology, City of Hope Beckman Research Institute, Duarte, California, United States of America
Robin S. Kramer Department of Computer Sciences, University of Missouri, Columbia, Missouri, United States of America
Simone L. Macmil Advanced Center for Genome Technology, University of Oklahoma, Norman, Oklahoma, United States of America
Graham B. Wiley Advanced Center for Genome Technology, University of Oklahoma, Norman, Oklahoma, United States of America
Lynda B. Bennett Department of Pathology and Anatomical Sciences, University of Missouri, Columbia, Missouri, United States of America
Jennifer L. Schnabel Department of Pathology and Anatomical Sciences, University of Missouri, Columbia, Missouri, United States of America
Kristen H. Taylor Department of Pathology and Anatomical Sciences, University of Missouri, Columbia, Missouri, United States of America
Sun Kim Center of Genomics and Bioinformatics, Indiana University, Bloomington, Indiana, United States of America
Dong Xu Division of Biology, City of Hope Beckman Research Institute, Duarte, California, United States of America
Arun Sreekumar Medical College of Georgia Cancer Center, Medical College of Georgia, Augusta, Georgia, United States of America
Gerd P. Pfeifer Department of Computer Sciences, University of Missouri, Columbia, Missouri, United States of America
Bruce A. Roe Advanced Center for Genome Technology, University of Oklahoma, Norman, Oklahoma, United States of America
Charles W. Caldwell Department of Pathology and Anatomical Sciences, University of Missouri, Columbia, Missouri, United States of America
Kapil N. Bhalla Medical College of Georgia Cancer Center, Medical College of Georgia, Augusta, Georgia, United States of America
Huidong Shi Medical College of Georgia Cancer Center, Medical College of Georgia, Augusta, Georgia, United States of America

Collapse

Khan Z, Bloom JS, Kruglyak L, Singh M. A practical algorithm for finding maximal exact matches in large sequence datasets using sparse suffix arrays. ACTA ACUST UNITED AC 2009;25:1609-16. [PMID: 19389736 DOI: 10.1093/bioinformatics/btp275] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Rho M, Choi JH, Kim S, Lynch M, Tang H. De novo identification of LTR retrotransposons in eukaryotic genomes. BMC Genomics 2007;8:90. [PMID: 17407597 PMCID: PMC1858694 DOI: 10.1186/1471-2164-8-90] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2007] [Accepted: 04/03/2007] [Indexed: 12/03/2022] Open

Abstract

Background

LTR retrotransposons are a class of mobile genetic elements containing two similar long terminal repeats (LTRs). Currently, LTR retrotransposons are annotated in eukaryotic genomes mainly through the conventional homology searching approach. Hence, it is limited to annotating known elements.

Results

In this paper, we report a de novo computational method that can identify new LTR retrotransposons without relying on a library of known elements. Specifically, our method identifies intact LTR retrotransposons by using an approximate string matching technique and protein domain analysis. In addition, it identifies partially deleted or solo LTRs using profile Hidden Markov Models (pHMMs). As a result, this method can de novo identify all types of LTR retrotransposons. We tested this method on the two pairs of eukaryotic genomes, C. elegans vs. C. briggsae and D. melanogaster vs. D. pseudoobscura. LTR retrotransposons in C. elegans and D. melanogaster have been intensively studied using conventional annotation methods. Comparing with previous work, we identified new intact LTR retroelements and new putative families, which may imply that there may still be new retroelements that are left to be discovered even in well-studied organisms. To assess the sensitivity and accuracy of our method, we compared our results with a previously published method, LTR_STRUC, which predominantly identifies full-length LTR retrotransposons. In summary, both methods identified comparable number of intact LTR retroelements. But our method can identify nearly all known elements in C. elegans, while LTR_STRUCT missed about 1/3 of them. Our method also identified more known LTR retroelements than LTR_STRUCT in the D. melanogaster genome. We also identified some LTR retroelements in the other two genomes, C. briggsae and D. pseudoobscura, which have not been completely finished. In contrast, the conventional method failed to identify those elements. Finally, the phylogenetic and chromosomal distributions of the identified elements are discussed.

Conclusion

We report a novel method for de novo identification of LTR retrotransposons in eukaryotic genomes with favorable performance over the existing methods.

Collapse

Uchiyama I, Higuchi T, Kobayashi I. CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes. BMC Bioinformatics 2006;7:472. [PMID: 17062155 PMCID: PMC1643837 DOI: 10.1186/1471-2105-7-472] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2006] [Accepted: 10/24/2006] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The recent accumulation of closely related genomic sequences provides a valuable resource for the elucidation of the evolutionary histories of various organisms. However, although numerous alignment calculation and visualization tools have been developed to date, the analysis of complex genomic changes, such as large insertions, deletions, inversions, translocations and duplications, still presents certain difficulties.

RESULTS

We have developed a comparative genome analysis tool, named CGAT, which allows detailed comparisons of closely related bacteria-sized genomes mainly through visualizing middle-to-large-scale changes to infer underlying mechanisms. CGAT displays precomputed pairwise genome alignments on both dotplot and alignment viewers with scrolling and zooming functions, and allows users to move along the pre-identified orthologous alignments. Users can place several types of information on this alignment, such as the presence of tandem repeats or interspersed repetitive sequences and changes in G+C contents or codon usage bias, thereby facilitating the interpretation of the observed genomic changes. In addition to displaying precomputed alignments, the viewer can dynamically calculate the alignments between specified regions; this feature is especially useful for examining the alignment boundaries, as these boundaries are often obscure and can vary between programs. Besides the alignment browser functionalities, CGAT also contains an alignment data construction module, which contains various procedures that are commonly used for pre- and post-processing for large-scale alignment calculation, such as the split-and-merge protocol for calculating long alignments, chaining adjacent alignments, and ortholog identification. Indeed, CGAT provides a general framework for the calculation of genome-scale alignments using various existing programs as alignment engines, which allows users to compare the outputs of different alignment programs. Earlier versions of this program have been used successfully in our research to infer the evolutionary history of apparently complex genome changes between closely related eubacteria and archaea.

CONCLUSION

CGAT is a practical tool for analyzing complex genomic changes between closely related genomes using existing alignment programs and other sequence analysis tools combined with extensive manual inspection.

Collapse