1
|
Camacho E, González-de la Fuente S, Solana JC, Rastrojo A, Carrasco-Ramiro F, Requena JM, Aguado B. Gene Annotation and Transcriptome Delineation on a De Novo Genome Assembly for the Reference Leishmania major Friedlin Strain. Genes (Basel) 2021; 12:genes12091359. [PMID: 34573340 PMCID: PMC8468144 DOI: 10.3390/genes12091359] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 08/20/2021] [Accepted: 08/27/2021] [Indexed: 01/05/2023] Open
Abstract
Leishmania major is the main causative agent of cutaneous leishmaniasis in humans. The Friedlin strain of this species (LmjF) was chosen when a multi-laboratory consortium undertook the objective of deciphering the first genome sequence for a parasite of the genus Leishmania. The objective was successfully attained in 2005, and this represented a milestone for Leishmania molecular biology studies around the world. Although the LmjF genome sequence was done following a shotgun strategy and using classical Sanger sequencing, the results were excellent, and this genome assembly served as the reference for subsequent genome assemblies in other Leishmania species. Here, we present a new assembly for the genome of this strain (named LMJFC for clarity), generated by the combination of two high throughput sequencing platforms, Illumina short-read sequencing and PacBio Single Molecular Real-Time (SMRT) sequencing, which provides long-read sequences. Apart from resolving uncertain nucleotide positions, several genomic regions were reorganized and a more precise composition of tandemly repeated gene loci was attained. Additionally, the genome annotation was improved by adding 542 genes and more accurate coding-sequences defined for around two hundred genes, based on the transcriptome delimitation also carried out in this work. As a result, we are providing gene models (including untranslated regions and introns) for 11,238 genes. Genomic information ultimately determines the biology of every organism; therefore, our understanding of molecular mechanisms will depend on the availability of precise genome sequences and accurate gene annotations. In this regard, this work is providing an improved genome sequence and updated transcriptome annotations for the reference L. major Friedlin strain.
Collapse
|
2
|
Mukherjee K, Rossi M, Salmela L, Boucher C. Fast and efficient Rmap assembly using the Bi-labelled de Bruijn graph. Algorithms Mol Biol 2021; 16:6. [PMID: 34034751 PMCID: PMC8147420 DOI: 10.1186/s13015-021-00182-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 04/13/2021] [Indexed: 11/10/2022] Open
Abstract
Genome wide optical maps are high resolution restriction maps that give a unique numeric representation to a genome. They are produced by assembling hundreds of thousands of single molecule optical maps, which are called Rmaps. Unfortunately, there are very few choices for assembling Rmap data. There exists only one publicly-available non-proprietary method for assembly and one proprietary software that is available via an executable. Furthermore, the publicly-available method, by Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006), follows the overlap-layout-consensus (OLC) paradigm, and therefore, is unable to scale for relatively large genomes. The algorithm behind the proprietary method, Bionano Genomics' Solve, is largely unknown. In this paper, we extend the definition of bi-labels in the paired de Bruijn graph to the context of optical mapping data, and present the first de Bruijn graph based method for Rmap assembly. We implement our approach, which we refer to as RMAPPER, and compare its performance against the assembler of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006) and Solve by Bionano Genomics on data from three genomes: E. coli, human, and climbing perch fish (Anabas Testudineus). Our method was able to successfully run on all three genomes. The method of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006) only successfully ran on E. coli. Moreover, on the human genome RMAPPER was at least 130 times faster than Bionano Solve, used five times less memory and produced the highest genome fraction with zero mis-assemblies. Our software, RMAPPER is written in C++ and is publicly available under GNU General Public License at https://github.com/kingufl/Rmapper .
Collapse
|
3
|
The genome of opportunistic fungal pathogen Fusarium oxysporum carries a unique set of lineage-specific chromosomes. Commun Biol 2020; 3:50. [PMID: 32005944 PMCID: PMC6994591 DOI: 10.1038/s42003-020-0770-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 01/10/2020] [Indexed: 12/12/2022] Open
Abstract
Fusarium oxysporum is a cross-kingdom fungal pathogen that infects plants and humans. Horizontally transferred lineage-specific (LS) chromosomes were reported to determine host-specific pathogenicity among phytopathogenic F. oxysporum. However, the existence and functional importance of LS chromosomes among human pathogenic isolates are unknown. Here we report four unique LS chromosomes in a human pathogenic strain NRRL 32931, isolated from a leukemia patient. These LS chromosomes were devoid of housekeeping genes, but were significantly enriched in genes encoding metal ion transporters and cation transporters. Homologs of NRRL 32931 LS genes, including a homolog of ceruloplasmin and the genes that contribute to the expansion of the alkaline pH-responsive transcription factor PacC/Rim1p, were also present in the genome of NRRL 47514, a strain associated with Fusarium keratitis outbreak. This study provides the first evidence, to our knowledge, for genomic compartmentalization in two human pathogenic fungal genomes and suggests an important role of LS chromosomes in niche adaptation. Zhang, Yang et al. compare a Fusarium oxysporum isolate obtained clinically to a phytopathogenic strain to examine transfer of lineage-specific chromosomes in determining host specificity. They find four unique lineage-specific chromosomes that seem to contribute to fungal adaptation to human hosts.
Collapse
|
4
|
Muggli MD, Puglisi SJ, Boucher C. Kohdista: an efficient method to index and query possible Rmap alignments. Algorithms Mol Biol 2019; 14:25. [PMID: 31867049 PMCID: PMC6907254 DOI: 10.1186/s13015-019-0160-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 11/19/2019] [Indexed: 11/23/2022] Open
Abstract
Background Genome-wide optical maps are ordered high-resolution restriction maps that give the position of occurrence of restriction cut sites corresponding to one or more restriction enzymes. These genome-wide optical maps are assembled using an overlap-layout-consensus approach using raw optical map data, which are referred to as Rmaps. Due to the high error-rate of Rmap data, finding the overlap between Rmaps remains challenging. Results We present Kohdista, which is an index-based algorithm for finding pairwise alignments between single molecule maps (Rmaps). The novelty of our approach is the formulation of the alignment problem as automaton path matching, and the application of modern index-based data structures. In particular, we combine the use of the Generalized Compressed Suffix Array (GCSA) index with the wavelet tree in order to build Kohdista. We validate Kohdista on simulated E. coli data, showing the approach successfully finds alignments between Rmaps simulated from overlapping genomic regions. Conclusion we demonstrate Kohdista is the only method that is capable of finding a significant number of high quality pairwise Rmap alignments for large eukaryote organisms in reasonable time.
Collapse
|
5
|
Mukherjee K, Washimkar D, Muggli MD, Salmela L, Boucher C. Error correcting optical mapping data. Gigascience 2018; 7:5005021. [PMID: 29846578 PMCID: PMC6007263 DOI: 10.1093/gigascience/giy061] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 05/16/2018] [Indexed: 12/31/2022] Open
Abstract
Optical mapping is a unique system that is capable of producing high-resolution, high-throughput genomic map data that gives information about the structure of a genome . Recently it has been used for scaffolding contigs and for assembly validation for large-scale sequencing projects, including the maize, goat, and Amborella genomes. However, a major impediment in the use of this data is the variety and quantity of errors in the raw optical mapping data, which are called Rmaps. The challenges associated with using Rmap data are analogous to dealing with insertions and deletions in the alignment of long reads. Moreover, they are arguably harder to tackle since the data are numerical and susceptible to inaccuracy. We develop cOMet to error correct Rmap data, which to the best of our knowledge is the only optical mapping error correction method. Our experimental results demonstrate that cOMet has high prevision and corrects 82.49% of insertion errors and 77.38% of deletion errors in Rmap data generated from the Escherichia coli K-12 reference genome. Out of the deletion errors corrected, 98.26% are true errors. Similarly, out of the insertion errors corrected, 82.19% are true errors. It also successfully scales to large genomes, improving the quality of 78% and 99% of the Rmaps in the plum and goat genomes, respectively. Last, we show the utility of error correction by demonstrating how it improves the assembly of Rmap data. Error corrected Rmap data results in an assembly that is more contiguous and covers a larger fraction of the genome.
Collapse
Affiliation(s)
- Kingshuk Mukherjee
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville
| | - Darshan Washimkar
- Department of Computer Science, Colorado State University, Fort Collins
| | - Martin D Muggli
- Department of Computer Science, Colorado State University, Fort Collins
| | - Leena Salmela
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki
| | - Christina Boucher
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville
| |
Collapse
|
6
|
Yuan Y, Bayer PE, Batley J, Edwards D. Improvements in Genomic Technologies: Application to Crop Genomics. Trends Biotechnol 2017; 35:547-558. [DOI: 10.1016/j.tibtech.2017.02.009] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Revised: 02/10/2017] [Accepted: 02/14/2017] [Indexed: 12/13/2022]
|
7
|
Maschmann A, Kounovsky-Shafer KL. Determination of restriction enzyme activity when cutting DNA labeled with the TOTO dye family. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2017; 36:406-417. [PMID: 28362164 DOI: 10.1080/15257770.2017.1300665] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Optical mapping, a single DNA molecule genome analysis platform that can determine methylation profiles, uses fluorescently labeled DNA molecules that are elongated on the surface and digested with a restriction enzyme to produce a barcode of that molecule. Understanding how the cyanine fluorochromes affect enzyme activity can lead to other fluorochromes used in the optical mapping system. The effects of restriction digestion on fluorochrome labeled DNA (Ethidium Bromide, DAPI, H33258, EthD-1, TOTO-1) have been analyzed previously. However, TOTO-1 is a part of a family of cyanine fluorochromes (YOYO-1, TOTO-1, BOBO-1, POPO-1, YOYO-3, TOTO-3, BOBO-3, and POPO-3) and the rest of the fluorochromes have not been examined in terms of their effects on restriction digestion. In order to determine if the other dyes in the TOTO-1 family inhibit restriction enzymes in the same way as TOTO-1, lambda DNA was stained with a dye from the TOTO family and digested. The restriction enzyme activity in regards to each dye, as well as each restriction enzyme, was compared to determine the extent of digestion. YOYO-1, TOTO-1, and POPO-1 fluorochromes inhibited ScaI-HF, PmlI, and EcoRI restriction enzymes. Additionally, the mobility of labeled DNA fragments in an agarose gel changed depending on which dye was intercalated.
Collapse
Affiliation(s)
- April Maschmann
- a Department of Chemistry , University of Nebraska-Kearney , Kearney , NE , USA
| | | |
Collapse
|
8
|
Alonso G, Rastrojo A, López-Pérez S, Requena JM, Aguado B. Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome. Parasit Vectors 2016; 9:74. [PMID: 26857920 PMCID: PMC4746890 DOI: 10.1186/s13071-016-1329-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Accepted: 01/20/2016] [Indexed: 01/22/2023] Open
Abstract
Background Leishmania parasites cause severe human diseases known as leishmaniasis. These eukaryotic microorganisms possess an atypical chromosomal architecture and the regulation of gene expression occurs almost exclusively at post-transcriptional levels. Accordingly, sequencing of the genome of Leishmania major, and subsequently the genome of other related species, was paramount for highlighting these peculiar molecular aspects. Recently, we carried out an analysis of gene expression by massive sequencing of RNA in the L. major promastigote, and data derived from that analysis were suggestive of possible errors in the current genome assembly for this Leishmania species. Results During the analysis by RNA-Seq of the transcriptome for L. major Friedlin strain, 163,714 reads could not be aligned with the reference genome. Thus, de novo assembly with these reads was carried out and the resulting contigs were further analyzed. After detailed homology searches using available databases, it was postulated that 15 contigs might correspond to genomic sequences lost during the initial genome assembly of the L. major Friedlin strain. This was experimentally confirmed by PCR amplification, cloning and sequencing of the new genomic regions. As a result, we have identified seven regions of the L. major (Friedlin) genome that were lost during the sequence assembly. This led to the uncovering of six new genes (LmjF.15.1475, LmjF.15.0285, LmjF.24.0765, LmjF.14.0860, LmjF.19.0305, and LmjF.27.2035), and correction of the annotation for two others (LmjF.15.1480 and LmjF.27.2030). Our data suggest that these genomic regions probably collapsed during the genome assembly due to the existence of gene duplications and/or repeated regions surrounding the missed genes. Conclusion RNA-seq data helped to reconstruct some genomic regions misassembled during the L. major Friedlin genome assembly, which is otherwise quite robust. On the other hand, this study shows that data derived from massive sequencing approaches, including RNA-Seq, should be carefully inspected to improve current genome definition and gene annotations. Electronic supplementary material The online version of this article (doi:10.1186/s13071-016-1329-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Graciela Alonso
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| | - Alberto Rastrojo
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| | - Sara López-Pérez
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| | - Jose M Requena
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| | - Begoña Aguado
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| |
Collapse
|
9
|
Mendelowitz LM, Schwartz DC, Pop M. Maligner: a fast ordered restriction map aligner. Bioinformatics 2015; 32:1016-22. [PMID: 26637292 DOI: 10.1093/bioinformatics/btv711] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 12/01/2015] [Indexed: 12/28/2022] Open
Abstract
MOTIVATION The Optical Mapping System discovers structural variants and potentiates sequence assembly of genomes via scaffolding and comparisons that globally validate or correct sequence assemblies. Despite its utility, there are few publicly available tools for aligning optical mapping datasets. RESULTS Here we present software, named 'Maligner', for the alignment of both single molecule restriction maps (Rmaps) and in silico restriction maps of sequence contigs to a reference. Maligner provides two modes of alignment: an efficient, sensitive dynamic programming implementation that scales to large eukaryotic genomes, and a faster indexed based implementation for finding alignments with unmatched sites in the reference but not the query. We compare our software to other publicly available tools on Rmap datasets and show that Maligner finds more correct alignments in comparable runtime. Lastly, we introduce the M-Score statistic for normalizing alignment scores across restriction maps and demonstrate its utility for selecting high quality alignments. AVAILABILITY AND IMPLEMENTATION The Maligner software is written in C ++ and is available at https://github.com/LeeMendelowitz/maligner under the GNU General Public License. CONTACT mpop@umiacs.umd.edu.
Collapse
Affiliation(s)
- Lee M Mendelowitz
- Center for Bioinformatics and Computational Biology, Applied Math & Statistics, and Scientific Computation
| | - David C Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, USA and the UW-Biotechnology Center, University of Wisconsin-Madison, WI 53706, USA
| | - Mihai Pop
- Center for Bioinformatics and Computational Biology, Applied Math & Statistics, and Scientific Computation, Department of Computer Science, University of Maryland, College Park, MD 20742, USA and
| |
Collapse
|
10
|
Muggli MD, Puglisi SJ, Ronen R, Boucher C. Misassembly detection using paired-end sequence reads and optical mapping data. Bioinformatics 2015; 31:i80-8. [PMID: 26072512 PMCID: PMC4542784 DOI: 10.1093/bioinformatics/btv262] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Motivation: A crucial problem in genome assembly is the discovery and correction of misassembly errors in draft genomes. We develop a method called misSEQuel that enhances the quality of draft genomes by identifying misassembly errors and their breakpoints using paired-end sequence reads and optical mapping data. Our method also fulfills the critical need for open source computational methods for analyzing optical mapping data. We apply our method to various assemblies of the loblolly pine, Francisella tularensis, rice and budgerigar genomes. We generated and used stimulated optical mapping data for loblolly pine and F.tularensis and used real optical mapping data for rice and budgerigar. Results: Our results demonstrate that we detect more than 54% of extensively misassembled contigs and more than 60% of locally misassembled contigs in assemblies of F.tularensis and between 31% and 100% of extensively misassembled contigs and between 57% and 73% of locally misassembled contigs in assemblies of loblolly pine. Using the real optical mapping data, we correctly identified 75% of extensively misassembled contigs and 100% of locally misassembled contigs in rice, and 77% of extensively misassembled contigs and 80% of locally misassembled contigs in budgerigar. Availability and implementation:misSEQuel can be used as a post-processing step in combination with any genome assembler and is freely available at http://www.cs.colostate.edu/seq/. Contact:muggli@cs.colostate.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin D Muggli
- Department of Computer Science, Colorado State University, Fort Collins, CO 80526, USA, Department of Computer Science, University of Helsinki, Finland and Bioinformatics Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Simon J Puglisi
- Department of Computer Science, Colorado State University, Fort Collins, CO 80526, USA, Department of Computer Science, University of Helsinki, Finland and Bioinformatics Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Roy Ronen
- Department of Computer Science, Colorado State University, Fort Collins, CO 80526, USA, Department of Computer Science, University of Helsinki, Finland and Bioinformatics Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Christina Boucher
- Department of Computer Science, Colorado State University, Fort Collins, CO 80526, USA, Department of Computer Science, University of Helsinki, Finland and Bioinformatics Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
11
|
Zhou S, Goldstein S, Place M, Bechner M, Patino D, Potamousis K, Ravindran P, Pape L, Rincon G, Hernandez-Ortiz J, Medrano JF, Schwartz DC. A clone-free, single molecule map of the domestic cow (Bos taurus) genome. BMC Genomics 2015; 16:644. [PMID: 26314885 PMCID: PMC4551733 DOI: 10.1186/s12864-015-1823-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2015] [Accepted: 08/07/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The cattle (Bos taurus) genome was originally selected for sequencing due to its economic importance and unique biology as a model organism for understanding other ruminants, or mammals. Currently, there are two cattle genome sequence assemblies (UMD3.1 and Btau4.6) from groups using dissimilar assembly algorithms, which were complemented by genetic and physical map resources. However, past comparisons between these assemblies revealed substantial differences. Consequently, such discordances have engendered ambiguities when using reference sequence data, impacting genomic studies in cattle and motivating construction of a new optical map resource--BtOM1.0--to guide comparisons and improvements to the current sequence builds. Accordingly, our comprehensive comparisons of BtOM1.0 against the UMD3.1 and Btau4.6 sequence builds tabulate large-to-immediate scale discordances requiring mediation. RESULTS The optical map, BtOM1.0, spanning the B. taurus genome (Hereford breed, L1 Dominette 01449) was assembled from an optical map dataset consisting of 2,973,315 (439 X; raw dataset size before assembly) single molecule optical maps (Rmaps; 1 Rmap = 1 restriction mapped DNA molecule) generated by the Optical Mapping System. The BamHI map spans 2,575.30 Mb and comprises 78 optical contigs assembled by a combination of iterative (using the reference sequence: UMD3.1) and de novo assembly techniques. BtOM1.0 is a high-resolution physical map featuring an average restriction fragment size of 8.91 Kb. Comparisons of BtOM1.0 vs. UMD3.1, or Btau4.6, revealed that Btau4.6 presented far more discordances (7,463) vs. UMD3.1 (4,754). Overall, we found that Btau4.6 presented almost double the number of discordances than UMD3.1 across most of the 6 categories of sequence vs. map discrepancies, which are: COMPLEX (misassembly), DELs (extraneous sequences), INSs (missing sequences), ITs (Inverted/Translocated sequences), ECs (extra restriction cuts) and MCs (missing restriction cuts). CONCLUSION Alignments of UMD3.1 and Btau4.6 to BtOM1.0 reveal discordances commensurate with previous reports, and affirm the NCBI's current designation of UMD3.1 sequence assembly as the "reference assembly" and the Btau4.6 as the "alternate assembly." The cattle genome optical map, BtOM1.0, when used as a comprehensive and largely independent guide, will greatly assist improvements to existing sequence builds, and later serve as an accurate physical scaffold for studies concerning the comparative genomics of cattle breeds.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Steve Goldstein
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Michael Place
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Michael Bechner
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Diego Patino
- Departamento de Materiales, Facultad de Minas, Universidad Nacional de Colombia, Sede Medellin, Calle 75 # 79A-51, Bloque M17, Medellin, Colombia, SA.
| | - Konstantinos Potamousis
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Prabu Ravindran
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Louise Pape
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Gonzalo Rincon
- Department of Animal Science, University of California-Davis, Davis, CA, 95616, USA.
| | - Juan Hernandez-Ortiz
- Departamento de Materiales, Facultad de Minas, Universidad Nacional de Colombia, Sede Medellin, Calle 75 # 79A-51, Bloque M17, Medellin, Colombia, SA.
| | - Juan F Medrano
- Department of Animal Science, University of California-Davis, Davis, CA, 95616, USA.
| | - David C Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| |
Collapse
|
12
|
Mendelowitz L, Pop M. Computational methods for optical mapping. Gigascience 2014; 3:33. [PMID: 25671093 PMCID: PMC4323141 DOI: 10.1186/2047-217x-3-33] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Accepted: 12/02/2014] [Indexed: 11/10/2022] Open
Abstract
Optical mapping and newer genome mapping technologies based on nicking enzymes provide low resolution but long-range genomic information. The optical mapping technique has been successfully used for assessing the quality of genome assemblies and for detecting large-scale structural variants and rearrangements that cannot be detected using current paired end sequencing protocols. Here, we review several algorithms and methods for building consensus optical maps and aligning restriction patterns to a reference map, as well as methods for using optical maps with sequence assemblies.
Collapse
Affiliation(s)
- Lee Mendelowitz
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD USA ; Applied Math & Statistics, and Scientific Computation, University of Maryland, College Park, MD USA
| | - Mihai Pop
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD USA ; Department of Computer Science, University of Maryland, College Park, MD USA
| |
Collapse
|
13
|
Fluorescence in situ hybridization and optical mapping to correct scaffold arrangement in the tomato genome. G3-GENES GENOMES GENETICS 2014; 4:1395-405. [PMID: 24879607 PMCID: PMC4132171 DOI: 10.1534/g3.114.011197] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The order and orientation (arrangement) of all 91 sequenced scaffolds in the 12 pseudomolecules of the recently published tomato (Solanum lycopersicum, 2n = 2x = 24) genome sequence were positioned based on marker order in a high-density linkage map. Here, we report the arrangement of these scaffolds determined by two independent physical methods, bacterial artificial chromosome–fluorescence in situ hybridization (BAC-FISH) and optical mapping. By localizing BACs at the ends of scaffolds to spreads of tomato synaptonemal complexes (pachytene chromosomes), we showed that 45 scaffolds, representing one-third of the tomato genome, were arranged differently than predicted by the linkage map. These scaffolds occur mostly in pericentric heterochromatin where 77% of the tomato genome is located and where linkage mapping is less accurate due to reduced crossing over. Although useful for only part of the genome, optical mapping results were in complete agreement with scaffold arrangement by FISH but often disagreed with scaffold arrangement based on the linkage map. The scaffold arrangement based on FISH and optical mapping changes the positions of hundreds of markers in the linkage map, especially in heterochromatin. These results suggest that similar errors exist in pseudomolecules from other large genomes that have been assembled using only linkage maps to predict scaffold arrangement, and these errors can be corrected using FISH and/or optical mapping. Of note, BAC-FISH also permits estimates of the sizes of gaps between scaffolds, and unanchored BACs are often visualized by FISH in gaps between scaffolds and thus represent starting points for filling these gaps.
Collapse
|
14
|
|
15
|
Whole genome mapping and re-organization of the nuclear and mitochondrial genomes of Babesia microti isolates. PLoS One 2013; 8:e72657. [PMID: 24023759 PMCID: PMC3762879 DOI: 10.1371/journal.pone.0072657] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 07/12/2013] [Indexed: 11/19/2022] Open
Abstract
Babesia microti is the primary causative agent of human babesiosis, an emerging pathogen that causes a malaria-like illness with possible fatal outcome in immunocompromised patients. The genome sequence of the B. microti R1 strain was reported in 2012 and revealed a distinct evolutionary path for this pathogen relative to that of other apicomplexa. Lacking from the first genome assembly and initial molecular analyses was information about the terminal ends of each chromosome, and both the exact number of chromosomes in the nuclear genome and the organization of the mitochondrial genome remained ambiguous. We have now performed various molecular analyses to characterize the nuclear and mitochondrial genomes of the B. microti R1 and Gray strains and generated high-resolution Whole Genome maps. These analyses show that the genome of B. microti consists of four nuclear chromosomes and a linear mitochondrial genome present in four different structural types. Furthermore, Whole Genome mapping allowed resolution of the chromosomal ends, identification of areas of misassembly in the R1 genome, and genomic differences between the R1 and Gray strains, which occur primarily in the telomeric regions. These studies set the stage for a better understanding of the evolution and diversity of this important human pathogen.
Collapse
|
16
|
Mazurie AJ, Alves JM, Ozaki LS, Zhou S, Schwartz DC, Buck GA. Comparative genomics of cryptosporidium. Int J Genomics 2013; 2013:832756. [PMID: 23738321 PMCID: PMC3659464 DOI: 10.1155/2013/832756] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 04/10/2013] [Indexed: 11/18/2022] Open
Abstract
Until recently, the apicomplexan parasites, Cryptosporidium hominis and C. parvum, were considered the same species. However, the two parasites, now considered distinct species, exhibit significant differences in host range, infectivity, and pathogenicity, and their sequenced genomes exhibit only 95-97% identity. The availability of the complete genome sequences of these organisms provides the potential to identify the genetic variations that are responsible for the phenotypic differences between the two parasites. We compared the genome organization and structure, gene composition, the metabolic and other pathways, and the local sequence identity between the genes of these two Cryptosporidium species. Our observations show that the phenotypic differences between C. hominis and C. parvum are not due to gross genome rearrangements, structural alterations, gene deletions or insertions, metabolic capabilities, or other obvious genomic alterations. Rather, the results indicate that these genomes exhibit a remarkable structural and compositional conservation and suggest that the phenotypic differences observed are due to subtle variations in the sequences of proteins that act at the interface between the parasite and its host.
Collapse
Affiliation(s)
- Aurélien J. Mazurie
- Department of Microbiology, Montana State University, Bozeman, MT 59717, USA
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284-2030, USA
| | - João M. Alves
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284-2030, USA
| | - Luiz S. Ozaki
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284-2030, USA
| | - Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - David C. Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Gregory A. Buck
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284-2030, USA
| |
Collapse
|
17
|
Dorfman KD, King SB, Olson DW, Thomas JDP, Tree DR. Beyond gel electrophoresis: microfluidic separations, fluorescence burst analysis, and DNA stretching. Chem Rev 2013; 113:2584-667. [PMID: 23140825 PMCID: PMC3595390 DOI: 10.1021/cr3002142] [Citation(s) in RCA: 149] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Kevin D. Dorfman
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| | - Scott B. King
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| | - Daniel W. Olson
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| | - Joel D. P. Thomas
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| | - Douglas R. Tree
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| |
Collapse
|
18
|
AGORA: Assembly Guided by Optical Restriction Alignment. BMC Bioinformatics 2012; 13:189. [PMID: 22856673 PMCID: PMC3431216 DOI: 10.1186/1471-2105-13-189] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2012] [Accepted: 06/28/2012] [Indexed: 11/10/2022] Open
Abstract
Background Genome assembly is difficult due to repeated sequences within the genome, which create ambiguities and cause the final assembly to be broken up into many separate sequences (contigs). Long range linking information, such as mate-pairs or mapping data, is necessary to help assembly software resolve repeats, thereby leading to a more complete reconstruction of genomes. Prior work has used optical maps for validating assemblies and scaffolding contigs, after an initial assembly has been produced. However, optical maps have not previously been used within the genome assembly process. Here, we use optical map information within the popular de Bruijn graph assembly paradigm to eliminate paths in the de Bruijn graph which are not consistent with the optical map and help determine the correct reconstruction of the genome. Results We developed a new algorithm called AGORA: Assembly Guided by Optical Restriction Alignment. AGORA is the first algorithm to use optical map information directly within the de Bruijn graph framework to help produce an accurate assembly of a genome that is consistent with the optical map information provided. Our simulations on bacterial genomes show that AGORA is effective at producing assemblies closely matching the reference sequences. Additionally, we show that noise in the optical map can have a strong impact on the final assembly quality for some complex genomes, and we also measure how various characteristics of the starting de Bruijn graph may impact the quality of the final assembly. Lastly, we show that a proper choice of restriction enzyme for the optical map may substantially improve the quality of the final assembly. Conclusions Our work shows that optical maps can be used effectively to assemble genomes within the de Bruijn graph assembly framework. Our experiments also provide insights into the characteristics of the mapping data that most affect the performance of our algorithm, indicating the potential benefit of more accurate optical mapping technologies, such as nano-coding.
Collapse
|
19
|
Xing MN, Zhang XZ, Huang H. Application of metagenomic techniques in mining enzymes from microbial communities for biofuel synthesis. Biotechnol Adv 2012; 30:920-9. [DOI: 10.1016/j.biotechadv.2012.01.021] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
20
|
Bannantine JP, Wu CW, Hsu C, Zhou S, Schwartz DC, Bayles DO, Paustian ML, Alt DP, Sreevatsan S, Kapur V, Talaat AM. Genome sequencing of ovine isolates of Mycobacterium avium subspecies paratuberculosis offers insights into host association. BMC Genomics 2012; 13:89. [PMID: 22409516 PMCID: PMC3337245 DOI: 10.1186/1471-2164-13-89] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2011] [Accepted: 03/12/2012] [Indexed: 01/09/2023] Open
Abstract
Background The genome of Mycobacterium avium subspecies paratuberculosis (MAP) is remarkably homogeneous among the genomes of bovine, human and wildlife isolates. However, previous work in our laboratories with the bovine K-10 strain has revealed substantial differences compared to sheep isolates. To systematically characterize all genomic differences that may be associated with the specific hosts, we sequenced the genomes of three U.S. sheep isolates and also obtained an optical map. Results Our analysis of one of the isolates, MAP S397, revealed a genome 4.8 Mb in size with 4,700 open reading frames (ORFs). Comparative analysis of the MAP S397 isolate showed it acquired approximately 10 large sequence regions that are shared with the human M. avium subsp. hominissuis strain 104 and lost 2 large regions that are present in the bovine strain. In addition, optical mapping defined the presence of 7 large inversions between the bovine and ovine genomes (~ 2.36 Mb). Whole-genome sequencing of 2 additional sheep strains of MAP (JTC1074 and JTC7565) further confirmed genomic homogeneity of the sheep isolates despite the presence of polymorphisms on the nucleotide level. Conclusions Comparative sequence analysis employed here provided a better understanding of the host association, evolution of members of the M. avium complex and could help in deciphering the phenotypic differences observed among sheep and cattle strains of MAP. A similar approach based on whole-genome sequencing combined with optical mapping could be employed to examine closely related pathogens. We propose an evolutionary scenario for M. avium complex strains based on these genome sequences.
Collapse
Affiliation(s)
- John P Bannantine
- National Animal Disease Center, USDA-Agricultural Research Service, Ames, Iowa, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Teixeira SM, de Paiva RMC, Kangussu-Marcolino MM, Darocha WD. Trypanosomatid comparative genomics: Contributions to the study of parasite biology and different parasitic diseases. Genet Mol Biol 2012; 35:1-17. [PMID: 22481868 PMCID: PMC3313497 DOI: 10.1590/s1415-47572012005000008] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2011] [Accepted: 10/18/2011] [Indexed: 01/23/2023] Open
Abstract
In 2005, draft sequences of the genomes of Trypanosoma brucei, Trypanosoma cruzi and Leishmania major, also known as the Tri-Tryp genomes, were published. These protozoan parasites are the causative agents of three distinct insect-borne diseases, namely sleeping sickness, Chagas disease and leishmaniasis, all with a worldwide distribution. Despite the large estimated evolutionary distance among them, a conserved core of ~6,200 trypanosomatid genes was found among the Tri-Tryp genomes. Extensive analysis of these genomic sequences has greatly increased our understanding of the biology of these parasites and their host-parasite interactions. In this article, we review the recent advances in the comparative genomics of these three species. This analysis also includes data on additional sequences derived from other trypanosmatid species, as well as recent data on gene expression and functional genomics. In addition to facilitating the identification of key parasite molecules that may provide a better understanding of these complex diseases, genome studies offer a rich source of new information that can be used to define potential new drug targets and vaccine candidates for controlling these parasitic infections.
Collapse
Affiliation(s)
- Santuza M Teixeira
- Departamento de Bioquímica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | | | | |
Collapse
|
22
|
Riley MC, Kirkup BC, Johnson JD, Lesho EP, Ockenhouse CF. Rapid whole genome optical mapping of Plasmodium falciparum. Malar J 2011; 10:252. [PMID: 21871093 PMCID: PMC3173401 DOI: 10.1186/1475-2875-10-252] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2011] [Accepted: 08/26/2011] [Indexed: 11/21/2022] Open
Abstract
Background Immune evasion and drug resistance in malaria have been linked to chromosomal recombination and gene copy number variation (CNV). These events are ideally studied using comparative genomic analyses; however in malaria these analyses are not as common or thorough as in other infectious diseases, partly due to the difficulty in sequencing and assembling complete genome drafts. Recently, whole genome optical mapping has gained wide use in support of genomic sequence assembly and comparison. Here, a rapid technique for producing whole genome optical maps of Plasmodium falciparum is described and the results of mapping four genomes are presented. Methods Four laboratory strains of P. falciparum were analysed using the Argus™ optical mapping system to produce ordered restriction fragment maps of all 14 chromosomes in each genome. Plasmodium falciparum DNA was isolated directly from blood culture, visualized using the Argus™ system and assembled in a manner analogous to next generation sequence assembly into maps (AssemblyViewer™, OpGen Inc.®). Full coverage maps were generated for P. falciparum strains 3D7, FVO, D6 and C235. A reference P. falciparum in silico map was created by the digestion of the genomic sequence of P. falciparum with the restriction enzyme AflII, for comparisons to genomic optical maps. Maps were then compared using the MapSolver™ software. Results Genomic variation was observed among the mapped strains, as well as between the map of the reference strain and the map derived from the putative sequence of that same strain. Duplications, deletions, insertions, inversions and misassemblies of sizes ranging from 3,500 base pairs up to 78,000 base pairs were observed. Many genomic events occurred in areas of known repetitive sequence or high copy number genes, including var gene clusters and rifin complexes. Conclusions This technique for optical mapping of multiple malaria genomes allows for whole genome comparison of multiple strains and can assist in identifying genetic variation and sequence contig assembly. New protocols and technology allowed us to produce high quality contigs spanning four P. falciparum genomes in six weeks for less than $1,000.00 per genome. This relatively low cost and quick turnaround makes the technique valuable compared to other genomic sequencing technologies for studying genetic variation in malaria.
Collapse
Affiliation(s)
- Matthew C Riley
- Walter Reed Army Institute of Research, Division of Malaria Vaccine Development, Silver Spring, Maryland, USA.
| | | | | | | | | |
Collapse
|
23
|
Neely RK, Deen J, Hofkens J. Optical mapping of DNA: Single-molecule-based methods for mapping genomes. Biopolymers 2011; 95:298-311. [DOI: 10.1002/bip.21579] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Revised: 12/15/2010] [Accepted: 12/15/2010] [Indexed: 11/09/2022]
|
24
|
Giongo A, Tyler HL, Zipperer UN, Triplett EW. Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission. Stand Genomic Sci 2010; 2:309-17. [PMID: 21304715 PMCID: PMC3035290 DOI: 10.4056/sigs.972221] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures.
Collapse
Affiliation(s)
- Adriana Giongo
- Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida, PO Box 110700, Gainesville, FL 32611-0700 USA
| | | | | | | |
Collapse
|
25
|
Neely RK, Dedecker P, Hotta JI, Urbanavičiūtė G, Klimašauskas S, Hofkens J. DNA fluorocode: A single molecule, optical map of DNA with nanometre resolution. Chem Sci 2010. [DOI: 10.1039/c0sc00277a] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
|
26
|
Zhou S, Wei F, Nguyen J, Bechner M, Potamousis K, Goldstein S, Pape L, Mehan MR, Churas C, Pasternak S, Forrest DK, Wise R, Ware D, Wing RA, Waterman MS, Livny M, Schwartz DC. A single molecule scaffold for the maize genome. PLoS Genet 2009; 5:e1000711. [PMID: 19936062 PMCID: PMC2774507 DOI: 10.1371/journal.pgen.1000711] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 10/05/2009] [Indexed: 11/18/2022] Open
Abstract
About 85% of the maize genome consists of highly repetitive sequences that are interspersed by low-copy, gene-coding sequences. The maize community has dealt with this genomic complexity by the construction of an integrated genetic and physical map (iMap), but this resource alone was not sufficient for ensuring the quality of the current sequence build. For this purpose, we constructed a genome-wide, high-resolution optical map of the maize inbred line B73 genome containing >91,000 restriction sites (averaging 1 site/∼23 kb) accrued from mapping genomic DNA molecules. Our optical map comprises 66 contigs, averaging 31.88 Mb in size and spanning 91.5% (2,103.93 Mb/∼2,300 Mb) of the maize genome. A new algorithm was created that considered both optical map and unfinished BAC sequence data for placing 60/66 (2,032.42 Mb) optical map contigs onto the maize iMap. The alignment of optical maps against numerous data sources yielded comprehensive results that proved revealing and productive. For example, gaps were uncovered and characterized within the iMap, the FPC (fingerprinted contigs) map, and the chromosome-wide pseudomolecules. Such alignments also suggested amended placements of FPC contigs on the maize genetic map and proactively guided the assembly of chromosome-wide pseudomolecules, especially within complex genomic regions. Lastly, we think that the full integration of B73 optical maps with the maize iMap would greatly facilitate maize sequence finishing efforts that would make it a valuable reference for comparative studies among cereals, or other maize inbred lines and cultivars. The maize genome contains abundant repeats interspersed by low-copy, gene-coding sequences that make it a challenge to sequence; consequently, current BAC sequence assemblies average 11 contigs per clone. The iMap deals with such complexity by the judicious integration of IBM genetic and B73 physical maps, but the B73 genome structure could differ from the IBM population because of genetic recombination and subsequent rearrangements. Accordingly, we report a genome-wide, high-resolution optical map of maize B73 genome that was constructed from the direct analysis of genomic DNA molecules without using genetic markers. The integration of optical and iMap resources with comparisons to FPC maps enabled a uniquely comprehensive and scalable assessment of a given BAC's sequence assembly, its placement within a FPC contig, and the location of this FPC contig within a chromosome-wide pseudomolecule. As such, the overall utility of the maize optical map for the validation of sequence assemblies has been significant and demonstrates the inherent advantages of single molecule platforms. Construction of the maize optical map represents the first physical map of a eukaryotic genome larger than 400 Mb that was created de novo from individual genomic DNA molecules.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Fusheng Wei
- Department of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
| | - John Nguyen
- Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
| | - Mike Bechner
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Konstantinos Potamousis
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Steve Goldstein
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Louise Pape
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Michael R. Mehan
- Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
| | - Chris Churas
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Shiran Pasternak
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Dan K. Forrest
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Roger Wise
- Corn Insects and Crop Genetics Research, United States Department of Agriculture–Agricultural Research Service and Department of Plant Pathology, Iowa State University, Ames, Iowa, United States of America
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- Plant, Soil, and Nutrition Research, United States Department of Agriculture–Agricultural Research Service, Ithaca, New York, United States of America
| | - Rod A. Wing
- Department of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Michael S. Waterman
- Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
| | - Miron Livny
- Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - David C. Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- * E-mail:
| |
Collapse
|
27
|
Ananiev GE, Goldstein S, Runnheim R, Forrest DK, Zhou S, Potamousis K, Churas CP, Bergendahl V, Thomson JA, Schwartz DC. Optical mapping discerns genome wide DNA methylation profiles. BMC Mol Biol 2008; 9:68. [PMID: 18667073 PMCID: PMC2516518 DOI: 10.1186/1471-2199-9-68] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2008] [Accepted: 07/30/2008] [Indexed: 11/23/2022] Open
Abstract
Background Methylation of CpG dinucleotides is a fundamental mechanism of epigenetic regulation in eukaryotic genomes. Development of methods for rapid genome wide methylation profiling will greatly facilitate both hypothesis and discovery driven research in the field of epigenetics. In this regard, a single molecule approach to methylation profiling offers several unique advantages that include elimination of chemical DNA modification steps and PCR amplification. Results A single molecule approach is presented for the discernment of methylation profiles, based on optical mapping. We report results from a series of pilot studies demonstrating the capabilities of optical mapping as a platform for methylation profiling of whole genomes. Optical mapping was used to discern the methylation profile from both an engineered and wild type Escherichia coli. Furthermore, the methylation status of selected loci within the genome of human embryonic stem cells was profiled using optical mapping. Conclusion The optical mapping platform effectively detects DNA methylation patterns. Due to single molecule detection, optical mapping offers significant advantages over other technologies. This advantage stems from obviation of DNA modification steps, such as bisulfite treatment, and the ability of the platform to assay repeat dense regions within mammalian genomes inaccessible to techniques using array-hybridization technologies.
Collapse
Affiliation(s)
- Gene E Ananiev
- Department of Chemistry, Laboratory for Molecular and Computational Genomics, University of Wisconsin Biotechnology Center, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Nagarajan N, Read TD, Pop M. Scaffolding and validation of bacterial genome assemblies using optical restriction maps. Bioinformatics 2008; 24:1229-35. [PMID: 18356192 PMCID: PMC2373919 DOI: 10.1093/bioinformatics/btn102] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Revised: 03/05/2008] [Accepted: 03/16/2008] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION New, high-throughput sequencing technologies have made it feasible to cheaply generate vast amounts of sequence information from a genome of interest. The computational reconstruction of the complete sequence of a genome is complicated by specific features of these new sequencing technologies, such as the short length of the sequencing reads and absence of mate-pair information. In this article we propose methods to overcome such limitations by incorporating information from optical restriction maps. RESULTS We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets. We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii. The resulting assemblies contain a single scaffold covering a large fraction of the respective genomes, suggesting that the careful use of optical maps can provide a cost-effective framework for the assembly of genomes. AVAILABILITY The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma
Collapse
|
29
|
Zhou S, Bechner MC, Place M, Churas CP, Pape L, Leong SA, Runnheim R, Forrest DK, Goldstein S, Livny M, Schwartz DC. Validation of rice genome sequence by optical mapping. BMC Genomics 2007; 8:278. [PMID: 17697381 PMCID: PMC2048515 DOI: 10.1186/1471-2164-8-278] [Citation(s) in RCA: 103] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2007] [Accepted: 08/15/2007] [Indexed: 11/30/2022] Open
Abstract
Background Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. Results To facilitate ongoing sequencing finishing and validation efforts, we have constructed a whole-genome SwaI optical restriction map of the rice genome. The physical map consists of 14 contigs, covering 12 chromosomes, with a total genome size of 382.17 Mb; this value is about 11% smaller than original estimates. 9 of the 14 optical map contigs are without gaps, covering chromosomes 1, 2, 3, 4, 5, 7, 8 10, and 12 in their entirety – including centromeres and telomeres. Alignments between optical and in silico restriction maps constructed from IRGSP (International Rice Genome Sequencing Project) and TIGR (The Institute for Genomic Research) genome sequence sources are comprehensive and informative, evidenced by map coverage across virtually all published gaps, discovery of new ones, and characterization of sequence misassemblies; all totalling ~14 Mb. Furthermore, since optical maps are ordered restriction maps, identified discordances are pinpointed on a reliable physical scaffold providing an independent resource for closure of gaps and rectification of misassemblies. Conclusion Analysis of sequence and optical mapping data effectively validates genome sequence assemblies constructed from large, repeat-rich genomes. Given this conclusion we envision new applications of such single molecule analysis that will merge advantages offered by high-resolution optical maps with inexpensive, but short sequence reads generated by emerging sequencing platforms. Lastly, map construction techniques presented here points the way to new types of comparative genome analysis that would focus on discernment of structural differences revealed by optical maps constructed from a broad range of rice subspecies and varieties.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Michael C Bechner
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Michael Place
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Chris P Churas
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Louise Pape
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Sally A Leong
- USDA-ARS, CCRU, Department of Plant Pathology, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Rod Runnheim
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Dan K Forrest
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Steve Goldstein
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Miron Livny
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - David C Schwartz
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
30
|
Xiao M, Phong A, Ha C, Chan TF, Cai D, Leung L, Wan E, Kistler AL, DeRisi JL, Selvin PR, Kwok PY. Rapid DNA mapping by fluorescent single molecule detection. Nucleic Acids Res 2006; 35:e16. [PMID: 17175538 PMCID: PMC1807959 DOI: 10.1093/nar/gkl1044] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
DNA mapping is an important analytical tool in genomic sequencing, medical diagnostics and pathogen identification. Here we report an optical DNA mapping strategy based on direct imaging of individual DNA molecules and localization of multiple sequence motifs on the molecules. Individual genomic DNA molecules were labeled with fluorescent dyes at specific sequence motifs by the action of nicking endonuclease followed by the incorporation of dye terminators with DNA polymerase. The labeled DNA molecules were then stretched into linear form on a modified glass surface and imaged using total internal reflection fluorescence (TIRF) microscopy. By determining the positions of the fluorescent labels with respect to the DNA backbone, the distribution of the sequence motif recognized by the nicking endonuclease can be established with good accuracy, in a manner similar to reading a barcode. With this approach, we constructed a specific sequence motif map of lambda-DNA. We further demonstrated the capability of this approach to rapidly type a human adenovirus and several strains of human rhinovirus.
Collapse
Affiliation(s)
- Ming Xiao
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
- To whom correspondence should be addressed at: 513, Parnassus Avenue, HSW-901A, San Francisco, CA 94143, USA. Tel: +1 41 551 43876; Fax: +1 41 547 62956;
| | - Angie Phong
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Connie Ha
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Ting-Fung Chan
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Dongmei Cai
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Lucinda Leung
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Eunice Wan
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Amy L. Kistler
- Department of Biochemistry and Biophysics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Joseph L. DeRisi
- Department of Biochemistry and Biophysics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Paul R. Selvin
- Department of Physics and Center of Biophysics, University of Illinois at Urbana-ChampaignUrbana, IL 61801, USA
| | - Pui-Yan Kwok
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
- Department of Dermatology, University of CaliforniaSan Francisco, CA 94115, USA
| |
Collapse
|
31
|
Wu T, Schwartz DC. Transchip: single-molecule detection of transcriptional elongation complexes. Anal Biochem 2006; 361:31-46. [PMID: 17187751 PMCID: PMC1945215 DOI: 10.1016/j.ab.2006.10.042] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2006] [Revised: 10/30/2006] [Accepted: 10/30/2006] [Indexed: 11/24/2022]
Abstract
A new single-molecule system, Transchip, was developed for analysis of transcription products at their genomic origins. The bacteriophage T7 RNA polymerase and its promoters were used in a model system, and resultant RNAs were imaged and detected at their positions along single template DNA molecules. The Transchip system has drawn from critical aspects of Optical Mapping, a single-molecule system that enables the construction of high-resolution ordered restriction maps of whole genomes from single DNA molecules. Through statistical analysis of hundreds of single-molecule template/transcript complexes, Transchip enables analysis of the locations and strength of promoters, the direction and processivity of transcription reactions, and the termination of transcription. These novel results suggest that the new system may serve as a high-throughput platform to investigate transcriptional events on a large genome-wide scale.
Collapse
Affiliation(s)
- Tian Wu
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | |
Collapse
|
32
|
Chen Q, Savarino SJ, Venkatesan MM. Subtractive hybridization and optical mapping of the enterotoxigenic Escherichia coli H10407 chromosome: isolation of unique sequences and demonstration of significant similarity to the chromosome of E. coli K-12. MICROBIOLOGY-SGM 2006; 152:1041-1054. [PMID: 16549668 DOI: 10.1099/mic.0.28648-0] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Enterotoxigenic Escherichia coli (ETEC) is a primary cause of diarrhoea in infants in developing countries and in travellers to endemic regions. While several virulence genes have been identified on ETEC plasmids, little is known about the ETEC chromosome, although it is expected to share significant homology in backbone sequences with E. coli K-12. In the absence of genomic sequence information, the subtractive hybridization method and the more recently described optical mapping technique were carried out to determine the degree of genomic variation between virulent ETEC strain H10407 and the non-pathogenic E. coli K-12 strain MG1655. In one round of PCR-based suppression subtractive hybridization, 153 fragments representing sequences unique to strain H10407 were identified. blast searches indicated that few unique sequences showed homology to known pathogenicity island genes identified in related E. coli pathogens. A total of 65 fragments contained sequences that were either linked to hypothetical proteins or showed no homology to any known sequence in the database. The remaining sequences were either phage or prophage related or displayed homology to classifiable genes that function in various aspects of bacterial metabolism. The 153 unique sequences showed variable distribution across different ETEC strains including ETEC strain B7A, which is attenuated in virulence and lacked several H10407-specific sequences. Restriction-enzyme-based optical maps of strain H10407 were compared to in silico restriction maps of strain MG1655 and related E. coli pathogens. The 5.1 Mb ETEC chromosome was approximately 500 kb greater in length than the chromosome of E. coli K-12, collinear with it and indicated several discrete regions where insertions and/or deletions had occurred relative to the chromosome of strain MG1655. No major inversions, transpositions or gross rearrangements were observed on the ETEC chromosome. Based on comparisons with known genomic sequences and related optical-map-based restriction site similarity, the sequence of the H10407 chromosome is expected to demonstrate approximately 96 % identity with that of E. coli K-12.
Collapse
Affiliation(s)
- Qing Chen
- Department of Enteric Infections, Division of Communicable Diseases and Immunology, Walter Reed Army Institute of Research, Silver Spring, MD, USA
| | - Stephen J Savarino
- Enteric Diseases Department, Naval Medical Research Center, Silver Spring, MD, USA
| | - Malabi M Venkatesan
- Department of Enteric Infections, Division of Communicable Diseases and Immunology, Walter Reed Army Institute of Research, Silver Spring, MD, USA
| |
Collapse
|
33
|
Abstract
The present review considered: (a) the factors that conditioned the early transition from non-life to life; (b) genome structure and complexity in prokaryotes, eukaryotes, and organelles; (c) comparative human chromosome genomics; and (d) the Brazilian contribution to some of these studies. Understanding the dialectical conflict between freedom and organization is fundamental to give meaning to the patterns and processes of organic evolution.
Collapse
Affiliation(s)
- Francisco M Salzano
- Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Caixa Postal 15053, 91501-970 Porto Alegre, RS, Brazil.
| |
Collapse
|
34
|
Galagan JE, Henn MR, Ma LJ, Cuomo CA, Birren B. Genomics of the fungal kingdom: Insights into eukaryotic biology. Genome Res 2005; 15:1620-31. [PMID: 16339359 DOI: 10.1101/gr.3767105] [Citation(s) in RCA: 233] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The last decade has witnessed a revolution in the genomics of the fungal kingdom. Since the sequencing of the first fungus in 1996, the number of available fungal genome sequences has increased by an order of magnitude. Over 40 complete fungal genomes have been publicly released with an equal number currently being sequenced--representing the widest sampling of genomes from any eukaryotic kingdom. Moreover, many of these sequenced species form clusters of related organisms designed to enable comparative studies. These data provide an unparalleled opportunity to study the biology and evolution of this medically, industrially, and environmentally important kingdom. In addition, fungi also serve as model organisms for all eukaryotes. The available fungal genomic resource, coupled with the experimental tractability of the fungi, is accelerating research into the fundamental aspects of eukaryotic biology. We provide here an overview of available fungal genomes and highlight some of the biological insights that have been derived through their analysis. We also discuss insights into the fundamental cellular biology shared between fungi and other eukaryotic organisms.
Collapse
Affiliation(s)
- James E Galagan
- The Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts 02141, USA.
| | | | | | | | | |
Collapse
|
35
|
Coppel RL, Black CG. Parasite genomes. Int J Parasitol 2005; 35:465-79. [PMID: 15826640 DOI: 10.1016/j.ijpara.2005.01.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2005] [Revised: 02/24/2005] [Accepted: 02/24/2005] [Indexed: 01/01/2023]
Abstract
The availability of genome sequences and the associated transcriptome and proteome mapping projects has revolutionised research in the field of parasitology. As more parasite species are sequenced, comparative and phylogenetic comparisons are improving the quality of gene prediction and annotation. Genome sequences of parasites are also providing important data sets for understanding parasite biology and identifying new vaccine candidates and drug targets. We review some of the preliminary conclusions from examination of parasite genome sequences and discuss some of the bioinformatics approaches taken in this analysis.
Collapse
Affiliation(s)
- Ross L Coppel
- Department of Microbiology and the Victorian Bioinformatics Consortium, Monash University, Melbourne, Vic. 3800, Australia.
| | | |
Collapse
|
36
|
Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G, Berriman M, Sisk E, Rajandream MA, Adlem E, Aert R, Anupama A, Apostolou Z, Attipoe P, Bason N, Bauser C, Beck A, Beverley SM, Bianchettin G, Borzym K, Bothe G, Bruschi CV, Collins M, Cadag E, Ciarloni L, Clayton C, Coulson RMR, Cronin A, Cruz AK, Davies RM, De Gaudenzi J, Dobson DE, Duesterhoeft A, Fazelina G, Fosker N, Frasch AC, Fraser A, Fuchs M, Gabel C, Goble A, Goffeau A, Harris D, Hertz-Fowler C, Hilbert H, Horn D, Huang Y, Klages S, Knights A, Kube M, Larke N, Litvin L, Lord A, Louie T, Marra M, Masuy D, Matthews K, Michaeli S, Mottram JC, Müller-Auer S, Munden H, Nelson S, Norbertczak H, Oliver K, O'neil S, Pentony M, Pohl TM, Price C, Purnelle B, Quail MA, Rabbinowitsch E, Reinhardt R, Rieger M, Rinta J, Robben J, Robertson L, Ruiz JC, Rutter S, Saunders D, Schäfer M, Schein J, Schwartz DC, Seeger K, Seyler A, Sharp S, Shin H, Sivam D, Squares R, Squares S, Tosato V, Vogt C, Volckaert G, Wambutt R, Warren T, Wedler H, Woodward J, Zhou S, Zimmermann W, Smith DF, Blackwell JM, Stuart KD, Barrell B, Myler PJ. The genome of the kinetoplastid parasite, Leishmania major. Science 2005; 309:436-42. [PMID: 16020728 PMCID: PMC1470643 DOI: 10.1126/science.1112680] [Citation(s) in RCA: 1043] [Impact Index Per Article: 54.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Leishmania species cause a spectrum of human diseases in tropical and subtropical regions of the world. We have sequenced the 36 chromosomes of the 32.8-megabase haploid genome of Leishmania major (Friedlin strain) and predict 911 RNA genes, 39 pseudogenes, and 8272 protein-coding genes, of which 36% can be ascribed a putative function. These include genes involved in host-pathogen interactions, such as proteolytic enzymes, and extensive machinery for synthesis of complex surface glycoconjugates. The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Trypanosoma cruzi (Tritryp) genomes suggest that the mechanisms regulating RNA polymerase II-directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling. Abundant RNA-binding proteins are encoded in the Tritryp genomes, consistent with active posttranscriptional regulation of gene expression.
Collapse
MESH Headings
- Animals
- Chromatin/genetics
- Chromatin/metabolism
- Gene Expression Regulation
- Genes, Protozoan
- Genes, rRNA
- Genome, Protozoan
- Glycoconjugates/biosynthesis
- Glycoconjugates/metabolism
- Leishmania major/chemistry
- Leishmania major/genetics
- Leishmania major/metabolism
- Leishmaniasis, Cutaneous/parasitology
- Lipid Metabolism
- Membrane Proteins/biosynthesis
- Membrane Proteins/chemistry
- Membrane Proteins/genetics
- Membrane Proteins/metabolism
- Molecular Sequence Data
- Multigene Family
- Protein Biosynthesis
- Protein Processing, Post-Translational
- Protozoan Proteins/biosynthesis
- Protozoan Proteins/chemistry
- Protozoan Proteins/genetics
- Protozoan Proteins/metabolism
- RNA Processing, Post-Transcriptional
- RNA Splicing
- RNA, Protozoan/genetics
- RNA, Protozoan/metabolism
- Sequence Analysis, DNA
- Transcription, Genetic
Collapse
Affiliation(s)
- Alasdair C Ivens
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Abstract
This article reviews the recent advances made in the field of human leishmaniasis. Special emphasis is placed upon the application of various molecular tools for accurate and rapid diagnosis, understanding the mechanisms of drug resistance and identification of vaccine candidates. The focus will be on the major role played by recombinant antigens in the immunoserodiagnosis and progress of the Leishmania genome project, which has enabled researchers to design better PCR primers and molecular probes for microarrays. A special interest is placed on the recombinant antigen (rK39) cloned from the Leishmania chagasi kinesin gene and a very recently cloned recombinant antigen (KE16) from the Old World Leishmania donovani species with high sensitivity and specificity. Advances made in the specific PCR primer designed to diagnose and differentiate various species and strains of Leishmania causing visceral and post-kala-azar-dermal leishmaniasis have been covered. Molecular methods (e.g., DNA and protein microarrays) applied to understanding the pathobiology of the parasite, mechanism of host invasion, drug interaction and drug resistance to develop effective therapeutic molecules, gene expression profiling studies that have opened doors to understand many host-parasite relations, effective therapy and vaccine candidates are extensively covered in this review.
Collapse
Affiliation(s)
- Sarman Singh
- All India Institute of Medical Sciences, New Delhi-110029, India.
| | | | | |
Collapse
|
38
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2005. [PMCID: PMC2447482 DOI: 10.1002/cfg.421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|