1
|
Raeisi Dehkordi S, Luebeck J, Bafna V. FaNDOM: Fast nested distance-based seeding of optical maps. PATTERNS (NEW YORK, N.Y.) 2021; 2:100248. [PMID: 34027500 PMCID: PMC8134938 DOI: 10.1016/j.patter.2021.100248] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 03/08/2021] [Accepted: 04/01/2021] [Indexed: 12/25/2022]
Abstract
Optical mapping (OM) provides single-molecule readouts of fluorescently labeled sequence motifs on long fragments of DNA, resolved to nucleotide-level coordinates. With the advent of microfluidic technologies for analysis of DNA molecules, it is possible to inexpensively generate long OM data ( > 150 kbp) at high coverage. In addition to scaffolding for de novo assembly, OM data can be aligned to a reference genome for identification of genomic structural variants. We introduce FaNDOM (Fast Nested Distance Seeding of Optical Maps)-an optical map alignment tool that greatly reduces the search space of the alignment process. On four benchmark human datasets, FaNDOM was significantly (4-14×) faster than competing tools while maintaining comparable sensitivity and specificity. We used FaNDOM to map variants in three cancer cell lines and identified many biologically interesting structural variants, including deletions, duplications, gene fusions and gene-disrupting rearrangements. FaNDOM is publicly available at https://github.com/jluebeck/FaNDOM.
Collapse
Affiliation(s)
- Siavash Raeisi Dehkordi
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Jens Luebeck
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics & Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Vineet Bafna
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
2
|
Luebeck J, Coruh C, Dehkordi SR, Lange JT, Turner KM, Deshpande V, Pai DA, Zhang C, Rajkumar U, Law JA, Mischel PS, Bafna V. AmpliconReconstructor integrates NGS and optical mapping to resolve the complex structures of focal amplifications. Nat Commun 2020; 11:4374. [PMID: 32873787 PMCID: PMC7463033 DOI: 10.1038/s41467-020-18099-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 07/31/2020] [Indexed: 12/15/2022] Open
Abstract
Oncogene amplification, a major driver of cancer pathogenicity, is often mediated through focal amplification of genomic segments. Recent results implicate extrachromosomal DNA (ecDNA) as the primary driver of focal copy number amplification (fCNA) - enabling gene amplification, rapid tumor evolution, and the rewiring of regulatory circuitry. Resolving an fCNA's structure is a first step in deciphering the mechanisms of its genesis and the fCNA's subsequent biological consequences. We introduce a computational method, AmpliconReconstructor (AR), for integrating optical mapping (OM) of long DNA fragments (>150 kb) with next-generation sequencing (NGS) to resolve fCNAs at single-nucleotide resolution. AR uses an NGS-derived breakpoint graph alongside OM scaffolds to produce high-fidelity reconstructions. After validating its performance through multiple simulation strategies, AR reconstructed fCNAs in seven cancer cell lines to reveal the complex architecture of ecDNA, a breakage-fusion-bridge and other complex rearrangements. By reconstructing the rearrangement signatures associated with an fCNA's generative mechanism, AR enables a more thorough understanding of the origins of fCNAs.
Collapse
Affiliation(s)
- Jens Luebeck
- Bioinformatics and Systems Biology Graduate Program, University of California at San Diego, La Jolla, CA, 92093, USA
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA, 92093, USA
| | - Ceyda Coruh
- Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
| | - Siavash R Dehkordi
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA, 92093, USA
| | - Joshua T Lange
- Biomedical Sciences Graduate Program, University of California at San Diego, La Jolla, CA, 92093, USA
- Ludwig Institute for Cancer Research, University of California at San Diego, La Jolla, CA, 92093, USA
| | - Kristen M Turner
- Ludwig Institute for Cancer Research, University of California at San Diego, La Jolla, CA, 92093, USA
| | - Viraj Deshpande
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA, 92093, USA
| | - Dave A Pai
- Bionano Genomics, Inc., San Diego, CA, 92121, USA
| | - Chao Zhang
- Bioinformatics and Systems Biology Graduate Program, University of California at San Diego, La Jolla, CA, 92093, USA
| | - Utkrisht Rajkumar
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA, 92093, USA
| | - Julie A Law
- Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
| | - Paul S Mischel
- Ludwig Institute for Cancer Research, University of California at San Diego, La Jolla, CA, 92093, USA
- Moores Cancer Center, University of California at San Diego, La Jolla, CA, 92093, USA
- Department of Pathology, University of California at San Diego, La Jolla, CA, 92093, USA
| | - Vineet Bafna
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
3
|
Yuan Y, Chung CYL, Chan TF. Advances in optical mapping for genomic research. Comput Struct Biotechnol J 2020; 18:2051-2062. [PMID: 32802277 PMCID: PMC7419273 DOI: 10.1016/j.csbj.2020.07.018] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 07/08/2020] [Accepted: 07/24/2020] [Indexed: 12/28/2022] Open
Abstract
Recent advances in optical mapping have allowed the construction of improved genome assemblies with greater contiguity. Optical mapping also enables genome comparison and identification of large-scale structural variations. Association of these large-scale genomic features with biological functions is an important goal in plant and animal breeding and in medical research. Optical mapping has also been used in microbiology and still plays an important role in strain typing and epidemiological studies. Here, we review the development of optical mapping in recent decades to illustrate its importance in genomic research. We detail its applications and algorithms to show its specific advantages. Finally, we discuss the challenges required to facilitate the optimization of optical mapping and improve its future development and application.
Collapse
Key Words
- 3D, three-dimensional
- DBG, de Bruijn graph
- DLS, direct label and strain
- DNA, deoxyribonucleic acid
- Genome assembly
- Hi-C, high-throughput chromosome conformation capture
- Mb, million base pair
- Next generation sequencing
- OLC, overlap-layout-consensus
- Optical mapping
- PCR, polymerase chain reaction
- PacBio, Pacific Biosciences
- SRS, short-read sequencing
- SV, structural variation
- Structural variation
- bp, base pair
- kb, kilobase pair
Collapse
Affiliation(s)
- Yuxuan Yuan
- School of Life Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
- State Key Laboratory for Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong SAR, China
- AoE Centre for Genomic Studies on Plant-Environment Interaction for Sustainable Agriculture and Food Security, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Claire Yik-Lok Chung
- School of Life Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
- State Key Laboratory for Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Ting-Fung Chan
- School of Life Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
- State Key Laboratory for Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong SAR, China
- AoE Centre for Genomic Studies on Plant-Environment Interaction for Sustainable Agriculture and Food Security, The Chinese University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
4
|
Abstract
The computational reconstruction of genome sequences from shotgun sequencing data has been greatly simplified by the advent of sequencing technologies that generate long reads. In the case of relatively small genomes (e.g., bacterial or viral), complete genome sequences can frequently be reconstructed computationally without the need for further experiments. However, large and complex genomes, such as those of most animals and plants, continue to pose significant challenges. In such genomes, assembly software produces incomplete and fragmented reconstructions that require additional experimentally derived information and manual intervention in order to reconstruct individual chromosome arms. Recent technologies originally designed to capture chromatin structure have been shown to effectively complement sequencing data, leading to much more contiguous reconstructions of genomes than previously possible. Here, we survey these technologies and the algorithms used to assemble and analyze large eukaryotic genomes, placed within the historical context of genome scaffolding technologies that have been in existence since the dawn of the genomic era.
Collapse
Affiliation(s)
- Jay Ghurye
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Mihai Pop
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
- * E-mail:
| |
Collapse
|
5
|
|
6
|
Mikheikin A, Olsen A, Leslie K, Russell-Pavier F, Yacoot A, Picco L, Payton O, Toor A, Chesney A, Gimzewski JK, Mishra B, Reed J. DNA nanomapping using CRISPR-Cas9 as a programmable nanoparticle. Nat Commun 2017; 8:1665. [PMID: 29162844 PMCID: PMC5698298 DOI: 10.1038/s41467-017-01891-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/24/2017] [Indexed: 01/26/2023] Open
Abstract
Progress in whole-genome sequencing using short-read (e.g., <150 bp), next-generation sequencing technologies has reinvigorated interest in high-resolution physical mapping to fill technical gaps that are not well addressed by sequencing. Here, we report two technical advances in DNA nanotechnology and single-molecule genomics: (1) we describe a labeling technique (CRISPR-Cas9 nanoparticles) for high-speed AFM-based physical mapping of DNA and (2) the first successful demonstration of using DVD optics to image DNA molecules with high-speed AFM. As a proof of principle, we used this new “nanomapping” method to detect and map precisely BCL2–IGH translocations present in lymph node biopsies of follicular lymphoma patents. This HS-AFM “nanomapping” technique can be complementary to both sequencing and other physical mapping approaches. Physical mapping of DNA can be used to detect structural variants and for whole-genome haplotype assembly. Here, the authors use CRISPR-Cas9 and high-speed atomic force microscopy to ‘nanomap’ single molecules of DNA.
Collapse
Affiliation(s)
- Andrey Mikheikin
- Department of Physics, Virginia Commonwealth University, Richmond, 23284, VA, USA
| | - Anita Olsen
- Department of Physics, Virginia Commonwealth University, Richmond, 23284, VA, USA
| | - Kevin Leslie
- Department of Physics, Virginia Commonwealth University, Richmond, 23284, VA, USA
| | - Freddie Russell-Pavier
- National Physical Laboratory, Hampton Road, Teddington, TW11 0LW, Middlesex, UK.,Interface Analysis Centre, H. H. Wills Physics Laboratory, Tyndall Avenue, Bristol, BS8 1TL, UK
| | - Andrew Yacoot
- National Physical Laboratory, Hampton Road, Teddington, TW11 0LW, Middlesex, UK
| | - Loren Picco
- Interface Analysis Centre, H. H. Wills Physics Laboratory, Tyndall Avenue, Bristol, BS8 1TL, UK
| | - Oliver Payton
- Interface Analysis Centre, H. H. Wills Physics Laboratory, Tyndall Avenue, Bristol, BS8 1TL, UK
| | - Amir Toor
- Department of Internal Medicine, VCU School of Medicine, Richmond, 23284, VA, USA.,VCU Massey Cancer Center, Richmond, 23284, VA, USA
| | - Alden Chesney
- VCU Massey Cancer Center, Richmond, 23284, VA, USA.,Department of Pathology, VCU School of Medicine, Richmond, 23284, VA, USA
| | - James K Gimzewski
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, 90095, CA, USA
| | - Bud Mishra
- Departments of Computer Science and Mathematics, Courant Institute of Mathematical Sciences, New York University, New York, 10012, NY, USA
| | - Jason Reed
- Department of Physics, Virginia Commonwealth University, Richmond, 23284, VA, USA. .,VCU Massey Cancer Center, Richmond, 23284, VA, USA.
| |
Collapse
|
7
|
Abstract
Optical mapping (OM) has been used in microbiology for the past 20 years, initially as a technique to facilitate DNA sequence-based studies; however, with decreases in DNA sequencing costs and increases in sequence output from automated sequencing platforms, OM has grown into an important auxiliary tool for genome assembly and comparison. Currently, there are a number of new and exciting applications for OM in the field of microbiology, including investigation of disease outbreaks, identification of specific genes of clinical and/or epidemiological relevance, and the possibility of single-cell analysis when combined with cell-sorting approaches. In addition, designing lab-on-a-chip systems based on OM is now feasible and will allow the integrated and automated microbiological analysis of biological fluids. Here, we review the basic technology of OM, detail the current state of the art of the field, and look ahead to possible future developments in OM technology for microbiological applications.
Collapse
|
8
|
Maschmann A, Kounovsky-Shafer KL. Determination of restriction enzyme activity when cutting DNA labeled with the TOTO dye family. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2017; 36:406-417. [PMID: 28362164 DOI: 10.1080/15257770.2017.1300665] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Optical mapping, a single DNA molecule genome analysis platform that can determine methylation profiles, uses fluorescently labeled DNA molecules that are elongated on the surface and digested with a restriction enzyme to produce a barcode of that molecule. Understanding how the cyanine fluorochromes affect enzyme activity can lead to other fluorochromes used in the optical mapping system. The effects of restriction digestion on fluorochrome labeled DNA (Ethidium Bromide, DAPI, H33258, EthD-1, TOTO-1) have been analyzed previously. However, TOTO-1 is a part of a family of cyanine fluorochromes (YOYO-1, TOTO-1, BOBO-1, POPO-1, YOYO-3, TOTO-3, BOBO-3, and POPO-3) and the rest of the fluorochromes have not been examined in terms of their effects on restriction digestion. In order to determine if the other dyes in the TOTO-1 family inhibit restriction enzymes in the same way as TOTO-1, lambda DNA was stained with a dye from the TOTO family and digested. The restriction enzyme activity in regards to each dye, as well as each restriction enzyme, was compared to determine the extent of digestion. YOYO-1, TOTO-1, and POPO-1 fluorochromes inhibited ScaI-HF, PmlI, and EcoRI restriction enzymes. Additionally, the mobility of labeled DNA fragments in an agarose gel changed depending on which dye was intercalated.
Collapse
Affiliation(s)
- April Maschmann
- a Department of Chemistry , University of Nebraska-Kearney , Kearney , NE , USA
| | | |
Collapse
|
9
|
Abstract
Optical mapping is a new technique to generate restriction maps of DNA easily and quickly. DNA restriction maps can be aligned by comparing corresponding restriction fragment lengths. To relate, organize, and analyse these maps it is necessary to rapidly compare maps. The issue of the statistical significance of approximately matching maps then becomes central, as in BLAST with sequence scoring. In this paper, we study the approximation to the distribution of counts of matched regions of specified length when comparing two DNA restriction maps. Distributional results are given to enable us to computep-values and hence to determine whether or not the two restriction maps are related. The key tool used is the Chen-Stein method of Poisson approximation. Certain open problems are described.
Collapse
|
10
|
Vij S, Kuhl H, Kuznetsova IS, Komissarov A, Yurchenko AA, Van Heusden P, Singh S, Thevasagayam NM, Prakki SRS, Purushothaman K, Saju JM, Jiang J, Mbandi SK, Jonas M, Hin Yan Tong A, Mwangi S, Lau D, Ngoh SY, Liew WC, Shen X, Hon LS, Drake JP, Boitano M, Hall R, Chin CS, Lachumanan R, Korlach J, Trifonov V, Kabilov M, Tupikin A, Green D, Moxon S, Garvin T, Sedlazeck FJ, Vurture GW, Gopalapillai G, Kumar Katneni V, Noble TH, Scaria V, Sivasubbu S, Jerry DR, O'Brien SJ, Schatz MC, Dalmay T, Turner SW, Lok S, Christoffels A, Orbán L. Chromosomal-Level Assembly of the Asian Seabass Genome Using Long Sequence Reads and Multi-layered Scaffolding. PLoS Genet 2016; 12:e1005954. [PMID: 27082250 PMCID: PMC4833346 DOI: 10.1371/journal.pgen.1005954] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Accepted: 03/03/2016] [Indexed: 11/18/2022] Open
Abstract
We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics.
Collapse
Affiliation(s)
- Shubha Vij
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
| | - Heiner Kuhl
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Inna S. Kuznetsova
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
- Laboratory of Chromosome Structure and Function, Department of Cytology and Histology, Biological Faculty, Saint Petersburg State University, St. Petersburg, Russia
| | - Aleksey Komissarov
- Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia
| | - Andrey A. Yurchenko
- Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia
| | - Peter Van Heusden
- South African MRC Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
| | - Siddharth Singh
- Pacific Biosciences, Menlo Park, California, United States of America
| | | | | | | | - Jolly M. Saju
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
| | - Junhui Jiang
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
| | - Stanley Kimbung Mbandi
- South African MRC Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
| | - Mario Jonas
- South African MRC Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
| | - Amy Hin Yan Tong
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
| | - Sarah Mwangi
- South African MRC Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
| | - Doreen Lau
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
| | - Si Yan Ngoh
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
| | - Woei Chang Liew
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
| | - Xueyan Shen
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
| | - Lawrence S. Hon
- Pacific Biosciences, Menlo Park, California, United States of America
| | - James P. Drake
- Pacific Biosciences, Menlo Park, California, United States of America
| | - Matthew Boitano
- Pacific Biosciences, Menlo Park, California, United States of America
| | - Richard Hall
- Pacific Biosciences, Menlo Park, California, United States of America
| | - Chen-Shan Chin
- Pacific Biosciences, Menlo Park, California, United States of America
| | | | - Jonas Korlach
- Pacific Biosciences, Menlo Park, California, United States of America
| | - Vladimir Trifonov
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Marsel Kabilov
- Genomics Core Facility, Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Alexey Tupikin
- Genomics Core Facility, Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Darrell Green
- Norwich Medical School, University of East Anglia, Norwich Research Park, Norwich, United Kingdom
| | - Simon Moxon
- The Genome Analysis Centre, Norwich, United Kingdom
| | - Tyler Garvin
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, New York, United States of America
| | - Fritz J. Sedlazeck
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, New York, United States of America
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Gregory W. Vurture
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, New York, United States of America
| | - Gopikrishna Gopalapillai
- Nutrition, Genetics & Biotechnology Division, ICAR-Central Institute of Brackishwater Aquaculture, Tamil Nadu, India
| | - Vinaya Kumar Katneni
- Nutrition, Genetics & Biotechnology Division, ICAR-Central Institute of Brackishwater Aquaculture, Tamil Nadu, India
| | - Tansyn H. Noble
- College of Marine and Environmental Sciences and Center for Sustainable Tropical Fisheries and Aquaculture, James Cook University, Townsville, Queensland, Australia
| | - Vinod Scaria
- CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, India
| | - Sridhar Sivasubbu
- CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, India
| | - Dean R. Jerry
- College of Marine and Environmental Sciences and Center for Sustainable Tropical Fisheries and Aquaculture, James Cook University, Townsville, Queensland, Australia
| | - Stephen J. O'Brien
- Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia
- Oceanographic Center, Nova Southeastern University Ft. Lauderdale, Ft. Lauderdale, Florida, United States of America
| | - Michael C. Schatz
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, New York, United States of America
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Tamás Dalmay
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, United Kingdom
| | - Stephen W. Turner
- Pacific Biosciences, Menlo Park, California, United States of America
| | - Si Lok
- The Centre for Applied Genomics, The Hospital for Sick Children, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada
| | - Alan Christoffels
- South African MRC Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
| | - László Orbán
- Reproductive Genomics Group, Temasek Life Sciences Laboratory, Singapore
- Department of Animal Sciences and Animal Husbandry, Georgikon Faculty, University of Pannonia, Keszthely, Hungary
- Centre for Comparative Genomics, Murdoch University, Murdoch, Australia
| |
Collapse
|
11
|
Verzotto D, M. Teo AS, Hillmer AM, Nagarajan N. OPTIMA: sensitive and accurate whole-genome alignment of error-prone genomic maps by combinatorial indexing and technology-agnostic statistical analysis. Gigascience 2016; 5:2. [PMID: 26793302 PMCID: PMC4719737 DOI: 10.1186/s13742-016-0110-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2015] [Accepted: 01/06/2016] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Resolution of complex repeat structures and rearrangements in the assembly and analysis of large eukaryotic genomes is often aided by a combination of high-throughput sequencing and genome-mapping technologies (for example, optical restriction mapping). In particular, mapping technologies can generate sparse maps of large DNA fragments (150 kilo base pairs (kbp) to 2 Mbp) and thus provide a unique source of information for disambiguating complex rearrangements in cancer genomes. Despite their utility, combining high-throughput sequencing and mapping technologies has been challenging because of the lack of efficient and sensitive map-alignment algorithms for robustly aligning error-prone maps to sequences. RESULTS We introduce a novel seed-and-extend glocal (short for global-local) alignment method, OPTIMA (and a sliding-window extension for overlap alignment, OPTIMA-Overlap), which is the first to create indexes for continuous-valued mapping data while accounting for mapping errors. We also present a novel statistical model, agnostic with respect to technology-dependent error rates, for conservatively evaluating the significance of alignments without relying on expensive permutation-based tests. CONCLUSIONS We show that OPTIMA and OPTIMA-Overlap outperform other state-of-the-art approaches (1.6-2 times more sensitive) and are more efficient (170-200 %) and precise in their alignments (nearly 99 % precision). These advantages are independent of the quality of the data, suggesting that our indexing approach and statistical evaluation are robust, provide improved sensitivity and guarantee high precision.
Collapse
Affiliation(s)
- Davide Verzotto
- Computational and Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Audrey S. M. Teo
- Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Axel M. Hillmer
- Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Niranjan Nagarajan
- Computational and Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| |
Collapse
|
12
|
Towards a More Accurate Error Model for BioNano Optical Maps. BIOINFORMATICS RESEARCH AND APPLICATIONS 2016. [DOI: 10.1007/978-3-319-38782-6_6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
13
|
Teo ASM, Verzotto D, Yao F, Nagarajan N, Hillmer AM. Single-molecule optical genome mapping of a human HapMap and a colorectal cancer cell line. Gigascience 2015; 4:65. [PMID: 26719794 PMCID: PMC4696294 DOI: 10.1186/s13742-015-0106-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2015] [Accepted: 12/17/2015] [Indexed: 11/21/2022] Open
Abstract
Background Next-generation sequencing (NGS) technologies have changed our understanding of the variability of the human genome. However, the identification of genome structural variations based on NGS approaches with read lengths of 35–300 bases remains a challenge. Single-molecule optical mapping technologies allow the analysis of DNA molecules of up to 2 Mb and as such are suitable for the identification of large-scale genome structural variations, and for de novo genome assemblies when combined with short-read NGS data. Here we present optical mapping data for two human genomes: the HapMap cell line GM12878 and the colorectal cancer cell line HCT116. Findings High molecular weight DNA was obtained by embedding GM12878 and HCT116 cells, respectively, in agarose plugs, followed by DNA extraction under mild conditions. Genomic DNA was digested with KpnI and 310,000 and 296,000 DNA molecules (≥150 kb and 10 restriction fragments), respectively, were analyzed per cell line using the Argus optical mapping system. Maps were aligned to the human reference by OPTIMA, a new glocal alignment method. Genome coverage of 6.8× and 5.7× was obtained, respectively; 2.9× and 1.7× more than the coverage obtained with previously available software. Conclusions Optical mapping allows the resolution of large-scale structural variations of the genome, and the scaffold extension of NGS-based de novo assemblies. OPTIMA is an efficient new alignment method; our optical mapping data provide a resource for genome structure analyses of the human HapMap reference cell line GM12878, and the colorectal cancer cell line HCT116.
Collapse
Affiliation(s)
- Audrey S M Teo
- Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Davide Verzotto
- Computational and Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Fei Yao
- Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Niranjan Nagarajan
- Computational and Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Axel M Hillmer
- Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| |
Collapse
|
14
|
Olsen RA, Bunikis I, Tiukova I, Holmberg K, Lötstedt B, Pettersson OV, Passoth V, Käller M, Vezzi F. De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping. Gigascience 2015; 4:56. [PMID: 26617983 PMCID: PMC4661999 DOI: 10.1186/s13742-015-0094-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Accepted: 11/04/2015] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome). Obtaining high quality draft assemblies is extremely important in the case of yeast genomes to better characterise major events in their evolutionary history. The aim of this work is two-fold: on the one hand we want to show how combining different and somewhat complementary technologies is key to improving assembly quality and correctness, and on the other hand we present a de novo assembly pipeline we believe to be beneficial to core facility bioinformaticians. To demonstrate both the effectiveness of combining technologies and the simplicity of the pipeline, here we present the results obtained using the Dekkera bruxellensis genome. METHODS In this work we used short-read Illumina data and long-read PacBio data combined with the extreme long-range information from OpGen optical maps in the task of de novo genome assembly and finishing. Moreover, we developed NouGAT, a semi-automated pipeline for read-preprocessing, de novo assembly and assembly evaluation, which was instrumental for this work. RESULTS We obtained a high quality draft assembly of a yeast genome, resolved on a chromosomal level. Furthermore, this assembly was corrected for mis-assembly errors as demonstrated by resolving a large collapsed repeat and by receiving higher scores by assembly evaluation tools. With the inclusion of PacBio data we were able to fill about 5 % of the optical mapped genome not covered by the Illumina data.
Collapse
Affiliation(s)
- Remi-Andre Olsen
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| | - Ignas Bunikis
- Uppsala Genome Center, NGI/SciLifeLab, Department of Immunology, Genetics and Pathology, Uppsala University, BMC, Box 815, SE-752 37 Uppsala, Sweden
| | - Ievgeniia Tiukova
- Department of Microbiology, Swedish University of Agricultural Sciences, Box 7025, SE-75007 Uppsala, Sweden
| | - Kicki Holmberg
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| | - Britta Lötstedt
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| | - Olga Vinnere Pettersson
- Uppsala Genome Center, NGI/SciLifeLab, Department of Immunology, Genetics and Pathology, Uppsala University, BMC, Box 815, SE-752 37 Uppsala, Sweden
| | - Volkmar Passoth
- Department of Microbiology, Swedish University of Agricultural Sciences, Box 7025, SE-75007 Uppsala, Sweden
| | - Max Käller
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| | - Francesco Vezzi
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| |
Collapse
|
15
|
Chamala S, Chanderbali AS, Der JP, Lan T, Walts B, Albert VA, dePamphilis CW, Leebens-Mack J, Rounsley S, Schuster SC, Wing RA, Xiao N, Moore R, Soltis PS, Soltis DE, Barbazuk WB. Assembly and Validation of the Genome of the Nonmodel Basal Angiosperm Amborella. Science 2013; 342:1516-7. [DOI: 10.1126/science.1241130] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
16
|
Mazurie AJ, Alves JM, Ozaki LS, Zhou S, Schwartz DC, Buck GA. Comparative genomics of cryptosporidium. Int J Genomics 2013; 2013:832756. [PMID: 23738321 PMCID: PMC3659464 DOI: 10.1155/2013/832756] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 04/10/2013] [Indexed: 11/18/2022] Open
Abstract
Until recently, the apicomplexan parasites, Cryptosporidium hominis and C. parvum, were considered the same species. However, the two parasites, now considered distinct species, exhibit significant differences in host range, infectivity, and pathogenicity, and their sequenced genomes exhibit only 95-97% identity. The availability of the complete genome sequences of these organisms provides the potential to identify the genetic variations that are responsible for the phenotypic differences between the two parasites. We compared the genome organization and structure, gene composition, the metabolic and other pathways, and the local sequence identity between the genes of these two Cryptosporidium species. Our observations show that the phenotypic differences between C. hominis and C. parvum are not due to gross genome rearrangements, structural alterations, gene deletions or insertions, metabolic capabilities, or other obvious genomic alterations. Rather, the results indicate that these genomes exhibit a remarkable structural and compositional conservation and suggest that the phenotypic differences observed are due to subtle variations in the sequences of proteins that act at the interface between the parasite and its host.
Collapse
Affiliation(s)
- Aurélien J. Mazurie
- Department of Microbiology, Montana State University, Bozeman, MT 59717, USA
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284-2030, USA
| | - João M. Alves
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284-2030, USA
| | - Luiz S. Ozaki
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284-2030, USA
| | - Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - David C. Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Gregory A. Buck
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284-2030, USA
| |
Collapse
|
17
|
Dorfman KD, King SB, Olson DW, Thomas JDP, Tree DR. Beyond gel electrophoresis: microfluidic separations, fluorescence burst analysis, and DNA stretching. Chem Rev 2013; 113:2584-667. [PMID: 23140825 PMCID: PMC3595390 DOI: 10.1021/cr3002142] [Citation(s) in RCA: 149] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Kevin D. Dorfman
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| | - Scott B. King
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| | - Daniel W. Olson
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| | - Joel D. P. Thomas
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| | - Douglas R. Tree
- Department of Chemical Engineering and Materials Science, University of Minnesota — Twin Cities, 421 Washington Ave. SE, Minneapolis, MN 55455, Phone: 1-612-624-5560. Fax: 1-612-626-7246
| |
Collapse
|
18
|
Reevaluating assembly evaluations with feature response curves: GAGE and assemblathons. PLoS One 2012; 7:e52210. [PMID: 23284938 PMCID: PMC3532452 DOI: 10.1371/journal.pone.0052210] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Accepted: 11/16/2012] [Indexed: 11/19/2022] Open
Abstract
In just the last decade, a multitude of bio-technologies and software pipelines have emerged to revolutionize genomics. To further their central goal, they aim to accelerate and improve the quality of de novo whole-genome assembly starting from short DNA sequences/reads. However, the performance of each of these tools is contingent on the length and quality of the sequencing data, the structure and complexity of the genome sequence, and the resolution and quality of long-range information. Furthermore, in the absence of any metric that captures the most fundamental “features” of a high-quality assembly, there is no obvious recipe for users to select the most desirable assembler/assembly. This situation has prompted the scientific community to rely on crowd-sourcing through international competitions, such as Assemblathons or GAGE, with the intention of identifying the best assembler(s) and their features. Somewhat circuitously, the only available approach to gauge de novo assemblies and assemblers relies solely on the availability of a high-quality fully assembled reference genome sequence. Still worse, reference-guided evaluations are often both difficult to analyze, leading to conclusions that are difficult to interpret. In this paper, we circumvent many of these issues by relying upon a tool, dubbed , which is capable of evaluating de novo assemblies from the read-layouts even when no reference exists. We extend the FRCurve approach to cases where lay-out information may have been obscured, as is true in many deBruijn-graph-based algorithms. As a by-product, FRCurve now expands its applicability to a much wider class of assemblers – thus, identifying higher-quality members of this group, their inter-relations as well as sensitivity to carefully selected features, with or without the support of a reference sequence or layout for the reads. The paper concludes by reevaluating several recently conducted assembly competitions and the datasets that have resulted from them.
Collapse
|
19
|
AGORA: Assembly Guided by Optical Restriction Alignment. BMC Bioinformatics 2012; 13:189. [PMID: 22856673 PMCID: PMC3431216 DOI: 10.1186/1471-2105-13-189] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2012] [Accepted: 06/28/2012] [Indexed: 11/10/2022] Open
Abstract
Background Genome assembly is difficult due to repeated sequences within the genome, which create ambiguities and cause the final assembly to be broken up into many separate sequences (contigs). Long range linking information, such as mate-pairs or mapping data, is necessary to help assembly software resolve repeats, thereby leading to a more complete reconstruction of genomes. Prior work has used optical maps for validating assemblies and scaffolding contigs, after an initial assembly has been produced. However, optical maps have not previously been used within the genome assembly process. Here, we use optical map information within the popular de Bruijn graph assembly paradigm to eliminate paths in the de Bruijn graph which are not consistent with the optical map and help determine the correct reconstruction of the genome. Results We developed a new algorithm called AGORA: Assembly Guided by Optical Restriction Alignment. AGORA is the first algorithm to use optical map information directly within the de Bruijn graph framework to help produce an accurate assembly of a genome that is consistent with the optical map information provided. Our simulations on bacterial genomes show that AGORA is effective at producing assemblies closely matching the reference sequences. Additionally, we show that noise in the optical map can have a strong impact on the final assembly quality for some complex genomes, and we also measure how various characteristics of the starting de Bruijn graph may impact the quality of the final assembly. Lastly, we show that a proper choice of restriction enzyme for the optical map may substantially improve the quality of the final assembly. Conclusions Our work shows that optical maps can be used effectively to assemble genomes within the de Bruijn graph assembly framework. Our experiments also provide insights into the characteristics of the mapping data that most affect the performance of our algorithm, indicating the potential benefit of more accurate optical mapping technologies, such as nano-coding.
Collapse
|
20
|
Comparing de novo genome assembly: the long and short of it. PLoS One 2011; 6:e19175. [PMID: 21559467 PMCID: PMC3084767 DOI: 10.1371/journal.pone.0019175] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2010] [Accepted: 03/29/2011] [Indexed: 01/30/2023] Open
Abstract
Recent advances in DNA sequencing technology and their focal role in Genome Wide Association Studies (GWAS) have rekindled a growing interest in the whole-genome sequence assembly (WGSA) problem, thereby, inundating the field with a plethora of new formalizations, algorithms, heuristics and implementations. And yet, scant attention has been paid to comparative assessments of these assemblers' quality and accuracy. No commonly accepted and standardized method for comparison exists yet. Even worse, widely used metrics to compare the assembled sequences emphasize only size, poorly capturing the contig quality and accuracy. This paper addresses these concerns: it highlights common anomalies in assembly accuracy through a rigorous study of several assemblers, compared under both standard metrics (N50, coverage, contig sizes, etc.) as well as a more comprehensive metric (Feature-Response Curves, FRC) that is introduced here; FRC transparently captures the trade-offs between contigs' quality against their sizes. For this purpose, most of the publicly available major sequence assemblers--both for low-coverage long (Sanger) and high-coverage short (Illumina) reads technologies--are compared. These assemblers are applied to microbial (Escherichia coli, Brucella, Wolbachia, Staphylococcus, Helicobacter) and partial human genome sequences (Chr. Y), using sequence reads of various read-lengths, coverages, accuracies, and with and without mate-pairs. It is hoped that, based on these evaluations, computational biologists will identify innovative sequence assembly paradigms, bioinformaticists will determine promising approaches for developing "next-generation" assemblers, and biotechnologists will formulate more meaningful design desiderata for sequencing technology platforms. A new software tool for computing the FRC metric has been developed and is available through the AMOS open-source consortium.
Collapse
|
21
|
Mir KU. Sequencing genomes: from individuals to populations. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2010; 8:367-78. [PMID: 19808932 DOI: 10.1093/bfgp/elp040] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The whole genome sequences of Jim Watson and Craig Venter are early examples of personalized genomics, which promises to change how we approach healthcare in the future. Before personal sequencing can have practical medical benefits, however, and before it should be advocated for implementation at the population-scale, there needs to be a better understanding of which genetic variants influence which traits and how their effects are modified by epigenetic factors. Nonetheless, for forging links between DNA sequence and phenotype, efforts to sequence the genomes of individuals need to continue; this includes sequencing sub-populations for association studies which analyse the difference in sequence between disease affected and unaffected individuals. Such studies can only be applied on a large enough scale to be effective if the massive strides in sequencing technology that have recently occurred also continue.
Collapse
Affiliation(s)
- Kalim U Mir
- The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.
| |
Collapse
|
22
|
Ananiev GE, Goldstein S, Runnheim R, Forrest DK, Zhou S, Potamousis K, Churas CP, Bergendahl V, Thomson JA, Schwartz DC. Optical mapping discerns genome wide DNA methylation profiles. BMC Mol Biol 2008; 9:68. [PMID: 18667073 PMCID: PMC2516518 DOI: 10.1186/1471-2199-9-68] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2008] [Accepted: 07/30/2008] [Indexed: 11/23/2022] Open
Abstract
Background Methylation of CpG dinucleotides is a fundamental mechanism of epigenetic regulation in eukaryotic genomes. Development of methods for rapid genome wide methylation profiling will greatly facilitate both hypothesis and discovery driven research in the field of epigenetics. In this regard, a single molecule approach to methylation profiling offers several unique advantages that include elimination of chemical DNA modification steps and PCR amplification. Results A single molecule approach is presented for the discernment of methylation profiles, based on optical mapping. We report results from a series of pilot studies demonstrating the capabilities of optical mapping as a platform for methylation profiling of whole genomes. Optical mapping was used to discern the methylation profile from both an engineered and wild type Escherichia coli. Furthermore, the methylation status of selected loci within the genome of human embryonic stem cells was profiled using optical mapping. Conclusion The optical mapping platform effectively detects DNA methylation patterns. Due to single molecule detection, optical mapping offers significant advantages over other technologies. This advantage stems from obviation of DNA modification steps, such as bisulfite treatment, and the ability of the platform to assay repeat dense regions within mammalian genomes inaccessible to techniques using array-hybridization technologies.
Collapse
Affiliation(s)
- Gene E Ananiev
- Department of Chemistry, Laboratory for Molecular and Computational Genomics, University of Wisconsin Biotechnology Center, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Nagarajan N, Read TD, Pop M. Scaffolding and validation of bacterial genome assemblies using optical restriction maps. Bioinformatics 2008; 24:1229-35. [PMID: 18356192 PMCID: PMC2373919 DOI: 10.1093/bioinformatics/btn102] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Revised: 03/05/2008] [Accepted: 03/16/2008] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION New, high-throughput sequencing technologies have made it feasible to cheaply generate vast amounts of sequence information from a genome of interest. The computational reconstruction of the complete sequence of a genome is complicated by specific features of these new sequencing technologies, such as the short length of the sequencing reads and absence of mate-pair information. In this article we propose methods to overcome such limitations by incorporating information from optical restriction maps. RESULTS We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets. We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii. The resulting assemblies contain a single scaffold covering a large fraction of the respective genomes, suggesting that the careful use of optical maps can provide a cost-effective framework for the assembly of genomes. AVAILABILITY The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma
Collapse
|
24
|
Zhou S, Bechner MC, Place M, Churas CP, Pape L, Leong SA, Runnheim R, Forrest DK, Goldstein S, Livny M, Schwartz DC. Validation of rice genome sequence by optical mapping. BMC Genomics 2007; 8:278. [PMID: 17697381 PMCID: PMC2048515 DOI: 10.1186/1471-2164-8-278] [Citation(s) in RCA: 103] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2007] [Accepted: 08/15/2007] [Indexed: 11/30/2022] Open
Abstract
Background Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. Results To facilitate ongoing sequencing finishing and validation efforts, we have constructed a whole-genome SwaI optical restriction map of the rice genome. The physical map consists of 14 contigs, covering 12 chromosomes, with a total genome size of 382.17 Mb; this value is about 11% smaller than original estimates. 9 of the 14 optical map contigs are without gaps, covering chromosomes 1, 2, 3, 4, 5, 7, 8 10, and 12 in their entirety – including centromeres and telomeres. Alignments between optical and in silico restriction maps constructed from IRGSP (International Rice Genome Sequencing Project) and TIGR (The Institute for Genomic Research) genome sequence sources are comprehensive and informative, evidenced by map coverage across virtually all published gaps, discovery of new ones, and characterization of sequence misassemblies; all totalling ~14 Mb. Furthermore, since optical maps are ordered restriction maps, identified discordances are pinpointed on a reliable physical scaffold providing an independent resource for closure of gaps and rectification of misassemblies. Conclusion Analysis of sequence and optical mapping data effectively validates genome sequence assemblies constructed from large, repeat-rich genomes. Given this conclusion we envision new applications of such single molecule analysis that will merge advantages offered by high-resolution optical maps with inexpensive, but short sequence reads generated by emerging sequencing platforms. Lastly, map construction techniques presented here points the way to new types of comparative genome analysis that would focus on discernment of structural differences revealed by optical maps constructed from a broad range of rice subspecies and varieties.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Michael C Bechner
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Michael Place
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Chris P Churas
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Louise Pape
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Sally A Leong
- USDA-ARS, CCRU, Department of Plant Pathology, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Rod Runnheim
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Dan K Forrest
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Steve Goldstein
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Miron Livny
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - David C Schwartz
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA
- Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
- Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
25
|
Reed J, Mishra B, Pittenger B, Magonov S, Troke J, Teitell MA, Gimzewski JK. Single molecule transcription profiling with AFM. NANOTECHNOLOGY 2007; 18:44032. [PMID: 20721301 PMCID: PMC2922717 DOI: 10.1088/0957-4484/18/4/044032] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Established techniques for global gene expression profiling, such as microarrays, face fundamental sensitivity constraints. Due to greatly increasing interest in examining minute samples from micro-dissected tissues, including single cells, unorthodox approaches, including molecular nanotechnologies, are being explored in this application. Here, we examine the use of single molecule, ordered restriction mapping, combined with AFM, to measure gene transcription levels from very low abundance samples. We frame the problem mathematically, using coding theory, and present an analysis of the critical error sources that may serve as a guide to designing future studies. We follow with experiments detailing the construction of high density, single molecule, ordered restriction maps from plasmids and from cDNA molecules, using two different enzymes, a result not previously reported. We discuss these results in the context of our calculations.
Collapse
Affiliation(s)
- Jason Reed
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA 90095, USA
| | - Bud Mishra
- Department of Computer Science and Mathematics, Courant Institute of Mathematical Sciences, New York University, New York, NY 10012, USA
| | | | | | - Joshua Troke
- Department of Pathology and the Center for Cell Control, an NIH Nanomedicine Development Center, UCLA, Los Angeles, CA 90095, USA
| | - Michael A Teitell
- Department of Pathology and the Center for Cell Control, an NIH Nanomedicine Development Center, UCLA, Los Angeles, CA 90095, USA
- California Nanosystems Institute (CNSI), Los Angeles, CA 90095, USA
| | - James K Gimzewski
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA 90095, USA
- California Nanosystems Institute (CNSI), Los Angeles, CA 90095, USA
| |
Collapse
|
26
|
Xiao M, Phong A, Ha C, Chan TF, Cai D, Leung L, Wan E, Kistler AL, DeRisi JL, Selvin PR, Kwok PY. Rapid DNA mapping by fluorescent single molecule detection. Nucleic Acids Res 2006; 35:e16. [PMID: 17175538 PMCID: PMC1807959 DOI: 10.1093/nar/gkl1044] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
DNA mapping is an important analytical tool in genomic sequencing, medical diagnostics and pathogen identification. Here we report an optical DNA mapping strategy based on direct imaging of individual DNA molecules and localization of multiple sequence motifs on the molecules. Individual genomic DNA molecules were labeled with fluorescent dyes at specific sequence motifs by the action of nicking endonuclease followed by the incorporation of dye terminators with DNA polymerase. The labeled DNA molecules were then stretched into linear form on a modified glass surface and imaged using total internal reflection fluorescence (TIRF) microscopy. By determining the positions of the fluorescent labels with respect to the DNA backbone, the distribution of the sequence motif recognized by the nicking endonuclease can be established with good accuracy, in a manner similar to reading a barcode. With this approach, we constructed a specific sequence motif map of lambda-DNA. We further demonstrated the capability of this approach to rapidly type a human adenovirus and several strains of human rhinovirus.
Collapse
Affiliation(s)
- Ming Xiao
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
- To whom correspondence should be addressed at: 513, Parnassus Avenue, HSW-901A, San Francisco, CA 94143, USA. Tel: +1 41 551 43876; Fax: +1 41 547 62956;
| | - Angie Phong
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Connie Ha
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Ting-Fung Chan
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Dongmei Cai
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Lucinda Leung
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Eunice Wan
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Amy L. Kistler
- Department of Biochemistry and Biophysics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Joseph L. DeRisi
- Department of Biochemistry and Biophysics, University of CaliforniaSan Francisco, CA 94115, USA
| | - Paul R. Selvin
- Department of Physics and Center of Biophysics, University of Illinois at Urbana-ChampaignUrbana, IL 61801, USA
| | - Pui-Yan Kwok
- Cardiovascular Research Institute and Center for Human Genetics, University of CaliforniaSan Francisco, CA 94115, USA
- Department of Dermatology, University of CaliforniaSan Francisco, CA 94115, USA
| |
Collapse
|
27
|
Valouev A, Schwartz DC, Zhou S, Waterman MS. An algorithm for assembly of ordered restriction maps from single DNA molecules. Proc Natl Acad Sci U S A 2006; 103:15770-5. [PMID: 17043225 PMCID: PMC1635078 DOI: 10.1073/pnas.0604040103] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2006] [Indexed: 11/18/2022] Open
Abstract
The restriction mapping of a massive number of individual DNA molecules by optical mapping enables assembly of physical maps spanning mammalian and plant genomes; however, not through computational means permitting completely de novo assembly. Existing algorithms are not practical for genomes larger than lower eukaryotes due to their high time and space complexity. In many ways, sequence assembly parallels map assembly, so that the overlap-layout-consensus strategy, recently shown effective in assembling very large genomes in feasible time, sheds new light on solving map construction issues associated with single molecule substrates. Accordingly, we report an adaptation of this approach as the formal basis for de novo optical map assembly and demonstrate its computational feasibility for assembly of very large genomes. As such, we discuss assembly results for a series of genomes: human, plant, lower eukaryote and bacterial. Unlike sequence assembly, the optical map assembly problem is actually more complex because restriction maps from single molecules are constructed, manifesting errors stemming from: missing cuts, false cuts, and high variance of estimated fragment sizes; chimeric maps resulting from artifactually merged molecules; and true overlap scores that are "in the noise" or "slightly above the noise." We address these problems, fundamental to many single molecule measurements, by an effective error correction method using global overlap information to eliminate spurious overlaps and chimeric maps that are otherwise difficult to identify.
Collapse
Affiliation(s)
- Anton Valouev
- Department of Mathematics, University of Southern California, 3620 South Vermont Avenue, KAP 108, Los Angeles, CA 90089-2532, USA.
| | | | | | | |
Collapse
|
28
|
Yu W, Li X, Liu J, Wu B, Williams KR, Zhao H. Multiple peak alignment in sequential data analysis: a scale-space-based approach. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2006; 3:208-19. [PMID: 17048459 DOI: 10.1109/tcbb.2006.41] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
In this paper, we address the multiple peak alignment problem in sequential data analysis with an approach based on the Gaussian scale-space theory. We assume that multiple sets of detected peaks are the observed samples of a set of common peaks. We also assume that the locations of the observed peaks follow unimodal distributions (e.g., normal distribution) with their means equal to the corresponding locations of the common peaks and variances reflecting the extension of their variations. Under these assumptions, we convert the problem of estimating locations of the unknown number of common peaks from multiple sets of detected peaks into a much simpler problem of searching for local maxima in the scale-space representation. The optimization of the scale parameter is achieved using an energy minimization approach. We compare our approach with a hierarchical clustering method using both simulated data and real mass spectrometry data. We also demonstrate the merit of extending the binary peak detection method (i.e., a candidate is considered either as a peak or as a nonpeak) with a quantitative scoring measure-based approach (i.e., we assign to each candidate a possibility of being a peak).
Collapse
Affiliation(s)
- Weichuan Yu
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Sai Kung, Kowloon, Hong Kong.
| | | | | | | | | | | |
Collapse
|
29
|
Dimalanta ET, Lim A, Runnheim R, Lamers C, Churas C, Forrest DK, de Pablo JJ, Graham MD, Coppersmith SN, Goldstein S, Schwartz DC. A microfluidic system for large DNA molecule arrays. Anal Chem 2006; 76:5293-301. [PMID: 15362885 DOI: 10.1021/ac0496401] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Single molecule approaches offer the promise of large, exquisitely miniature ensembles for the generation of equally large data sets. Although microfluidic devices have previously been designed to manipulate single DNA molecules, many of the functionalities they embody are not applicable to very large DNA molecules, normally extracted from cells. Importantly, such microfluidic devices must work within an integrated system to enable high-throughput biological or biochemical analysis-a key measure of any device aimed at the chemical/biological interface and required if large data sets are to be created for subsequent analysis. The challenge here was to design an integrated microfluidic device to control the deposition or elongation of large DNA molecules (up to millimeters in length), which would serve as a general platform for biological/biochemical analysis to function within an integrated system that included massively parallel data collection and analysis. The approach we took was to use replica molding to construct silastic devices to consistently deposit oriented, elongated DNA molecules onto charged surfaces, creating massive single molecule arrays, which we analyzed for both physical and biochemical insights within an integrated environment that created large data sets. The overall efficacy of this approach was demonstrated by the restriction enzyme mapping and identification of single human genomic DNA molecules.
Collapse
Affiliation(s)
- Eileen T Dimalanta
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, and Laboratory of Genetics, University of Wisconsin-Madison, 425 Henry Mall, Madison, Wisconsin 53706, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Valouev A, Li L, Liu YC, Schwartz DC, Yang Y, Zhang Y, Waterman MS. Alignment of Optical Maps. J Comput Biol 2006; 13:442-62. [PMID: 16597251 DOI: 10.1089/cmb.2006.13.442] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We introduce a new scoring method for calculation of alignments of optical maps. Missing cuts, false cuts, and sizing errors present in optical maps are addressed by our alignment score through calculation of corresponding likelihoods. The size error model is derived through the application of Central Limit Theorem and validated by residual plots collected from real data. Missing cuts and false cuts are modeled as Bernoulli and Poisson events, respectively, as suggested by previous studies. Likelihoods are used to derive an alignment score through calculation of likelihood ratios for a certain hypothesis test. This allows us to achieve maximal descriminative power for the alignment score. Our scoring method is naturally embedded within a well known DP framework for finding optimal alignments.
Collapse
Affiliation(s)
- Anton Valouev
- Department of Mathematics, University of Southern California, Los Angeles, 90089-1113, USA.
| | | | | | | | | | | | | |
Collapse
|
31
|
Reslewic S, Zhou S, Place M, Zhang Y, Briska A, Goldstein S, Churas C, Runnheim R, Forrest D, Lim A, Lapidus A, Han CS, Roberts GP, Schwartz DC. Whole-genome shotgun optical mapping of Rhodospirillum rubrum. Appl Environ Microbiol 2005; 71:5511-22. [PMID: 16151144 PMCID: PMC1214604 DOI: 10.1128/aem.71.9.5511-5522.2005] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2005] [Accepted: 04/11/2005] [Indexed: 11/20/2022] Open
Abstract
Rhodospirillum rubrum is a phototrophic purple nonsulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems and as a source of hydrogen and biodegradable plastic production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction endonuclease maps (XbaI, NheI, and HindIII) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction endonuclease maps from randomly sheared genomic DNA molecules extracted from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the HindIII map acted as a scaffold for high-resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and confirmation of genome sequence, this work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a "molecular cytogenetics" approach to solving problems in genomic analysis.
Collapse
Affiliation(s)
- Susan Reslewic
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW-Biotechnology Center, 425 Henry Mall, Madison, WI 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Zhou S, Kile A, Bechner M, Place M, Kvikstad E, Deng W, Wei J, Severin J, Runnheim R, Churas C, Forrest D, Dimalanta ET, Lamers C, Burland V, Blattner FR, Schwartz DC. Single-molecule approach to bacterial genomic comparisons via optical mapping. J Bacteriol 2004; 186:7773-82. [PMID: 15516592 PMCID: PMC524920 DOI: 10.1128/jb.186.22.7773-7782.2004] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computation Genomics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Zhou S, Kile A, Kvikstad E, Bechner M, Severin J, Forrest D, Runnheim R, Churas C, Anantharaman TS, Myler P, Vogt C, Ivens A, Stuart K, Schwartz DC. Shotgun optical mapping of the entire Leishmania major Friedlin genome. Mol Biochem Parasitol 2004; 138:97-106. [PMID: 15500921 DOI: 10.1016/j.molbiopara.2004.08.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2004] [Accepted: 08/02/2004] [Indexed: 11/21/2022]
Abstract
Leishmania is a group of protozoan parasites which causes a broad spectrum of diseases resulting in widespread human suffering and death, as well as economic loss from the infection of some domestic animals and wildlife. To further understand the fundamental genomic architecture of this parasite, and to accelerate the on-going sequencing project, a whole-genome XbaI restriction map was constructed using the optical mapping system. This map supplemented traditional physical maps that were generated by fingerprinting and hybridization of cosmid and P1 clone libraries. Thirty-six optical map contigs were constructed for the corresponding known 36 chromosomes of the Leishmania major Friedlin genome. The chromosome sizes ranged from 326.9 to 2821.3 kb, with a total genome size of 34.7 Mb; the average XbaI restriction fragment was 25.3 kb, and ranged from 15.7 to 77.8 kb on a per chromosomes basis. Comparison between the optical maps and the in silico maps of sequence drawn from completed, nearly finished, or large sequence contigs showed that optical maps served several useful functions within the path to create finished sequence by: guiding aspects of the sequence assembly, identifying misassemblies, detection of cosmid or PAC clones misplacements to chromosomes, and validation of sequence stemming from varying degrees of finishing. Our results also showed the potential use of optical maps as a means to detect and characterize map segmental duplication within genomes.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Zhou S, Kvikstad E, Kile A, Severin J, Forrest D, Runnheim R, Churas C, Hickman JW, Mackenzie C, Choudhary M, Donohue T, Kaplan S, Schwartz DC. Whole-genome shotgun optical mapping of Rhodobacter sphaeroides strain 2.4.1 and its use for whole-genome shotgun sequence assembly. Genome Res 2003; 13:2142-51. [PMID: 12952882 PMCID: PMC403714 DOI: 10.1101/gr.1128803] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2002] [Accepted: 06/30/2003] [Indexed: 11/24/2022]
Abstract
Rhodobacter sphaeroides 2.4.1 is a facultative photoheterotrophic bacterium with tremendous metabolic diversity, which has significantly contributed to our understanding of the molecular genetics of photosynthesis, photoheterotrophy, nitrogen fixation, hydrogen metabolism, carbon dioxide fixation, taxis, and tetrapyrrole biosynthesis. To further understand this remarkable bacterium, and to accelerate an ongoing sequencing project, two whole-genome restriction maps (EcoRI and HindIII) of R. sphaeroides strain 2.4.1 were constructed using shotgun optical mapping. The approach directly mapped genomic DNA by the random mapping of single molecules. The two maps were used to facilitate sequence assembly by providing an optical scaffold for high-resolution alignment and verification of sequence contigs. Our results show that such maps facilitated the closure of sequence gaps by the early detection of nascent sequence contigs during the course of the whole-genome shotgun sequencing process.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Center, Madison, Wisconsin 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Zhou S, Deng W, Anantharaman TS, Lim A, Dimalanta ET, Wang J, Wu T, Chunhong T, Creighton R, Kile A, Kvikstad E, Bechner M, Yen G, Garic-Stankovic A, Severin J, Forrest D, Runnheim R, Churas C, Lamers C, Perna NT, Burland V, Blattner FR, Mishra B, Schwartz DC. A whole-genome shotgun optical map of Yersinia pestis strain KIM. Appl Environ Microbiol 2002; 68:6321-31. [PMID: 12450857 PMCID: PMC134435 DOI: 10.1128/aem.68.12.6321-6331.2002] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2002] [Accepted: 09/12/2002] [Indexed: 11/20/2022] Open
Abstract
Yersinia pestis is the causative agent of the bubonic, septicemic, and pneumonic plagues (also known as black death) and has been responsible for recurrent devastating pandemics throughout history. To further understand this virulent bacterium and to accelerate an ongoing sequencing project, two whole-genome restriction maps (XhoI and PvuII) of Y. pestis strain KIM were constructed using shotgun optical mapping. This approach constructs ordered restriction maps from randomly sheared individual DNA molecules directly extracted from cells. The two maps served different purposes; the XhoI map facilitated sequence assembly by providing a scaffold for high-resolution alignment, while the PvuII map verified genome sequence assembly. Our results show that such maps facilitated the closure of sequence gaps and, most importantly, provided a purely independent means for sequence validation. Given the recent advancements to the optical mapping system, increased resolution and throughput are enabling such maps to guide sequence assembly at a very early stage of a microbial sequencing project.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Lim A, Dimalanta ET, Potamousis KD, Yen G, Apodoca J, Tao C, Lin J, Qi R, Skiadas J, Ramanathan A, Perna NT, Plunkett G, Burland V, Mau B, Hackett J, Blattner FR, Anantharaman TS, Mishra B, Schwartz DC. Shotgun optical maps of the whole Escherichia coli O157:H7 genome. Genome Res 2001; 11:1584-93. [PMID: 11544203 PMCID: PMC311123 DOI: 10.1101/gr.172101] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2000] [Accepted: 06/04/2001] [Indexed: 11/24/2022]
Abstract
We have constructed NheI and XhoI optical maps of Escherichia coli O157:H7 solely from genomic DNA molecules to provide a uniquely valuable scaffold for contig closure and sequence validation. E. coli O157:H7 is a common pathogen found in contaminated food and water. Our approach obviated the need for the analysis of clones, PCR products, and hybridizations, because maps were constructed from ensembles of single DNA molecules. Shotgun sequencing of bacterial genomes remains labor-intensive, despite advances in sequencing technology. This is partly due to manual intervention required during the last stages of finishing. The applicability of optical mapping to this problem was enhanced by advances in machine vision techniques that improved mapping throughput and created a path to full automation of mapping. Comparisons were made between maps and sequence data that characterized sequence gaps and guided nascent assemblies.
Collapse
Affiliation(s)
- A Lim
- Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Karp RM, Pe'er I, Shamir R. An algorithm combining discrete and continuous methods for optical mapping. J Comput Biol 2001; 7:745-60. [PMID: 11153097 DOI: 10.1089/106652701446189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Optical mapping is a novel technique for generating the restriction map of a DNA molecule by observing many single, partially digested copies of it, using fluorescence microscopy. The real-life problem is complicated by numerous factors: false positive and false negative cut observations, inaccurate location measurements, unknown orientations, and faulty molecules. We present an algorithm for solving the real-life problem. The algorithm combines continuous optimization and combinatorial algorithms applied to a nonuniform discretization of the data. We present encouraging results on real experimental data and on simulated data.
Collapse
Affiliation(s)
- R M Karp
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA 94720-1776, USA.
| | | | | |
Collapse
|
38
|
Abstract
Optical mapping is a new technique to generate restriction maps of DNA easily and quickly. DNA restriction maps can be aligned by comparing corresponding restriction fragment lengths. To relate, organize, and analyse these maps it is necessary to rapidly compare maps. The issue of the statistical significance of approximately matching maps then becomes central, as in BLAST with sequence scoring. In this paper, we study the approximation to the distribution of counts of matched regions of specified length when comparing two DNA restriction maps. Distributional results are given to enable us to computep-values and hence to determine whether or not the two restriction maps are related. The key tool used is the Chen-Stein method of Poisson approximation. Certain open problems are described.
Collapse
|
39
|
Giacalone J, Delobette S, Gibaja V, Ni L, Skiadas Y, Qi R, Edington J, Lai Z, Gebauer D, Zhao H, Anantharaman T, Mishra B, Brown LG, Saxena R, Page DC, Schwartz DC. Optical mapping of BAC clones from the human Y chromosome DAZ locus. Genome Res 2000; 10:1421-9. [PMID: 10984460 PMCID: PMC310922 DOI: 10.1101/gr.112100] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/1999] [Accepted: 07/12/2000] [Indexed: 11/25/2022]
Abstract
The accurate mapping of clones derived from genomic regions containing complex arrangements of repeated elements presents special problems for DNA sequencers. Recent advances in the automation of optical mapping have enabled us to map a set of 16 BAC clones derived from the DAZ locus of the human Y chromosome long arm, a locus in which the entire DAZ gene as well as subsections within the gene copies have been duplicated. High-resolution optical mapping employing seven enzymes places these clones into two contigs representing four distinct copies of the DAZ gene and highlights a number of differences between individual copies of DAZ.
Collapse
Affiliation(s)
- J Giacalone
- W.M. Keck Laboratory for Biomolecular Imaging, Department of Chemistry, New York University, New York, New York 10003, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Abstract
Optical mapping is a novel technique for determining the restriction sites on a DNA molecule by directly observing a number of partially digested copies of the molecule under a light microscope. The problem is complicated by uncertainty as to the orientation of the molecules and by erroneous detection of cuts. In this paper we study the problem of constructing a restriction map based on optical mapping data. We give several variants of a polynomial reconstruction algorithm, as well as an algorithm that is exponential in the number of cut sites, and hence is appropriate only for small number of cut sites. We give a simple probabilistic model for data generation and for the errors and prove probabilistic upper and lower bounds on the number of molecules needed by each algorithm in order to obtain a correct map, expressed as a function of the number of cut sites and the error parameters. To the best of our knowledge, this is the first probabilistic analysis of algorithms for the problem. We also provide experimental results confirming that our algorithms are highly effective on simulated data.
Collapse
Affiliation(s)
- R M Karp
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley 94720, USA.
| | | |
Collapse
|
41
|
Lin J, Qi R, Aston C, Jing J, Anantharaman TS, Mishra B, White O, Daly MJ, Minton KW, Venter JC, Schwartz DC. Whole-genome shotgun optical mapping of Deinococcus radiodurans. Science 1999; 285:1558-62. [PMID: 10477518 DOI: 10.1126/science.285.5433.1558] [Citation(s) in RCA: 150] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
A whole-genome restriction map of Deinococcus radiodurans, a radiation-resistant bacterium able to survive up to 15,000 grays of ionizing radiation, was constructed without using DNA libraries, the polymerase chain reaction, or electrophoresis. Very large, randomly sheared, genomic DNA fragments were used to construct maps from individual DNA molecules that were assembled into two circular overlapping maps (2.6 and 0.415 megabases), without gaps. A third smaller chromosome (176 kilobases) was identified and characterized. Aberrant nonlinear DNA structures that may define chromosome structure and organization, as well as intermediates in DNA repair, were directly visualized by optical mapping techniques after gamma irradiation.
Collapse
Affiliation(s)
- J Lin
- W. M. Keck Laboratory for Biomolecular Imaging, Department of Chemistry, New York University, 31 Washington Place, New York, NY 10003, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Aston C, Mishra B, Schwartz DC. Optical mapping and its potential for large-scale sequencing projects. Trends Biotechnol 1999; 17:297-302. [PMID: 10370237 DOI: 10.1016/s0167-7799(99)01326-8] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Physical mapping has been rediscovered as an important component of large-scale sequencing projects. Restriction maps provide landmark sequences at defined intervals, and high-resolution restriction maps can be assembled from ensembles of single molecules by optical means. Such optical maps can be constructed from both large-insert clones and genomic DNA, and are used as a scaffold for accurately aligning sequence contigs generated by shotgun sequencing.
Collapse
Affiliation(s)
- C Aston
- Wyeth-Ayerst Research, CNS Disorders, Princeton, NJ 08543, USA.
| | | | | |
Collapse
|
43
|
Affiliation(s)
- C Aston
- Department of Chemistry, W. M. Keck Laboratory for Biomolecular Imaging, New York University, New York 10003, USA
| | | | | |
Collapse
|
44
|
Abstract
Optical Mapping is an emerging technology for constructing ordered restriction maps of DNA molecules. The underlying computational problems for this technology have been studied and several models have been proposed in recent literature. Most of these propose combinatorial models; some of them also present statistical approaches. However, it is not a priori clear as to how these models relate to one another and to the underlying problem. We present a uniform framework for the restriction map problems where each of these various models is a specific instance of the basic framework. We achieve this by identifying two "signature" functions f() and g() that characterize the models. We identify the constraints these two functions must satisfy, thus opening up the possibility of exploring other plausible models. We show that for all of the combinatorial models proposed in literature, the signature functions are semi-algebraic. We also analyze a proposed statistical method in this framework and show that the signature functions are transcendental for this model. We also believe that this framework would provide useful guidelines for dealing with other inferencing problems arising in practice. Finally, we indicate the open problems by including a survey of the best known results for these problems.
Collapse
Affiliation(s)
- L Parida
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York 10012, USA.
| |
Collapse
|
45
|
Abstract
Detailed restriction maps of microbial genomes are a valuable resource in genome sequencing studies but are toilsome to construct by contig construction of maps derived from cloned DNA. Analysis of genomic DNA enables large stretches of the genome to be mapped and circumvents library construction and associated cloning artifacts. We used pulsed-field gel electrophoresis purified Plasmodium falciparum chromosome 2 DNA as the starting material for optical mapping, a system for making ordered restriction maps from ensembles of individual DNA molecules. DNA molecules were bound to derivatized glass surfaces, cleaved with NheI or BamHI, and imaged by digital fluorescence microscopy. Large pieces of the chromosome containing ordered DNA restriction fragments were mapped. Maps were assembled from 50 molecules producing an average contig depth of 15 molecules and high-resolution restriction maps covering the entire chromosome. Chromosome 2 was found to be 976 kb by optical mapping withNheI, and 946 kb with BamHI, which compares closely to the published size of 947 kb from large-scale sequencing. The maps were used to further verify assemblies from the plasmid library used for sequencing. Maps generated in silico from the sequence data were compared to the optical mapping data, and good correspondence was found. Such high-resolution restriction maps may become an indispensable resource for large-scale genome sequencing projects.
Collapse
|
46
|
Lee JK, Dancík V, Waterman MS. Estimation for restriction sites observed by optical mapping using reversible-jump Markov Chain Monte Carlo. J Comput Biol 1998; 5:505-15. [PMID: 9773346 DOI: 10.1089/cmb.1998.5.505] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A fundamentally new molecular-biology approach in constructing restriction maps, Optical Mapping, has been developed by Schwartz et al. (1993). Using this method restriction maps are constructed by measuring the relevant fluorescence intensity and length measurements. However, it is difficult to directly estimate the restriction site locations of single DNA molecules based on these optical mapping data because of the precision of length measurements and the unknown number of true restriction sites in the data. We propose the use of a hierarchical Bayes model based on a mixture model with normals and random noise. In this model we explicitly consider the missing observation structure of the data, such as the orientations of molecules, the allocations of cutting sites to restriction sites, and the indicator variables of whether observed cut sites are true or false. Because of the complexity of the model, the large number of missing data, and the unknown number of restriction sites, we use Reversible-Jump Markov Chain Monte Carlo (MCMC) to estimate the number and the locations of the restriction sites. Since there exists a high multimodality due to unknown orientations of molecules, we also use a combination of our MCMC approach and the flipping algorithm suggested by Dancík and Waterman (1997). The study is highly computer-intensive and the development of an efficient algorithm is required.
Collapse
Affiliation(s)
- J K Lee
- National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | |
Collapse
|
47
|
Jing J, Reed J, Huang J, Hu X, Clarke V, Edington J, Housman D, Anantharaman TS, Huff EJ, Mishra B, Porter B, Shenker A, Wolfson E, Hiort C, Kantor R, Aston C, Schwartz DC. Automated high resolution optical mapping using arrayed, fluid-fixed DNA molecules. Proc Natl Acad Sci U S A 1998; 95:8046-51. [PMID: 9653137 PMCID: PMC20926 DOI: 10.1073/pnas.95.14.8046] [Citation(s) in RCA: 229] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/1998] [Accepted: 04/23/1998] [Indexed: 02/08/2023] Open
Abstract
New mapping approaches construct ordered restriction maps from fluorescence microscope images of individual, endonuclease-digested DNA molecules. In optical mapping, molecules are elongated and fixed onto derivatized glass surfaces, preserving biochemical accessibility and fragment order after enzymatic digestion. Measurements of relative fluorescence intensity and apparent length determine the sizes of restriction fragments, enabling ordered map construction without electrophoretic analysis. The optical mapping system reported here is based on our physical characterization of an effect using fluid flows developed within tiny, evaporating droplets to elongate and fix DNA molecules onto derivatized surfaces. Such evaporation-driven molecular fixation produces well elongated molecules accessible to restriction endonucleases, and notably, DNA polymerase I. We then developed the robotic means to grid DNA spots in well defined arrays that are digested and analyzed in parallel. To effectively harness this effect for high-throughput genome mapping, we developed: (i) machine vision and automatic image acquisition techniques to work with fixed, digested molecules within gridded samples, and (ii) Bayesian inference approaches that are used to analyze machine vision data, automatically producing high-resolution restriction maps from images of individual DNA molecules. The aggregate significance of this work is the development of an integrated system for mapping small insert clones allowing biochemical data obtained from engineered ensembles of individual molecules to be automatically accumulated and analyzed for map construction. These approaches are sufficiently general for varied biochemical analyses of individual molecules using statistically meaningful population sizes.
Collapse
Affiliation(s)
- J Jing
- W. M. Keck Laboratory for Biomolecular Imaging, Department of Chemistry, New York University, 31 Washington Place, New York, NY 10003, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Reed J, Singer E, Kresbach G, Schwartz DC. A quantitative study of optical mapping surfaces by atomic force microscopy and restriction endonuclease digestion assays. Anal Biochem 1998; 259:80-8. [PMID: 9606147 DOI: 10.1006/abio.1998.2640] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Many new techniques in biomolecular chemistry and genomic analysis require the immobilization of molecular reagents on specially prepared surfaces. However, the process of molecular fixation often interferes with or precludes the use of standard in vitro biochemical assays. Optical mapping is an emergent technology for genomic analysis which relies on the biochemical activity of DNA fixed to silanized glass surfaces. Optical mapping surfaces have been shown to be compatible with restriction endonucleases and a variety of DNA polymerases. The essential properties of biochemically active surfaces are poorly understood in most of the current technologies which utilize molecular fixation, including optical mapping. The purpose of this study is to use the powerful technique of atomic force microscopy, in combination with informative enzymatic assays, to correlate biochemical activity with microscopic surface structure. The results presented provide meaningful insight into the effect of surface preparation on the biochemical accessibility of surface-bound molecules. Novel analysis which may facilitate the automation of optical mapping is presented.
Collapse
Affiliation(s)
- J Reed
- Department of Chemistry, W. M. Keck Laboratory for Biomolecular Imaging, New York University, New York 10003, USA
| | | | | | | |
Collapse
|
49
|
Abstract
Genome maps have been constructed for the mycobacterial pathogens Mycobacterium leprae and Mycobacterium tuberculosis, as well as for the attenuated vaccine strain Mycobacterium bovis BCG Pasteur. While the chromosomes of M. tuberculosis and M. bovis BCG Pasteur show extensive conservation at the gross level, comparison with M. leprae revealed a high degree of diversification, with a mosaic-like pattern apparent. The ordered libraries of M. tuberculosis and M. leprae produced during the course of these studies played a central role in the genome sequencing projects of these two bacilli, showing the utility of this approach for systematic sequencing of bacterial genomes.
Collapse
Affiliation(s)
- W J Philipp
- Institute for Medical Microbiology, University of Berne, Switzerland.
| | | | | | | |
Collapse
|
50
|
Cai W, Jing J, Irvin B, Ohler L, Rose E, Shizuya H, Kim UJ, Simon M, Anantharaman T, Mishra B, Schwartz DC. High-resolution restriction maps of bacterial artificial chromosomes constructed by optical mapping. Proc Natl Acad Sci U S A 1998; 95:3390-5. [PMID: 9520376 PMCID: PMC19846 DOI: 10.1073/pnas.95.7.3390] [Citation(s) in RCA: 49] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/09/1998] [Indexed: 02/06/2023] Open
Abstract
Large insert clone libraries have been the primary resource used for the physical mapping of the human genome. Research directions in the genome community now are shifting direction from purely mapping to large-scale sequencing, which in turn, require new standards to be met by physical maps and large insert libraries. Bacterial artificial chromosome libraries offer enormous potential as the chosen substrate for both mapping and sequencing studies. Physical mapping, however, has come under some scrutiny as being "redundant" in the age of large-scale automated sequencing. We report the development and applications of nonelectrophoretic, optical approaches for high-resolution mapping of bacterial artificial chromosome that offer the potential to complement and thereby advance large-scale sequencing projects.
Collapse
Affiliation(s)
- W Cai
- W. M. Keck Laboratory for Biomolecular Imaging, Department of Chemistry, New York University, 31 Washington Place, New York, NY 10003, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|