1
|
Wendl MC, Wilson RK. Statistical aspects of discerning indel-type structural variation via DNA sequence alignment. BMC Genomics 2009; 10:359. [PMID: 19656394 PMCID: PMC2748092 DOI: 10.1186/1471-2164-10-359] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2009] [Accepted: 08/05/2009] [Indexed: 01/10/2023] Open
Abstract
Background Structural variations in the form of DNA insertions and deletions are an important aspect of human genetics and especially relevant to medical disorders. Investigations have shown that such events can be detected via tell-tale discrepancies in the aligned lengths of paired-end DNA sequencing reads. Quantitative aspects underlying this method remain poorly understood, despite its importance and conceptual simplicity. We report the statistical theory characterizing the length-discrepancy scheme for Gaussian libraries, including coverage-related effects that preceding models are unable to account for. Results Deletion and insertion statistics both depend heavily on physical coverage, but otherwise differ dramatically, refuting a commonly held doctrine of symmetry. Specifically, coverage restrictions render insertions much more difficult to capture. Increased read length has the counterintuitive effect of worsening insertion detection characteristics of short inserts. Variance in library insert length is also a critical factor here and should be minimized to the greatest degree possible. Conversely, no significant improvement would be realized in lowering fosmid variances beyond current levels. Detection power is examined under a straightforward alternative hypothesis and found to be generally acceptable. We also consider the proposition of characterizing variation over the entire spectrum of variant sizes under constant risk of false-positive errors. At 1% risk, many designs will leave a significant gap in the 100 to 200 bp neighborhood, requiring unacceptably high redundancies to compensate. We show that a few modifications largely close this gap and we give a few examples of feasible spectrum-covering designs. Conclusion The theory resolves several outstanding issues and furnishes a general methodology for designing future projects from the standpoint of a spectrum-wide constant risk.
Collapse
Affiliation(s)
- Michael C Wendl
- The Genome Center and Department of Genetics, Washington University, St Louis, MO 63108, USA.
| | | |
Collapse
|
2
|
Lamoureux D, Bernole A, Le Clainche I, Tual S, Thareau V, Paillard S, Legeai F, Dossat C, Wincker P, Oswald M, Merdinoglu D, Vignault C, Delrot S, Caboche M, Chalhoub B, Adam-Blondon AF. Anchoring of a large set of markers onto a BAC library for the development of a draft physical map of the grapevine genome. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2006; 113:344-56. [PMID: 16791700 DOI: 10.1007/s00122-006-0301-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2006] [Accepted: 04/21/2006] [Indexed: 05/10/2023]
Abstract
Five hundred and six EST-derived markers, 313 SSR markers and 26 BAC end-derived or SCAR markers were anchored by PCR on a subset of a Cabernet Sauvignon BAC library representing six genome equivalents pooled in three dimensions. In parallel, the 12,351 EST clusters of the grapevine UniGene set (build #11) from NCBI were used to design 12,125 primers pairs and perform electronic PCR on 67,543 nonredundant BAC-end sequences. This in silico experiment yielded 1,140 positive results concerning 638 different markers, among which 602 had not been already anchored by PCR. The data obtained will provide an easier access to the regulatory sequences surrounding important genes (represented by ESTs). In total, 1,731 islands of BAC clones (set of overlapping BAC clones containing at least one common marker) were obtained and 226 of them contained at least one genetically mapped anchor. These assigned islands are very useful because they will link the genetic map and the future fingerprint-based physical map and because they allowed us to indirectly place 93 ESTs on the genetic map. The islands containing two or more mapped SSR markers were also used to assess the quality of the integrated genetic map of the grapevine genome.
Collapse
Affiliation(s)
- Didier Lamoureux
- UMR INRA-CNRS-UEVE de Recherches en Génomique Végétale, 2 rue Gaston Crémieux, BP5708, 91057 Evry Cedex, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Wendl MC, Marra MA, Hillier LW, Chinwalla AT, Wilson RK, Waterston RH. Theories and Applications for Sequencing Randomly Selected Clones. Genome Res 2001. [DOI: 10.1101/gr.133901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Theory is developed for the process of sequencing randomly selected large-insert clones. Genome size, library depth, clone size, and clone distribution are considered relevant properties and perfect overlap detection for contig assembly is assumed. Genome-specific and nonrandom effects are neglected. Order of magnitude analysis indicates library depth is of secondary importance compared to the other variables, especially as clone size diminishes. In such cases, the well-known Poisson coverage law is a good approximation. Parameters derived from these models are used to examine performance for the specific case of sequencing random human BAC clones. We compare coverage and redundancy rates for libraries possessing uniform and nonuniform clone distributions. Results are measured against data from map-based human-chromosome-2 sequencing. We conclude that the map-based approach outperforms random clone sequencing, except early in a project. However, simultaneous use of both strategies can be beneficial if a performance-based estimate for halting random clone sequencing is made. Results further show that the random approach yields maximum effectiveness using nonbiased rather than biased libraries.
Collapse
|
4
|
Wendl MC, Marra MA, Hillier LW, Chinwalla AT, Wilson RK, Waterston RH. Theories and applications for sequencing randomly selected clones. Genome Res 2001; 11:274-80. [PMID: 11157790 PMCID: PMC311021 DOI: 10.1101/gr.gr-1339r] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Theory is developed for the process of sequencing randomly selected large-insert clones. Genome size, library depth, clone size, and clone distribution are considered relevant properties and perfect overlap detection for contig assembly is assumed. Genome-specific and nonrandom effects are neglected. Order of magnitude analysis indicates library depth is of secondary importance compared to the other variables, especially as clone size diminishes. In such cases, the well-known Poisson coverage law is a good approximation. Parameters derived from these models are used to examine performance for the specific case of sequencing random human BAC clones. We compare coverage and redundancy rates for libraries possessing uniform and nonuniform clone distributions. Results are measured against data from map-based human-chromosome-2 sequencing. We conclude that the map-based approach outperforms random clone sequencing, except early in a project. However, simultaneous use of both strategies can be beneficial if a performance-based estimate for halting random clone sequencing is made. Results further show that the random approach yields maximum effectiveness using nonbiased rather than biased libraries.
Collapse
Affiliation(s)
- M C Wendl
- Genome Sequencing Center, Washington University, St. Louis, Missouri 63108, USA.
| | | | | | | | | | | |
Collapse
|
5
|
Abstract
The aim of this paper is to provide general results for predicting progress in a physical mapping project by anchoring random clones, when clones and anchors are not homogeneously distributed along the genome. A complete physical map of the DNA of an organism consists of overlapping clones spanning the entire genome. Several schemes can be used to construct such a map, depending on the way that clones overlap. We focus here on the approach consisting of assembling clones sharing a common random short sequence called an anchor. Some mathematical analyses providing statistical properties of anchored clones have been developed in the stationary case. Modeling the clone and anchor processes as nonhomogeneous Poisson processes provides such an analysis in a general nonstationary framework. We apply our results to two natural nonhomogeneous models to illustrate the effect of inhomogeneity. This study reveals that using homogeneous processes for clones and anchors provides an overly optimistic assessment of the progress of the mapping project.
Collapse
Affiliation(s)
- S Schbath
- I.N.R.A., Unité de Biométrie, Jouy-en-Josas, France.
| |
Collapse
|
6
|
Xiong M, Chen HJ, Prade RA, Wang Y, Griffith J, Timberlake WE, Arnold J. On the consistency of a physical mapping method to reconstruct a chromosome in vitro. Genetics 1996; 142:267-84. [PMID: 8770604 PMCID: PMC1206956 DOI: 10.1093/genetics/142.1.267] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
During recent years considerable effort has been invested in creating physical maps for a variety of organisms as part of the Human Genome Project and in creating various methods for physical mapping. The statistical consistency of a physical mapping method to reconstruct a chromosome, however, has not been investigated. In this paper, we first establish that a model of physical mapping by binary fingerprinting of DNA fragments is identifiable using the key assumption-for a large randomly generated recombinant DNA library, there exists a staircase of DNA fragments across the chromosomal region of interest. Then we briefly introduce epi-convergence theory of variational analysis and transform the physical mapping problem into a constrained stochastic optimization problem. By doing so, we prove epi-convergence of the physical mapping model and epi-convergence of the physical mapping method. Combining the identifiability of our physical mapping model and the epi-convergence of a physical mapping method, finally we establish strong consistency of a physical mapping method.
Collapse
Affiliation(s)
- M Xiong
- Department of Mathematics and Molecular Biology, University of Southern California, Los Angeles 90089, USA
| | | | | | | | | | | | | |
Collapse
|
7
|
Abstract
Arabidopsis thaliana is a small flowering plant that is a member of the family cruciferae. It has many characteristics--diploid genetics, rapid growth cycle, relatively low repetitive DNA content, and small genome size--that recommend it as the model for a plant genome project. The current status of the genetic and physical maps, as well as efforts to sequence the genome, are presented. Examples are given of genes isolated by using map-based cloning. The importance of the Arabidopsis project for plant biology in general is discussed.
Collapse
Affiliation(s)
- H M Goodman
- Department of Genetics, Harvard Medical School, Massachusetts General Hospital, Boston, MA 02114, USA
| | | | | |
Collapse
|
8
|
Port E, Sun F, Martin D, Waterman MS. Genomic mapping by end-characterized random clones: a mathematical analysis. Genomics 1995; 26:84-100. [PMID: 7782090 DOI: 10.1016/0888-7543(95)80086-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Physical maps can be constructed by "fingerprinting" a large number of random clones and inferring overlap between clones when the fingerprints are sufficiently similar. E. Lander and M. Waterman (Genomics 2: 231-239, 1988) gave a mathematical analysis of such mapping strategies. The analysis is useful for comparing various fingerprinting methods. Recently it has been proposed that ends of clones rather than the entire clone be fingerprinted or characterized. Such fingerprints, which include sequenced clone ends, require a mathematical analysis deeper than that of Lander-Waterman. This paper studies clone islands, which can include uncharacterized regions, and also the islands that are formed entirely from the ends of clones.
Collapse
Affiliation(s)
- E Port
- Department of Mathematics, University of Southern California, Los Angeles 90089-1113, USA
| | | | | | | |
Collapse
|
9
|
Goldberg PW, Golumbic MC, Kaplan H, Shamir R. Four strikes against physical mapping of DNA. J Comput Biol 1995; 2:139-52. [PMID: 7497116 DOI: 10.1089/cmb.1995.2.139] [Citation(s) in RCA: 119] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Physical mapping is a central problem in molecular biology and the human genome project. The problem is to reconstruct the relative position of fragments of DNA along the genome from information on their pairwise overlaps. We show that four simplified models of the problem lead to NP-complete decision problems: Colored unit interval graph completion, the maximum interval (or unit interval) subgraph, the pathwidth of a bipartite graph, and the k-consecutive ones problem for k > or = 2. These models have been chosen to reflect various features typical in biological data, including false-negative and positive errors, small width of the map, and chimericism.
Collapse
Affiliation(s)
- P W Goldberg
- Sandia National Labs, Albuquerque, New Mexico 87185, USA
| | | | | | | |
Collapse
|
10
|
Greenberg D, Istrail S. The chimeric mapping problem: algorithmic strategies and performance evaluation on synthetic genomic data. COMPUTERS & CHEMISTRY 1994; 18:207-20. [PMID: 7952891 DOI: 10.1016/0097-8485(94)85015-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The Human Genome Project requires better software for the creation of physical maps of chromosomes. Current mapping techniques involve breaking large segments of DNA into smaller, more-manageable pieces, gathering information on all the small pieces, and then constructing a map of the original large piece from the information about the small pieces. Unfortunately, in the process of breaking up the DNA some information is lost and noise of various types is introduced; in particular, the order of the pieces is not preserved. Thus, the map maker must solve a combinatorial problem in order to reconstruct the map. Good software is indispensable for quick, accurate reconstruction. The reconstruction is complicated by various experimental errors. A major source of difficulty--which seems to be inherent to the recombination technology--is the presence of chimeric DNA clones. It is fairly common for two disjoint DNA pieces to form a chimera, i.e., a fusion of two pieces which appears as a single piece. Attempts to order chimera will fail unless they are algorithmically divided into their constituent pieces. Despite consensus within the genomic mapping community of the critical importance of correcting chimerism, algorithms for solving the chimeric clone problem have received only passing attention in the literature. Based on a model proposed by Lander (1992a, b) this paper presents the first algorithms for analyzing chimerism. We construct physical maps in the presence of chimerism by creating optimization functions which have minimizations which correlate with map quality. Despite the fact that these optimization functions are invariably NP-complete our algorithms are guaranteed to produce solutions which are close to the optimum. The practical import of using these algorithms depends on the strength of the correlation of the function to the map quality as well as on the accuracy of the approximations. We employ two fundamentally different optimization functions as a means of avoiding biases likely to decorrelate the solutions from the desired map. Experiments on simulated data show that both our algorithm which minimizes the number of chimeric fragments in a solution and our algorithm which minimizes the maximum number of fragments per clone in a solution do, in fact, correlate to high quality solutions. Furthermore, tests on simulated data using parameters set to mimic real experiments show that that the algorithms have the potential to find high quality solutions with real data. We plan to test our software against real data from the Whitehead Institute and from Los Alamos Genomic Research Center in the near future.
Collapse
Affiliation(s)
- D Greenberg
- Sandia National Laboratories, Algorithms and Discrete Mathematics Department, Albuquerque, NM
| | | |
Collapse
|
11
|
Balding DJ. Design and analysis of chromosome physical mapping experiments. Philos Trans R Soc Lond B Biol Sci 1994; 344:329-35. [PMID: 7800702 DOI: 10.1098/rstb.1994.0071] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Mathematical and statistical aspects of constructing ordered-clone physical maps of chromosomes are reviewed. Three broad problems are addressed: analysis of fingerprint data to identify configurations of overlapping clones, prediction of the rate of progress of a mapping strategy and optimal design of pooling schemes for screening large clone libraries.
Collapse
Affiliation(s)
- D J Balding
- School of Mathematical Sciences, Queen Mary & Westfield College, University of London, U.K
| |
Collapse
|
12
|
Abstract
Sets of ordered overlapping cloned genomic DNA fragments that span each of the human chromosomes are urgently needed for identification of human disease genes. Such a physical map also provides unique material to study the structure and function of the genome. We have therefore exhaustively analysed the CEPH yeast artificial chromosome (YAC) library, which contains 33,000 clones, whose insert size was individually determined. These YACs have an average length of 0.9 megabases and cover the equivalent of 10 haploid genomes. Several mapping techniques were combined to provide multiple sources of structural information for most of these clones. Finally, the library was screened with more than 2,000 genetic markers quasiuniformly distributed over 90% of the genome. These results should allow the scientific community to construct detailed maps of all human chromosomes. Moreover, we propose a data analysis strategy that produces a first-generation integrated map covering most of the human genome.
Collapse
Affiliation(s)
- D Cohen
- Fondation Jean Dausset-CEPH, Paris, France
| | | | | |
Collapse
|
13
|
Zhang MQ, Marr TG. Genome mapping by nonrandom anchoring: a discrete theoretical analysis. Proc Natl Acad Sci U S A 1993; 90:600-4. [PMID: 8421694 PMCID: PMC45711 DOI: 10.1073/pnas.90.2.600] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
As part of our effort to construct a physical map of the genome of the fission yeast Schizosaccharomyces pombe, we have made theoretical predictions for the progress expected, as measured by the expected length fraction of island coverage and by the expected properties of the anchored islands such as the number and the size of islands. Our experimental strategy is to construct a random clone library and screen the library for clones having unique sequence at both ends. This scheme is essentially the same as the clone-limited double sequence-tagged-site selection scheme which was used in a computer simulation by Palazzolo et al. [Palazzolo, M. J., Sawyer, S. A., Martin, C. H., Smoller, D. A. & Hartl, D. L. (1991) Proc. Natl. Acad. Sci. USA 88, 8034-8038]. Both simulation and ongoing experiments in our laboratory have shown that the nonrandom anchoring method is far superior to random anchoring. In this paper, we propose a theoretical model to explain the simulated data and the experimental data.
Collapse
Affiliation(s)
- M Q Zhang
- Cold Spring Harbor Laboratory, NY 11724
| | | |
Collapse
|
14
|
Abstract
Statistical approaches help in the determination of significant configurations in protein and nucleic acid sequence data. Three recent statistical methods are discussed: (i) score-based sequence analysis that provides a means for characterizing anomalies in local sequence text and for evaluating sequence comparisons; (ii) quantile distributions of amino acid usage that reveal general compositional biases in proteins and evolutionary relations; and (iii) r-scan statistics that can be applied to the analysis of spacings of sequence markers.
Collapse
Affiliation(s)
- S Karlin
- Department of Mathematics, Stanford University, CA 94305
| | | |
Collapse
|
15
|
Evans GA, McElligott DL. Physical mapping of human chromosomes. GENETIC ENGINEERING 1992; 14:269-78. [PMID: 1368280 DOI: 10.1007/978-1-4615-3424-2_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Affiliation(s)
- G A Evans
- Molecular Genetics Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037
| | | |
Collapse
|
16
|
Marr TG, Yan X, Yu Q. Genomic mapping by single copy landmark detection: a predictive model with a discrete mathematical approach. Mamm Genome 1992; 3:644-9. [PMID: 1450514 DOI: 10.1007/bf00352482] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
One of the goals of the Human Genome Project is to produce libraries of largely contiguous, ordered sets of molecular clones for use in sequencing and gene mapping projects. This is planned to be done for human and many model organisms. Theory and practice have shown that long-range contiguity and the degree to which the entire genome is covered by ordered clones can be affected by many biological variables. Many laboratories are currently experimenting with different experimental strategies and theoretical models to help plan strategies for accomplishing long-range molecular mapping of genomes. Here we describe a new mathematical model and formulas for helping to plan genome mapping projects, using various single-copy landmark (SCL) detection, or "anchoring", strategies. We derive formulas that allow us to examine the effects of interactions among the following variables: average insert size of the cloning vector, average size of SCL, the number of SCL, and the redundancy in coverage of the clone library. We also examine and compare three different ways in which anchoring can be implemented: (1) anchors are selected independently of the library to be ordered (random anchoring); (2) anchors are made from end probes from both ends of clones in the library to be ordered (nonrandom anchoring); and (3) anchors are made from one end or the other, randomly, from clones in the library to be ordered (nonrandom anchoring). Our results show that, for biologically realistic conditions, nonrandom anchoring is always more effective than random anchoring for contig building, and there is little to be gained from making SCL from both ends of clones vs. only one end of clones.(ABSTRACT TRUNCATED AT 250 WORDS)
Collapse
Affiliation(s)
- T G Marr
- Cold Spring Harbor Laboratory, New York 11724
| | | | | |
Collapse
|
17
|
Barillot E, Lacroix B, Cohen D. Theoretical analysis of library screening using a N-dimensional pooling strategy. Nucleic Acids Res 1991; 19:6241-7. [PMID: 1956784 PMCID: PMC329134 DOI: 10.1093/nar/19.22.6241] [Citation(s) in RCA: 96] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
A solution to the problem of library screening is analysed. We examine how to retrieve those clones that are positive for a single copy landmark from a whole library while performing only a minimum number of laboratory tests: the clones are arranged on a matrix (i.e in 2 dimensions) and pooled according to the rows and columns. A fingerprint is determined for each pool and an analysis allows selection of a list containing all the positive clones, plus a few false positives. These false positives are eliminated by using another (or several other) matrix which has to be reconfigured in a way as different as possible from the previous one. We examine the use of cubes (3 dimensions) or hypercubes of any dimension instead of matrices and analyse how to reconfigure them in order to eliminate the false positives as efficiently as possible. The advantage of the method proposed is the low number of tests required and the low number of pools that require to be prepared [only 258 pools and 282 tests (258 + 24 verifications) are needed to screen the 72,000 clones of the CEPH YAC library (1) with a sequence-tagged site]. Furthermore, this method allows easy and systematic screenings and can be applied to a large physical mapping project, which will lead to an interesting map with a low, precisely known, rate of error: when fingerprinting a 150 Mb chromosome with the CEPH YAC library and 1750 sequence-tagged sites, 903,000 tests would be necessary to obtain about 20 contigs of an average length of 6.7 Mb, while only about one false positive would be expected in the resultant map. Finally, STSs can be ordered by dividing a clone library into sublibraries (corresponding to groups of microplates for example) and testing each STS on pooled clones from each sublibrary. This allows to dedicate to each STSs a fingerprint that consists in the list of the positive pools. In many cases these fingerprints will be enough to order the STSs. Indeed if large YACs (greater than 1 Mb) can be obtained, the combined screening of DNA families and YAC DNA pools would allow an integrated construction of both genetic and physical maps of the human genome, that will also reduce the optimal number of meioses needed for a 1 centimorgan linkage map.
Collapse
Affiliation(s)
- E Barillot
- Centre d'Etude du Polymorphisme Humain (CEPH), Paris, France
| | | | | |
Collapse
|
18
|
Green ED, Green P. Sequence-tagged site (STS) content mapping of human chromosomes: theoretical considerations and early experiences. PCR METHODS AND APPLICATIONS 1991; 1:77-90. [PMID: 1842934 DOI: 10.1101/gr.1.2.77] [Citation(s) in RCA: 75] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The magnitude of the effort required to complete the human genome project will require constant refinements of the tools available for the large-scale study of DNA. Such improvements must include both the development of more powerful technologies and the reformulation of the theoretical strategies that account for the changing experimental capabilities. The two technological advances described here, PCR and YAC cloning, have rapidly become incorporated into the standard armamentarium of genome analysis and represent key examples of how technological developments continue to drive experimental strategies in molecular biology. Because of its high sensitivity, specificity, and potential for automation, PCR is transforming many aspects of DNA mapping. Similarly, by providing the means to isolate and study larger pieces of DNA, YAC cloning has made practical the achievement of megabase-level continuity in physical maps. Taken together, these two technologies can be envisioned as providing a powerful strategy for constructing physical maps of whole chromosomes. Undoubtedly, future technological developments will promote even more effective mapping strategies. Nonetheless, the theoretical projections and practical experience described here suggest that constructing YAC-based STS-content maps of whole human chromosomes is now possible. Random STSs can be efficiently generated and used to screen collections of YAC clones, and contiguous YAC coverage of regions exceeding 2 Mb can be readily obtained. While the predicted laboratory effort required for mapping whole human chromosomes remains daunting, it is clearly feasible.
Collapse
Affiliation(s)
- E D Green
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110
| | | |
Collapse
|