1
|
Genomic landscapes of bacterial transposons and their applications in strain improvement. Appl Microbiol Biotechnol 2022; 106:6383-6396. [PMID: 36094654 DOI: 10.1007/s00253-022-12170-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 08/19/2022] [Accepted: 09/01/2022] [Indexed: 11/02/2022]
Abstract
Transposons are mobile genetic elements that can give rise to gene mutation and genome rearrangement. Due to their mobility, transposons have been exploited as genetic tools for modification of plants, animals, and microbes. Although a plethora of reviews have summarized families of transposons, the transposons from fermentation bacteria have not been systematically documented, which thereby constrain the exploitation for metabolic engineering and synthetic biology purposes. In this review, we summarize the transposons from the most used fermentation bacteria including Escherichia coli, Bacillus subtilis, Lactococcus lactis, Corynebacterium glutamicum, Klebsiella pneumoniae, and Zymomonas mobilis by literature retrieval and data mining from GenBank and KEGG. We also outline the state-of-the-art advances in basic research and industrial applications especially when allied with other genetic tools. Overall, this review aims to provide valuable insights for transposon-mediated strain improvement. KEY POINTS: • The transposons from the most-used fermentation bacteria are systematically summarized. • The applications of transposons in strain improvement are comprehensively reviewed.
Collapse
|
2
|
Applications of the Bacteriophage Mu In Vitro Transposition Reaction and Genome Manipulation via Electroporation of DNA Transposition Complexes. Methods Mol Biol 2018; 1681:279-286. [PMID: 29134602 DOI: 10.1007/978-1-4939-7343-9_20] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The capacity of transposable elements to insert into the genomes has been harnessed during the past decades to various in vitro and in vivo applications. This chapter describes in detail the general protocols and principles applicable for the Mu in vitro transposition reaction as well as the assembly of DNA transposition complexes that can be electroporated into bacterial cells to accomplish efficient gene delivery. These techniques with their modifications potentiate various gene and genome modification applications, which are discussed briefly here, and the reader is referred to the original publications for further details.
Collapse
|
3
|
Morelli A, Cabezas Y, Mills LJ, Seelig B. Extensive libraries of gene truncation variants generated by in vitro transposition. Nucleic Acids Res 2017; 45:e78. [PMID: 28130425 PMCID: PMC5449547 DOI: 10.1093/nar/gkx030] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 01/20/2017] [Indexed: 11/14/2022] Open
Abstract
The detailed analysis of the impact of deletions on proteins or nucleic acids can reveal important functional regions and lead to variants with improved macromolecular properties. We present a method to generate large libraries of mutants with deletions of varying length that are randomly distributed throughout a given gene. This technique facilitates the identification of crucial sequence regions in nucleic acids or proteins. The approach utilizes in vitro transposition to generate 5΄ and 3΄ fragment sub-libraries of a given gene, which are then randomly recombined to yield a final library comprising both terminal and internal deletions. The method is easy to implement and can generate libraries in three to four days. We used this approach to produce a library of >9000 random deletion mutants of an artificial RNA ligase enzyme representing 32% of all possible deletions. The quality of the library was assessed by next-generation sequencing and detailed bioinformatics analysis. Finally, we subjected this library to in vitro selection and obtained fully functional variants with deletions of up to 18 amino acids of the parental enzyme that had been 95 amino acids in length.
Collapse
Affiliation(s)
- Aleardo Morelli
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.,BioTechnology Institute, University of Minnesota, St. Paul, MN 55108, USA
| | - Yari Cabezas
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.,BioTechnology Institute, University of Minnesota, St. Paul, MN 55108, USA
| | - Lauren J Mills
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA
| | - Burckhard Seelig
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.,BioTechnology Institute, University of Minnesota, St. Paul, MN 55108, USA
| |
Collapse
|
4
|
Pulkkinen E, Haapa-Paananen S, Turakainen H, Savilahti H. A set of mini-Mu transposons for versatile cloning of circular DNA and novel dual-transposon strategy for increased efficiency. Plasmid 2016; 86:46-53. [PMID: 27387339 DOI: 10.1016/j.plasmid.2016.07.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Revised: 06/29/2016] [Accepted: 07/02/2016] [Indexed: 12/22/2022]
Abstract
Mu transposition-based cloning of DNA circles employs in vitro transposition reaction to deliver both the plasmid origin of replication and a selectable marker into the target DNA of interest. We report here the construction of a platform for the purpose that contains ten mini-Mu transposons with five different replication origins, enabling a variety of research approaches for the discovery and study of circular DNA. We also demonstrate that the simultaneous use of two transposons, one with the origin of replication and the other with selectable marker, is beneficial as it improves the cloning efficiency by reducing the fraction of autointegration-derived plasmid clones. The constructed transposons now provide a set of new tools for the studies on DNA circles and widen the applicability of Mu transposition based approaches to clone circular DNA from various sources.
Collapse
Affiliation(s)
- Elsi Pulkkinen
- Division of Genetics and Physiology, Department of Biology, University of Turku, Vesilinnantie 5, FI-20500 Turku, Finland
| | - Saija Haapa-Paananen
- Division of Genetics and Physiology, Department of Biology, University of Turku, Vesilinnantie 5, FI-20500 Turku, Finland
| | - Hilkka Turakainen
- Institute of Biotechnology, Viikki Biocenter, P.O. Box 56, Viikinkaari 9, FI-00014, University of Helsinki, Helsinki, Finland
| | - Harri Savilahti
- Division of Genetics and Physiology, Department of Biology, University of Turku, Vesilinnantie 5, FI-20500 Turku, Finland; Institute of Biotechnology, Viikki Biocenter, P.O. Box 56, Viikinkaari 9, FI-00014, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
5
|
Liu SS, Wei X, Ji Q, Xin X, Jiang B, Liu J. A facile and efficient transposon mutagenesis method for generation of multi-codon deletions in protein sequences. J Biotechnol 2016; 227:27-34. [DOI: 10.1016/j.jbiotec.2016.03.038] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Revised: 03/17/2016] [Accepted: 03/21/2016] [Indexed: 12/17/2022]
|
6
|
MuA-mediated in vitro cloning of circular DNA: transpositional autointegration and the effect of MuB. Mol Genet Genomics 2016; 291:1181-91. [DOI: 10.1007/s00438-016-1175-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 01/21/2016] [Indexed: 11/26/2022]
|
7
|
Pulkkinen E, Haapa-Paananen S, Savilahti H. An assay to monitor the activity of DNA transposition complexes yields a general quality control measure for transpositional recombination reactions. Mob Genet Elements 2014; 4:1-8. [PMID: 26442171 PMCID: PMC4590003 DOI: 10.4161/21592543.2014.969576] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 08/22/2014] [Accepted: 09/01/2014] [Indexed: 12/20/2022] Open
Abstract
Transposon-based technologies have many applications in molecular biology and can be used for gene delivery into prokaryotic and eukaryotic cells. Common transpositional activity measurement assays suitable for many types of transposons would be beneficial, as diverse transposon systems could be compared for their performance attributes. Therefore, we developed a general-purpose assay to enable and standardize the activity measurement for DNA transposition complexes (transpososomes), using phage Mu transposition as a test platform. This assay quantifies transpositional recombination efficiency and is based on an in vitro transposition reaction with a target plasmid carrying a lethal ccdB gene. If transposition targets ccdB, this gene becomes inactivated, enabling plasmid-receiving Escherichia coli cells to survive and to be scored as colonies on selection plates. The assay was validated with 3 mini-Mu transposons varying in size and differing in their marker gene constitution. Tests with different amounts of transposon DNA provided a linear response and yielded a 10-fold operational range for the assay. The colony formation capacity was linearly correlated with the competence status of the E.coli cells, enabling normalization of experimental data obtained with different batches of recipient cells. The developed assay can now be used to directly compare transpososome activities with all types of mini-Mu transposons, regardless of their aimed use. Furthermore, the assay should be directly applicable to other transposition-based systems with a functional in vitro reaction, and it provides a dependable quality control measure that previously has been lacking but is highly important for the evaluation of current and emerging transposon-based applications.
Collapse
Affiliation(s)
- Elsi Pulkkinen
- Division of Genetics and Physiology; Department of Biology; University of Turku; Turku, Finland
| | - Saija Haapa-Paananen
- Division of Genetics and Physiology; Department of Biology; University of Turku; Turku, Finland
| | - Harri Savilahti
- Division of Genetics and Physiology; Department of Biology; University of Turku; Turku, Finland
| |
Collapse
|
8
|
Hall RN, Meers J, Mitter N, Fowler EV, Mahony TJ. The Meleagrid herpesvirus 1 genome is partially resistant to transposition. Avian Dis 2013; 57:380-6. [PMID: 23901750 DOI: 10.1637/10339-082912-reg.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The propagation of herpesvirus genomes as infectious bacterial artificial chromosomes (iBAC) has enabled the application of highly efficient strategies to investigate gene function across the genome. One of these strategies, transposition, has been used successfully on a number of herpesvirus iBACs to generate libraries of gene disruption mutants. Gene deletion studies aimed at determining the dispensable gene repertoire of the Meleagrid herpesvirus 1 (MeHV-1) genome to enhance the utility of this virus as a vaccine vector have been conducted in this report. A MeHV-1 iBAC was used in combination with the Tn5 and MuA transposition systems in an attempt to generate MeHV-1 gene interruption libraries. However, these studies demonstrated that Tn5 transposition events into the MeHV-1 genome occurred at unexpectedly low frequencies. Furthermore, characterization of genomic locations of the rare Tn5 transposon insertion events indicated a nonrandom distribution within the viral genome, with seven of the 24 insertions occurring within the gene encoding infected cell protein 4. Although insertion events with the MuA system occurred at higher frequency compared with the Tn5 system, fewer insertion events were generated than has previously been reported with this system. The characterization and distribution of these MeHV-1 iBAC transposed mutants is discussed at both the nucleotide and genomic level, and the properties of the MeHV-1 genome that could influence transposition frequency are discussed.
Collapse
Affiliation(s)
- Robyn N Hall
- School of Veterinary Science, The University of Queensland, Gatton, QLD 4343, Australia
| | | | | | | | | |
Collapse
|
9
|
Uenishi H, Morozumi T, Toki D, Eguchi-Ogawa T, Rund LA, Schook LB. Large-scale sequencing based on full-length-enriched cDNA libraries in pigs: contribution to annotation of the pig genome draft sequence. BMC Genomics 2012; 13:581. [PMID: 23150988 PMCID: PMC3499286 DOI: 10.1186/1471-2164-13-581] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Accepted: 08/09/2012] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Along with the draft sequencing of the pig genome, which has been completed by an international consortium, collection of the nucleotide sequences of genes expressed in various tissues and determination of entire cDNA sequences are necessary for investigations of gene function. The sequences of expressed genes are also useful for genome annotation, which is important for isolating the genes responsible for particular traits. RESULTS We performed a large-scale expressed sequence tag (EST) analysis in pigs by using 32 full-length-enriched cDNA libraries derived from 28 kinds of tissues and cells, including seven tissues (brain, cerebellum, colon, hypothalamus, inguinal lymph node, ovary, and spleen) derived from pigs that were cloned from a sow subjected to genome sequencing. We obtained more than 330,000 EST reads from the 5'-ends of the cDNA clones. Comparison with human and bovine gene catalogs revealed that the ESTs corresponded to at least 15,000 genes. cDNA clones representing contigs and singlets generated by assembly of the EST reads were subjected to full-length determination of inserts. We have finished sequencing 31,079 cDNA clones corresponding to more than 12,000 genes. Mapping of the sequences of these cDNA clones on the draft sequence of the pig genome has indicated that the clones are derived from about 15,000 independent loci on the pig genome. CONCLUSIONS ESTs and cDNA sequences derived from full-length-enriched libraries are valuable for annotation of the draft sequence of the pig genome. This information will also contribute to the exploration of promoter sequences on the genome and to molecular biology-based analyses in pigs.
Collapse
Affiliation(s)
- Hirohide Uenishi
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2 Ikenodai, Tsukuba, Ibaraki, 305-8602, Japan
- Division of Animal Sciences, National Institute of Agrobiological Sciences, 2 Ikenodai, Tsukuba, Ibaraki, 305-8602, Japan
- Animal Genome Research Program, 2 Ikenodai, Tsukuba, Ibaraki, 305-8602, Japan
| | - Takeya Morozumi
- Animal Genome Research Program, 2 Ikenodai, Tsukuba, Ibaraki, 305-8602, Japan
- Animal Research Division, Japan Institute of Association for Techno-innovation in Agriculture, Forestry and Fisheries, 446-1 Ippaizuka, Kamiyokoba, Tsukuba, Ibaraki, 305-0854, Japan
| | - Daisuke Toki
- Animal Genome Research Program, 2 Ikenodai, Tsukuba, Ibaraki, 305-8602, Japan
- Animal Research Division, Japan Institute of Association for Techno-innovation in Agriculture, Forestry and Fisheries, 446-1 Ippaizuka, Kamiyokoba, Tsukuba, Ibaraki, 305-0854, Japan
| | - Tomoko Eguchi-Ogawa
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2 Ikenodai, Tsukuba, Ibaraki, 305-8602, Japan
- Animal Genome Research Program, 2 Ikenodai, Tsukuba, Ibaraki, 305-8602, Japan
| | - Lauretta A Rund
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL, 61801, USA
| | - Lawrence B Schook
- Institute for Genomic Biology, University of Illinois at Urbana-Champaign, 1206 West Gregory Drive, Urbana, IL, 61801, USA
| |
Collapse
|
10
|
Green B, Bouchier C, Fairhead C, Craig NL, Cormack BP. Insertion site preference of Mu, Tn5, and Tn7 transposons. Mob DNA 2012; 3:3. [PMID: 22313799 PMCID: PMC3292447 DOI: 10.1186/1759-8753-3-3] [Citation(s) in RCA: 103] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2011] [Accepted: 02/07/2012] [Indexed: 11/10/2022] Open
Abstract
Background Transposons, segments of DNA that can mobilize to other locations in a genome, are often used for insertion mutagenesis or to generate priming sites for sequencing of large DNA molecules. For both of these uses, a transposon with minimal insertion bias is desired to allow complete coverage with minimal oversampling. Findings Three transposons, Mu, Tn5, and Tn7, were used to generate insertions in the same set of fosmids containing Candida glabrata genomic DNA. Tn7 demonstrates markedly less insertion bias than either Mu or Tn5, with both Mu and Tn5 biased toward sequences containing guanosine (G) and cytidine (C). This preference of Mu and Tn5 yields less uniform spacing of insertions than for Tn7, in the adenosine (A) and thymidine (T) rich genome of C. glabrata (39% GC). Conclusions In light of its more uniform distribution of insertions, Tn7 should be considered for applications in which insertion bias is deleterious.
Collapse
Affiliation(s)
- Brian Green
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Hunterian 617, 725 North Wolfe Street, Baltimore, MD 21205-2185, USA.
| | | | | | | | | |
Collapse
|
11
|
A rapid and cost-effective method for sequencing pooled cDNA clones by using a combination of transposon insertion and Gateway technology. Biotechniques 2012; 51:195-7. [PMID: 21906043 DOI: 10.2144/000113737] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2011] [Accepted: 07/14/2011] [Indexed: 11/23/2022] Open
Abstract
Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Collapse
|
12
|
Kharel MK, Nybo SE, Shepherd MD, Rohr J. Cloning and characterization of the ravidomycin and chrysomycin biosynthetic gene clusters. Chembiochem 2010; 11:523-32. [PMID: 20140934 PMCID: PMC2879346 DOI: 10.1002/cbic.200900673] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2009] [Indexed: 11/06/2022]
Abstract
The gene clusters responsible for the biosynthesis of two antitumor antibiotics, ravidomycin and chrysomycin, have been cloned from Streptomyces ravidus and Streptomyces albaduncus, respectively. Sequencing of the 33.28 kb DNA region of the cosmid cosRav32 and the 34.65 kb DNA region of cosChry1-1 and cosChryF2 revealed 36 and 35 open reading frames (ORFs), respectively, harboring tandem sets of type II polyketide synthase (PKS) genes, D-ravidosamine and D-virenose biosynthetic genes, post-PKS tailoring genes, regulatory genes, and genes of unknown function. The isolated ravidomycin gene cluster was confirmed to be involved in ravidomycin biosynthesis through the production of a new analogue of ravidomycin along with anticipated pathway intermediates and biosynthetic shunt products upon heterologous expression of the cosmid, cosRav32, in Streptomyces lividans TK24. The identity of the cluster was further verified through cross complementation of gilvocarcin V (GV) mutants. Similarly, the chrysomycin gene cluster was demonstrated to be indirectly involved in chrysomycin biosynthesis through cross-complementation of gilvocarcin mutants deficient in the oxygenases GilOII, GilOIII, and GilOIV with the respective chrysomycin monooxygenase homologues. The ravidomycin glycosyltransferase (RavGT) appears to be able to transfer both amino- and neutral sugars, exemplified through the structurally distinct 6-membered D-ravidosamine and 5-membered D-fucofuranose, to the coumarin-based polyketide derived backbone. These results expand the library of biosynthetic genes involved in the biosyntheses of gilvocarcin class compounds that can be used to generate novel analogues through combinatorial biosynthesis.
Collapse
Affiliation(s)
- Madan K Kharel
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY 40536-0596, USA
| | | | | | | |
Collapse
|
13
|
Next generation tools for high-throughput promoter and expression analysis employing single-copy knock-ins at the Hprt1 locus. Genomics 2008; 93:196-204. [PMID: 18950699 DOI: 10.1016/j.ygeno.2008.09.014] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2008] [Revised: 09/15/2008] [Accepted: 09/17/2008] [Indexed: 11/22/2022]
Abstract
We have engineered a set of useful tools that facilitate targeted single copy knock-in (KI) at the hypoxanthine guanine phosphoribosyl transferase 1 (Hprt1) locus. We employed fine scale mapping to delineate the precise breakpoint location at the Hprt1(b-m3) locus allowing allele specific PCR assays to be established. Our suite of tools contains four targeting expression vectors and a complementing series of embryonic stem cell lines. Two of these vectors encode enhanced green fluorescent protein (EGFP) driven by the human cytomegalovirus immediate-early enhancer/modified chicken beta-actin (CAG) promoter, whereas the other two permit flexible combinations of a chosen promoter combined with a reporter and/or gene of choice. We have validated our tools as part of the Pleiades Promoter Project (http://www.pleiades.org), with the generation of brain-specific EGFP positive germline mouse strains.
Collapse
|
14
|
Whole-genome detection of conditionally essential and dispensable genes in Escherichia coli via genetic footprinting. Methods Mol Biol 2008; 416:83-102. [PMID: 18392962 DOI: 10.1007/978-1-59745-321-9_6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
We present a whole-genome approach to genetic footprinting in Escherichia coli using Tn5-based transposons to determine gene essentiality. A population of cells is mutagenized and subjected to outgrowth under selective conditions. Transposon insertions in the surviving mutants are detected using nested polymerase chain reaction (PCR), agarose gel electrophoresis, and software-assisted PCR product size determination. Genomic addresses of these inserts are then mapped onto the E. coli genome sequence based on the PCR product lengths and the addresses of the corresponding genome-specific primers. Gene essentiality conclusions were drawn based on a semiautomatic analysis of the number and relative positions of inserts retained within each gene after selective outgrowth.
Collapse
|
15
|
Morin RD, Chang E, Petrescu A, Liao N, Griffith M, Kirkpatrick R, Butterfield YS, Young AC, Stott J, Barber S, Babakaiff R, Dickson MC, Matsuo C, Wong D, Yang GS, Smailus DE, Wetherby KD, Kwong PN, Grimwood J, Brinkley CP, Brown-John M, Reddix-Dugue ND, Mayo M, Schmutz J, Beland J, Park M, Gibson S, Olson T, Bouffard GG, Tsai M, Featherstone R, Chand S, Siddiqui AS, Jang W, Lee E, Klein SL, Blakesley RW, Zeeberg BR, Narasimhan S, Weinstein JN, Pennacchio CP, Myers RM, Green ED, Wagner L, Gerhard DS, Marra MA, Jones SJ, Holt RA. Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling. Genome Res 2006; 16:796-803. [PMID: 16672307 PMCID: PMC1479861 DOI: 10.1101/gr.4871006] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization.
Collapse
Affiliation(s)
- Ryan D. Morin
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Elbert Chang
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Anca Petrescu
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Nancy Liao
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Malachi Griffith
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Robert Kirkpatrick
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | | | - Alice C. Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Jeffrey Stott
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Sarah Barber
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Ryan Babakaiff
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Mark C. Dickson
- Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Corey Matsuo
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - David Wong
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - George S. Yang
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Duane E. Smailus
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Keith D. Wetherby
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Peggy N. Kwong
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Jane Grimwood
- Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | | | - Mabel Brown-John
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | | | - Michael Mayo
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Jeremy Schmutz
- Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Jaclyn Beland
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Morgan Park
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Susan Gibson
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Teika Olson
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Gerard G. Bouffard
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Miranda Tsai
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Ruth Featherstone
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Steve Chand
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Asim S. Siddiqui
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Wonhee Jang
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894, USA
| | - Ed Lee
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894, USA
| | - Steven L. Klein
- National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | - Barry R. Zeeberg
- Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology
| | | | - John N. Weinstein
- Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology
| | - Christa Prange Pennacchio
- The I.M.A.G.E Consortium, Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Richard M. Myers
- Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Eric D. Green
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Lukas Wagner
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894, USA
| | | | - Marco A. Marra
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Steven J.M. Jones
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Robert A. Holt
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
- Corresponding author.E-mail ; fax (604) 877-6085
| |
Collapse
|
16
|
Yu JH, Schaffer DV. Selection of novel vesicular stomatitis virus glycoprotein variants from a peptide insertion library for enhanced purification of retroviral and lentiviral vectors. J Virol 2006; 80:3285-92. [PMID: 16537595 PMCID: PMC1440395 DOI: 10.1128/jvi.80.7.3285-3292.2006] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2005] [Accepted: 01/19/2006] [Indexed: 11/20/2022] Open
Abstract
The introduction of new features or functions that are not present in an original protein is a significant challenge in protein engineering. For example, modifications to vesicular stomatitis virus glycoprotein (VSV-G), which is commonly used to pseudotype retroviral and lentiviral vectors for gene delivery, have been hindered by a lack of structural knowledge of the protein. We have developed a transposon-based approach that randomly incorporates designed polypeptides throughout a protein to generate saturated insertion libraries and a subsequent high-throughput selection process in mammalian cells that enables the identification of optimal insertion sites for a novel designed functionality. This method was applied to VSV-G in order to construct a comprehensive library of mutants whose combined members have a His6 tag inserted at likely every site in the original protein sequence. Selecting the library via iterative retroviral infections of mammalian cells led to the identification of several VSV-G-His6 variants that were able to package high-titer viral vectors and could be purified by Ni-nitrilotriacetic acid affinity chromatography. Column purification of vectors reduced protein and DNA impurities more than 5,000-fold and 14,000-fold, respectively, from the viral supernatant. This substantially improved purity elicited a weaker immune response in the brain, without altering the infectivity or tropism from wild-type VSV-G-pseudotyped vectors. This work applies a powerful new tool for protein engineering to construct novel viral envelope variants that can greatly improve the safety and use of retroviral and lentiviral vectors for clinical gene therapy. Furthermore, this approach of library generation and selection can readily be extended to other challenges in protein engineering.
Collapse
Affiliation(s)
- Julie H Yu
- Department of Chemical Engineering, University of California, Berkeley, Berkeley, CA 94720-1462, USA
| | | |
Collapse
|
17
|
Smailus DE, Marziali A, Dextras P, Marra MA, Holt RA. Simple, robust methods for high-throughput nanoliter-scale DNA sequencing. Genome Res 2005; 15:1447-50. [PMID: 16169928 PMCID: PMC1240088 DOI: 10.1101/gr.4221805] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We have developed high-throughput DNA sequencing methods that generate high quality data from reactions as small as 400 nL, providing an approximate order of magnitude reduction in reagent use relative to standard protocols. Sequencing of clones from plasmid, fosmid, and BAC libraries yielded read lengths (PHRED20 bases) of 765 +/- 172 (n = 10,272), 621 +/- 201 (n = 1824), and 647 +/- 189 (n = 568), respectively. Implementation of these procedures at high-throughput genome centers could have a substantial impact on the amount of data that can be generated per unit cost.
Collapse
Affiliation(s)
- Duane E Smailus
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6
| | | | | | | | | |
Collapse
|
18
|
Zhang C, Kitsberg D, Chy H, Zhou Q, Morrison JR. Transposon-mediated generation of targeting vectors for the production of gene knockouts. Nucleic Acids Res 2005; 33:e24. [PMID: 15699181 PMCID: PMC549422 DOI: 10.1093/nar/gni014] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Vectors used for gene targeting experiments usually consist of a selectable marker flanked by two regions of homology to the targeted gene. In a homologous recombination event, the selectable marker replaces an essential element of the target gene rendering it inactive. Other applications of gene targeting technology include gene replacement (knockins) and conditional vectors which allow for the generation of inducible or tissue-specific gene-targeting events. The assembly of gene-targeting vectors is generally a laborious process requiring considerable technical skill. The procedures presented here report the application of transposons as tools for the construction of targeting vectors. Two mini-Mu transposons were sequentially inserted by in vitro transposition at each side of the region targeted for deletion. One such transposon carries an antibiotic resistance marker suitable for selection in mammalian cells. A deletion is then generated between the two transposons either by LoxP-induced recombination or by restriction digestion followed by ligation. This deletion removes part of both transposons plus the targeted region in between, leaving a transposon carrying the selectable marker flanked by two arms which are homologous to the targeted gene. Targeting vectors constructed using these transposons were electroporated into embryonic stem cells and shown to be effective in gene-targeting events.
Collapse
Affiliation(s)
- Chunfang Zhang
- CopyRat Pty Ltd 27-31 Wright Street, Clayton, Victoria 3168, Australia.
| | | | | | | | | |
Collapse
|
19
|
Baross A, Butterfield YSN, Coughlin SM, Zeng T, Griffith M, Griffith OL, Petrescu AS, Smailus DE, Khattra J, McDonald HL, McKay SJ, Moksa M, Holt RA, Marra MA. Systematic recovery and analysis of full-ORF human cDNA clones. Genome Res 2004; 14:2083-92. [PMID: 15489330 PMCID: PMC528924 DOI: 10.1101/gr.2473704] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The Mammalian Gene Collection (MGC) consortium (http://mgc.nci.nih.gov) seeks to establish publicly available collections of full-ORF cDNAs for several organisms of significance to biomedical research, including human. To date over 15,200 human cDNA clones containing full-length open reading frames (ORFs) have been identified via systematic expressed sequence tag (EST) analysis of a diverse set of cDNA libraries; however, further systematic EST analysis is no longer an efficient method for identifying new cDNAs. As part of our involvement in the MGC program, we have developed a scalable method for targeted recovery of cDNA clones to facilitate recovery of genes absent from the MGC collection. First, cDNA is synthesized from various RNAs, followed by polymerase chain reaction (PCR) amplification of transcripts in 96-well plates using gene-specific primer pairs flanking the ORFs. Amplicons are cloned into a sequencing vector, and full-length sequences are obtained. Sequences are processed and assembled using Phred and Phrap, and analyzed using Consed and a number of bioinformatics methods we have developed. Sequences are compared with the Reference Sequence (RefSeq) database, and validation of sequence discrepancies is attempted using other sequence databases including dbEST and dbSNP. Clones with identical sequence to RefSeq or containing only validated changes will become part of the MGC human gene collection. Clones containing novel splice variants or polymorphisms have also been identified. Our approach to clone recovery, applied at large scale, has the potential to recover many and possibly most of the genes absent from the MGC collection.
Collapse
Affiliation(s)
- Agnes Baross
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 4E6, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Gunaratne PH, Wu JQ, Garcia AM, Hulyk S, Worley KC, Margolin JF, Gibbs RA. Concatenation cDNA sequencing for transcriptome analysis. C R Biol 2004; 326:971-7. [PMID: 14744103 DOI: 10.1016/j.crvi.2003.09.032] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
We describe a high-throughput cDNA sequencing pipeline (http://www.hgsc.bcm.tmc.edu/projects/cdna) built in response to the emerging need for rapid sequencing of large cDNA collections. Using this strategy cDNA inserts are purified and joined through concatenation into large molecules. These 'pseudo-BACs' are subjected to random shotgun sequencing whereby the majority of cDNA inserts in the pool are sequenced. Using this concatenation cDNA sequencing platform, we have contributed more than 13000 full-length cDNA sequences from human and mouse to the Mammalian Gene Collection (MGC).
Collapse
Affiliation(s)
- Preethi H Gunaratne
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
| | | | | | | | | | | | | |
Collapse
|
21
|
Abstract
Transposons are mobile genetic elements that can relocate from one genomic location to another. As well as modulating gene expression and contributing to genome plasticity and evolution, transposons are remarkably diverse molecular tools for both whole-genome and single-gene studies in bacteria, yeast, and other microorganisms. Efficient but simple in vitro transposition reactions now allow the mutational analysis of previously recalcitrant microorganisms. Transposon-based signature-tagged mutagenesis and genetic footprinting strategies have pinpointed essential genes and genes that are crucial for the infectivity of a variety of human and other pathogens. Individual proteins and protein complexes can be dissected by transposon-mediated scanning linker mutagenesis. These and other transposon-based approaches have reaffirmed the usefulness of these elements as simple yet highly effective mutagens for both functional genomic and proteomic studies of microorganisms.
Collapse
Affiliation(s)
- Finbarr Hayes
- Department of Biomolecular Sciences, University of Manchester Institute of Science and Technology, PO Box 88, Manchester M60 1QD, England.
| |
Collapse
|
22
|
Poussu E, Vihinen M, Paulin L, Savilahti H. Probing the α-complementing domain of E. coli
β-galactosidase with use of an insertional pentapeptide mutagenesis strategy based on Mu in vitro DNA transposition. Proteins 2004; 54:681-92. [PMID: 14997564 DOI: 10.1002/prot.10467] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Protein structure-function relationships can be studied by using linker insertion mutagenesis, which efficiently identifies essential regions in target proteins. Bacteriophage Mu in vitro DNA transposition was used to generate an extensive library of pentapeptide insertion mutants within the alpha-complementing domain 1 of Escherichia coli beta-galactosidase, yielding mutants at 100% efficiency. Each mutant contained an accurate 15-bp insertion that translated to five additional amino acids within the protein, and the insertions were distributed essentially randomly along the target sequence. Individual mutants (alpha-donors) were analyzed for their ability to restore (by alpha-complementation) beta-galactosidase activity of the M15 deletion mutant (alpha-acceptor), and the data were correlated to the structure of the beta-galactosidase tetramer. Most of the insertions were well tolerated, including many of those disrupting secondary structural elements even within the protein's interior. Nevertheless, certain sites were sensitive to mutations, indicating both known and previously unknown regions of functional importance. Inhibitory insertions within the N-terminus and loop regions most likely influenced protein tetramerization via direct local effects on protein-protein interactions. Within the domain 1 core, the insertions probably caused either lateral shifting of the polypeptide chain toward the protein's exterior or produced more pronounced structural distortions. Six percent of the mutant proteins exhibited temperature sensitivity, in general suggesting the method's usefulness for generation of conditional phenotypes. The method should be applicable to any cloned protein-encoding gene.
Collapse
Affiliation(s)
- Eini Poussu
- Program in Cellular Biotechnology, Institute of Biotechnology, Viikki Biocenter, University of Helsinki, Finland
| | | | | | | |
Collapse
|
23
|
Goryshin IY, Naumann TA, Apodaca J, Reznikoff WS. Chromosomal deletion formation system based on Tn5 double transposition: use for making minimal genomes and essential gene analysis. Genome Res 2003; 13:644-53. [PMID: 12654720 PMCID: PMC430159 DOI: 10.1101/gr.611403] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
In this communication, we describe the use of specialized transposons (Tn5 derivatives) to create deletions in the Escherichia coli K-12 chromosome. These transposons are essentially rearranged composite transposons that have been assembled to promote the use of the internal transposon ends, resulting in intramolecular transposition events. Two similar transposons were developed. The first deletion transposon was utilized to create a consecutive set of deletions in the E. coli chromosome. The deletion procedure has been repeated 20 serial times to reduce the genome an average of 200 kb (averaging 10 kb per deletion). The second deletion transposon contains a conditional origin of replication that allows deleted chromosomal DNA to be captured as a complementary plasmid. By plating cells on media that do not support plasmid replication, the deleted chromosomal material is lost and if it is essential, the cells do not survive. This methodology was used to analyze 15 chromosomal regions and more than 100 open reading frames (ORFs). This provides a robust technology for identifying essential and dispensable genes.
Collapse
Affiliation(s)
- Igor Y Goryshin
- Department of Biochemistry, University of Wisconsin, Madison, Wisconsin 53706, USA
| | | | | | | |
Collapse
|
24
|
Vilen H, Aalto JM, Kassinen A, Paulin L, Savilahti H. A direct transposon insertion tool for modification and functional analysis of viral genomes. J Virol 2003; 77:123-34. [PMID: 12477817 PMCID: PMC140628 DOI: 10.1128/jvi.77.1.123-134.2003] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Advances in DNA transposition technology have recently generated efficient tools for various types of functional genetic analyses. We demonstrate here the power of the bacteriophage Mu-derived in vitro DNA transposition system for modification and functional characterization of a complete bacterial virus genome. The linear double-stranded DNA genome of Escherichia coli bacteriophage PRD1 was studied by insertion mutagenesis with reporter mini-Mu transposons that were integrated in vitro into isolated genomic DNA. After introduction into bacterial cells by electroporation, recombinant transposon-containing virus clones were identified by autoradiography or visual blue-white screening employing alpha-complementation of E. coli beta-galactosidase. Additionally, a modified transposon with engineered NotI sites at both ends was used to introduce novel restriction sites into the phage genome. Analysis of the transposon integration sites in the genomes of viable recombinant phage generated a functional map, collectively indicating genes and genomic regions essential and nonessential for virus propagation. Moreover, promoterless transposons defined the direction of transcription within several insert-tolerant genomic regions. These strategies for the analysis of viral genomes are of a general nature and therefore may be applied to functional genomics studies in all prokaryotic and eukaryotic cell viruses.
Collapse
Affiliation(s)
- Heikki Vilen
- Program in Cellular Biotechnology, Institute of Biotechnology, Viikki Biocenter, University of Helsinki, Finland
| | | | | | | | | |
Collapse
|
25
|
Strausberg RL, Feingold EA, Grouse LH, Derge JG, Klausner RD, Collins FS, Wagner L, Shenmen CM, Schuler GD, Altschul SF, Zeeberg B, Buetow KH, Schaefer CF, Bhat NK, Hopkins RF, Jordan H, Moore T, Max SI, Wang J, Hsieh F, Diatchenko L, Marusina K, Farmer AA, Rubin GM, Hong L, Stapleton M, Soares MB, Bonaldo MF, Casavant TL, Scheetz TE, Brownstein MJ, Usdin TB, Toshiyuki S, Carninci P, Prange C, Raha SS, Loquellano NA, Peters GJ, Abramson RD, Mullahy SJ, Bosak SA, McEwan PJ, McKernan KJ, Malek JA, Gunaratne PH, Richards S, Worley KC, Hale S, Garcia AM, Gay LJ, Hulyk SW, Villalon DK, Muzny DM, Sodergren EJ, Lu X, Gibbs RA, Fahey J, Helton E, Ketteman M, Madan A, Rodrigues S, Sanchez A, Whiting M, Madan A, Young AC, Shevchenko Y, Bouffard GG, Blakesley RW, Touchman JW, Green ED, Dickson MC, Rodriguez AC, Grimwood J, Schmutz J, Myers RM, Butterfield YSN, Krzywinski MI, Skalska U, Smailus DE, Schnerch A, Schein JE, Jones SJM, Marra MA. Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proc Natl Acad Sci U S A 2002; 99:16899-903. [PMID: 12477932 PMCID: PMC139241 DOI: 10.1073/pnas.242603899] [Citation(s) in RCA: 1344] [Impact Index Per Article: 61.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http:mgc.nci.nih.gov).
Collapse
|
26
|
Shevchenko Y, Bouffard GG, Butterfield YSN, Blakesley RW, Hartley JL, Young AC, Marra MA, Jones SJM, Touchman JW, Green ED. Systematic sequencing of cDNA clones using the transposon Tn5. Nucleic Acids Res 2002; 30:2469-77. [PMID: 12034835 PMCID: PMC117195 DOI: 10.1093/nar/30.11.2469] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In parallel with the production of genomic sequence data, attention is being focused on the generation of comprehensive cDNA-sequence resources. Such efforts are increasingly emphasizing the production of high-accuracy sequence corresponding to the entire insert of cDNA clones, especially those presumed to reflect the full-length mRNA. The complete sequencing of cDNA clones on a large scale presents unique challenges because of the generally small, yet heterogeneous, sizes of the cloned inserts. We have developed a strategy for high-throughput sequencing of cDNA clones using the transposon Tn5. This approach has been tailored for implementation within an existing large-scale 'shotgun-style' sequencing program, although it could be readily adapted for use in virtually any sequencing environment. In addition, we have developed a modified version of our strategy that can be applied to cDNA clones with large cloning vectors, thereby overcoming a potential limitation of transposon-based approaches. Here we describe the details of our cDNA-sequencing pipeline, including a summary of the experience in sequencing more than 4200 cDNA clones to produce more than 8 million base pairs of high-accuracy cDNA sequence. These data provide both convincing evidence that the insertion of Tn5 into cDNA clones is sufficiently random for its effective use in large-scale cDNA sequencing as well as interesting insight about the sequence context preferred for insertion by Tn5.
Collapse
Affiliation(s)
- Yuriy Shevchenko
- NIH Intramural Sequencing Center, National Institutes of Health, Gaithersburg, MD 20877, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Stapleton M, Carlson J, Brokstein P, Yu C, Champe M, George R, Guarin H, Kronmiller B, Pacleb J, Park S, Wan K, Rubin GM, Celniker SE. A Drosophila full-length cDNA resource. Genome Biol 2002; 3:RESEARCH0080. [PMID: 12537569 PMCID: PMC151182 DOI: 10.1186/gb-2002-3-12-research0080] [Citation(s) in RCA: 143] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2002] [Revised: 11/27/2002] [Accepted: 11/27/2002] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND A collection of sequenced full-length cDNAs is an important resource both for functional genomics studies and for the determination of the intron-exon structure of genes. Providing this resource to the Drosophila melanogaster research community has been a long-term goal of the Berkeley Drosophila Genome Project. We have previously described the Drosophila Gene Collection (DGC), a set of putative full-length cDNAs that was produced by generating and analyzing over 250,000 expressed sequence tags (ESTs) derived from a variety of tissues and developmental stages. RESULTS We have generated high-quality full-insert sequence for 8,921 clones in the DGC. We compared the sequence of these clones to the annotated Release 3 genomic sequence, and identified more than 5,300 cDNAs that contain a complete and accurate protein-coding sequence. This corresponds to at least one splice form for 40% of the predicted D. melanogaster genes. We also identified potential new cases of RNA editing. CONCLUSIONS We show that comparison of cDNA sequences to a high-quality annotated genomic sequence is an effective approach to identifying and eliminating defective clones from a cDNA collection and ensure its utility for experimentation. Clones were eliminated either because they carry single nucleotide discrepancies, which most probably result from reverse transcriptase errors, or because they are truncated and contain only part of the protein-coding sequence.
Collapse
Affiliation(s)
- Mark Stapleton
- Berkeley Drosophila Genome Project Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|