1
|
Zhang X, Van Treeck B, Horton CA, McIntyre JJR, Palm SM, Shumate JL, Collins K. Harnessing eukaryotic retroelement proteins for transgene insertion into human safe-harbor loci. Nat Biotechnol 2025; 43:42-51. [PMID: 38379101 PMCID: PMC11371274 DOI: 10.1038/s41587-024-02137-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 01/10/2024] [Indexed: 02/22/2024]
Abstract
Current approaches for inserting autonomous transgenes into the genome, such as CRISPR-Cas9 or virus-based strategies, have limitations including low efficiency and high risk of untargeted genome mutagenesis. Here, we describe precise RNA-mediated insertion of transgenes (PRINT), an approach for site-specifically primed reverse transcription that directs transgene synthesis directly into the genome at a multicopy safe-harbor locus. PRINT uses delivery of two in vitro transcribed RNAs: messenger RNA encoding avian R2 retroelement-protein and template RNA encoding a transgene of length validated up to 4 kb. The R2 protein coordinately recognizes the target site, nicks one strand at a precise location and primes complementary DNA synthesis for stable transgene insertion. With a cultured human primary cell line, over 50% of cells can gain several 2 kb transgenes, of which more than 50% are full-length. PRINT advantages include no extragenomic DNA, limiting risk of deleterious mutagenesis and innate immune responses, and the relatively low cost, rapid production and scalability of RNA-only delivery.
Collapse
Affiliation(s)
- Xiaozhu Zhang
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Briana Van Treeck
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Connor A Horton
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Jeremy J R McIntyre
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Sarah M Palm
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Justin L Shumate
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Kathleen Collins
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, USA.
| |
Collapse
|
2
|
Lee RJ, Horton CA, Van Treeck B, McIntyre JJR, Collins K. Conserved and divergent DNA recognition specificities and functions of R2 retrotransposon N-terminal domains. Cell Rep 2024; 43:114239. [PMID: 38753487 PMCID: PMC11204384 DOI: 10.1016/j.celrep.2024.114239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/04/2024] [Accepted: 05/01/2024] [Indexed: 05/18/2024] Open
Abstract
R2 non-long terminal repeat (non-LTR) retrotransposons are among the most extensively distributed mobile genetic elements in multicellular eukaryotes and show promise for applications in transgene supplementation of the human genome. They insert new gene copies into a conserved site in 28S ribosomal DNA with exquisite specificity. R2 clades are defined by the number of zinc fingers (ZFs) at the N terminus of the retrotransposon-encoded protein, postulated to additively confer DNA site specificity. Here, we illuminate general principles of DNA recognition by R2 N-terminal domains across and between clades, with extensive, specific recognition requiring only one or two compact domains. DNA-binding and protection assays demonstrate broadly shared as well as clade-specific DNA interactions. Gene insertion assays in cells identify the N-terminal domains sufficient for target-site insertion and reveal roles in second-strand cleavage or synthesis for clade-specific ZFs. Our results have implications for understanding evolutionary diversification of non-LTR retrotransposon insertion mechanisms and the design of retrotransposon-based gene therapies.
Collapse
Affiliation(s)
- Rosa Jooyoung Lee
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Connor A Horton
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Briana Van Treeck
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Jeremy J R McIntyre
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Kathleen Collins
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
3
|
A Survey of Transposon Landscapes in the Putative Ancient Asexual Ostracod Darwinula stevensoni. Genes (Basel) 2021; 12:genes12030401. [PMID: 33799706 PMCID: PMC7998251 DOI: 10.3390/genes12030401] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 03/02/2021] [Accepted: 03/06/2021] [Indexed: 11/17/2022] Open
Abstract
How asexual reproduction shapes transposable element (TE) content and diversity in eukaryotic genomes remains debated. We performed an initial survey of TE load and diversity in the putative ancient asexual ostracod Darwinula stevensoni. We examined long contiguous stretches of DNA in clones from a genomic fosmid library, totaling about 2.5 Mb, and supplemented these data with results on TE abundance and diversity from an Illumina draft genome. In contrast to other TE studies in putatively ancient asexuals, which revealed relatively low TE content, we found that at least 19% of the fosmid dataset and 26% of the genome assembly corresponded to known transposons. We observed a high diversity of transposon families, including LINE, gypsy, PLE, mariner/Tc, hAT, CMC, Sola2, Ginger, Merlin, Harbinger, MITEs and helitrons, with the prevalence of DNA transposons. The predominantly low levels of sequence diversity indicate that many TEs are or have recently been active. In the fosmid data, no correlation was found between telomeric repeats and non-LTR retrotransposons, which are present near telomeres in other taxa. Most TEs in the fosmid data were located outside of introns and almost none were found in exons. We also report an N-terminal Myb/SANT-like DNA-binding domain in site-specific R4/Dong non-LTR retrotransposons. Although initial results on transposable loads need to be verified with high quality draft genomes, this study provides important first insights into TE dynamics in putative ancient asexual ostracods.
Collapse
|
4
|
Pradhan M, Govindaraju A, Jagdish A, Christensen SM. The linker region of LINEs modulates DNA cleavage and DNA polymerization. Anal Biochem 2020; 603:113809. [PMID: 32511965 DOI: 10.1016/j.ab.2020.113809] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Revised: 05/26/2020] [Accepted: 05/28/2020] [Indexed: 01/09/2023]
Abstract
Long interspersed elements (LINEs) replicate by target primed reverse transcription (TPRT). Insertion involves two half reactions. Each half reaction involves DNA cleavage followed by DNA synthesis. The linker region, located just beyond the reverse transcriptase in the LINE open reading frame, contains a set of predicted helices that may form an α-finger, followed by a gag-like zinc-knuckle. Point mutations of moderately conserved amino-acid residues in the presumptive α-finger severely impair the DNA endonuclease and reverse transcriptase activities of the integration reaction during both half reactions. Mutations in the gag-like zinc-knuckle also impair DNA cleavage and DNA synthesis in some instances. Mutations in core residues that presumably disrupt the protein structure of the presumptive α-finger and the gag-like zinc-knuckle lead to a promiscuous DNA endonuclease and protein-nucleic acid complexes that get stuck in the well during analysis. The linker region appears to function as a protein, DNA, and RNA conformational switching area. The linker is used to properly position nucleic acid substrates into the active sites of the reverse transcriptase and of the DNA endonuclease.
Collapse
Affiliation(s)
- Monika Pradhan
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Aruna Govindaraju
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Athena Jagdish
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Shawn M Christensen
- Department of Biology, University of Texas at Arlington, Arlington, TX, 76019, USA.
| |
Collapse
|
5
|
Mahbub MM, Chowdhury SM, Christensen SM. Globular domain structure and function of restriction-like-endonuclease LINEs: similarities to eukaryotic splicing factor Prp8. Mob DNA 2017; 8:16. [PMID: 29151899 PMCID: PMC5678591 DOI: 10.1186/s13100-017-0097-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Accepted: 10/17/2017] [Indexed: 12/16/2022] Open
Abstract
Background R2 elements are a clade of early branching Long Interspersed Elements (LINEs). LINEs are retrotransposable elements whose replication can have profound effects on the genomes in which they reside. No crystal or EM structures exist for the reverse transcriptase (RT) and linker regions of LINEs. Results Using limited proteolysis as a probe for globular domain structure, we show that the protein encoded by the Bombyx mori R2 element has two major globular domains: (1) a small globular domain consisting of the N-terminal zinc finger and Myb motifs, and (2) a large globular domain consisting of the RT, linker, and type II restriction-like endonuclease (RLE). Further digestion of the large globular domain occurred within the RT. Mapping these RT cleavages onto an updated model of the R2Bm RT indicated that the thumb of the RT was largely protected from proteolytic cleavage. The crystal structure of the large globular domain of Prp8, a eukaryotic splicing factor, was a major template used in building the R2Bm RT model, particularly the thumb region. The large fragment of Prp8 consists not only of a RT similar to R2Bm, but also an RLE and a linker connecting the two regions. The linker sequences adjacent to the RLE in LINEs and Prp8 share a set of two important α-helices and a (presumptive) knuckle/ββα structural motif that are closely associated with the thumb. The RLEs of LINEs and Prp8 share a unique catalytic core residue spacing as well as other key residues. Conclusions The protein encoded by RLE LINEs consists of two major globular domains. The larger of the two globular domain contains the RT, linker, and RLE and is similar to the large fragment of the spliceosomal protein Prp8. The similarities are suggestive of possible common ancestry. Electronic supplementary material The online version of this article (10.1186/s13100-017-0097-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- M Murshida Mahbub
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Drive, Room 337, Arlington, TX 76010 USA
| | - Saiful M Chowdhury
- Department of Chemistry and Biochemistry, University of Texas at Arlington, 700 Planetarium Place, Room 130, Arlington, TX 76010 USA
| | - Shawn M Christensen
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Drive, Room 337, Arlington, TX 76010 USA
| |
Collapse
|
6
|
Integration site selection by retroviruses and transposable elements in eukaryotes. Nat Rev Genet 2017; 18:292-308. [PMID: 28286338 DOI: 10.1038/nrg.2017.7] [Citation(s) in RCA: 153] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Transposable elements and retroviruses are found in most genomes, can be pathogenic and are widely used as gene-delivery and functional genomics tools. Exploring whether these genetic elements target specific genomic sites for integration and how this preference is achieved is crucial to our understanding of genome evolution, somatic genome plasticity in cancer and ageing, host-parasite interactions and genome engineering applications. High-throughput profiling of integration sites by next-generation sequencing, combined with large-scale genomic data mining and cellular or biochemical approaches, has revealed that the insertions are usually non-random. The DNA sequence, chromatin and nuclear context, and cellular proteins cooperate in guiding integration in eukaryotic genomes, leading to a remarkable diversity of insertion site distribution and evolutionary strategies.
Collapse
|
7
|
Govindaraju A, Cortez JD, Reveal B, Christensen SM. Endonuclease domain of non-LTR retrotransposons: loss-of-function mutants and modeling of the R2Bm endonuclease. Nucleic Acids Res 2016; 44:3276-87. [PMID: 26961309 PMCID: PMC4838377 DOI: 10.1093/nar/gkw134] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Revised: 02/22/2016] [Accepted: 02/23/2016] [Indexed: 01/07/2023] Open
Abstract
Non-LTR retrotransposons are an important class of mobile elements that insert into host DNA by target-primed reverse transcription (TPRT). Non-LTR retrotransposons must bind to their mRNA, recognize and cleave their target DNA, and perform TPRT at the site of DNA cleavage. As DNA binding and cleavage are such central parts of the integration reaction, a better understanding of the endonuclease encoded by non-LTR retrotransposons is needed. This paper explores the R2 endonuclease domain from Bombyx mori using in vitro studies and in silico modeling. Mutations in conserved sequences located across the putative PD-(D/E)XK endonuclease domain reduced DNA cleavage, DNA binding and TPRT. A mutation at the beginning of the first α-helix of the modeled endonuclease obliterated DNA cleavage and greatly reduced DNA binding. It also reduced TPRT when tested on pre-cleaved DNA substrates. The catalytic K was located to a non-canonical position within the second α-helix. A mutation located after the fourth β-strand reduced DNA binding and cleavage. The motifs that showed impaired activity form an extensive basic region. The R2 biochemical and structural data are compared and contrasted with that of two other well characterized PD-(D/E)XK endonucleases, restriction endonucleases and archaeal Holliday junction resolvases.
Collapse
Affiliation(s)
- Aruna Govindaraju
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019-0498, USA
| | - Jeremy D. Cortez
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019-0498, USA
| | - Brad Reveal
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019-0498, USA
| | - Shawn M. Christensen
- Department of Biology, University of Texas at Arlington, Arlington, TX 76019-0498, USA
| |
Collapse
|
8
|
Abstract
Although most of non-long terminal repeat (non-LTR) retrotransposons are incorporated in the host genome almost randomly, some non-LTR retrotransposons are incorporated into specific sequences within a target site. On the basis of structural and phylogenetic features, non-LTR retrotransposons are classified into two large groups, restriction enzyme-like endonuclease (RLE)-encoding elements and apurinic/apyrimidinic endonuclease (APE)-encoding elements. All clades of RLE-encoding non-LTR retrotransposons include site-specific elements. However, only two of more than 20 APE-encoding clades, Tx1 and R1, contain site-specific non-LTR elements. Site-specific non-LTR retrotransposons usually target within multi-copy RNA genes, such as rRNA gene (rDNA) clusters, or repetitive genomic sequences, such as telomeric repeats; this behavior may be a symbiotic strategy to reduce the damage to the host genome. Site- and sequence-specificity are variable even among closely related non-LTR elements and appeared to have changed during evolution. In the APE-encoding elements, the primary determinant of the sequence- specific integration is APE itself, which nicks one strand of the target DNA during the initiation of target primed reverse transcription (TPRT). However, other factors, such as interaction between mRNA and the target DNA, and access to the target region in the nuclei also affect the sequence-specificity. In contrast, in the RLE-encoding elements, DNA-binding motifs appear to affect their sequence-specificity, rather than the RLE domain itself. Highly specific integration properties of these site-specific non-LTR elements make them ideal alternative tools for sequence-specific gene delivery, particularly for therapeutic purposes in human diseases.
Collapse
|
9
|
Abstract
R2 elements are sequence specific non-LTR retrotransposons that exclusively insert in the 28S rRNA genes of animals. R2s encode an endonuclease that cleaves the insertion site and a reverse transcriptase that uses the cleaved DNA to prime reverse transcription of the R2 transcript, a process termed target primed reverse transcription. Additional unusual properties of the reverse transcriptase as well as DNA and RNA binding domains of the R2 encoded protein have been characterized. R2 expression is through co-transcription with the 28S gene and self-cleavage by a ribozyme encoded at the R2 5' end. Studies in laboratory stocks and natural populations of Drosophila suggest that R2 expression is tied to the distribution of R2-inserted units within the rDNA locus. Most individuals have no R2 expression because only a small fraction of their rRNA genes need to be active, and a contiguous region of the locus free of R2 insertions can be selected for activation. However, if the R2-free region is not large enough to produce sufficient rRNA, flanking units - including those inserted with R2 - must be activated. Finally, R2 copies rapidly turnover within the rDNA locus, yet R2 has been vertically maintained in animal lineages for hundreds of millions of years. The key to this stability is R2's ability to remain dormant in rDNA units outside the transcribed regions for generations until the stochastic nature of the crossovers that drive the concerted evolution of the rDNA locus inevitably reshuffle the inserted and uninserted units, resulting in transcription of the R2-inserted units.
Collapse
|