1
|
Liao X, Zhu W, Zhou J, Li H, Xu X, Zhang B, Gao X. Repetitive DNA sequence detection and its role in the human genome. Commun Biol 2023; 6:954. [PMID: 37726397 PMCID: PMC10509279 DOI: 10.1038/s42003-023-05322-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 09/04/2023] [Indexed: 09/21/2023] Open
Abstract
Repetitive DNA sequences playing critical roles in driving evolution, inducing variation, and regulating gene expression. In this review, we summarized the definition, arrangement, and structural characteristics of repeats. Besides, we introduced diverse biological functions of repeats and reviewed existing methods for automatic repeat detection, classification, and masking. Finally, we analyzed the type, structure, and regulation of repeats in the human genome and their role in the induction of complex diseases. We believe that this review will facilitate a comprehensive understanding of repeats and provide guidance for repeat annotation and in-depth exploration of its association with human diseases.
Collapse
Affiliation(s)
- Xingyu Liao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Wufei Zhu
- Department of Endocrinology, Yichang Central People's Hospital, The First College of Clinical Medical Science, China Three Gorges University, 443000, Yichang, P.R. China
| | - Juexiao Zhou
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Haoyang Li
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xiaopeng Xu
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Bin Zhang
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia.
| |
Collapse
|
2
|
Zheng Y, Chen C, Wang M, Moawad AS, Wang X, Song C. SINE Insertion in the Pig Carbonic Anhydrase 5B (CA5B) Gene Is Associated with Changes in Gene Expression and Phenotypic Variation. Animals (Basel) 2023; 13:1942. [PMID: 37370452 DOI: 10.3390/ani13121942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 05/27/2023] [Accepted: 06/02/2023] [Indexed: 06/29/2023] Open
Abstract
Transposons are genetic elements that are present in mammalian genomes and occupy a large proportion of the pig genome, with retrotransposons being the most abundant. In a previous study, it was found that a SINE retrotransposon was inserted in the 1st intron of the CA5B gene in pigs, and the present study aimed to investigate the SINE insertion polymorphism in this gene in different pig breeds. Polymerase chain reaction (PCR) was used to confirm the polymorphism in 11 pig breeds and wild boars), and it was found that there was moderate polymorphism information content in 9 of the breeds. Further investigation in cell experiments revealed that the 330 bp SINE insertion in the RIP-CA5B site promoted expression activity in the weak promoter region of this site. Additionally, an enhancer verification vector experiment showed that the 330 bp SINE sequence acted as an enhancer on the core promoter region upstream of the CA5B gene region. The expression of CA5B in adipose tissue (back fat and leaf fat) in individuals with the (SINE+/+) genotype was significantly higher than those with (SINE+/-) and (SINE-/-) genotypes. The association analysis revealed that the (SINE+/+) genotype was significantly associated with a higher back fat thickness than the (SINE-/-) genotype. Moreover, it was observed that the insertion of SINE at the RIP-CA5B site carried ATTT repeats, and three types of (ATTT) repeats were identified among different individuals/breeds (i.e., (ATTT)4, (ATTT)6 and (ATTT)9). Overall, the study provides insights into the genetic basis of adipose tissue development in pigs and highlights the role of a SINE insertion in the CA5B gene in this process.
Collapse
Affiliation(s)
- Yao Zheng
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| | - Cai Chen
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
- International Joint Research Laboratory, Universities of Jiangsu Province of China for Domestic Animal Germplasm Resources and Genetic Improvement, Yangzhou 225009, China
| | - Mengli Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| | - Ali Shoaib Moawad
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
- Department of Animal Production, Faculty of Agriculture, Kafrelsheikh University, Kafrelsheikh 33516, Egypt
| | - Xiaoyan Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| | - Chengyi Song
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| |
Collapse
|
3
|
Halabian R, Makałowski W. A Map of 3' DNA Transduction Variants Mediated by Non-LTR Retroelements on 3202 Human Genomes. BIOLOGY 2022; 11:1032. [PMID: 36101413 PMCID: PMC9311842 DOI: 10.3390/biology11071032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/05/2022] [Accepted: 07/06/2022] [Indexed: 05/03/2023]
Abstract
As one of the major structural constituents, mobile elements comprise more than half of the human genome, among which Alu, L1, and SVA elements are still active and continue to generate new offspring. One of the major characteristics of L1 and SVA elements is their ability to co-mobilize adjacent downstream sequences to new loci in a process called 3' DNA transduction. Transductions influence the structure and content of the genome in different ways, such as increasing genome variation, exon shuffling, and gene duplication. Moreover, given their mutagenicity capability, 3' transductions are often involved in tumorigenesis or in the development of some diseases. In this study, we analyzed 3202 genomes sequenced at high coverage by the New York Genome Center to catalog and characterize putative 3' transduced segments mediated by L1s and SVAs. Here, we present a genome-wide map of inter/intrachromosomal 3' transduction variants, including their genomic and functional location, length, progenitor location, and allelic frequency across 26 populations. In total, we identified 7103 polymorphic L1s and 3040 polymorphic SVAs. Of these, 268 and 162 variants were annotated as high-confidence L1 and SVA 3' transductions, respectively, with lengths that ranged from 7 to 997 nucleotides. We found specific loci within chromosomes X, 6, 7, and 6_GL000253v2_alt as master L1s and SVAs that had yielded more transductions, among others. Together, our results demonstrate the dynamic nature of transduction events within the genome and among individuals and their contribution to the structural variations of the human genome.
Collapse
Affiliation(s)
| | - Wojciech Makałowski
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, 48149 Münster, Germany;
| |
Collapse
|
4
|
Payer LM, Steranka JP, Kryatova MS, Grillo G, Lupien M, Rocha PP, Burns KH. Alu insertion variants alter gene transcript levels. Genome Res 2021; 31:2236-2248. [PMID: 34799402 PMCID: PMC8647820 DOI: 10.1101/gr.261305.120] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 09/23/2021] [Indexed: 12/23/2022]
Abstract
Alu are high copy number interspersed repeats that have accumulated near genes during primate and human evolution. They are a pervasive source of structural variation in modern humans. Impacts that Alu insertions may have on gene expression are not well understood, although some have been associated with expression quantitative trait loci (eQTLs). Here, we directly test regulatory effects of polymorphic Alu insertions in isolation of other variants on the same haplotype. To screen insertion variants for those with such effects, we used ectopic luciferase reporter assays and evaluated 110 Alu insertion variants, including more than 40 with a potential role in disease risk. We observed a continuum of effects with significant outliers that up- or down-regulate luciferase activity. Using a series of reporter constructs, which included genomic context surrounding the Alu, we can distinguish between instances in which the Alu disrupts another regulator and those in which the Alu introduces new regulatory sequence. We next focused on three polymorphic Alu loci associated with breast cancer that display significant effects in the reporter assay. We used CRISPR to modify the endogenous sequences, establishing cell lines varying in the Alu genotype. Our findings indicate that Alu genotype can alter expression of genes implicated in cancer risk, including PTHLH, RANBP9, and MYC These data show that commonly occurring polymorphic Alu elements can alter transcript levels and potentially contribute to disease risk.
Collapse
Affiliation(s)
- Lindsay M Payer
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | - Jared P Steranka
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | - Maria S Kryatova
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | - Giacomo Grillo
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 1L7, Canada
| | - Mathieu Lupien
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 1L7, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 1L7, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada
| | - Pedro P Rocha
- Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH, Bethesda, Maryland 20892-4340, USA
- National Cancer Institute, NIH, Bethesda, Maryland 20892, USA
| | - Kathleen H Burns
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
- McKusick-Nathans Institute of Genetics, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| |
Collapse
|
5
|
Steely CJ, Russell KL, Feusier JE, Qiao Y, Tavtigian SV, Marth G, Jorde LB. Mobile element insertions and associated structural variants in longitudinal breast cancer samples. Sci Rep 2021; 11:13020. [PMID: 34158539 PMCID: PMC8219704 DOI: 10.1038/s41598-021-92444-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 06/07/2021] [Indexed: 02/05/2023] Open
Abstract
While mobile elements are largely inactive in healthy somatic tissues, increased activity has been found in cancer tissues, with significant variation among different cancer types. In addition to insertion events, mobile elements have also been found to mediate many structural variation events in the genome. Here, to better understand the timing and impact of mobile element insertions and associated structural variants in cancer, we examined their activity in longitudinal samples of four metastatic breast cancer patients. We identified 11 mobile element insertions or associated structural variants and found that the majority of these occurred early in tumor progression. Most of the variants impact intergenic regions; however, we identified a translocation interrupting MAP2K4 involving Alu elements and a deletion in YTHDF2 involving mobile elements that likely inactivate reported tumor suppressor genes. The high variant allele fraction of the translocation, the loss of the other copy of MAP2K4, the recurrent loss-of-function mutations found in this gene in other cancers, and the important function of MAP2K4 indicate that this translocation is potentially a driver mutation. Overall, using a unique longitudinal dataset, we find that most variants are likely passenger mutations in the four patients we examined, but some variants impact tumor progression.
Collapse
Affiliation(s)
- Cody J Steely
- Department of Human Genetics, University of Utah School of Medicine, 15 N. 2030 E. Rm 5100, Salt Lake City, UT, 84112, USA.
| | - Kristi L Russell
- Department of Human Genetics, University of Utah School of Medicine, 15 N. 2030 E. Rm 5100, Salt Lake City, UT, 84112, USA
| | - Julie E Feusier
- Department of Human Genetics, University of Utah School of Medicine, 15 N. 2030 E. Rm 5100, Salt Lake City, UT, 84112, USA
| | - Yi Qiao
- Department of Human Genetics, University of Utah School of Medicine, 15 N. 2030 E. Rm 5100, Salt Lake City, UT, 84112, USA
- Utah Center for Genetic Discovery, Salt Lake City, UT, 84112, USA
| | - Sean V Tavtigian
- Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, UT, 84112, USA
| | - Gabor Marth
- Department of Human Genetics, University of Utah School of Medicine, 15 N. 2030 E. Rm 5100, Salt Lake City, UT, 84112, USA
- Utah Center for Genetic Discovery, Salt Lake City, UT, 84112, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah School of Medicine, 15 N. 2030 E. Rm 5100, Salt Lake City, UT, 84112, USA
- Utah Center for Genetic Discovery, Salt Lake City, UT, 84112, USA
| |
Collapse
|
6
|
Genetic Diversity and Population Structures in Chinese Miniature Pigs Revealed by SINE Retrotransposon Insertion Polymorphisms, a New Type of Genetic Markers. Animals (Basel) 2021; 11:ani11041136. [PMID: 33921134 PMCID: PMC8071531 DOI: 10.3390/ani11041136] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/08/2021] [Accepted: 04/14/2021] [Indexed: 12/13/2022] Open
Abstract
Simple Summary Our previous studies suggested that the short interspersed nuclear element (SINE) retrotransposon insertion polymorphisms (RIPs), as a new type of molecular marker developed very recently, are ideal molecular markers and have the potential to be used for population genetic analysis and molecular breeding in pigs and possibly it can be extended to other livestock animals as well. However, no report is available for the application of SINE RIPs in population genetic analysis in livestock, including pigs. Here, we evaluated 30 SINE RIPs in several indigenous Chinese miniature pig breeds, including three subpopulations of Bama pigs (BM-cov, BM-clo, and BM-inb). BM-cov is a subpopulation conserved in the national conservation farm, and BM-clo is a closed population maintained over 30 years with only 2 boars and 14 sows imported from its original area, while BM-inb herd is an 18 generation continuous inbreeding line based on the BM-clo population. To our knowledge, it is the first time to report the genetic diversity, breed differentiation, and population structures for these populations by using SINE RIPs, and which suggests the feasibility of SINE RIPs in pig genetic analysis. Abstract RIPs have been developed as effective genetic markers and popularly applied for genetic analysis in plants, but few reports are available for domestic animals. Here, we established 30 new molecular markers based on the SINE RIPs, and applied them for population genetic analysis in seven Chinese miniature pigs. The data revealed that the closed herd (BM-clo), inbreeding herd (BM-inb) of Bama miniature pigs were distinctly different from the BM-cov herds in the conservation farm, and other miniature pigs (Wuzhishan, Congjiang Xiang, Tibetan, and Mingguang small ear). These later five miniature pig breeds can further be classified into two clades based on a phylogenetic tree: one included BM-cov and Wuzhishan, the other included Congjiang Xiang, Tibetan, and Mingguang small ear, which was well-supported by structure analysis. The polymorphic information contents estimated by using SINE RIPs are lower than the predictions based on microsatellites. Overall, the genetic distances and breed-relationships between these populations revealed by 30 SINE RIPs generally agree with their evolutions and geographic distributions. We demonstrated the potential of SINE RIPs as new genetic markers for genetic monitoring and population structure analysis in pigs, which can even be extended to other livestock animals.
Collapse
|
7
|
Cao X, Zhang Y, Payer LM, Lords H, Steranka JP, Burns KH, Xing J. Polymorphic mobile element insertions contribute to gene expression and alternative splicing in human tissues. Genome Biol 2020; 21:185. [PMID: 32718348 PMCID: PMC7385971 DOI: 10.1186/s13059-020-02101-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 07/14/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Mobile elements are a major source of structural variants in the human genome, and some mobile elements can regulate gene expression and transcript splicing. However, the impact of polymorphic mobile element insertions (pMEIs) on gene expression and splicing in diverse human tissues has not been thoroughly studied. The multi-tissue gene expression and whole genome sequencing data generated by the Genotype-Tissue Expression (GTEx) project provide a great opportunity to systematically evaluate the role of pMEIs in regulating gene expression in human tissues. RESULTS Using the GTEx whole genome sequencing data, we identify 20,545 high-quality pMEIs from 639 individuals. Coupling pMEI genotypes with gene expression profiles, we identify pMEI-associated expression quantitative trait loci (eQTLs) and splicing quantitative trait loci (sQTLs) in 48 tissues. Using joint analyses of pMEIs and other genomic variants, pMEIs are predicted to be the potential causal variant for 3522 eQTLs and 3717 sQTLs. The pMEI-associated eQTLs and sQTLs show a high level of tissue specificity, and these pMEIs are enriched in the proximity of affected genes and in regulatory elements. Using reporter assays, we confirm that several pMEIs associated with eQTLs and sQTLs can alter gene expression levels and isoform proportions, respectively. CONCLUSION Overall, our study shows that pMEIs are associated with thousands of gene expression and splicing variations, indicating that pMEIs could have a significant role in regulating tissue-specific gene expression and transcript splicing. Detailed mechanisms for the role of pMEIs in gene regulation in different tissues will be an important direction for future studies.
Collapse
Affiliation(s)
- Xiaolong Cao
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Yeting Zhang
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
- Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Lindsay M Payer
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - Hannah Lords
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Jared P Steranka
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - Kathleen H Burns
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - Jinchuan Xing
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
- Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
| |
Collapse
|
8
|
Santagostino M, Piras FM, Cappelletti E, Del Giudice S, Semino O, Nergadze SG, Giulotto E. Insertion of Telomeric Repeats in the Human and Horse Genomes: An Evolutionary Perspective. Int J Mol Sci 2020; 21:E2838. [PMID: 32325780 PMCID: PMC7215372 DOI: 10.3390/ijms21082838] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Revised: 04/15/2020] [Accepted: 04/16/2020] [Indexed: 01/06/2023] Open
Abstract
Interstitial telomeric sequences (ITSs) are short stretches of telomeric-like repeats (TTAGGG)n at nonterminal chromosomal sites. We previously demonstrated that, in the genomes of primates and rodents, ITSs were inserted during the repair of DNA double-strand breaks. These conclusions were derived from sequence comparisons of ITS-containing loci and ITS-less orthologous loci in different species. To our knowledge, insertion polymorphism of ITSs, i.e., the presence of an ITS-containing allele and an ITS-less allele in the same species, has not been described. In this work, we carried out a genome-wide analysis of 2504 human genomic sequences retrieved from the 1000 Genomes Project and a PCR-based analysis of 209 human DNA samples. In spite of the large number of individual genomes analyzed we did not find any evidence of insertion polymorphism in the human population. On the contrary, the analysis of ITS loci in the genome of a single horse individual, the reference genome, allowed us to identify five heterozygous ITS loci, suggesting that insertion polymorphism of ITSs is an important source of genetic variability in this species. Finally, following a comparative sequence analysis of horse ITSs and of their orthologous empty loci in other Perissodactyla, we propose models for the mechanism of ITS insertion during the evolution of this order.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Elena Giulotto
- Department of Biology and Biotechnology, University of Pavia, 27100 Pavia, Italy; (M.S.); (F.M.P.); (E.C.); (S.D.G.); (O.S.); (S.G.N.)
| |
Collapse
|
9
|
Goubert C, Zevallos NA, Feschotte C. Contribution of unfixed transposable element insertions to human regulatory variation. Philos Trans R Soc Lond B Biol Sci 2020; 375:20190331. [PMID: 32075552 PMCID: PMC7061991 DOI: 10.1098/rstb.2019.0331] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/09/2019] [Indexed: 12/11/2022] Open
Abstract
Thousands of unfixed transposable element (TE) insertions segregate in the human population, but little is known about their impact on genome function. Recently, a few studies associated unfixed TE insertions to mRNA levels of adjacent genes, but the biological significance of these associations, their replicability across cell types and the mechanisms by which they may regulate genes remain largely unknown. Here, we performed a TE-expression QTL analysis of 444 lymphoblastoid cell lines (LCL) and 289 induced pluripotent stem cells using a newly developed set of genotypes for 2743 polymorphic TE insertions. We identified 211 and 176 TE-eQTL acting in cis in each respective cell type. Approximately 18% were shared across cell types with strongly correlated effects. Furthermore, analysis of chromatin accessibility QTL in a subset of the LCL suggests that unfixed TEs often modulate the activity of enhancers and other distal regulatory DNA elements, which tend to lose accessibility when a TE inserts within them. We also document a case of an unfixed TE likely influencing gene expression at the post-transcriptional level. Our study points to broad and diverse cis-regulatory effects of unfixed TEs in the human population and underscores their plausible contribution to phenotypic variation. This article is part of a discussion meeting issue 'Crossroads between transposons and gene regulation'.
Collapse
Affiliation(s)
| | | | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, 526 Campus Road, Ithaca, NY 14853, USA
| |
Collapse
|
10
|
Loh JW, Ha H, Lin T, Sun N, Burns KH, Xing J. Integrated Mobile Element Scanning (ME-Scan) method for identifying multiple types of polymorphic mobile element insertions. Mob DNA 2020; 11:12. [PMID: 32110248 PMCID: PMC7035633 DOI: 10.1186/s13100-020-00207-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Accepted: 02/14/2020] [Indexed: 01/29/2023] Open
Abstract
Background Mobile elements are ubiquitous components of mammalian genomes and constitute more than half of the human genome. Polymorphic mobile element insertions (pMEIs) are a major source of human genomic variation and are gaining research interest because of their involvement in gene expression regulation, genome integrity, and disease. Results Building on our previous Mobile Element Scanning (ME-Scan) protocols, we developed an integrated ME-Scan protocol to identify three major active families of human mobile elements, AluYb, L1HS, and SVA. This approach selectively amplifies insertion sites of currently active retrotransposons for Illumina sequencing. By pooling the libraries together, we can identify pMEIs from all three mobile element families in one sequencing run. To demonstrate the utility of the new ME-Scan protocol, we sequenced 12 human parent-offspring trios. Our results showed high sensitivity (> 90%) and accuracy (> 95%) of the protocol for identifying pMEIs in the human genome. In addition, we also tested the feasibility of identifying somatic insertions using the protocol. Conclusions The integrated ME-Scan protocol is a cost-effective way to identify novel pMEIs in the human genome. In addition, by developing the protocol to detect three mobile element families, we demonstrate the flexibility of the ME-Scan protocol. We present instructions for the library design, a sequencing protocol, and a computational pipeline for downstream analyses as a complete framework that will allow researchers to easily adapt the ME-Scan protocol to their own projects in other genomes.
Collapse
Affiliation(s)
- Jui Wan Loh
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA
| | - Hongseok Ha
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA.,2Human Genetic Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, 08854 NJ USA
| | - Timothy Lin
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA
| | - Nawei Sun
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA.,2Human Genetic Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, 08854 NJ USA
| | - Kathleen H Burns
- 3Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, 21205 MD USA
| | - Jinchuan Xing
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA.,2Human Genetic Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, 08854 NJ USA
| |
Collapse
|
11
|
Gardner EJ, Prigmore E, Gallone G, Danecek P, Samocha KE, Handsaker J, Gerety SS, Ironfield H, Short PJ, Sifrim A, Singh T, Chandler KE, Clement E, Lachlan KL, Prescott K, Rosser E, FitzPatrick DR, Firth HV, Hurles ME. Contribution of retrotransposition to developmental disorders. Nat Commun 2019; 10:4630. [PMID: 31604926 PMCID: PMC6789007 DOI: 10.1038/s41467-019-12520-y] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 09/11/2019] [Indexed: 02/08/2023] Open
Abstract
Mobile genetic Elements (MEs) are segments of DNA which can copy themselves and other transcribed sequences through the process of retrotransposition (RT). In humans several disorders have been attributed to RT, but the role of RT in severe developmental disorders (DD) has not yet been explored. Here we identify RT-derived events in 9738 exome sequenced trios with DD-affected probands. We ascertain 9 de novo MEs, 4 of which are likely causative of the patient's symptoms (0.04%), as well as 2 de novo gene retroduplications. Beyond identifying likely diagnostic RT events, we estimate genome-wide germline ME mutation rate and selective constraint and demonstrate that coding RT events have signatures of purifying selection equivalent to those of truncating mutations. Overall, our analysis represents a comprehensive interrogation of the impact of retrotransposition on protein coding genes and a framework for future evolutionary and disease studies.
Collapse
Affiliation(s)
- Eugene J Gardner
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Elena Prigmore
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Giuseppe Gallone
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Petr Danecek
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Kaitlin E Samocha
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Juliet Handsaker
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Sebastian S Gerety
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Holly Ironfield
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Patrick J Short
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Alejandro Sifrim
- Department of Human Genetics, KU Leuven, Herestraat 49, Box 602, Leuven, B-3000, Belgium
| | - Tarjinder Singh
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK
| | - Kate E Chandler
- Manchester Centre for Genomic Medicine, Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, Greater, Manchester, M13 9WL, UK
| | - Emma Clement
- Department of Clinical Genetics, North East Thames Regional Genetics Service, Great Ormond Street Hospital for Children NHS Trust, Holborn, London, WC1N 3JH, UK
| | - Katherine L Lachlan
- Wessex Clinical Genetics Service, Southampton University Hospitals NHS Foundation Trust, Princess Anne Hospital, Southampton, SO16 5YA, UK.,Faculty of Medicine, Human Development and Health, University of Southampton, Southampton, SO17 1BJ, UK
| | - Katrina Prescott
- Clinical Genetics Department, Yorkshire Regional Genetics Service, Leeds Teaching Hospitals NHS Trust, Chapel Allerton Hospital, Leeds, LS7 4SA, UK
| | - Elisabeth Rosser
- Department of Clinical Genetics, North East Thames Regional Genetics Service, Great Ormond Street Hospital for Children NHS Trust, Holborn, London, WC1N 3JH, UK
| | - David R FitzPatrick
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, WGH, Edinburgh, EH4 2SP, UK
| | - Helen V Firth
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK.,East Anglian Medical Genetics Service, Box 134, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
| | - Matthew E Hurles
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, CB10 1SA, UK.
| |
Collapse
|
12
|
Zhang L, wang X, Chen C, Wang W, Yang K, Shen D, Wang S, gao B, Guo Y, Mao J, song C. Development of retrotransposons insertion polymorphic markers and application in the genetic variation evaluation of Chinese Bama miniature pigs. CANADIAN JOURNAL OF ANIMAL SCIENCE 2019. [DOI: 10.1139/cjas-2018-0138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Retrotransposons are genetic elements that can amplify themselves in a genome and are abundant in many eukaryotic organisms. In this study, we established some new short interspersed nuclear elements (SINE) and endogenous retroviruses (ERV) retrotransposons insertion polymorphism (RTIP) markers based on BLAT alignment tool strategy, and followed by PCR evaluation. We investigated the genetic variations among four subpopulations of Chinese Bama miniature pigs (BM), including BM in national conservation farm (BM-cov), BM inbreeding population (BM-inb) and BM closed Herd (BM-clo) in Guangxi University, and BM in the Experimental pig farm of Yangzhou University (BM-yzu). Genetic distance, polymorphism information content (PIC) and heterozygosity (He) of these markers in four of BM subpopulations were measured. Twelve SINE and twenty-eight ERV polymorphic molecular markers were identified in the four subpopulations. The BM-cov pigs represented the highest He and PIC, which indicated that BM-cov pigs maintain relatively highly genetic diversity. BM-inb pigs represented the lowest He and PIC indicating less variation and a high degree of inbreeding. Microsatellites polymorphism in four BM populations also well supported the results of these RTIP markers. In summary, retrotransposons insertion polymorphic markers could be a useful tool for population genetic variation analysis. Current SINE and ERV variation data may also provide a reference guide for the conservation and utilization of the BM miniature pig resource.
Collapse
|
13
|
Puurand T, Kukuškina V, Pajuste FD, Remm M. AluMine: alignment-free method for the discovery of polymorphic Alu element insertions. Mob DNA 2019; 10:31. [PMID: 31360240 PMCID: PMC6639938 DOI: 10.1186/s13100-019-0174-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 07/12/2019] [Indexed: 01/09/2023] Open
Abstract
Background Recently, alignment-free sequence analysis methods have gained popularity in the field of personal genomics. These methods are based on counting frequencies of short k-mer sequences, thus allowing faster and more robust analysis compared to traditional alignment-based methods. Results We have created a fast alignment-free method, AluMine, to analyze polymorphic insertions of Alu elements in the human genome. We tested the method on 2,241 individuals from the Estonian Genome Project and identified 28,962 potential polymorphic Alu element insertions. Each tested individual had on average 1,574 Alu element insertions that were different from those in the reference genome. In addition, we propose an alignment-free genotyping method that uses the frequency of insertion/deletion-specific 32-mer pairs to call the genotype directly from raw sequencing reads. Using this method, the concordance between the predicted and experimentally observed genotypes was 98.7%. The running time of the discovery pipeline is approximately 2 h per individual. The genotyping of potential polymorphic insertions takes between 0.4 and 4 h per individual, depending on the hardware configuration. Conclusions AluMine provides tools that allow discovery of novel Alu element insertions and/or genotyping of known Alu element insertions from personal genomes within few hours.
Collapse
Affiliation(s)
- Tarmo Puurand
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Viktoria Kukuškina
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | | | - Maido Remm
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| |
Collapse
|
14
|
Sanchez-Luque FJ, Kempen MJHC, Gerdes P, Vargas-Landin DB, Richardson SR, Troskie RL, Jesuadian JS, Cheetham SW, Carreira PE, Salvador-Palomeque C, García-Cañadas M, Muñoz-Lopez M, Sanchez L, Lundberg M, Macia A, Heras SR, Brennan PM, Lister R, Garcia-Perez JL, Ewing AD, Faulkner GJ. LINE-1 Evasion of Epigenetic Repression in Humans. Mol Cell 2019; 75:590-604.e12. [PMID: 31230816 DOI: 10.1016/j.molcel.2019.05.024] [Citation(s) in RCA: 97] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 04/08/2019] [Accepted: 05/15/2019] [Indexed: 02/07/2023]
Abstract
Epigenetic silencing defends against LINE-1 (L1) retrotransposition in mammalian cells. However, the mechanisms that repress young L1 families and how L1 escapes to cause somatic genome mosaicism in the brain remain unclear. Here we report that a conserved Yin Yang 1 (YY1) transcription factor binding site mediates L1 promoter DNA methylation in pluripotent and differentiated cells. By analyzing 24 hippocampal neurons with three distinct single-cell genomic approaches, we characterized and validated a somatic L1 insertion bearing a 3' transduction. The source (donor) L1 for this insertion was slightly 5' truncated, lacked the YY1 binding site, and was highly mobile when tested in vitro. Locus-specific bisulfite sequencing revealed that the donor L1 and other young L1s with mutated YY1 binding sites were hypomethylated in embryonic stem cells, during neurodifferentiation, and in liver and brain tissue. These results explain how L1 can evade repression and retrotranspose in the human body.
Collapse
Affiliation(s)
- Francisco J Sanchez-Luque
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia; GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain.
| | - Marie-Jeanne H C Kempen
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia; MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine (IGMM), University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - Patricia Gerdes
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Dulce B Vargas-Landin
- Australian Research Council Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, the University of Western Australia, Perth, WA 6009, Australia; Harry Perkins Institute of Medical Research, Perth, WA 6009, Australia
| | - Sandra R Richardson
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Robin-Lee Troskie
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - J Samuel Jesuadian
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Seth W Cheetham
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Patricia E Carreira
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Carmen Salvador-Palomeque
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Marta García-Cañadas
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain
| | - Martin Muñoz-Lopez
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain
| | - Laura Sanchez
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain
| | - Mischa Lundberg
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Angela Macia
- Department of Pediatrics/Rady Children's Hospital San Diego, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Sara R Heras
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain; Department of Biochemistry and Molecular Biology II, Faculty of Pharmacy, University of Granada, Campus Universitario de Cartuja, 18071 Granada, Spain
| | - Paul M Brennan
- Edinburgh Cancer Research Centre, Western General Hospital, Edinburgh, EH4 2XR, UK
| | - Ryan Lister
- Australian Research Council Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, the University of Western Australia, Perth, WA 6009, Australia; Harry Perkins Institute of Medical Research, Perth, WA 6009, Australia
| | - Jose L Garcia-Perez
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain; MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine (IGMM), University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - Adam D Ewing
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Geoffrey J Faulkner
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia; Queensland Brain Institute, University of Queensland, Brisbane, QLD 4072, Australia.
| |
Collapse
|
15
|
Tang Y, Ma X, Zhao S, Xue W, Zheng X, Sun H, Gu P, Zhu Z, Sun C, Liu F, Tan L. Identification of an active miniature inverted-repeat transposable element mJing in rice. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 98:639-653. [PMID: 30689248 PMCID: PMC6850418 DOI: 10.1111/tpj.14260] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2018] [Revised: 01/01/2019] [Accepted: 01/18/2019] [Indexed: 05/27/2023]
Abstract
Miniature inverted-repeat transposable elements (MITEs) are structurally homogeneous non-autonomous DNA transposons with high copy numbers that play important roles in genome evolution and diversification. Here, we analyzed the rice high-tillering dwarf (htd) mutant in an advanced backcross population between cultivated and wild rice, and identified an active MITE named miniature Jing (mJing). The mJing element belongs to the PIF/Harbinger superfamily. japonica rice var. Nipponbare and indica var. 93-11 harbor 72 and 79 mJing family members, respectively, have undergone multiple rounds of amplification bursts during the evolution of Asian cultivated rice (Oryza sativa L.). A heterologous transposition experiment in Arabidopsis thaliana indicated that the autonomous element Jing is likely to have provides the transposase needed for mJing mobilization. We identified 297 mJing insertion sites and their presence/absence polymorphism among 71 rice samples through targeted high-throughput sequencing. The results showed that the copy number of mJing varies dramatically among Asian cultivated rice (O. sativa), its wild ancestor (O. rufipogon), and African cultivated rice (O. glaberrima) and that some mJing insertions are subject to directional selection. These findings suggest that the amplification and removal of mJing elements have played an important role in rice genome evolution and species diversification.
Collapse
Affiliation(s)
- Yanyan Tang
- State Key Laboratory of Plant Physiology and BiochemistryChina Agricultural UniversityBeijing100193China
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Xin Ma
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Shuangshuang Zhao
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Wei Xue
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Xu Zheng
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Hongying Sun
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Ping Gu
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Zuofeng Zhu
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Chuanqing Sun
- State Key Laboratory of Plant Physiology and BiochemistryChina Agricultural UniversityBeijing100193China
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Fengxia Liu
- State Key Laboratory of Plant Physiology and BiochemistryChina Agricultural UniversityBeijing100193China
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| | - Lubin Tan
- National Center for Evaluation of Agricultural Wild Plants (Rice)MOE Laboratory of Crop Heterosis and UtilizationDepartment of Plant Genetics and BreedingChina Agricultural UniversityBeijing100193China
| |
Collapse
|
16
|
Dynamic Methylation of an L1 Transduction Family during Reprogramming and Neurodifferentiation. Mol Cell Biol 2019; 39:MCB.00499-18. [PMID: 30692270 PMCID: PMC6425141 DOI: 10.1128/mcb.00499-18] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 01/11/2019] [Indexed: 01/28/2023] Open
Abstract
The retrotransposon LINE-1 (L1) is a significant source of endogenous mutagenesis in humans. In each individual genome, a few retrotransposition-competent L1s (RC-L1s) can generate new heritable L1 insertions in the early embryo, primordial germ line, and germ cells. L1 retrotransposition can also occur in the neuronal lineage and cause somatic mosaicism. Although DNA methylation mediates L1 promoter repression, the temporal pattern of methylation applied to individual RC-L1s during neurogenesis is unclear. Here, we identified a de novo L1 insertion in a human induced pluripotent stem cell (hiPSC) line via retrotransposon capture sequencing (RC-seq). The L1 insertion was full-length and carried 5' and 3' transductions. The corresponding donor RC-L1 was part of a large and recently active L1 transduction family and was highly mobile in a cultured-cell L1 retrotransposition reporter assay. Notably, we observed distinct and dynamic DNA methylation profiles for the de novo L1 and members of its extended transduction family during neuronal differentiation. These experiments reveal how a de novo L1 insertion in a pluripotent stem cell is rapidly recognized and repressed, albeit incompletely, by the host genome during neurodifferentiation, while retaining potential for further retrotransposition.
Collapse
|
17
|
Steranka JP, Tang Z, Grivainis M, Huang CRL, Payer LM, Rego FOR, Miller TLA, Galante PAF, Ramaswami S, Heguy A, Fenyö D, Boeke JD, Burns KH. Transposon insertion profiling by sequencing (TIPseq) for mapping LINE-1 insertions in the human genome. Mob DNA 2019; 10:8. [PMID: 30899333 PMCID: PMC6407172 DOI: 10.1186/s13100-019-0148-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 01/14/2019] [Indexed: 12/14/2022] Open
Abstract
Background Transposable elements make up a significant portion of the human genome. Accurately locating these mobile DNAs is vital to understand their role as a source of structural variation and somatic mutation. To this end, laboratories have developed strategies to selectively amplify or otherwise enrich transposable element insertion sites in genomic DNA. Results Here we describe a technique, Transposon Insertion Profiling by sequencing (TIPseq), to map Long INterspersed Element 1 (LINE-1, L1) retrotransposon insertions in the human genome. This method uses vectorette PCR to amplify species-specific L1 (L1PA1) insertion sites followed by paired-end Illumina sequencing. In addition to providing a step-by-step molecular biology protocol, we offer users a guide to our pipeline for data analysis, TIPseqHunter. Our recent studies in pancreatic and ovarian cancer demonstrate the ability of TIPseq to identify invariant (fixed), polymorphic (inherited variants), as well as somatically-acquired L1 insertions that distinguish cancer genomes from a patient’s constitutional make-up. Conclusions TIPseq provides an approach for amplifying evolutionarily young, active transposable element insertion sites from genomic DNA. Our rationale and variations on this protocol may be useful to those mapping L1 and other mobile elements in complex genomes. Electronic supplementary material The online version of this article (10.1186/s13100-019-0148-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jared P Steranka
- 1Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA.,2McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| | - Zuojian Tang
- 3Department for Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016 USA.,4Institute for Systems Genetics, NYU Langone Health, New York, NY 10016 USA
| | - Mark Grivainis
- 3Department for Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016 USA.,4Institute for Systems Genetics, NYU Langone Health, New York, NY 10016 USA
| | - Cheng Ran Lisa Huang
- 2McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| | - Lindsay M Payer
- 1Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| | - Fernanda O R Rego
- 5Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo, Brazil
| | - Thiago Luiz Araujo Miller
- 5Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo, Brazil.,Departamento de Bioquímica, Instituto de Química, Universidade de São Paul, São Paulo, Brazil
| | - Pedro A F Galante
- 5Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo, Brazil
| | - Sitharam Ramaswami
- 7Genome Technology Center, Division of Advanced Research Technologies, NYU Langone Health, New York, NY USA
| | - Adriana Heguy
- 7Genome Technology Center, Division of Advanced Research Technologies, NYU Langone Health, New York, NY USA
| | - David Fenyö
- 3Department for Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016 USA.,4Institute for Systems Genetics, NYU Langone Health, New York, NY 10016 USA
| | - Jef D Boeke
- 4Institute for Systems Genetics, NYU Langone Health, New York, NY 10016 USA
| | - Kathleen H Burns
- 1Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA.,2McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| |
Collapse
|
18
|
Tang W, Mun S, Joshi A, Han K, Liang P. Mobile elements contribute to the uniqueness of human genome with 15,000 human-specific insertions and 14 Mbp sequence increase. DNA Res 2019; 25:521-533. [PMID: 30052927 PMCID: PMC6191304 DOI: 10.1093/dnares/dsy022] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 06/20/2018] [Indexed: 02/02/2023] Open
Abstract
Mobile elements (MEs) collectively contribute to at least 50% of the human genome. Due to their past incremental accumulation and ongoing DNA transposition, MEs serve as a significant source for both inter- and intra-species genetic and phenotypic diversity during primate and human evolution. By making use of the most recent genome sequences for human and many other closely related primates and robust multi-way comparative genomic approach, we identified a total of 14,870 human-specific MEs (HS-MEs) with more than 8,000 being newly identified. Collectively, these HS-MEs contribute to a total of 14.2 Mbp net genome sequence increase. Several new observations were made based on these HS-MEs, including the finding of Y chromosome as a strikingly hot target for HS-MEs and a strong mutual preference for SINE-R/VNTR/Alu (SVAs). Furthermore, ∼8,000 of these HS-MEs were found to locate in the vicinity of ∼4,900 genes, and collectively they contribute to ∼84 kb sequences in the human reference transcriptome in association with over 300 genes, including protein-coding sequences for 40 genes. In conclusion, our results demonstrate that MEs made a significant contribution to the evolution of human genome by participating in gene function in a human-specific fashion.
Collapse
Affiliation(s)
- Wanxiangfu Tang
- Department of Biological Sciences, Brock University, St. Catharines, ON, Canada
| | - Seyoung Mun
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research, Center for Regenerative Medicine, Dankook University, Cheonan, Republic of Korea
| | - Aditya Joshi
- Department of Biological Sciences, Brock University, St. Catharines, ON, Canada
| | - Kyudong Han
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research, Center for Regenerative Medicine, Dankook University, Cheonan, Republic of Korea
| | - Ping Liang
- Department of Biological Sciences, Brock University, St. Catharines, ON, Canada
| |
Collapse
|
19
|
Komkov AY, Minervina AA, Nugmanov GA, Saliutina MV, Khodosevich KV, Lebedev YB, Mamedov IZ. An advanced enrichment method for rare somatic retroelement insertions sequencing. Mob DNA 2018; 9:31. [PMID: 30450130 PMCID: PMC6208084 DOI: 10.1186/s13100-018-0136-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Accepted: 10/15/2018] [Indexed: 12/18/2022] Open
Abstract
Background There is increasing evidence that the transpositional activity of retroelements (REs) is not limited to germ line cells, but often occurs in tumor and normal somatic cells. Somatic transpositions were found in several human tissues and are especially typical for the brain. Several computational and experimental approaches for detection of somatic retroelement insertions was developed in the past few years. These approaches were successfully applied to detect somatic insertions in clonally expanded tumor cells. At the same time, identification of somatic insertions presented in small proportion of cells, such as neurons, remains a considerable challenge. Results In this study, we developed a normalization procedure for library enrichment by DNA sequences corresponding to rare somatic RE insertions. Two rounds of normalization increased the number of fragments adjacent to somatic REs in the sequenced sample by more than 26-fold, and the number of identified somatic REs was increased by 8-fold. Conclusions The developed technique can be used in combination with vast majority of modern RE identification approaches and can dramatically increase their capacity to detect rare somatic RE insertions in different types of cells. Electronic supplementary material The online version of this article (10.1186/s13100-018-0136-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexander Y Komkov
- 1Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya str. 16/10, Moscow, 117997 Russia.,Dmitry Rogachev National Medical Research Center of Pediatric Hematology, Oncology and Immunology, Samory Mashela str. 1, Moscow, 117997 Russia
| | - Anastasia A Minervina
- 1Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya str. 16/10, Moscow, 117997 Russia
| | - Gaiaz A Nugmanov
- 1Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya str. 16/10, Moscow, 117997 Russia
| | - Mariia V Saliutina
- 1Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya str. 16/10, Moscow, 117997 Russia
| | - Konstantin V Khodosevich
- 3Biotech Research and Innovation Centre, Copenhagen University, Ole Maaløes Vej 5, Copenhagen, 2200 Denmark
| | - Yuri B Lebedev
- 1Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya str. 16/10, Moscow, 117997 Russia
| | - Ilgar Z Mamedov
- 1Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya str. 16/10, Moscow, 117997 Russia.,4Pirogov Russian National Research Medical University, Ostrovitianov str. 1, Moscow, 117997 Russia
| |
Collapse
|
20
|
Kent TV, Uzunović J, Wright SI. Coevolution between transposable elements and recombination. Philos Trans R Soc Lond B Biol Sci 2018; 372:rstb.2016.0458. [PMID: 29109221 DOI: 10.1098/rstb.2016.0458] [Citation(s) in RCA: 181] [Impact Index Per Article: 25.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/18/2017] [Indexed: 12/24/2022] Open
Abstract
One of the most striking patterns of genome structure is the tight, typically negative, association between transposable elements (TEs) and meiotic recombination rates. While this is a highly recurring feature of eukaryotic genomes, the mechanisms driving correlations between TEs and recombination remain poorly understood, and distinguishing cause versus effect is challenging. Here, we review the evidence for a relation between TEs and recombination, and discuss the underlying evolutionary forces. Evidence to date suggests that overall TE densities correlate negatively with recombination, but the strength of this correlation varies across element types, and the pattern can be reversed. Results suggest that heterogeneity in the strength of selection against ectopic recombination and gene disruption can drive TE accumulation in regions of low recombination, but there is also strong evidence that the regulation of TEs can influence local recombination rates. We hypothesize that TE insertion polymorphism may be important in driving within-species variation in recombination rates in surrounding genomic regions. Furthermore, the interaction between TEs and recombination may create positive feedback, whereby TE accumulation in non-recombining regions contributes to the spread of recombination suppression. Further investigation of the coevolution between recombination and TEs has important implications for our understanding of the evolution of recombination rates and genome structure.This article is part of the themed issue 'Evolutionary causes and consequences of recombination rate variation in sexual organisms'.
Collapse
Affiliation(s)
- Tyler V Kent
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks St, Toronto, Ontario, Canada M5S3B2
| | - Jasmina Uzunović
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks St, Toronto, Ontario, Canada M5S3B2
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks St, Toronto, Ontario, Canada M5S3B2
| |
Collapse
|
21
|
Identification of mutations in HEXA and HEXB in Sandhoff and Tay-Sachs diseases: a new large deletion caused by Alu elements in HEXA. Hum Genome Var 2018; 5:18003. [PMID: 31428437 PMCID: PMC6694291 DOI: 10.1038/hgv.2018.3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Revised: 12/21/2017] [Accepted: 12/21/2017] [Indexed: 12/12/2022] Open
Abstract
GM2 gangliosides are a group of lysosomal lipid storage disorders that are due to mutations in HEXA, HEXB and GM2A. In our study, 10 patients with these diseases were enrolled, and Sanger sequencing was performed for the HEXA and HEXB genes. The results revealed one known splice site mutation (c.346+1G>A, IVS2+1G>A) and three novel mutations (a large deletion involving exons 6–10; one nucleotide deletion, c.622delG [p.D208Ifsx15]; and a missense mutation, c.919G>A [p.E307K]) in HEXA. In HEXB, one known mutation (c.1597C>T [p.R533C]) and one variant of uncertain significance (c.619A>G [p.I207V]) were identified. Five patients had c.1597C>T in HEXB, indicating a common mutation in south Iran. In this study, a unique large deletion in HEXA was identified as a homozygous state. To predict the cause of the large deletion in HEXA, RepeatMasker was used to investigate the Alu elements. In addition, to identify the breakpoint of this deletion, PCR was performed around these elements. Using Repeat masker, different Alu elements were identified across HEXA, mainly in intron 5 and intron 10 adjacent to the deleted exons. PCR around the Alu elements and Sanger sequencing revealed the start point of a large deletion in AluSz6 in the intron 6 and the end of its breakpoint 73 nucleotides downstream of AluJo in intron 10. Our study showed that HEXA is an Alu-rich gene that predisposes individuals to disease-associated large deletions due to these elements.
Collapse
|
22
|
Kvikstad EM, Piazza P, Taylor JC, Lunter G. A high throughput screen for active human transposable elements. BMC Genomics 2018; 19:115. [PMID: 29390960 PMCID: PMC5796560 DOI: 10.1186/s12864-018-4485-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 01/16/2018] [Indexed: 11/30/2022] Open
Abstract
Background Transposable elements (TEs) are mobile genetic sequences that randomly propagate within their host’s genome. This mobility has the potential to affect gene transcription and cause disease. However, TEs are technically challenging to identify, which complicates efforts to assess the impact of TE insertions on disease. Here we present a targeted sequencing protocol and computational pipeline to identify polymorphic and novel TE insertions using next-generation sequencing: TE-NGS. The method simultaneously targets the three subfamilies that are responsible for the majority of recent TE activity (L1HS, AluYa5/8, and AluYb8/9) thereby obviating the need for multiple experiments and reducing the amount of input material required. Results Here we describe the laboratory protocol and detection algorithm, and a benchmark experiment for the reference genome NA12878. We demonstrate a substantial enrichment for on-target fragments, and high sensitivity and precision to both reference and NA12878-specific insertions. We report 17 previously unreported loci for this individual which are supported by orthogonal long-read evidence, and we identify 1470 polymorphic and novel TEs in 12 additional samples that were previously undocumented in databases of insertion polymorphisms. Conclusions We anticipate that future applications of TE-NGS alongside exome sequencing of patients with sporadic disease will reduce the number of unresolved cases, and improve estimates of the contribution of TEs to human genetic disease. Electronic supplementary material The online version of this article (10.1186/s12864-018-4485-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Erika M Kvikstad
- Wellcome Trust Centre for Human Genetics, Oxford, UK. .,National Institute for Health Research Comprehensive Biomedical Research Centre, Oxford, UK.
| | - Paolo Piazza
- Wellcome Trust Centre for Human Genetics, Oxford, UK.,Department of Medicine, Imperial College London, London, UK
| | - Jenny C Taylor
- Wellcome Trust Centre for Human Genetics, Oxford, UK.,National Institute for Health Research Comprehensive Biomedical Research Centre, Oxford, UK
| | - Gerton Lunter
- Wellcome Trust Centre for Human Genetics, Oxford, UK
| |
Collapse
|
23
|
Feusier J, Witherspoon DJ, Scott Watkins W, Goubert C, Sasani TA, Jorde LB. Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations. Mob DNA 2017; 8:9. [PMID: 28770012 PMCID: PMC5531096 DOI: 10.1186/s13100-017-0093-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 07/17/2017] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Polymorphic human Alu elements are excellent tools for assessing population structure, and new retrotransposition events can contribute to disease. Next-generation sequencing has greatly increased the potential to discover Alu elements in human populations, and various sequencing and bioinformatics methods have been designed to tackle the problem of detecting these highly repetitive elements. However, current techniques for Alu discovery may miss rare, polymorphic Alu elements. Combining multiple discovery approaches may provide a better profile of the polymorphic Alu mobilome. AluYb8/9 elements have been a focus of our recent studies as they are young subfamilies (~2.3 million years old) that contribute ~30% of recent polymorphic Alu retrotransposition events. Here, we update our ME-Scan methods for detecting Alu elements and apply these methods to discover new insertions in a large set of individuals with diverse ancestral backgrounds. RESULTS We identified 5,288 putative Alu insertion events, including several hundred novel AluYb8/9 elements from 213 individuals from 18 diverse human populations. Hundreds of these loci were specific to continental populations, and 23 non-reference population-specific loci were validated by PCR. We provide high-quality sequence information for 68 rare AluYb8/9 elements, of which 11 have hallmarks of an active source element. Our subfamily distribution of rare AluYb8/9 elements is consistent with previous datasets, and may be representative of rare loci. We also find that while ME-Scan and low-coverage, whole-genome sequencing (WGS) detect different Alu elements in 41 1000 Genomes individuals, the two methods yield similar population structure results. CONCLUSION Current in-silico methods for Alu discovery may miss rare, polymorphic Alu elements. Therefore, using multiple techniques can provide a more accurate profile of Alu elements in individuals and populations. We improved our false-negative rate as an indicator of sample quality for future ME-Scan experiments. In conclusion, we demonstrate that ME-Scan is a good supplement for next-generation sequencing methods and is well-suited for population-level analyses.
Collapse
Affiliation(s)
- Julie Feusier
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - David J. Witherspoon
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - W. Scott Watkins
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - Clément Goubert
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - Thomas A. Sasani
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - Lynn B. Jorde
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| |
Collapse
|
24
|
Abstract
Transposable elements give rise to interspersed repeats, sequences that comprise most of our genomes. These mobile DNAs have been historically underappreciated - both because they have been presumed to be unimportant, and because their high copy number and variability pose unique technical challenges. Neither impediment now seems steadfast. Interest in the human mobilome has never been greater, and methods enabling its study are maturing at a fast pace. This Review describes the activity of transposable elements in human cancers, particularly long interspersed element-1 (LINE-1). LINE-1 sequences are self-propagating, protein-coding retrotransposons, and their activity results in somatically acquired insertions in cancer genomes. Altered expression of transposable elements and animation of genomic LINE-1 sequences appear to be hallmarks of cancer, and can be responsible for driving mutations in tumorigenesis.
Collapse
Affiliation(s)
- Kathleen H Burns
- Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| |
Collapse
|
25
|
Goubert C, Henri H, Minard G, Valiente Moro C, Mavingui P, Vieira C, Boulesteix M. High-throughput sequencing of transposable element insertions suggests adaptive evolution of the invasive Asian tiger mosquito towards temperate environments. Mol Ecol 2017; 26:3968-3981. [DOI: 10.1111/mec.14184] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Revised: 05/06/2017] [Accepted: 05/08/2017] [Indexed: 12/21/2022]
Affiliation(s)
- Clement Goubert
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Laboratoire de Biometrie et Biologie Evolutive; UMR CNRS 5558; Villeurbanne France
- Department of Human Genetics; University of Utah; Salt Lake City UT USA
| | - Helene Henri
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Laboratoire de Biometrie et Biologie Evolutive; UMR CNRS 5558; Villeurbanne France
| | - Guillaume Minard
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Ecologie Microbienne; UMR CNRS 5557; UMR INRA 1418; Villeurbanne France
- Department of Biosciences; Metapopulation Research Center; University of Helsinki; Helsinki Finland
| | - Claire Valiente Moro
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Ecologie Microbienne; UMR CNRS 5557; UMR INRA 1418; Villeurbanne France
| | - Patrick Mavingui
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Ecologie Microbienne; UMR CNRS 5557; UMR INRA 1418; Villeurbanne France
- UMR PIMIT; INSERM 1187, CNRS 9192, IRD 249, Plateforme Technologique CYROI; Universite de La Reunion; Sainte-Clotilde Reunion
| | - Cristina Vieira
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Laboratoire de Biometrie et Biologie Evolutive; UMR CNRS 5558; Villeurbanne France
| | - Matthieu Boulesteix
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Laboratoire de Biometrie et Biologie Evolutive; UMR CNRS 5558; Villeurbanne France
| |
Collapse
|
26
|
Transposable elements in cancer. NATURE REVIEWS. CANCER 2017. [PMID: 28642606 DOI: 10.1038/nrc.2017.35+[doi]] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Transposable elements give rise to interspersed repeats, sequences that comprise most of our genomes. These mobile DNAs have been historically underappreciated - both because they have been presumed to be unimportant, and because their high copy number and variability pose unique technical challenges. Neither impediment now seems steadfast. Interest in the human mobilome has never been greater, and methods enabling its study are maturing at a fast pace. This Review describes the activity of transposable elements in human cancers, particularly long interspersed element-1 (LINE-1). LINE-1 sequences are self-propagating, protein-coding retrotransposons, and their activity results in somatically acquired insertions in cancer genomes. Altered expression of transposable elements and animation of genomic LINE-1 sequences appear to be hallmarks of cancer, and can be responsible for driving mutations in tumorigenesis.
Collapse
|
27
|
Structural variants caused by Alu insertions are associated with risks for many human diseases. Proc Natl Acad Sci U S A 2017; 114:E3984-E3992. [PMID: 28465436 DOI: 10.1073/pnas.1704117114] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Interspersed repeat sequences comprise much of our DNA, although their functional effects are poorly understood. The most commonly occurring repeat is the Alu short interspersed element. New Alu insertions occur in human populations, and have been responsible for several instances of genetic disease. In this study, we sought to determine if there are instances of polymorphic Alu insertion variants that function in a common variant, common disease paradigm. We cataloged 809 polymorphic Alu elements mapping to 1,159 loci implicated in disease risk by genome-wide association study (GWAS) (P < 10-8). We found that Alu insertion variants occur disproportionately at GWAS loci (P = 0.013). Moreover, we identified 44 of these Alu elements in linkage disequilibrium (r2 > 0.7) with the trait-associated SNP. This figure represents a >20-fold increase in the number of polymorphic Alu elements associated with human phenotypes. This work provides a broader perspective on how structural variants in repetitive DNAs may contribute to human disease.
Collapse
|
28
|
Kryatova MS, Steranka JP, Burns KH, Payer LM. Insertion and deletion polymorphisms of the ancient AluS family in the human genome. Mob DNA 2017; 8:6. [PMID: 28450901 PMCID: PMC5402677 DOI: 10.1186/s13100-017-0089-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Accepted: 04/04/2017] [Indexed: 01/09/2023] Open
Abstract
Background Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome. Results Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3′ intact with 3′ poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements. Conclusions Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion, thus suggesting that some AluS elements have been more active recently than previously thought, or that fixation of AluS insertion alleles remains incomplete. These data expand the potential significance of polymorphic AluS elements in contributing to structural variation in the human genome. Future discovery efforts focusing on polymorphic AluS elements are likely to identify more such polymorphisms, and approaches tailored to identify deletion alleles may be warranted. Electronic supplementary material The online version of this article (doi:10.1186/s13100-017-0089-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maria S Kryatova
- Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA.,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA
| | - Jared P Steranka
- Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA.,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA
| | - Kathleen H Burns
- Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA.,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA
| | - Lindsay M Payer
- Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA
| |
Collapse
|
29
|
Human transposon insertion profiling: Analysis, visualization and identification of somatic LINE-1 insertions in ovarian cancer. Proc Natl Acad Sci U S A 2017; 114:E733-E740. [PMID: 28096347 DOI: 10.1073/pnas.1619797114] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Mammalian genomes are replete with interspersed repeats reflecting the activity of transposable elements. These mobile DNAs are self-propagating, and their continued transposition is a source of both heritable structural variation as well as somatic mutation in human genomes. Tailored approaches to map these sequences are useful to identify insertion alleles. Here, we describe in detail a strategy to amplify and sequence long interspersed element-1 (LINE-1, L1) retrotransposon insertions selectively in the human genome, transposon insertion profiling by next-generation sequencing (TIPseq). We also report the development of a machine-learning-based computational pipeline, TIPseqHunter, to identify insertion sites with high precision and reliability. We demonstrate the utility of this approach to detect somatic retrotransposition events in high-grade ovarian serous carcinoma.
Collapse
|
30
|
Ha H, Loh JW, Xing J. Identification of polymorphic SVA retrotransposons using a mobile element scanning method for SVA (ME-Scan-SVA). Mob DNA 2016; 7:15. [PMID: 27478512 PMCID: PMC4967303 DOI: 10.1186/s13100-016-0072-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2016] [Accepted: 07/21/2016] [Indexed: 12/28/2022] Open
Abstract
Background Mobile element insertions are a major source of human genomic variation. SVA (SINE-R/VNTR/Alu) is the youngest retrotransposon family in the human genome and a number of diseases are known to be caused by SVA insertions. However, inter-individual genomic variations generated by SVA insertions and their impacts have not been studied extensively due to the difficulty in identifying polymorphic SVA insertions. Results To systematically identify SVA insertions at the population level and assess their genomic impact, we developed a mobile element scanning (ME-Scan) protocol we called ME-Scan-SVA. Using a nested SVA-specific PCR enrichment method, ME-Scan-SVA selectively amplify the 5′ end of SVA elements and their flanking genomic regions. To demonstrate the utility of the protocol, we constructed and sequenced a ME-Scan-SVA library of 21 individuals and analyzed the data using a new analysis pipeline designed for the protocol. Overall, the method achieved high SVA-specificity and over >90 % of the sequenced reads are from SVA insertions. The method also had high sensitivity (>90 %) for fixed SVA insertions that contain the SVA-specific primer-binding sites in the reference genome. Using candidate locus selection criteria that are expected to have a 90 % sensitivity, we identified 151 and 29 novel polymorphic SVA candidates under relaxed and stringent cutoffs, respectively (average 12 and 2 per individual). For six polymorphic SVAs that we were able to validate by PCR, the average individual genotype accuracy is 92 %, demonstrating a high accuracy of the computational genotype calling pipeline. Conclusions The new approach allows identifying novel SVA insertions using high-throughput sequencing. It is cost-effective and can be applied in large-scale population study. It also can be applied for detecting potential active SVA elements, and somatic SVA retrotransposition events in different tissues or developmental stages. Electronic supplementary material The online version of this article (doi:10.1186/s13100-016-0072-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hongseok Ha
- Department of Genetics, The State University of New Jersey, Piscataway, 08854 NJ USA ; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, 08854 NJ USA
| | - Jui Wan Loh
- Department of Genetics, The State University of New Jersey, Piscataway, 08854 NJ USA
| | - Jinchuan Xing
- Department of Genetics, The State University of New Jersey, Piscataway, 08854 NJ USA ; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, 08854 NJ USA
| |
Collapse
|
31
|
Abstract
Transposable elements have had a profound impact on the structure and function of mammalian genomes. The retrotransposon Long INterspersed Element-1 (LINE-1 or L1), by virtue of its replicative mobilization mechanism, comprises ∼17% of the human genome. Although the vast majority of human LINE-1 sequences are inactive molecular fossils, an estimated 80-100 copies per individual retain the ability to mobilize by a process termed retrotransposition. Indeed, LINE-1 is the only active, autonomous retrotransposon in humans and its retrotransposition continues to generate both intra-individual and inter-individual genetic diversity. Here, we briefly review the types of transposable elements that reside in mammalian genomes. We will focus our discussion on LINE-1 retrotransposons and the non-autonomous Short INterspersed Elements (SINEs) that rely on the proteins encoded by LINE-1 for their mobilization. We review cases where LINE-1-mediated retrotransposition events have resulted in genetic disease and discuss how the characterization of these mutagenic insertions led to the identification of retrotransposition-competent LINE-1s in the human and mouse genomes. We then discuss how the integration of molecular genetic, biochemical, and modern genomic technologies have yielded insight into the mechanism of LINE-1 retrotransposition, the impact of LINE-1-mediated retrotransposition events on mammalian genomes, and the host cellular mechanisms that protect the genome from unabated LINE-1-mediated retrotransposition events. Throughout this review, we highlight unanswered questions in LINE-1 biology that provide exciting opportunities for future research. Clearly, much has been learned about LINE-1 and SINE biology since the publication of Mobile DNA II thirteen years ago. Future studies should continue to yield exciting discoveries about how these retrotransposons contribute to genetic diversity in mammalian genomes.
Collapse
|
32
|
REPdenovo: Inferring De Novo Repeat Motifs from Short Sequence Reads. PLoS One 2016; 11:e0150719. [PMID: 26977803 PMCID: PMC4792456 DOI: 10.1371/journal.pone.0150719] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2015] [Accepted: 02/18/2016] [Indexed: 11/22/2022] Open
Abstract
Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo.
Collapse
|
33
|
Ahl V, Keller H, Schmidt S, Weichenrieder O. Retrotransposition and Crystal Structure of an Alu RNP in the Ribosome-Stalling Conformation. Mol Cell 2015; 60:715-727. [PMID: 26585389 DOI: 10.1016/j.molcel.2015.10.003] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Revised: 09/14/2015] [Accepted: 10/01/2015] [Indexed: 10/22/2022]
Abstract
The Alu element is the most successful human genomic parasite affecting development and causing disease. It originated as a retrotransposon during early primate evolution of the gene encoding the signal recognition particle (SRP) RNA. We defined a minimal Alu RNA sufficient for effective retrotransposition and determined a high-resolution structure of its complex with the SRP9/14 proteins. The RNA adopts a compact, closed conformation that matches the envelope of the SRP Alu domain in the ribosomal translation elongation factor-binding site. Conserved structural elements in SRP RNAs support an ancient function of the closed conformation that predates SRP9/14. Structure-based mutagenesis shows that retrotransposition requires the closed conformation of the Alu ribonucleoprotein particle and is consistent with the recognition of stalled ribosomes. We propose that ribosome stalling is a common cause for the cis-preference of the mammalian L1 retrotransposon and for the efficiency of the Alu RNA in hijacking nascent L1 reverse transcriptase.
Collapse
Affiliation(s)
- Valentina Ahl
- Department of Biochemistry, Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076 Tübingen, Germany
| | - Heiko Keller
- Department of Biochemistry, Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076 Tübingen, Germany
| | - Steffen Schmidt
- Department of Biochemistry, Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076 Tübingen, Germany
| | - Oliver Weichenrieder
- Department of Biochemistry, Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076 Tübingen, Germany.
| |
Collapse
|
34
|
White TB, Morales ME, Deininger PL. Alu elements and DNA double-strand break repair. Mob Genet Elements 2015; 5:81-85. [PMID: 26942043 DOI: 10.1080/2159256x.2015.1093067] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Revised: 08/31/2015] [Accepted: 09/04/2015] [Indexed: 12/30/2022] Open
Abstract
Alu elements represent one of the most common sources of homology and homeology in the human genome. Homeologous recombination between Alu elements represents a major form of genetic instability leading to deletions and duplications. Although these types of events have been studied extensively through genomic sequencing to assess the impact of Alu elements on disease mutations and genome evolution, the overall abundance of Alu elements in the genome often makes it difficult to assess the relevance of the Alu elements to specific recombination events. We recently reported a powerful new reporter gene system that allows the assessment of various cis and trans factors on the contribution of Alu elements to various forms of genetic instability. This allowed a quantitative measurement of the influence of mismatches on Alu elements and instability. It also confirmed that homeologous Alu elements are able to stimulate non-homologous end joining events in their vicinity. This appears to be dependent on portions of the mismatch repair pathway. We are now in a position to begin to unravel the complex influences of Alu density, mismatch and location with alterations of DNA repair processes in various tissues and tumors.
Collapse
Affiliation(s)
- Travis B White
- Tulane Cancer Center; Tulane University Health Sciences Center ; New Orleans, LA USA
| | - Maria E Morales
- Tulane Cancer Center; Tulane University Health Sciences Center ; New Orleans, LA USA
| | - Prescott L Deininger
- Tulane Cancer Center; Tulane University Health Sciences Center ; New Orleans, LA USA
| |
Collapse
|
35
|
Wildschutte JH, Baron A, Diroff NM, Kidd JM. Discovery and characterization of Alu repeat sequences via precise local read assembly. Nucleic Acids Res 2015; 43:10292-307. [PMID: 26503250 PMCID: PMC4666360 DOI: 10.1093/nar/gkv1089] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Accepted: 10/08/2015] [Indexed: 12/03/2022] Open
Abstract
Alu insertions have contributed to >11% of the human genome and ∼30–35 Alu subfamilies remain actively mobile, yet the characterization of polymorphic Alu insertions from short-read data remains a challenge. We build on existing computational methods to combine Alu detection and de novo assembly of WGS data as a means to reconstruct the full sequence of insertion events from Illumina paired end reads. Comparison with published calls obtained using PacBio long-reads indicates a false discovery rate below 5%, at the cost of reduced sensitivity due to the colocation of reference and non-reference repeats. We generate a highly accurate call set of 1614 completely assembled Alu variants from 53 samples from the Human Genome Diversity Project (HGDP) panel. We utilize the reconstructed alternative insertion haplotypes to genotype 1010 fully assembled insertions, obtaining >99% agreement with genotypes obtained by PCR. In our assembled sequences, we find evidence of premature insertion mechanisms and observe 5′ truncation in 16% of AluYa5 and AluYb8 insertions. The sites of truncation coincide with stem-loop structures and SRP9/14 binding sites in the Alu RNA, implicating L1 ORF2p pausing in the generation of 5′ truncations. Additionally, we identified variable AluJ and AluS elements that likely arose due to non-retrotransposition mechanisms.
Collapse
Affiliation(s)
- Julia H Wildschutte
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Alayna Baron
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Nicolette M Diroff
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
36
|
Qian Y, Kehr B, Halldórsson BV. PopAlu: population-scale detection of Alu polymorphisms. PeerJ 2015; 3:e1269. [PMID: 26417547 PMCID: PMC4582951 DOI: 10.7717/peerj.1269] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 09/04/2015] [Indexed: 11/20/2022] Open
Abstract
Alu elements are sequences of approximately 300 basepairs that together comprise more than 10% of the human genome. Due to their recent origin in primate evolution some Alu elements are polymorphic in humans, present in some individuals while absent in others. We present PopAlu, a tool to detect polymorphic Alu elements on a population scale from paired-end sequencing data. PopAlu uses read pair distance and orientation as well as split reads to identify the location and precise breakpoints of polymorphic Alus. Genotype calling enables us to differentiate between homozygous and heterozygous carriers, making the output of PopAlu suitable for use in downstream analyses such as genome-wide association studies (GWAS). We show on a simulated dataset that PopAlu calls Alu elements inserted and deleted with respect to a reference genome with high accuracy and high precision. Our analysis of real data of a human trio from the 1000 Genomes Project confirms that PopAlu is able to produce highly accurate genotype calls. To our knowledge, PopAlu is the first tool that identifies polymorphic Alu elements from multiple individuals simultaneously, pinpoints the precise breakpoints and calls genotypes with high accuracy.
Collapse
Affiliation(s)
- Yu Qian
- Bioinformatics Research Center, Aarhus University , Aarhus , Denmark
| | - Birte Kehr
- deCODE genetics/Amgen , Reykjavík , Iceland
| | - Bjarni V Halldórsson
- deCODE genetics/Amgen , Reykjavík , Iceland ; Institute of Biomedical and Neural Engineering, School of Science and Engineering, Reykjavik University , Reykjavík , Iceland
| |
Collapse
|
37
|
Kuhn A, Ong YM, Quake SR, Burkholder WF. Read count-based method for high-throughput allelic genotyping of transposable elements and structural variants. BMC Genomics 2015; 16:508. [PMID: 26153459 PMCID: PMC4494700 DOI: 10.1186/s12864-015-1700-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Accepted: 06/15/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Like other structural variants, transposable element insertions can be highly polymorphic across individuals. Their functional impact, however, remains poorly understood. Current genome-wide approaches for genotyping insertion-site polymorphisms based on targeted or whole-genome sequencing remain very expensive and can lack accuracy, hence new large-scale genotyping methods are needed. RESULTS We describe a high-throughput method for genotyping transposable element insertions and other types of structural variants that can be assayed by breakpoint PCR. The method relies on next-generation sequencing of multiplex, site-specific PCR amplification products and read count-based genotype calls. We show that this method is flexible, efficient (it does not require rounds of optimization), cost-effective and highly accurate. CONCLUSIONS This method can benefit a wide range of applications from the routine genotyping of animal and plant populations to the functional study of structural variants in humans.
Collapse
Affiliation(s)
- Alexandre Kuhn
- Microfluidics Systems Biology Lab, Institute of Molecular and Cell Biology, Agency for Science, Technology and Research (A*STAR), Proteos Building, Room #03-04, 61 Biopolis Drive, Singapore, 138673, Singapore.
| | - Yao Min Ong
- Microfluidics Systems Biology Lab, Institute of Molecular and Cell Biology, Agency for Science, Technology and Research (A*STAR), Proteos Building, Room #03-04, 61 Biopolis Drive, Singapore, 138673, Singapore.
| | - Stephen R Quake
- Depts. of Bioengineering and Applied Physics and Howard Hughes Medical Institute, Stanford University, Clark Center, Room E300, 318 Campus Drive, Stanford, CA, 94305, USA. .,Visiting Investigator, Institute of Molecular and Cell Biology, A*STAR, Singapore, 138673, Singapore.
| | - William F Burkholder
- Microfluidics Systems Biology Lab, Institute of Molecular and Cell Biology, Agency for Science, Technology and Research (A*STAR), Proteos Building, Room #03-04, 61 Biopolis Drive, Singapore, 138673, Singapore.
| |
Collapse
|
38
|
Platt RN, Zhang Y, Witherspoon DJ, Xing J, Suh A, Keith MS, Jorde LB, Stevens RD, Ray DA. Targeted Capture of Phylogenetically Informative Ves SINE Insertions in Genus Myotis. Genome Biol Evol 2015; 7:1664-75. [PMID: 26014613 PMCID: PMC4494050 DOI: 10.1093/gbe/evv099] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Identification of retrotransposon insertions in nonmodel taxa can be technically challenging and costly. This has inhibited progress in understanding retrotransposon insertion dynamics outside of a few well-studied species. To address this problem, we have extended a retrotransposon-based capture and sequence method (ME-Scan [mobile element scanning]) to identify insertions belonging to the Ves family of short interspersed elements (SINEs) across seven species of the bat genus Myotis. We identified between 120,000 and 143,000 SINE insertions in six taxa lacking a draft genome by comparing to the M. lucifugus reference genome. On average, each Ves insertion was sequenced to 129.6 × coverage. When mapped back to the M. lucifugus reference genome, all insertions were confidently assigned within a 10-bp window. Polymorphic Ves insertions were identified in each taxon based on their mapped locations. Using cross-species comparisons and the identified insertion positions, a presence–absence matrix was created for approximately 796,000 insertions. Dollo parsimony analysis of more than 85,000 phylogenetically informative insertions recovered strongly supported, monophyletic clades that correspond with the biogeography of each taxa. This phylogeny is similar to previously published mitochondrial phylogenies, with the exception of the placement of M. vivesi. These results support the utility of our variation on ME-Scan to identify polymorphic retrotransposon insertions in taxa without a reference genome and for large-scale retrotransposon-based phylogenetics.
Collapse
Affiliation(s)
- Roy N Platt
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University Department of Biological Sciences, Texas Tech University
| | - Yuhua Zhang
- Bionomics Research & Technology Center, Environmental and Occupational Health Science Institute, Rutgers, The State University of New Jersey
| | | | - Jinchuan Xing
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey
| | - Alexander Suh
- Department of Evolutionary Biology, Uppsala University, Sweden
| | - Megan S Keith
- Department of Biological Sciences, Texas Tech University
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah Health Sciences Center
| | - Richard D Stevens
- Department of Natural Resources Management and the Museum of Texas Tech University
| | - David A Ray
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University Department of Biological Sciences, Texas Tech University
| |
Collapse
|
39
|
Gu S, Yuan B, Campbell IM, Beck CR, Carvalho CMB, Nagamani SCS, Erez A, Patel A, Bacino CA, Shaw CA, Stankiewicz P, Cheung SW, Bi W, Lupski JR. Alu-mediated diverse and complex pathogenic copy-number variants within human chromosome 17 at p13.3. Hum Mol Genet 2015; 24:4061-77. [PMID: 25908615 DOI: 10.1093/hmg/ddv146] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Accepted: 04/20/2015] [Indexed: 01/05/2023] Open
Abstract
Alu repetitive elements are known to be major contributors to genome instability by generating Alu-mediated copy-number variants (CNVs). Most of the reported Alu-mediated CNVs are simple deletions and duplications, and the mechanism underlying Alu-Alu-mediated rearrangement has been attributed to non-allelic homologous recombination (NAHR). Chromosome 17 at the p13.3 genomic region lacks extensive low-copy repeat architecture; however, it is highly enriched for Alu repetitive elements, with a fraction of 30% of total sequence annotated in the human reference genome, compared with the 10% genome-wide and 18% on chromosome 17. We conducted mechanistic studies of the 17p13.3 CNVs by performing high-density oligonucleotide array comparative genomic hybridization, specifically interrogating the 17p13.3 region with ∼150 bp per probe density; CNV breakpoint junctions were mapped to nucleotide resolution by polymerase chain reaction and Sanger sequencing. Studied rearrangements include 5 interstitial deletions, 14 tandem duplications, 7 terminal deletions and 13 complex genomic rearrangements (CGRs). Within the 17p13.3 region, Alu-Alu-mediated rearrangements were identified in 80% of the interstitial deletions, 46% of the tandem duplications and 50% of the CGRs, indicating that this mechanism was a major contributor for formation of breakpoint junctions. Our studies suggest that Alu repetitive elements facilitate formation of non-recurrent CNVs, CGRs and other structural aberrations of chromosome 17 at p13.3. The common observation of Alu-mediated rearrangement in CGRs and breakpoint junction sequences analysis further demonstrates that this type of mechanism is unlikely attributed to NAHR, but rather may be due to a recombination-coupled DNA replicative repair process.
Collapse
Affiliation(s)
- Shen Gu
- Department of Molecular & Human Genetics
| | - Bo Yuan
- Department of Molecular & Human Genetics
| | | | | | | | - Sandesh C S Nagamani
- Department of Molecular & Human Genetics, Texas Children's Hospital, Houston, TX 77030, USA and
| | - Ayelet Erez
- Department of Molecular & Human Genetics, Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | | | - Carlos A Bacino
- Department of Molecular & Human Genetics, Texas Children's Hospital, Houston, TX 77030, USA and
| | | | | | | | - Weimin Bi
- Department of Molecular & Human Genetics
| | - James R Lupski
- Department of Molecular & Human Genetics, Department of Pediatrics and Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA, Texas Children's Hospital, Houston, TX 77030, USA and
| |
Collapse
|
40
|
Library Construction for High-Throughput Mobile Element Identification and Genotyping. Methods Mol Biol 2015; 1589:1-15. [PMID: 26025622 DOI: 10.1007/7651_2015_265] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Mobile genetic elements are discrete DNA elements that can move around and copy themselves in a genome. As a ubiquitous component of the genome, mobile elements contribute to both genetic and epigenetic variation. Therefore, it is important to determine the genome-wide distribution of mobile elements. Here we present a targeted high-throughput sequencing protocol called Mobile Element Scanning (ME-Scan) for genome-wide mobile element detection. We will describe oligonucleotides design, sequencing library construction, and computational analysis for the ME-Scan protocol.
Collapse
|
41
|
Guffanti G, Gaudi S, Fallon JH, Sobell J, Potkin SG, Pato C, Macciardi F. Transposable elements and psychiatric disorders. Am J Med Genet B Neuropsychiatr Genet 2014; 165B:201-16. [PMID: 24585726 DOI: 10.1002/ajmg.b.32225] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 01/21/2014] [Indexed: 12/15/2022]
Abstract
Transposable Elements (TEs) or transposons are low-complexity elements (e.g., LINEs, SINEs, SVAs, and HERVs) that make up to two-thirds of the human genome. There is mounting evidence that TEs play an essential role in genomic architecture and regulation related to both normal function and disease states. Recently, the identification of active TEs in several different human brain regions suggests that TEs play a role in normal brain development and adult physiology and quite possibly in psychiatric disorders. TEs have been implicated in hemophilia, neurofibromatosis, and cancer. With the advent of next-generation whole-genome sequencing approaches, our understanding of the relationship between TEs and psychiatric disorders will greatly improve. We will review the biology of TEs and early evidence for TE involvement in psychiatric disorders.
Collapse
Affiliation(s)
- Guia Guffanti
- Department of Psychiatry, Columbia University, New York, New York
| | | | | | | | | | | | | |
Collapse
|
42
|
Kamath PL, Elleder D, Bao L, Cross PC, Powell JH, Poss M. The population history of endogenous retroviruses in mule deer (Odocoileus hemionus). J Hered 2014; 105:173-87. [PMID: 24336966 PMCID: PMC3920814 DOI: 10.1093/jhered/est088] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2013] [Revised: 09/27/2013] [Accepted: 10/24/2013] [Indexed: 11/13/2022] Open
Abstract
Mobile elements are powerful agents of genomic evolution and can be exceptionally informative markers for investigating species and population-level evolutionary history. While several studies have utilized retrotransposon-based insertional polymorphisms to resolve phylogenies, few population studies exist outside of humans. Endogenous retroviruses are LTR-retrotransposons derived from retroviruses that have become stably integrated in the host genome during past infections and transmitted vertically to subsequent generations. They offer valuable insight into host-virus co-evolution and a unique perspective on host evolutionary history because they integrate into the genome at a discrete point in time. We examined the evolutionary history of a cervid endogenous gammaretrovirus (CrERVγ) in mule deer (Odocoileus hemionus). We sequenced 14 CrERV proviruses (CrERV-in1 to -in14), and examined the prevalence and distribution of 13 proviruses in 262 deer among 15 populations from Montana, Wyoming, and Utah. CrERV absence in white-tailed deer (O. virginianus), identical 5' and 3' long terminal repeat (LTR) sequences, insertional polymorphism, and CrERV divergence time estimates indicated that most endogenization events occurred within the last 200000 years. Population structure inferred from CrERVs (F ST = 0.008) and microsatellites (θ = 0.01) was low, but significant, with Utah, northwestern Montana, and a Helena herd being particularly differentiated. Clustering analyses indicated regional structuring, and non-contiguous clustering could often be explained by known translocations. Cluster ensemble results indicated spatial localization of viruses, specifically in deer from northeastern and western Montana. This study demonstrates the utility of endogenous retroviruses to elucidate and provide novel insight into both ERV evolutionary history and the history of contemporary host populations.
Collapse
Affiliation(s)
- Pauline L Kamath
- the US Geological Survey, Northern Rocky Mountain Science Center, Bozeman, MT 59715
| | | | | | | | | | | |
Collapse
|
43
|
Bazak L, Haviv A, Barak M, Jacob-Hirsch J, Deng P, Zhang R, Isaacs FJ, Rechavi G, Li JB, Eisenberg E, Levanon EY. A-to-I RNA editing occurs at over a hundred million genomic sites, located in a majority of human genes. Genome Res 2013; 24:365-76. [PMID: 24347612 PMCID: PMC3941102 DOI: 10.1101/gr.164749.113] [Citation(s) in RCA: 464] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
RNA molecules transmit the information encoded in the genome and generally reflect its content. Adenosine-to-inosine (A-to-I) RNA editing by ADAR proteins converts a genomically encoded adenosine into inosine. It is known that most RNA editing in human takes place in the primate-specific Alu sequences, but the extent of this phenomenon and its effect on transcriptome diversity are not yet clear. Here, we analyzed large-scale RNA-seq data and detected ∼1.6 million editing sites. As detection sensitivity increases with sequencing coverage, we performed ultradeep sequencing of selected Alu sequences and showed that the scope of editing is much larger than anticipated. We found that virtually all adenosines within Alu repeats that form double-stranded RNA undergo A-to-I editing, although most sites exhibit editing at only low levels (<1%). Moreover, using high coverage sequencing, we observed editing of transcripts resulting from residual antisense expression, doubling the number of edited sites in the human genome. Based on bioinformatic analyses and deep targeted sequencing, we estimate that there are over 100 million human Alu RNA editing sites, located in the majority of human genes. These findings set the stage for exploring how this primate-specific massive diversification of the transcriptome is utilized.
Collapse
Affiliation(s)
- Lily Bazak
- Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900, Israel
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Ade C, Roy-Engel AM, Deininger PL. Alu elements: an intrinsic source of human genome instability. Curr Opin Virol 2013; 3:639-45. [PMID: 24080407 PMCID: PMC3982648 DOI: 10.1016/j.coviro.2013.09.002] [Citation(s) in RCA: 77] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Accepted: 09/09/2013] [Indexed: 11/29/2022]
Abstract
Alu elements are ∼300bp sequences that have amplified via an RNA intermediate leading to the accumulation of over 1 million copies in the human genome. Although a few of the copies are active, Alu germline activity is the highest of all human retrotransposons and does significantly contribute to genetic disease and population diversity. There are two basic mechanisms by which Alu elements contribute to disease: through insertional mutagenesis and as a large source of repetitive sequences that contribute to nonallelic homologous recombination (NAHR) that cause genetic deletions and duplications.
Collapse
Affiliation(s)
- Catherine Ade
- Tulane University, Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane Cancer Center, Consortium Of Mobile Elements at Tulane)
| | - Astrid M. Roy-Engel
- Tulane University, Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane Cancer Center, Consortium Of Mobile Elements at Tulane)
| | - Prescott L. Deininger
- Tulane University, Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane Cancer Center, Consortium Of Mobile Elements at Tulane)
| |
Collapse
|
45
|
Carreira PE, Richardson SR, Faulkner GJ. L1 retrotransposons, cancer stem cells and oncogenesis. FEBS J 2013; 281:63-73. [PMID: 24286172 PMCID: PMC4160015 DOI: 10.1111/febs.12601] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2013] [Revised: 10/28/2013] [Accepted: 11/11/2013] [Indexed: 12/17/2022]
Abstract
Retrotransposons have played a central role in human genome evolution. The accumulation of heritable L1, Alu and SVA retrotransposon insertions continues to generate structural variation within and between populations, and can result in spontaneous genetic disease. Recent works have reported somatic L1 retrotransposition in tumours, which in some cases may contribute to oncogenesis. Intriguingly, L1 mobilization appears to occur almost exclusively in cancers of epithelial cell origin. In this review, we discuss how L1 retrotransposition could potentially trigger neoplastic transformation, based on the established correlation between L1 activity and cellular plasticity, and the proven capacity of L1-mediated insertional mutagenesis to decisively alter gene expression and functional output.
Collapse
Affiliation(s)
- Patricia E Carreira
- Cancer Biology Program, Mater Medical Research Institute, South Brisbane, Australia
| | | | | |
Collapse
|
46
|
Abstract
We analyzed 83 fully sequenced great ape genomes for mobile element insertions, predicting a total of 49,452 fixed and polymorphic Alu and long interspersed element 1 (L1) insertions not present in the human reference assembly and assigning each retrotransposition event to a different time point during great ape evolution. We used these homoplasy-free markers to construct a mobile element insertions-based phylogeny of humans and great apes and demonstrate their differential power to discern ape subspecies and populations. Within this context, we find a good correlation between L1 diversity and single-nucleotide polymorphism heterozygosity (r(2) = 0.65) in contrast to Alu repeats, which show little correlation (r(2) = 0.07). We estimate that the "rate" of Alu retrotransposition has differed by a factor of 15-fold in these lineages. Humans, chimpanzees, and bonobos show the highest rates of Alu accumulation--the latter two since divergence 1.5 Mya. The L1 insertion rate, in contrast, has remained relatively constant, with rates differing by less than a factor of three. We conclude that Alu retrotransposition has been the most variable form of genetic variation during recent human-great ape evolution, with increases and decreases occurring over very short periods of evolutionary time.
Collapse
|
47
|
Grandi FC, An W. Non-LTR retrotransposons and microsatellites: Partners in genomic variation. Mob Genet Elements 2013; 3:e25674. [PMID: 24195012 PMCID: PMC3812793 DOI: 10.4161/mge.25674] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Revised: 07/07/2013] [Accepted: 07/09/2013] [Indexed: 01/10/2023] Open
Abstract
The human genome is laden with both non-LTR (long-terminal repeat) retrotransposons and microsatellite repeats. Both types of sequences are able to, either actively or passively, mutagenize the genomes of human individuals and are therefore poised to dynamically alter the human genomic landscape across generations. Non-LTR retrotransposons, such as L1 and Alu, are a major source of new microsatellites, which are born both concurrently and subsequently to L1 and Alu integration into the genome. Likewise, the mutation dynamics of microsatellite repeats have a direct impact on the fitness of their non-LTR retrotransposon parent owing to microsatellite expansion and contraction. This review explores the interactions and dynamics between non-LTR retrotransposons and microsatellites in the context of genomic variation and evolution.
Collapse
Affiliation(s)
- Fiorella C Grandi
- School of Molecular Biosciences and Center for Reproductive Biology; Washington State University; Pullman, WA USA
| | | |
Collapse
|
48
|
Burgess DJ. Population genetics: Mobile elements across human populations. Nat Rev Genet 2013; 14:370. [PMID: 23657482 DOI: 10.1038/nrg3497] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|