1
|
Wang C, Zhang Y, Guan F, He YZ, Wu Y. Genome-wide identification and phylogenetic analysis of the tetraspanin gene family in lepidopteran insects and expression profiling analysis in Helicoverpa armigera. INSECT SCIENCE 2025; 32:471-486. [PMID: 38880966 DOI: 10.1111/1744-7917.13402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 05/01/2024] [Accepted: 05/16/2024] [Indexed: 06/18/2024]
Abstract
The tetraspanin gene family encodes cell-surface proteins that span the membrane 4 times and play critical roles in a wide range of biological processes across numerous organisms. Recent findings highlight the involvement of a tetraspanin of the lepidopteran pest Helicoverpa armigera in resistance to Bacillus thuringiensis Cry insecticidal proteins, which are extensively used in transgenic crops. Thus, a better understanding of lepidopteran tetraspanins is urgently needed. In the current study, genome scanning in 10 lepidopteran species identified a total of 283 sequences encoding potential tetraspanins. Based on conserved cysteine patterns in the large extracellular loop and their phylogenetic relationships, these tetraspanins were classified into 8 subfamilies (TspA to TspH). Six ancestral introns were identified within lepidopteran tetraspanin genes. Tetraspanins in TspA, TspB, TspC, and TspD subfamilies exhibit highly similar gene organization, while tetraspanins in the remaining 4 subfamilies exhibited variation in intron loss and/or gain during evolution. Analysis of chromosomal distribution revealed a lepidopteran-specific cluster of 10 to 11 tetraspanins, likely formed by tandem duplication events. Selective pressure analysis indicated negative selection across all orthologous groups, with ω values ranging between 0.004 and 0.362. However, positive selection was identified at 18 sites within TspB5, TspC5, TspE3, and TspF10. Furthermore, spatiotemporal expression analysis of H. armigera tetraspanins demonstrated variable expression levels across different developmental stages and tissues, suggesting diverse functions of tetraspanin members in this globally important insect pest. Our findings establish a solid foundation for subsequent functional investigations of tetraspanins in lepidopteran species.
Collapse
Affiliation(s)
- Chenyang Wang
- College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China
| | - Yinuo Zhang
- College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China
| | - Fang Guan
- College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China
| | - Ya-Zhou He
- College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China
| | - Yidong Wu
- College of Plant Protection, Nanjing Agricultural University, Nanjing, 210095, China
| |
Collapse
|
2
|
Conceição HB, Mercuri RLV, de Castro MPM, Ohara DT, Guardia GDA, Galante PAF. RCPedia: a global resource for studying and exploring retrocopies in diverse species. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae530. [PMID: 39240653 PMCID: PMC11387616 DOI: 10.1093/bioinformatics/btae530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 07/27/2024] [Accepted: 09/05/2024] [Indexed: 09/07/2024]
Abstract
MOTIVATION Gene retrocopies arise from the reverse transcription and genomic insertion of processed mRNA transcripts. These elements have significantly contributed to genetic diversity and novelties throughout the evolution of many species. However, the study of retrocopies has been challenging, owing to the absence of comprehensive, complete, and user-friendly databases for diverse species. RESULTS Here, we introduce an improved version of RCPedia, an integrative database meticulously designed for the study of retrocopies. RCPedia offers an extensive catalog of retrocopies identified across 44 species, which includes 13 primates, 4 rodents, 6 chiropterans, 12 other mammals, 4 birds, turtles, lizards, frogs, zebrafish, and Drosophila. The database offers the most complete compilation of retrocopies per species, accompanied by detailed genomic annotations, expression data, and links to other data portals. Furthermore, RCPedia features a streamlined representation of data and an efficient querying system, establishing it as an invaluable tool for researchers in the fields of genomics, evolutionary biology, and transposable elements (TEs). In summary, RCPedia aims to enhance the investigation of retrocopies and their pivotal roles in shaping the genomic landscapes of diverse species. AVAILABILITY AND IMPLEMENTATION RCPedia is available at https://www.rcpediadb.org.
Collapse
Affiliation(s)
- Helena B Conceição
- Hospital Sirio-Libanes, São Paulo 01308-060, Brazil
- Interunidades em Bioinformática, Universidade de São Paulo, São Paulo 05508-000, Brazil
| | - Rafael L V Mercuri
- Hospital Sirio-Libanes, São Paulo 01308-060, Brazil
- Interunidades em Bioinformática, Universidade de São Paulo, São Paulo 05508-000, Brazil
| | - Matheus P M de Castro
- Hospital Sirio-Libanes, São Paulo 01308-060, Brazil
- Department of Biochemistry, University of São Paulo, São Paulo, 05508-000, Brazil
| | | | | | | |
Collapse
|
3
|
Becker GM, Thorne JW, Burke JM, Lewis RM, Notter DR, Morgan JLM, Schauer CS, Stewart WC, Redden RR, Murdoch BM. Genetic diversity of United States Rambouillet, Katahdin and Dorper sheep. Genet Sel Evol 2024; 56:56. [PMID: 39080565 PMCID: PMC11290166 DOI: 10.1186/s12711-024-00905-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 04/23/2024] [Indexed: 08/02/2024] Open
Abstract
BACKGROUND Managing genetic diversity is critically important for maintaining species fitness. Excessive homozygosity caused by the loss of genetic diversity can have detrimental effects on the reproduction and production performance of a breed. Analysis of genetic diversity can facilitate the identification of signatures of selection which may contribute to the specific characteristics regarding the health, production and physical appearance of a breed or population. In this study, breeds with well-characterized traits such as fine wool production (Rambouillet, N = 745), parasite resistance (Katahdin, N = 581) and environmental hardiness (Dorper, N = 265) were evaluated for inbreeding, effective population size (Ne), runs of homozygosity (ROH) and Wright's fixation index (FST) outlier approach to identify differential signatures of selection at 36,113 autosomal single nucleotide polymorphisms (SNPs). RESULTS Katahdin sheep had the largest current Ne at the most recent generation estimated with both the GONe and NeEstimator software. The most highly conserved ROH Island was identified in Rambouillet with a signature of selection on chromosome 6 containing 202 SNPs called in an ROH in 50 to 94% of the individuals. This region contained the DCAF16, LCORL and NCAPG genes that have been previously reported to be under selection and have biological roles related to milk production and growth traits. The outlier regions identified through the FST comparisons of Katahdin with Rambouillet and Dorper contained genes with known roles in milk production and mastitis resistance or susceptibility, and the FST comparisons of Rambouillet with Katahdin and Dorper identified genes related to wool growth, suggesting these traits have been under natural or artificial selection pressure in these populations. Genes involved in the cytokine-cytokine receptor interaction pathways were identified in all FST breed comparisons, which indicates the presence of allelic diversity between these breeds in genomic regions controlling cytokine signaling mechanisms. CONCLUSIONS In this paper, we describe signatures of selection within diverse and economically important U.S. sheep breeds. The genes contained within these signatures are proposed for further study to understand their relevance to biological traits and improve understanding of breed diversity.
Collapse
Affiliation(s)
- Gabrielle M Becker
- Department of Animal, Veterinary and Food Science, University of Idaho, Moscow, ID, USA
| | - Jacob W Thorne
- Department of Animal, Veterinary and Food Science, University of Idaho, Moscow, ID, USA
- Texas A&M AgriLife Extension, Texas A&M University, San Angelo, TX, USA
| | - Joan M Burke
- USDA, ARS, Dale Bumpers Small Farms Research Center, Booneville, AR, USA
| | - Ronald M Lewis
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - David R Notter
- School of Animal Sciences, Virginia Tech, Blacksburg, VA, USA
| | | | - Christopher S Schauer
- Hettinger Research Extension Center, North Dakota State University, Hettinger, ND, USA
| | - Whit C Stewart
- Department of Animal Science, University of Wyoming, Laramie, WY, USA
| | - R R Redden
- Texas A&M AgriLife Extension, Texas A&M University, San Angelo, TX, USA
| | - Brenda M Murdoch
- Department of Animal, Veterinary and Food Science, University of Idaho, Moscow, ID, USA.
| |
Collapse
|
4
|
Castellanos MDP, Wickramasinghe CD, Betrán E. The roles of gene duplications in the dynamics of evolutionary conflicts. Proc Biol Sci 2024; 291:20240555. [PMID: 38865605 DOI: 10.1098/rspb.2024.0555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 04/02/2024] [Indexed: 06/14/2024] Open
Abstract
Evolutionary conflicts occur when there is antagonistic selection between different individuals of the same or different species, life stages or between levels of biological organization. Remarkably, conflicts can occur within species or within genomes. In the dynamics of evolutionary conflicts, gene duplications can play a major role because they can bring very specific changes to the genome: changes in protein dose, the generation of novel paralogues with different functions or expression patterns or the evolution of small antisense RNAs. As we describe here, by having those effects, gene duplication might spark evolutionary conflict or fuel arms race dynamics that takes place during conflicts. Interestingly, gene duplication can also contribute to the resolution of a within-locus evolutionary conflict by partitioning the functions of the gene that is under an evolutionary trade-off. In this review, we focus on intraspecific conflicts, including sexual conflict and illustrate the various roles of gene duplications with a compilation of examples. These examples reveal the level of complexity and the differences in the patterns of gene duplications within genomes under different conflicts. These examples also reveal the gene ontologies involved in conflict and the genomic location of the elements of the conflict. The examples provide a blueprint for the direct study of these conflicts or the exploration of the presence of similar conflicts in other lineages.
Collapse
Affiliation(s)
| | | | - Esther Betrán
- Department of Biology, University of Texas at Arlington , Arlington, TX 76019, USA
| |
Collapse
|
5
|
Chu C, Ljungström V, Tran A, Jin H, Park PJ. Contribution of de novo retroelements to birth defects and childhood cancers. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305733. [PMID: 38699361 PMCID: PMC11065029 DOI: 10.1101/2024.04.15.24305733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Insertion of active retroelements-L1s, Alus, and SVAs-can disrupt proper genome function and lead to various disorders including cancer. However, the role of de novo retroelements (DNRTs) in birth defects and childhood cancers has not been well characterized due to the lack of adequate data and efficient computational tools. Here, we examine whole-genome sequencing data of 3,244 trios from 12 birth defect and childhood cancer cohorts in the Gabriella Miller Kids First Pediatric Research Program. Using an improved version of our tool xTea (x-Transposable element analyzer) that incorporates a deep-learning module, we identified 162 DNRTs, as well as 2 pseudogene insertions. Several variants are likely to be causal, such as a de novo Alu insertion that led to the ablation of a whole exon in the NF1 gene in a proband with brain tumor. We observe a high de novo SVA insertion burden in both high-intolerance loss-of-function genes and exons as well as more frequent de novo Alu insertions of paternal origin. We also identify potential mosaic DNRTs from embryonic stages. Our study reveals the important roles of DNRTs in causing birth defects and predisposition to childhood cancers.
Collapse
Affiliation(s)
- Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Viktor Ljungström
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Antuan Tran
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Hu Jin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Peter J. Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
6
|
Tuncel T, Ak G, Güneş HV, Metintaş M. Complex Genomic Rearrangement Patterns in Malignant Pleural Mesothelioma due to Environmental Asbestos Exposure. J Environ Pathol Toxicol Oncol 2024; 43:13-27. [PMID: 38505910 DOI: 10.1615/jenvironpatholtoxicoloncol.2023046200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024] Open
Abstract
Malignant pleural mesothelioma (MPM) is a rare type of cancer, and its main risk factor is exposure to asbestos. Accordingly, our knowledge of the genomic structure of an MPM tumor is limited when compared to other cancers. In this study, we aimed to characterize complex genomic rearrangement patterns and variations to better understand the genomics of MPM tumors. We comparatively scanned 3 MPM tumor genomes by Whole-Genome Sequencing and High-Resolution SNP array. We also used various computational algorithms to detect both CNAs and complex chromosomal rearrangements. Genomic data obtained from each bioinformatics tool are interpreted comparatively to better understand CNAs and cancer-related Nucleotide variations in MPM tumors. In patients 1 and 2, we found pathogenic nucleotide variants of BAP1, RB1, and TP53. These two MPM genomes exhibited a highly rearranged chromosomal rearrangement pattern resembling Chromomanagesis particularly in the form of Chromoanasynthesis. In patient 3, we found nucleotide variants of important cancer-related genes, including TGFBR1, KMT2C, and PALLD, to have lower chromosomal rearrangement complexity compared with patients 1 and 2. We also detected several actionable nucleotide variants including XRCC1, ERCC2. We also discovered the SKA3-DDX10 fusion in two MPM genomes, which is a novel finding for MPM. We found that MPM genomes are very complex, suggesting that this highly rearranged pattern is strongly related to driver mutational status like BAP1, TP53 and RB1.
Collapse
Affiliation(s)
- Tunç Tuncel
- Health Institutes of Turkey, Turkish Biotechnology Institute, Ankara, Turkey
| | - Güntülü Ak
- Eskisehir Osmangazi University Medical Faculty, Department of Chest Diseases, Lung and Pleural Cancers Research and Clinical Center, Eskisehir, Turkey
| | - Hasan Veysi Güneş
- Eskisehir Osmangazi University Medical Faculty, Department of Medical Biology, Eskisehir, Turkey
| | - Muzaffer Metintaş
- Eskisehir Osmangazi University Medical Faculty, Department of Chest Diseases, Lung and Pleural Cancers Research and Clinical Center, Eskisehir, Turkey
| |
Collapse
|
7
|
Yan Y, Tian Y, Wu Z, Zhang K, Yang R. Interchromosomal Colocalization with Parental Genes Is Linked to the Function and Evolution of Mammalian Retrocopies. Mol Biol Evol 2023; 40:msad265. [PMID: 38060983 PMCID: PMC10733166 DOI: 10.1093/molbev/msad265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 10/25/2023] [Accepted: 11/29/2023] [Indexed: 12/22/2023] Open
Abstract
Retrocopies are gene duplicates arising from reverse transcription of mature mRNA transcripts and their insertion back into the genome. While long being regarded as processed pseudogenes, more and more functional retrocopies have been discovered. How the stripped-down retrocopies recover expression capability and become functional paralogs continually intrigues evolutionary biologists. Here, we investigated the function and evolution of retrocopies in the context of 3D genome organization. By mapping retrocopy-parent pairs onto sequencing-based and imaging-based chromatin contact maps in human and mouse cell lines and onto Hi-C interaction maps in 5 other mammals, we found that retrocopies and their parental genes show a higher-than-expected interchromosomal colocalization frequency. The spatial interactions between retrocopies and parental genes occur frequently at loci in active subcompartments and near nuclear speckles. Accordingly, colocalized retrocopies are more actively transcribed and translated and are more evolutionarily conserved than noncolocalized ones. The active transcription of colocalized retrocopies may result from their permissive epigenetic environment and shared regulatory elements with parental genes. Population genetic analysis of retroposed gene copy number variants in human populations revealed that retrocopy insertions are not entirely random in regard to interchromosomal interactions and that colocalized retroposed gene copy number variants are more likely to reach high frequencies, suggesting that both insertion bias and natural selection contribute to the colocalization of retrocopy-parent pairs. Further dissection implies that reduced selection efficacy, rather than positive selection, contributes to the elevated allele frequency of colocalized retroposed gene copy number variants. Overall, our results hint a role of interchromosomal colocalization in the "resurrection" of initially neutral retrocopies.
Collapse
Affiliation(s)
- Yubin Yan
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Yuhan Tian
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Zefeng Wu
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Kunling Zhang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Ruolin Yang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| |
Collapse
|
8
|
Corradi C, Vilar JB, Buzatto VC, de Souza TA, Castro LP, Munford V, De Vecchi R, Galante PAF, Orpinelli F, Miller TLA, Buzzo JL, Sotto MN, Saldiva P, de Oliveira JW, Chaibub SCW, Sarasin A, Menck CFM. Mutational signatures and increased retrotransposon insertions in xeroderma pigmentosum variant skin tumors. Carcinogenesis 2023; 44:511-524. [PMID: 37195263 DOI: 10.1093/carcin/bgad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 04/06/2023] [Accepted: 05/05/2023] [Indexed: 05/18/2023] Open
Abstract
Xeroderma pigmentosum variant (XP-V) is an autosomal recessive disease with an increased risk of developing cutaneous neoplasms in sunlight-exposed regions. These cells are deficient in the translesion synthesis (TLS) DNA polymerase eta, responsible for bypassing different types of DNA lesions. From the exome sequencing of 11 skin tumors of a genetic XP-V patients' cluster, classical mutational signatures related to sunlight exposure, such as C>T transitions targeted to pyrimidine dimers, were identified. However, basal cell carcinomas also showed distinct C>A mutation spectra reflecting a mutational signature possibly related to sunlight-induced oxidative stress. Moreover, four samples carry different mutational signatures, with C>A mutations associated with tobacco chewing or smoking usage. Thus, XP-V patients should be warned of the risk of these habits. Surprisingly, higher levels of retrotransposon somatic insertions were also detected when the tumors were compared with non-XP skin tumors, revealing other possible causes for XP-V tumors and novel functions for the TLS polymerase eta in suppressing retrotransposition. Finally, the expected high mutation burden found in most of these tumors renders these XP patients good candidates for checkpoint blockade immunotherapy.
Collapse
Affiliation(s)
- Camila Corradi
- Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, SP 05508-000, Brazil
| | - Juliana B Vilar
- Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, SP 05508-000, Brazil
| | - Vanessa C Buzatto
- Molecular Oncology Center, Bioinformatics Laboratory, Hospital Sírio-Libanês, São Paulo, SP 01308-060, Brazil
| | - Tiago A de Souza
- Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, SP 05508-000, Brazil
- Tau GC Bioinformatics, Cotia, SP 06711-020, Brazil
| | - Ligia P Castro
- Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, SP 05508-000, Brazil
| | - Veridiana Munford
- Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, SP 05508-000, Brazil
| | | | - Pedro A F Galante
- Molecular Oncology Center, Bioinformatics Laboratory, Hospital Sírio-Libanês, São Paulo, SP 01308-060, Brazil
| | - Fernanda Orpinelli
- Molecular Oncology Center, Bioinformatics Laboratory, Hospital Sírio-Libanês, São Paulo, SP 01308-060, Brazil
| | - Thiago L A Miller
- Molecular Oncology Center, Bioinformatics Laboratory, Hospital Sírio-Libanês, São Paulo, SP 01308-060, Brazil
- Department of Biochemistry, Institute of Chemistry, University of Sao Paulo, Sao Paulo, SP 05508-000, Brazil
| | - José L Buzzo
- Molecular Oncology Center, Bioinformatics Laboratory, Hospital Sírio-Libanês, São Paulo, SP 01308-060, Brazil
| | - Mirian N Sotto
- Medical School, University of Sao Paulo, Sao Paulo, SP 01246-903, Brazil
| | - Paulo Saldiva
- Medical School, University of Sao Paulo, Sao Paulo, SP 01246-903, Brazil
| | - Jocelânio W de Oliveira
- Institute of Mathematics and Statistics, University of São Paulo, São Paulo, SP 05508-090, Brazil
| | | | - Alain Sarasin
- Laboratory of Genetic Instability and Oncogenesis, UMR8200 CNRS, Gustave Roussy, Université Paris-Sud, Villejuif, France
| | - Carlos F M Menck
- Department of Microbiology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, SP 05508-000, Brazil
| |
Collapse
|
9
|
Batcher K, Varney S, Raudsepp T, Jevit M, Dickinson P, Jagannathan V, Leeb T, Bannasch D. Ancient segmentally duplicated LCORL retrocopies in equids. PLoS One 2023; 18:e0286861. [PMID: 37289743 PMCID: PMC10249811 DOI: 10.1371/journal.pone.0286861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/25/2023] [Indexed: 06/10/2023] Open
Abstract
LINE-1 is an active transposable element encoding proteins capable of inserting host gene retrocopies, resulting in retro-copy number variants (retroCNVs) between individuals. Here, we performed retroCNV discovery using 86 equids and identified 437 retrocopy insertions. Only 5 retroCNVs were shared between horses and other equids, indicating that the majority of retroCNVs inserted after the species diverged. A large number (17-35 copies) of segmentally duplicated Ligand Dependent Nuclear Receptor Corepressor Like (LCORL) retrocopies were present in all equids but absent from other extant perissodactyls. The majority of LCORL transcripts in horses and donkeys originate from the retrocopies. The initial LCORL retrotransposition occurred 18 million years ago (17-19 95% CI), which is coincident with the increase in body size, reduction in digit number, and changes in dentition that characterized equid evolution. Evolutionary conservation of the LCORL retrocopy segmental amplification in the Equidae family, high expression levels and the ancient timeline for LCORL retrotransposition support a functional role for this structural variant.
Collapse
Affiliation(s)
- Kevin Batcher
- Department of Population Health and Reproduction, University of California Davis, Davis, CA, United States of America
| | - Scarlett Varney
- Department of Population Health and Reproduction, University of California Davis, Davis, CA, United States of America
| | - Terje Raudsepp
- Veterinary Integrative Biosciences, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Matthew Jevit
- Veterinary Integrative Biosciences, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Peter Dickinson
- Department of Surgical and Radiological Sciences, University of California Davis, Davis, CA, United States of America
| | - Vidhya Jagannathan
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - Tosso Leeb
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - Danika Bannasch
- Department of Population Health and Reproduction, University of California Davis, Davis, CA, United States of America
| |
Collapse
|
10
|
Ten Berk de Boer E, Bilgrav Saether K, Eisfeldt J. Discovery of non-reference processed pseudogenes in the Swedish population. Front Genet 2023; 14:1176626. [PMID: 37323659 PMCID: PMC10267823 DOI: 10.3389/fgene.2023.1176626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 05/19/2023] [Indexed: 06/17/2023] Open
Abstract
The vast majority of the human genome is non-coding. There is a diversity of non-coding features, some of which have functional importance. Although the non-coding regions constitute the majority of the genome, they remain understudied, and for a long time, these regions have been referred to as junk DNA. Pseudogenes are one of these features. A pseudogene is a non-functional copy of a protein-coding gene. Pseudogenes may arise through a variety of genetic mechanisms. Processed pseudogenes are formed through reverse transcription of mRNA by LINE elements, after which the cDNA is integrated into the genome. Processed pseudogenes are known to be variable across populations; however, the variability and distribution remains unknown. Herein, we apply a custom-designed processed pseudogene pipeline on the whole genome sequencing data of 3,500 individuals; 2,500 individuals from the thousand genomes dataset, as well as 1,000 Swedish individuals. Through these analyses, we discover over 3,000 pseudogenes missing from the GRCh38 reference. Utilising our pipeline, we position 74% of the detected processed pseudogenes-allowing for analyses of formation. Notably, we find that common structural variant callers, such as Delly, classify the processed pseudogenes as deletion events, which are later predicted to be truncating variants. By compiling lists of non-reference processed pseudogenes and their frequencies, we find a great variability of pseudogenes; indicating that non-reference processed pseudogenes may be useful for DNA testing and as population-specific markers. In summary, our findings highlight a great diversity of processed pseudogenes, that processed pseudogenes are actively formed in the human genome; and that our pipeline may be used to reduce false positive structural variation caused by the misalignment and subsequent misclassification of non-reference processed pseudogenes.
Collapse
Affiliation(s)
- Esmee Ten Berk de Boer
- Department of Molecular Medicine and Surgery, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Kristine Bilgrav Saether
- Department of Molecular Medicine and Surgery, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Science for Life Laboratory, Karolinska Institutet Science Park, Solna, Sweden
| | - Jesper Eisfeldt
- Department of Molecular Medicine and Surgery, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Science for Life Laboratory, Karolinska Institutet Science Park, Solna, Sweden
- Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
| |
Collapse
|
11
|
Pokrovac I, Pezer Ž. Recent advances and current challenges in population genomics of structural variation in animals and plants. Front Genet 2022; 13:1060898. [PMID: 36523759 PMCID: PMC9745067 DOI: 10.3389/fgene.2022.1060898] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 11/15/2022] [Indexed: 05/02/2024] Open
Abstract
The field of population genomics has seen a surge of studies on genomic structural variation over the past two decades. These studies witnessed that structural variation is taxonomically ubiquitous and represent a dominant form of genetic variation within species. Recent advances in technology, especially the development of long-read sequencing platforms, have enabled the discovery of structural variants (SVs) in previously inaccessible genomic regions which unlocked additional structural variation for population studies and revealed that more SVs contribute to evolution than previously perceived. An increasing number of studies suggest that SVs of all types and sizes may have a large effect on phenotype and consequently major impact on rapid adaptation, population divergence, and speciation. However, the functional effect of the vast majority of SVs is unknown and the field generally lacks evidence on the phenotypic consequences of most SVs that are suggested to have adaptive potential. Non-human genomes are heavily under-represented in population-scale studies of SVs. We argue that more research on other species is needed to objectively estimate the contribution of SVs to evolution. We discuss technical challenges associated with SV detection and outline the most recent advances towards more representative reference genomes, which opens a new era in population-scale studies of structural variation.
Collapse
Affiliation(s)
| | - Željka Pezer
- Laboratory for Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| |
Collapse
|
12
|
Batcher K, Varney S, York D, Blacksmith M, Kidd JM, Rebhun R, Dickinson P, Bannasch D. Recent, full-length gene retrocopies are common in canids. Genome Res 2022; 32:1602-1611. [PMID: 35961775 PMCID: PMC9435743 DOI: 10.1101/gr.276828.122] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 07/19/2022] [Indexed: 02/03/2023]
Abstract
Gene retrocopies arise from the reverse transcription and insertion into the genome of processed mRNA transcripts. Although many retrocopies have acquired mutations that render them functionally inactive, most mammals retain active LINE-1 sequences capable of producing new retrocopies. New retrocopies, referred to as retro copy number variants (retroCNVs), may not be identified by standard variant calling techniques in high-throughput sequencing data. Although multiple functional FGF4 retroCNVs have been associated with skeletal dysplasias in dogs, the full landscape of canid retroCNVs has not been characterized. Here, retroCNV discovery was performed on a whole-genome sequencing data set of 293 canids from 76 breeds. We identified retroCNV parent genes via the presence of mRNA-specific 30-mers, and then identified retroCNV insertion sites through discordant read analysis. In total, we resolved insertion sites for 1911 retroCNVs from 1179 parent genes, 1236 of which appeared identical to their parent genes. Dogs had on average 54.1 total retroCNVs and 1.4 private retroCNVs. We found evidence of expression in testes for 12% (14/113) of the retroCNVs identified in six Golden Retrievers, including four chimeric transcripts, and 97 retroCNVs also had significantly elevated F ST across dog breeds, possibly indicating selection. We applied our approach to a subset of human genomes and detected an average of 4.2 retroCNVs per sample, highlighting a 13-fold relative increase of retroCNV frequency in dogs. Particularly in canids, retroCNVs are a largely unexplored source of genetic variation that can contribute to genome plasticity and that should be considered when investigating traits and diseases.
Collapse
Affiliation(s)
- Kevin Batcher
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| | - Scarlett Varney
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| | - Daniel York
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Matthew Blacksmith
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Robert Rebhun
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Peter Dickinson
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Danika Bannasch
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| |
Collapse
|
13
|
Domazet-Lošo T. mRNA Vaccines: Why Is the Biology of Retroposition Ignored? Genes (Basel) 2022; 13:719. [PMID: 35627104 PMCID: PMC9141755 DOI: 10.3390/genes13050719] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 04/14/2022] [Accepted: 04/15/2022] [Indexed: 02/07/2023] Open
Abstract
The major advantage of mRNA vaccines over more conventional approaches is their potential for rapid development and large-scale deployment in pandemic situations. In the current COVID-19 crisis, two mRNA COVID-19 vaccines have been conditionally approved and broadly applied, while others are still in clinical trials. However, there is no previous experience with the use of mRNA vaccines on a large scale in the general population. This warrants a careful evaluation of mRNA vaccine safety properties by considering all available knowledge about mRNA molecular biology and evolution. Here, I discuss the pervasive claim that mRNA-based vaccines cannot alter genomes. Surprisingly, this notion is widely stated in the mRNA vaccine literature but never supported by referencing any primary scientific papers that would specifically address this question. This discrepancy becomes even more puzzling if one considers previous work on the molecular and evolutionary aspects of retroposition in murine and human populations that clearly documents the frequent integration of mRNA molecules into genomes, including clinical contexts. By performing basic comparisons, I show that the sequence features of mRNA vaccines meet all known requirements for retroposition using L1 elements-the most abundant autonomously active retrotransposons in the human genome. In fact, many factors associated with mRNA vaccines increase the possibility of their L1-mediated retroposition. I conclude that is unfounded to a priori assume that mRNA-based therapeutics do not impact genomes and that the route to genome integration of vaccine mRNAs via endogenous L1 retroelements is easily conceivable. This implies that we urgently need experimental studies that would rigorously test for the potential retroposition of vaccine mRNAs. At present, the insertional mutagenesis safety of mRNA-based vaccines should be considered unresolved.
Collapse
Affiliation(s)
- Tomislav Domazet-Lošo
- Laboratory of Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Bijenička Cesta 54, HR-10000 Zagreb, Croatia;
- School of Medicine, Catholic University of Croatia, Ilica 242, HR-10000 Zagreb, Croatia
| |
Collapse
|
14
|
Saitou M, Masuda N, Gokcumen O. Similarity-Based Analysis of Allele Frequency Distribution among Multiple Populations Identifies Adaptive Genomic Structural Variants. Mol Biol Evol 2022; 39:msab313. [PMID: 34718708 PMCID: PMC8896759 DOI: 10.1093/molbev/msab313] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Structural variants have a considerable impact on human genomic diversity. However, their evolutionary history remains mostly unexplored. Here, we developed a new method to identify potentially adaptive structural variants based on a similarity-based analysis that incorporates genotype frequency data from 26 populations simultaneously. Using this method, we analyzed 57,629 structural variants and identified 576 structural variants that show unusual population differentiation. Of these putatively adaptive structural variants, we further showed that 24 variants are multiallelic and overlap with coding sequences, and 20 variants are significantly associated with GWAS traits. Closer inspection of the haplotypic variation associated with these putatively adaptive and functional structural variants reveals deviations from neutral expectations due to: 1) population differentiation of rapidly evolving multiallelic variants, 2) incomplete sweeps, and 3) recent population-specific negative selection. Overall, our study provides new methodological insights, documents hundreds of putatively adaptive variants, and introduces evolutionary models that may better explain the complex evolution of structural variants.
Collapse
Affiliation(s)
- Marie Saitou
- Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - Naoki Masuda
- Department of Mathematics, University at Buffalo, State University of New York, Buffalo, NY, USA
- Computational and Data-Enabled Science and Engineering Program, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| |
Collapse
|
15
|
Abrahamsson S, Eiengård F, Rohlin A, Dávila López M. PΨFinder: a practical tool for the identification and visualization of novel pseudogenes in DNA sequencing data. BMC Bioinformatics 2022; 23:59. [PMID: 35114952 PMCID: PMC8812246 DOI: 10.1186/s12859-022-04583-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 01/24/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Processed pseudogenes (PΨgs) are disabled gene copies that are transcribed and may affect expression of paralogous genes. Moreover, their insertion in the genome can disrupt the structure or the regulatory region of a gene, affecting its expression level. These events have been identified as occurring mutations during cancer development, thus being able to identify PΨgs and their location will improve their impact on diagnostic testing, not only in cancer but also in inherited disorders. RESULTS We have implemented PΨFinder (P-psy-finder), a tool that identifies PΨgs, annotates known ones and predicts their insertion site(s) in the genome. The tool screens alignment files and provides user-friendly summary reports and visualizations. To demonstrate its applicability, we scanned 218 DNA samples from patients screened for hereditary colorectal cancer. We detected 423 PΨgs distributed in 96% of the samples, comprising 7 different parent genes. Among these, we confirmed the well-known insertion site of the SMAD4-PΨg within the last intron of the SCAI gene in one sample. While for the ubiquitous CBX3-PΨg, present in 82.6% of the samples, we found it reversed inserted in the second intron of the C15ORF57 gene. CONCLUSIONS PΨFinder is a tool that can automatically identify novel PΨgs from DNA sequencing data and determine their location in the genome with high sensitivity (95.92%). It generates high quality figures and tables that facilitate the interpretation of the results and can guide the experimental validation. PΨFinder is a complementary analysis to any mutational screening in the identification of disease-causing mutations within cancer and other diseases.
Collapse
Affiliation(s)
- Sanna Abrahamsson
- Bioinformatics Core Facility, Sahlgrenska Academy, University of Gothenburg, Box 115, 405 30, Gothenburg, Sweden
| | - Frida Eiengård
- Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Anna Rohlin
- Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.,Unit of Genetic Analysis and Bioinformatics, Department of Clinical Genetics and Genomics, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Marcela Dávila López
- Bioinformatics Core Facility, Sahlgrenska Academy, University of Gothenburg, Box 115, 405 30, Gothenburg, Sweden.
| |
Collapse
|
16
|
Piotrowski A, Koczkowska M, Poplawski AB, Bartoszewski R, Króliczewski J, Mieczkowska A, Gomes A, Crowley MR, Crossman DK, Chen Y, Lao P, Serra E, Llach MC, Castellanos E, Messiaen LM. Targeted massively parallel sequencing of candidate regions on chromosome 22q predisposing to multiple schwannomas: An analysis of 51 individuals in a single-center experience. Hum Mutat 2022; 43:74-84. [PMID: 34747535 DOI: 10.1002/humu.24294] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/15/2021] [Accepted: 10/28/2021] [Indexed: 01/07/2023]
Abstract
Constitutional LZTR1 or SMARCB1 pathogenic variants (PVs) have been found in ∼86% of familial and ∼40% of sporadic schwannomatosis cases. Hence, we performed massively parallel sequencing of the entire LZTR1, SMARCB1, and NF2 genomic loci in 35 individuals with schwannomas negative for constitutional first-hit PVs in the LZTR1/SMARCB1/NF2 coding sequences; however, with 22q deletion and/or a different NF2 PV in each tumor, including six cases with only one tumor available. Furthermore, we verified whether any other LZTR1/SMARCB1/NF2 (likely) PVs could be found in 16 cases carrying a SMARCB1 constitutional variant in the 3'-untranslated region (3'-UTR) c.*17C>T, c.*70C>T, or c.*82C>T. As no additional variants were found, functional studies were performed to clarify the effect of these 3'-UTR variants on the transcript. The 3'-UTR variants c.*17C>T and c.*82C>T showed pathogenicity by negatively affecting the SMARCB1 transcript level. Two novel deep intronic SMARCB1 variants, c.500+883T>G and c.500+887G>A, resulting in out-of-frame missplicing of intron 4, were identified in two unrelated individuals. Further resequencing of the entire repeat-masked genomics sequences of chromosome 22q in individuals negative for PVs in the SMARCB1/LZTR1/NF2 coding- and noncoding regions revealed five potential schwannomatosis-predisposing candidate genes, that is, MYO18B, NEFH, SGSM1, SGSM3, and SBF1, pending further verification.
Collapse
Affiliation(s)
- Arkadiusz Piotrowski
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
- 3P-Medicine Laboratory, Medical University of Gdansk, Gdansk, Poland
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Gdansk, Poland
| | - Magdalena Koczkowska
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
- 3P-Medicine Laboratory, Medical University of Gdansk, Gdansk, Poland
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Gdansk, Poland
| | - Andrzej B Poplawski
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Rafał Bartoszewski
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Gdansk, Poland
| | - Jarosław Króliczewski
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Gdansk, Poland
| | - Alina Mieczkowska
- Department of Biology and Pharmaceutical Botany, Medical University of Gdansk, Gdansk, Poland
| | - Alicia Gomes
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Michael R Crowley
- Genomic Core Facility, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - David K Crossman
- Genomic Core Facility, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Yunjia Chen
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Ping Lao
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Eduard Serra
- Hereditary Cancer Group, Program of Predictive and Personalized Medicine of Cancer (PMPPC), Germans Trias i Pujol Research Institute (IGTP), Barcelona, Spain
| | - Meritxell C Llach
- Hereditary Cancer Group, Program of Predictive and Personalized Medicine of Cancer (PMPPC), Germans Trias i Pujol Research Institute (IGTP), Barcelona, Spain
| | - Elisabeth Castellanos
- Clinical Genomics Research Group, Program of Predictive and Personalized Medicine of Cancer (PMPPC), Germans Trias i Pujol Research Institute (IGTP), Barcelona, Spain
- Clinical Genomics Unit, Clinical Genetics Service, Northern Metropolitan Clinical Laboratory, Germans Trias i Pujol University Hospital (HUGTiP), Barcelona, Spain
| | - Ludwine M Messiaen
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
| |
Collapse
|
17
|
Abstract
The genetic basis for the emergence of creativity in modern humans remains a mystery despite sequencing the genomes of chimpanzees and Neanderthals, our closest hominid relatives. Data-driven methods allowed us to uncover networks of genes distinguishing the three major systems of modern human personality and adaptability: emotional reactivity, self-control, and self-awareness. Now we have identified which of these genes are present in chimpanzees and Neanderthals. We replicated our findings in separate analyses of three high-coverage genomes of Neanderthals. We found that Neanderthals had nearly the same genes for emotional reactivity as chimpanzees, and they were intermediate between modern humans and chimpanzees in their numbers of genes for both self-control and self-awareness. 95% of the 267 genes we found only in modern humans were not protein-coding, including many long-non-coding RNAs in the self-awareness network. These genes may have arisen by positive selection for the characteristics of human well-being and behavioral modernity, including creativity, prosocial behavior, and healthy longevity. The genes that cluster in association with those found only in modern humans are over-expressed in brain regions involved in human self-awareness and creativity, including late-myelinating and phylogenetically recent regions of neocortex for autobiographical memory in frontal, parietal, and temporal regions, as well as related components of cortico-thalamo-ponto-cerebellar-cortical and cortico-striato-cortical loops. We conclude that modern humans have more than 200 unique non-protein-coding genes regulating co-expression of many more protein-coding genes in coordinated networks that underlie their capacities for self-awareness, creativity, prosocial behavior, and healthy longevity, which are not found in chimpanzees or Neanderthals.
Collapse
|
18
|
Zhang W, Tautz D. Tracing the origin and evolutionary fate of recent gene retrocopies in natural populations of the house mouse. Mol Biol Evol 2021; 39:6481550. [PMID: 34940842 PMCID: PMC8826619 DOI: 10.1093/molbev/msab360] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Although the contribution of retrogenes to the evolution of genes and genomes has long been recognized, the evolutionary patterns of very recently derived retrocopies that are still polymorphic within natural populations have not been much studied so far. We use here a set of 2,025 such retrocopies in nine house mouse populations from three subspecies (Mus musculus domesticus, M. m. musculus, and M. m. castaneus) to trace their origin and evolutionary fate. We find that ancient house-keeping genes are significantly more likely to generate retrocopies than younger genes and that the propensity to generate a retrocopy depends on its level of expression in the germline. Although most retrocopies are detrimental and quickly purged, we focus here on the subset that appears to be neutral or even adaptive. We show that retrocopies from X-chromosomal parental genes have a higher likelihood to reach elevated frequencies in the populations, confirming the notion of adaptive effects for “out-of-X” retrogenes. Also, retrocopies in intergenic regions are more likely to reach higher population frequencies than those in introns of genes, implying a more detrimental effect when they land within transcribed regions. For a small subset of retrocopies, we find signatures of positive selection, indicating they were involved in a recent adaptation process. We show that the population-specific distribution pattern of retrocopies is phylogenetically informative and can be used to infer population history with a better resolution than with SNP markers.
Collapse
Affiliation(s)
- Wenyu Zhang
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, Plön, D-24306, Germany
| | - Diethard Tautz
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, Plön, D-24306, Germany
| |
Collapse
|
19
|
Troskie RL, Faulkner GJ, Cheetham SW. Processed pseudogenes: A substrate for evolutionary innovation: Retrotransposition contributes to genome evolution by propagating pseudogene sequences with rich regulatory potential throughout the genome. Bioessays 2021; 43:e2100186. [PMID: 34569081 DOI: 10.1002/bies.202100186] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 09/09/2021] [Accepted: 09/13/2021] [Indexed: 11/08/2022]
Abstract
Processed pseudogenes may serve as a genetic reservoir for evolutionary innovation. Here, we argue that through the activity of long interspersed element-1 retrotransposons, processed pseudogenes disperse coding and noncoding sequences rich with regulatory potential throughout the human genome. While these sequences may appear to be non-functional, a lack of contemporary function does not prohibit future development of biological activity. Here, we discuss the dynamic evolution of certain processed pseudogenes into coding and noncoding genes and regulatory elements, and their implication in wide-ranging biological and pathological processes. Also see the video abstract here: https://youtu.be/iUY_mteVoPI.
Collapse
Affiliation(s)
- Robin-Lee Troskie
- Mater Research Institute, University of Queensland, Woolloongabba, Australia
| | - Geoffrey J Faulkner
- Mater Research Institute, University of Queensland, Woolloongabba, Australia.,Queensland Brain Institute, University of Queensland, Brisbane, Australia
| | - Seth W Cheetham
- Mater Research Institute, University of Queensland, Woolloongabba, Australia
| |
Collapse
|
20
|
Wei Z, Sun J, Li Q, Yao T, Zeng H, Wang Y. RetroScan: An Easy-to-Use Pipeline for Retrocopy Annotation and Visualization. Front Genet 2021; 12:719204. [PMID: 34484306 PMCID: PMC8415311 DOI: 10.3389/fgene.2021.719204] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 07/26/2021] [Indexed: 11/13/2022] Open
Abstract
Retrocopies, which are considered “junk genes,” are occasionally formed via the insertion of reverse-transcribed mRNAs at new positions in the genome. However, an increasing number of recent studies have shown that some retrocopies exhibit new biological functions and may contribute to genome evolution. Hence, the identification of retrocopies has become very meaningful for studying gene duplication and new gene generation. Current pipelines identify retrocopies through complex operations using alignment programs and filter scripts in a step-by-step manner. Therefore, there is an urgent need for a simple and convenient retrocopy annotation tool. Here, we report the development of RetroScan, a publicly available and easy-to-use tool for scanning, annotating and displaying retrocopies, consisting of two components: an analysis pipeline and a visual interface. The pipeline integrates a series of bioinformatics software programs and scripts for identifying retrocopies in just one line of command. Compared with previous methods, RetroScan increases accuracy and reduces false-positive results. We also provide a Shiny app for visualization. It displays information on retrocopies and their parental genes that can be used for the study of retrocopy structure and evolution. RetroScan is available at https://github.com/Vicky123wzy/RetroScan.
Collapse
Affiliation(s)
- Zhaoyuan Wei
- State Key Laboratory of Silkworm Genome Biology, Biological Science Research Center, Southwest University, Chongqing, China.,Biological Science Research Center, Southwest University, Chongqing, China
| | - Jiahe Sun
- Biological Science Research Center, Southwest University, Chongqing, China
| | - Qinhui Li
- State Key Laboratory of Silkworm Genome Biology, Biological Science Research Center, Southwest University, Chongqing, China
| | - Ting Yao
- State Key Laboratory of Silkworm Genome Biology, Biological Science Research Center, Southwest University, Chongqing, China
| | - Haiyue Zeng
- Biological Science Research Center, Southwest University, Chongqing, China
| | - Yi Wang
- State Key Laboratory of Silkworm Genome Biology, Biological Science Research Center, Southwest University, Chongqing, China.,Biological Science Research Center, Southwest University, Chongqing, China
| |
Collapse
|
21
|
Miller TLA, Orpinelli Rego F, Buzzo JLL, Galante PAF. sideRETRO: a pipeline for identifying somatic and polymorphic insertions of processed pseudogenes or retrocopies. Bioinformatics 2021; 37:419-421. [PMID: 32717039 DOI: 10.1093/bioinformatics/btaa689] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 06/29/2020] [Accepted: 07/23/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Retrocopies or processed pseudogenes are gene copies resulting from mRNA retrotransposition. These gene duplicates can be fixed, somatically inserted or polymorphic in the genome. However, knowledge regarding unfixed retrocopies (retroCNVs) is still limited, and the development of computational tools for effectively identifying and genotyping them is an urgent need. RESULTS Here, we present sideRETRO, a pipeline dedicated not only to detecting retroCNVs in whole-genome or whole-exome sequencing data but also to revealing their insertion sites, zygosity and genomic context and classifying them as somatic or polymorphic events. We show that sideRETRO can identify novel retroCNVs and genotype them, in addition to finding polymorphic retroCNVs in whole-genome and whole-exome data. Therefore, sideRETRO fills a gap in the literature and presents an efficient and straightforward algorithm to accelerate the study of bona fide retroCNVs. AVAILABILITY AND IMPLEMENTATION sideRETRO is available at https://github.com/galantelab/sideRETRO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thiago L A Miller
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo 01308-060, Brazil.,Departmento de Bioquímica, Universidade de São Paulo, São Paulo 05508-000, Brazil
| | | | - José Leonel L Buzzo
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo 01308-060, Brazil.,Departmento de Bioquímica, Universidade de São Paulo, São Paulo 05508-000, Brazil
| | - Pedro A F Galante
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo 01308-060, Brazil
| |
Collapse
|
22
|
Feng X, Li H. Higher Rates of Processed Pseudogene Acquisition in Humans and Three Great Apes Revealed by Long-Read Assemblies. Mol Biol Evol 2021; 38:2958-2966. [PMID: 33681998 DOI: 10.1093/molbev/msab062] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
LINE-1-mediated retrotransposition of protein-coding mRNAs is an active process in modern humans for both germline and somatic genomes. Prior works that surveyed human data mostly relied on detecting discordant mappings of paired-end short reads, or exon junctions contained in short reads. Moreover, there have been few genome-wide comparisons between gene retrocopies in great apes and humans. In this study, we introduced a more sensitive and accurate method to identify processed pseudogenes. Our method utilizes long-read assemblies, and more importantly, is able to provide full-length retrocopy sequences as well as flanking regions which are missed by short-read based methods. From 22 human individuals, we pinpointed 40 processed pseudogenes that are not present in the human reference genome GRCh38 and identified 17 pseudogenes that are in GRCh38 but absent from some input individuals. This represents a significantly higher discovery rate than previous reports (39 pseudogenes not in the reference genome out of 939 individuals). We also provided an overview of lineage-specific retrocopies in chimpanzee, gorilla, and orangutan genomes.
Collapse
Affiliation(s)
- Xiaowen Feng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
23
|
The mutational load in natural populations is significantly affected by high primary rates of retroposition. Proc Natl Acad Sci U S A 2021; 118:2013043118. [PMID: 33526666 PMCID: PMC8017666 DOI: 10.1073/pnas.2013043118] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The phenomenon of retroposition (the reintegration of reverse-transcribed RNA into the genome) has been well studied in comparisons between species and has been identified as a source of evolutionary innovation. However, less attention has been paid to possible negative effects of retroposition. To trace the evolutionary dynamics of these negative effects, our study uses a unique genomic dataset of house mouse populations. It reveals that the initial retroposition rate is very high and that most of these newly transposed retrocopies have a deleterious impact, apparently through modifying the expression of their parental genes. In humans, this effect is expected to cause disease alleles, and we propose that genetic screening should include the search for newly transposed retrocopies. Gene retroposition is known to contribute to patterns of gene evolution and adaptations. However, possible negative effects of gene retroposition remain largely unexplored since most previous studies have focused on between-species comparisons where negatively selected copies are mostly not observed, as they are quickly lost from populations. Here, we show for natural house mouse populations that the primary rate of retroposition is orders of magnitude higher than the long-term rate. Comparisons with single-nucleotide polymorphism distribution patterns in the same populations show that most retroposition events are deleterious. Transcriptomic profiling analysis shows that new retroposed copies become easily subject to transcription and have an influence on the expression levels of their parental genes, especially when transcribed in the antisense direction. Our results imply that the impact of retroposition on the mutational load has been highly underestimated in natural populations. This has additional implications for strategies of disease allele detection in humans.
Collapse
|
24
|
Zeng H, Chen X, Li H, Zhang J, Wei Z, Wang Y. Interpopulation differences of retroduplication variations (RDVs) in rice retrogenes and their phenotypic correlations. Comput Struct Biotechnol J 2021; 19:600-611. [PMID: 33510865 PMCID: PMC7811064 DOI: 10.1016/j.csbj.2020.12.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 12/29/2020] [Accepted: 12/31/2020] [Indexed: 11/21/2022] Open
Abstract
Retroduplication variation (RDV), a type of retrocopy polymorphism, is considered to have essential biological significance, but its effect on gene function and species phenotype is still poorly understood. To this end, we analyzed the retrocopies and RDVs in 3,010 rice genomes. We calculated the RDV frequencies in the genome of each rice population; detected the mutated, ancestral and expressed retrogenes in rice genomes; and analyzed their RDV influence on rice phenotypic traits. Collectively, 73 RDVs were identified, and 14 RDVs in ancestral retrogenes can significantly affect rice phenotypes. Our research reveals that RDV plays an important role in rice migration, domestication and evolution. We think that RDV is a good molecular breeding marker candidate. To our knowledge, this is the first study on the relationship between retrogene function, expression, RDV and species phenotype.
Collapse
Affiliation(s)
- Haiyue Zeng
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Shennong Class, Southwest University, Chongqing 400715, China
| | - Xingyu Chen
- Shennong Class, Southwest University, Chongqing 400715, China
| | - Hongbo Li
- College of Electronic and Information Engineering, Southwest University, Chongqing 400715
| | - Jun Zhang
- College of Computer & Information Science, Southwest University, Chongqing 400715, China
| | - Zhaoyuan Wei
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
| | - Yi Wang
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
| |
Collapse
|
25
|
Tuo Y, Chu W, Zhang J, Cheng J, Chen L, Bao L, Xiao T. Analysis of Natural Selection of Immune Genes in Spinibarbus caldwelli by Transcriptome Sequencing. Front Genet 2020; 11:714. [PMID: 32793279 PMCID: PMC7393255 DOI: 10.3389/fgene.2020.00714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 06/11/2020] [Indexed: 12/03/2022] Open
Abstract
Spinibarbus caldwelli is an omnivorous cyprinid fish that is distributed widely in China. To investigate the adaptive evolution of S. caldwelli, the muscle transcriptome was sequenced by Illumina HiSeq 4000 platform. A total of 80,447,367 reads were generated by next-generation sequencing. Also, 211,386 unigenes were obtained by de novo assembly. Additionally, we calculated that the divergence time between S. caldwelli and Sinocyclocheilus grahami is 23.14 million years ago (Mya). And both of them diverged from Ctenopharyngodon idellus 46.95 Mya. Furthermore, 38 positive genes were identified by calculating Ka/Ks ratios from 9225 orthologs. Among them, several immune-related genes were identified as positively selected, such as POLR3B, PIK3C3, TOPORS, FASTKD3, CYPLP1A1, and UACA. Our results throw light on the nature of the natural selection of S. caldwelli and contribute to future immunological and transcriptome studies.
Collapse
Affiliation(s)
- Yun Tuo
- Hunan Engineering Technology Research Center of Featured Aquatic Resources Utilization, Hunan Agricultural University, Changsha, China.,College of Life Science and Resources Environment, Yichun University, Yichun, China
| | - Wuying Chu
- Department of Biological and Environmental Engineering, Changsha University, Changsha, China
| | - Jianshe Zhang
- Department of Biological and Environmental Engineering, Changsha University, Changsha, China
| | - Jia Cheng
- Department of Biological and Environmental Engineering, Changsha University, Changsha, China
| | - Lin Chen
- Department of Biological and Environmental Engineering, Changsha University, Changsha, China
| | - Lingsheng Bao
- Department of Biological and Environmental Engineering, Changsha University, Changsha, China
| | - Tiaoyi Xiao
- Hunan Engineering Technology Research Center of Featured Aquatic Resources Utilization, Hunan Agricultural University, Changsha, China
| |
Collapse
|
26
|
Batcher K, Dickinson P, Maciejczyk K, Brzeski K, Rasouliha SH, Letko A, Drögemüller C, Leeb T, Bannasch D. Multiple FGF4 Retrocopies Recently Derived within Canids. Genes (Basel) 2020; 11:genes11080839. [PMID: 32717834 PMCID: PMC7465015 DOI: 10.3390/genes11080839] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 07/21/2020] [Accepted: 07/21/2020] [Indexed: 12/17/2022] Open
Abstract
Two transcribed retrocopies of the fibroblast growth factor 4 (FGF4) gene have previously been described in the domestic dog. An FGF4 retrocopy on chr18 is associated with disproportionate dwarfism, while an FGF4 retrocopy on chr12 is associated with both disproportionate dwarfism and intervertebral disc disease (IVDD). In this study, whole-genome sequencing data were queried to identify other FGF4 retrocopies that could be contributing to phenotypic diversity in canids. Additionally, dogs with surgically confirmed IVDD were assayed for novel FGF4 retrocopies. Five additional and distinct FGF4 retrocopies were identified in canids including a copy unique to red wolves (Canis rufus). The FGF4 retrocopies identified in domestic dogs were identical to domestic dog FGF4 haplotypes, which are distinct from modern wolf FGF4 haplotypes, indicating that these retrotransposition events likely occurred after domestication. The identification of multiple, full length FGF4 retrocopies with open reading frames in canids indicates that gene retrotransposition events occur much more frequently than previously thought and provide a mechanism for continued genetic and phenotypic diversity in canids.
Collapse
Affiliation(s)
- Kevin Batcher
- Department of Population Health and Reproduction, University of California-Davis, Davis, CA 95616, USA; (K.B.); (K.M.)
| | - Peter Dickinson
- Department of Surgical and Radiological Sciences, University of California-Davis, Davis, CA 95616, USA;
| | - Kimberly Maciejczyk
- Department of Population Health and Reproduction, University of California-Davis, Davis, CA 95616, USA; (K.B.); (K.M.)
| | - Kristin Brzeski
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA;
| | - Sheida Hadji Rasouliha
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland; (S.H.R.); (A.L.); (C.D.); (T.L.)
| | - Anna Letko
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland; (S.H.R.); (A.L.); (C.D.); (T.L.)
| | - Cord Drögemüller
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland; (S.H.R.); (A.L.); (C.D.); (T.L.)
| | - Tosso Leeb
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland; (S.H.R.); (A.L.); (C.D.); (T.L.)
| | - Danika Bannasch
- Department of Population Health and Reproduction, University of California-Davis, Davis, CA 95616, USA; (K.B.); (K.M.)
- Correspondence:
| |
Collapse
|
27
|
Abel HJ, Larson DE, Regier AA, Chiang C, Das I, Kanchi KL, Layer RM, Neale BM, Salerno WJ, Reeves C, Buyske S, Matise TC, Muzny DM, Zody MC, Lander ES, Dutcher SK, Stitziel NO, Hall IM. Mapping and characterization of structural variation in 17,795 human genomes. Nature 2020; 583:83-89. [PMID: 32460305 PMCID: PMC7547914 DOI: 10.1038/s41586-020-2371-0] [Citation(s) in RCA: 181] [Impact Index Per Article: 36.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Accepted: 05/18/2020] [Indexed: 12/18/2022]
Abstract
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Collapse
Affiliation(s)
- Haley J Abel
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - David E Larson
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Allison A Regier
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Colby Chiang
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Indraniel Das
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Krishna L Kanchi
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Ryan M Layer
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
- Department of Computer Science, University of Colorado, Boulder, CO, USA
| | - Benjamin M Neale
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - William J Salerno
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - Steven Buyske
- Department of Statistics, Rutgers University, Piscataway, NJ, USA
| | - Tara C Matise
- Department of Genetics, Rutgers University, Piscataway, NJ, USA
| | - Donna M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Susan K Dutcher
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Nathan O Stitziel
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA.
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA.
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA.
| |
Collapse
|
28
|
Abstract
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Collapse
|
29
|
Zheng B, Zhang S, Cai W, Wang J, Wang T, Tang N, Shi Y, Luo X, Yan W. Identification of Novel Fusion Transcripts in Undifferentiated Pleomorphic Sarcomas by Transcriptome Sequencing. Cancer Genomics Proteomics 2020; 16:399-408. [PMID: 31467233 DOI: 10.21873/cgp.20144] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Revised: 07/03/2019] [Accepted: 07/04/2019] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND/AIM Undifferentiated pleomorphic sarcoma (UPS) is an aggressive mesenchymal neoplasm characterized by chromosomal instability. The aim of this study was to identify fusion events involved in UPS. MATERIALS AND METHODS Transcriptome sequencing was performed to search for new fusion genes in 19 UPS samples, including two paired recurrent (R) and re-recurrent (RR) samples. RESULTS A total of 66 fusion genes were detected. Among them, 10 novel fusion genes were further confirmed by reverse transcription polymerase chain reaction (RT-PCR) and Sanger sequencing. Retinoblastoma (RB1) fusions (2 cases) were the most recurrent fusion genes. The gene fusions RB1-RNASEH2B, RB1-FGF14-AS1, and E2F6-FKBP4 were correlated with the Rb/E2F pathway. Pseudogenes were involved in the formation of the gene fusions CIC-DUX4L8 and EIF2AK4-ANXA2P2. Importantly, targetable gene fusions (PDGFRA-MACROD2 and NCOR1-MAP2K1) were detected in UPS. CONCLUSION Screening for the presence of fusion transcripts will provide vital clues to the understanding of genetic alterations and the finding of new targeted therapies for UPS.
Collapse
Affiliation(s)
- Biqiang Zheng
- Department of Musculoskeletal Cancer Surgery, Fudan University Shanghai Cancer Center, Shanghai, P.R. China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, P.R. China
| | | | - Weiluo Cai
- Department of Musculoskeletal Cancer Surgery, Fudan University Shanghai Cancer Center, Shanghai, P.R. China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, P.R. China
| | - Jian Wang
- Department of Musculoskeletal Cancer Surgery, Fudan University Shanghai Cancer Center, Shanghai, P.R. China.,Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai, P.R. China
| | - Ting Wang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, P.R. China
| | - Ning Tang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, P.R. China
| | - Yingqiang Shi
- Department of Musculoskeletal Cancer Surgery, Fudan University Shanghai Cancer Center, Shanghai, P.R. China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, P.R. China
| | - Xiaoying Luo
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, P.R. China
| | - Wangjun Yan
- Department of Musculoskeletal Cancer Surgery, Fudan University Shanghai Cancer Center, Shanghai, P.R. China .,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, P.R. China
| |
Collapse
|
30
|
Overcoming challenges and dogmas to understand the functions of pseudogenes. Nat Rev Genet 2019; 21:191-201. [DOI: 10.1038/s41576-019-0196-1] [Citation(s) in RCA: 92] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/05/2019] [Indexed: 01/08/2023]
|
31
|
Integrated exome and RNA sequencing of dedifferentiated liposarcoma. Nat Commun 2019; 10:5683. [PMID: 31831742 PMCID: PMC6908635 DOI: 10.1038/s41467-019-13286-z] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 10/28/2019] [Indexed: 01/06/2023] Open
Abstract
The genomic characteristics of dedifferentiated liposarcoma (DDLPS) that are associated with clinical features remain to be identified. Here, we conduct integrated whole exome and RNA sequencing analysis in 115 DDLPS tumors and perform comparative genomic analysis of well-differentiated and dedifferentiated components from eight DDLPS samples. Several somatic copy-number alterations (SCNAs), including the gain of 12q15, are identified as frequent genomic alterations. CTDSP1/2-DNM3OS fusion genes are identified in a subset of DDLPS tumors. Based on the association of SCNAs with clinical features, the DDLPS tumors are clustered into three groups. This clustering can predict the clinical outcome independently. The comparative analysis between well-differentiated and dedifferentiated components identify two categories of genomic alterations: shared alterations, associated with tumorigenesis, and dedifferentiated-specific alterations, associated with malignant transformation. This large-scale genomic analysis reveals the mechanisms underlying the development and progression of DDLPS and provides insights that could contribute to the refinement of DDLPS management. Understanding the genomic features of dedifferentiated liposarcoma (DDLPS) is likely to uncover new options for management. Here, the authors reveal three prognostic groups, and highlight molecular markers associated with malignant transformation.
Collapse
|
32
|
Feusier J, Watkins WS, Thomas J, Farrell A, Witherspoon DJ, Baird L, Ha H, Xing J, Jorde LB. Pedigree-based estimation of human mobile element retrotransposition rates. Genome Res 2019; 29:1567-1577. [PMID: 31575651 PMCID: PMC6771411 DOI: 10.1101/gr.247965.118] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 08/14/2019] [Indexed: 12/26/2022]
Abstract
Germline mutation rates in humans have been estimated for a variety of mutation types, including single-nucleotide and large structural variants. Here, we directly measure the germline retrotransposition rate for the three active retrotransposon elements: L1, Alu, and SVA. We used three tools for calling mobile element insertions (MEIs) (MELT, RUFUS, and TranSurVeyor) on blood-derived whole-genome sequence (WGS) data from 599 CEPH individuals, comprising 33 three-generation pedigrees. We identified 26 de novo MEIs in 437 births. The retrotransposition rate estimates for Alu elements, one in 40 births, is roughly half the rate estimated using phylogenetic analyses, a difference in magnitude similar to that observed for single-nucleotide variants. The L1 retrotransposition rate is one in 63 births and is within range of previous estimates (1:20-1:200 births). The SVA retrotransposition rate, one in 63 births, is much higher than the previous estimate of one in 900 births. Our large, three-generation pedigrees allowed us to assess parent-of-origin effects and the timing of insertion events in either gametogenesis or early embryonic development. We find a statistically significant paternal bias in Alu retrotransposition. Our study represents the first in-depth analysis of the rate and dynamics of human retrotransposition from WGS data in three-generation human pedigrees.
Collapse
Affiliation(s)
- Julie Feusier
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| | - W Scott Watkins
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| | - Jainy Thomas
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| | - Andrew Farrell
- USTAR Center for Genetic Discovery, Salt Lake City, Utah 84112, USA
| | - David J Witherspoon
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| | - Lisa Baird
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| | - Hongseok Ha
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08854, USA
| | - Jinchuan Xing
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08854, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| |
Collapse
|
33
|
Chatron N, Cassinari K, Quenez O, Baert-Desurmont S, Bardel C, Buisine MP, Calpena E, Capri Y, Corominas Galbany J, Diguet F, Edery P, Isidor B, Labalme A, Le Caignec C, Lévy J, Lecoquierre F, Lindenbaum P, Pichon O, Rollat-Farnier PA, Simonet T, Saugier-Veber P, Tabet AC, Toutain A, Wilkie AOM, Lesca G, Sanlaville D, Nicolas G, Schluth-Bolard C. Identification of mobile retrocopies during genetic testing: Consequences for routine diagnosis. Hum Mutat 2019; 40:1993-2000. [PMID: 31230393 DOI: 10.1002/humu.23845] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 05/29/2019] [Accepted: 06/17/2019] [Indexed: 12/24/2022]
Abstract
Human retrocopies, that is messenger RNA transcripts benefitting from the long interspersed element 1 machinery for retrotransposition, may have specific consequences for genomic testing. Next genetration sequencing (NGS) techniques allow the detection of such mobile elements but they may be misinterpreted as genomic duplications or be totally overlooked. We report eight observations of retrocopies detected during diagnostic NGS analyses of targeted gene panels, exome, or genome sequencing. For seven cases, while an exons-only copy number gain was called, read alignment inspection revealed a depth of coverage shift at every exon-intron junction where indels were also systematically called. Moreover, aberrant chimeric read pairs spanned entire introns or were paired with another locus for terminal exons. The 8th retrocopy was present in the reference genome and thus showed a normal NGS profile. We emphasize the existence of retrocopies and strategies to accurately detect them at a glance during genetic testing and discuss pitfalls for genetic testing.
Collapse
Affiliation(s)
- Nicolas Chatron
- Genetics Department, Hospices Civils de Lyon, Lyon, France.,GENDEV Team, CRNL, INSERM U1028, CNRS UMR5292, UCBL1, Lyon, France
| | - Kevin Cassinari
- Department of Genetics and CNR-MAJ, Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, F 76000, Normandy Center for Genomic and Personalized Medicine, Rouen, France
| | - Olivier Quenez
- Department of Genetics and CNR-MAJ, Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, F 76000, Normandy Center for Genomic and Personalized Medicine, Rouen, France
| | - Stéphanie Baert-Desurmont
- Department of Genetics, Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, F 76000, Normandy Center for Genomic and Personalized Medicine, Rouen, France
| | - Claire Bardel
- Bioinformatics group of the Lyon University Hospital NGS facility, Groupement Hospitalier Est, Lyon, France.,Biostatistics and Bioinformatics Department, HCL, Lyon, France
| | - Marie-Pierre Buisine
- Department of Biochemistry and Molecular Biology, JPA Research Center, Inserm UMR-S 1172, Lille University, Lille University Hospital, Lille, France
| | - Eduardo Calpena
- Clinical Genetics Group, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Yline Capri
- Genetics Department, Clinical Genetics Unit, Hôpital Universitaire Robert Debré, Paris, France
| | | | - Flavie Diguet
- Genetics Department, Hospices Civils de Lyon, Lyon, France.,GENDEV Team, CRNL, INSERM U1028, CNRS UMR5292, UCBL1, Lyon, France
| | - Patrick Edery
- Genetics Department, Hospices Civils de Lyon, Lyon, France.,GENDEV Team, CRNL, INSERM U1028, CNRS UMR5292, UCBL1, Lyon, France
| | | | - Audrey Labalme
- Genetics Department, Hospices Civils de Lyon, Lyon, France
| | - Cedric Le Caignec
- Genetics Department, CHU Nantes, Nantes, France.,INSERM UMR_S915, Institut du thorax, Nantes University, Nantes, France
| | - Jonathan Lévy
- Genetics Department, Cytogenetics Unit, Hôpital Universitaire Robert Debré, Paris, France
| | - François Lecoquierre
- Department of Genetics, Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, F 76000, Normandy Center for Genomic and Personalized Medicine, Rouen, France
| | - Pierre Lindenbaum
- INSERM, UMR_S1087, Institut du thorax, Nantes, France.,CNRS, UMR 6291, Nantes, France
| | | | - Pierre-Antoine Rollat-Farnier
- Genetics Department, Hospices Civils de Lyon, Lyon, France.,Bioinformatics group of the Lyon University Hospital NGS facility, Groupement Hospitalier Est, Lyon, France
| | - Thomas Simonet
- Cellular Biotechnology Center, Hospices Civils de Lyon, Lyon, France.,Nerve-Muscle Interactions Team, Institut NeuroMyoGène CNRS UMR 5310-INSERM U1217-Université Claude Bernard Lyon 1, Lyon, France
| | - Pascale Saugier-Veber
- Department of Genetics, Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, F 76000, Normandy Center for Genomic and Personalized Medicine, Rouen, France
| | - Anne-Claude Tabet
- Genetics Department, Cytogenetics Unit, Hôpital Universitaire Robert Debré, Paris, France.,Neuroscience Department, Human Genetics and Cognitive Function Unit, Institut Pasteur, Paris, France
| | - Annick Toutain
- Genetics Department, Hôpital Bretonneau, CHU, Tours, France.,UMR 1253, iBrain, Tours University, Inserm, Tours, France
| | - Andrew O M Wilkie
- Clinical Genetics Group, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Gaetan Lesca
- Genetics Department, Hospices Civils de Lyon, Lyon, France.,GENDEV Team, CRNL, INSERM U1028, CNRS UMR5292, UCBL1, Lyon, France
| | - Damien Sanlaville
- Genetics Department, Hospices Civils de Lyon, Lyon, France.,GENDEV Team, CRNL, INSERM U1028, CNRS UMR5292, UCBL1, Lyon, France
| | - Gaël Nicolas
- Department of Genetics and CNR-MAJ, Normandie Univ, UNIROUEN, Inserm U1245 and Rouen University Hospital, F 76000, Normandy Center for Genomic and Personalized Medicine, Rouen, France
| | - Caroline Schluth-Bolard
- Genetics Department, Hospices Civils de Lyon, Lyon, France.,GENDEV Team, CRNL, INSERM U1028, CNRS UMR5292, UCBL1, Lyon, France
| |
Collapse
|
34
|
Bim LV, Navarro FCP, Valente FOF, Lima-Junior JV, Delcelo R, Dias-da-Silva MR, Maciel RMB, Galante PAF, Cerutti JM. Retroposed copies of RET gene: a somatically acquired event in medullary thyroid carcinoma. BMC Med Genomics 2019; 12:104. [PMID: 31288802 PMCID: PMC6617568 DOI: 10.1186/s12920-019-0552-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 06/17/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Different pathogenic germline mutations in the RET oncogene are identified in MEN 2, a hereditary syndrome characterized by medullary thyroid carcinoma (MTC) and other endocrine tumors. Although genetic predisposition is recognized, not all RET mutation carriers will develop the disease during their lifetime or, likewise, RET mutation carriers belonging to the same family may present clinical heterogeneity. It has been suggested that a single germline mutation might not be sufficient for development of MEN 2-associated tumors and a somatic bi-allelic alteration might be required. Here we investigated the presence of somatic second hit mutation in the RET gene in MTC. METHODS We integrated Multiplex Ligation-dependent Probe Amplification (MLPA) and whole exome sequencing (WES) to search for copy number alteration (CNA) in the RET gene in MTC samples and medullary thyroid cell lines (TT and MZ-CR-1). We next found reads spanning exon-exon boundaries on RET, an indicative of retrocopy. We subsequently searched for RET retrocopies in the human reference genome (GRCh37) and in the 1000 Genomes Project data, by looking for reads reporting joined exons in the RET locus or distinct genomic regions. To determine RET retrocopy specificity and recurrence, DNA isolated from sporadic and MEN 2-associated MTC (n = 37), peripheral blood (n = 3) and papillary thyroid carcinomas with RET fusion (n = 10) samples were tested using PCR-sequencing methodology. RESULTS Through MLPA we have found evidence of CNA in the RET gene in MTC samples and MTC cell lines. WES analysis reinforced the presence of the CNA and hinted for a retroposed copy of RET not found in the human reference genome and 1.000 Genomes Project. Extended analysis confirmed the presence of a somatic MTC-related retrocopy of RET in both sporadic and hereditary tumors. We further unveiled a recurrent (28%) novel point mutation (p.G548 V) found exclusively in the retrocopy of RET. The mutation was also found in cDNA of mutated samples, suggesting it might be functional. CONCLUSION We here report a somatic specific RET retroposed copy in MTC samples and cell lines. Our results support the idea that generation of retrocopies in somatic cells is likely to contribute to MTC genesis and progression.
Collapse
Affiliation(s)
- Larissa V Bim
- Laboratório As Bases Genéticas dos Tumores da Tiroide, Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - Fábio C P Navarro
- Centro de Oncologia Molecular, Hospital Sírio-libanês, São Paulo, SP, Brazil.,Departamento de Bioquímica, Universidade de São Paulo, São Paulo, SP, Brazil
| | - Flávia O F Valente
- Laboratório de Endocrinologia Molecular e Translacional, Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - José V Lima-Junior
- Laboratório As Bases Genéticas dos Tumores da Tiroide, Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - Rosana Delcelo
- Departamento de Patologia, Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - Magnus R Dias-da-Silva
- Laboratório de Endocrinologia Molecular e Translacional, Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - Rui M B Maciel
- Laboratório de Endocrinologia Molecular e Translacional, Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - Pedro A F Galante
- Centro de Oncologia Molecular, Hospital Sírio-libanês, São Paulo, SP, Brazil
| | - Janete M Cerutti
- Laboratório As Bases Genéticas dos Tumores da Tiroide, Universidade Federal de São Paulo, São Paulo, SP, Brazil.
| |
Collapse
|
35
|
Tang D, Wu S, Luo K, Yuan H, Gao W, Zhu D, Zhang W, Xu Q. Sequence characterization and expression pattern analysis of six kinds of IL-17 family genes in the Asian swamp eel (Monopterus albus). FISH & SHELLFISH IMMUNOLOGY 2019; 89:257-270. [PMID: 30922887 DOI: 10.1016/j.fsi.2019.03.050] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 03/20/2019] [Accepted: 03/22/2019] [Indexed: 06/09/2023]
Abstract
Interleukin-17 (IL-17) is an important cytokine that plays a critical role in the inflammatory response and host defense against extracellular pathogens. In the present study, six novel IL-17 family genes (MaIL-17) were identified by analyzing Asian swamp eel (Monopterus albus) genome. Sequence analysis revealed that the MaIL-17 family genes shared similar features, comprising a signal peptide, an IL-17 superfamily region, and four conserved cysteines. Phylogenetic analysis showed that the MaIL-17 genes were clustered together with their corresponding IL-17 genes from other species. The similarity and identity of all IL-17 family genes indicated that the MaIL-17 genes are conserved among teleosts, while Ma-IL-17D is more conserved than the other Ma-IL-17s. Except for MaIL-17A/F3 and MaIL-17D, all MaIL-17s shared the same genomic structure as the genes from other fish, namely three exons and two introns. The MaIL-17s showed conserved synteny among fish, and we found that the MaIL-17D locus has a more conserved syntenic relationship with the loci from other fish and humans. These results demonstrated that MaIL-17D and human IL-17D might have evolved from a common ancestral gene and subsequently diverged. The analysis of swamp eel reference genes revealed that EEF1A1 (encoding eukaryotic translation elongation factor 1 alpha 1) was an ideal reference gene for accurate real-time qRT-PCR normalization in the swamp eel. The MaIL-17 genes are widely distributed throughout tissues, suggesting that MaIL-17s carry out their biological functions in immune and non-immune tissues compartments. The transcript of Ma-IL17s exhibited different fold changes in head kidney cells in response to Aeromonas veronii phorbol 12-myristate 13-acetate (PMA) and polyinosinic:polycytidylic acid (poly I:C) challenge, showing that MaIL-17A/F1 has stronger antiviral activities compared with other MaIL-17 family genes, and that MaIL-17A/F3 and MaIL-17A/F2 possess stronger effects against extracellular pathogens compared with the others; however, MaIL-17C2 and MaIL-17D may play vital roles during pathogen infection. The differential immune responses of these genes to Aeromonas veronii, PMA and poly I:C implied distinct mechanisms of host defense against extracellular pathogens.
Collapse
Affiliation(s)
- Dongdong Tang
- School of Animal Science, Yangtze University, Jingzhou, 434020, China; Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, China
| | - Shipei Wu
- School of Animal Science, Yangtze University, Jingzhou, 434020, China
| | - Kai Luo
- School of Animal Science, Yangtze University, Jingzhou, 434020, China
| | - Hanwen Yuan
- School of Animal Science, Yangtze University, Jingzhou, 434020, China
| | - Weihua Gao
- School of Animal Science, Yangtze University, Jingzhou, 434020, China
| | - Dashi Zhu
- School of Animal Science, Yangtze University, Jingzhou, 434020, China
| | - Wenbing Zhang
- School of Animal Science, Yangtze University, Jingzhou, 434020, China
| | - Qiaoqing Xu
- School of Animal Science, Yangtze University, Jingzhou, 434020, China; Key Laboratory of Aquaculture Disease Control, Ministry of Agriculture, China.
| |
Collapse
|
36
|
Lauer S, Gresham D. An evolving view of copy number variants. Curr Genet 2019; 65:1287-1295. [PMID: 31076843 DOI: 10.1007/s00294-019-00980-0] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 04/17/2019] [Accepted: 04/20/2019] [Indexed: 01/08/2023]
Abstract
Copy number variants (CNVs) are regions of the genome that vary in integer copy number. CNVs, which comprise both amplifications and deletions of DNA sequence, have been identified across all domains of life, from bacteria and archaea to plants and animals. CNVs are an important source of genetic diversity, and can drive rapid adaptive evolution and progression of heritable and somatic human diseases, such as cancer. However, despite their evolutionary importance and clinical relevance, CNVs remain understudied compared to single-nucleotide variants (SNVs). This is a consequence of the inherent difficulties in detecting CNVs at low-to-intermediate frequencies in heterogeneous populations of cells. Here, we discuss molecular methods used to detect CNVs, the limitations associated with using these techniques, and the application of new and emerging technologies that present solutions to these challenges. The goal of this short review and perspective is to highlight aspects of CNV biology that are understudied and define avenues for further research that address specific gaps in our knowledge of these complex alleles. We describe our recently developed method for CNV detection in which a fluorescent gene functions as a single-cell CNV reporter and present key findings from our evolution experiments in Saccharomyces cerevisiae. Using a CNV reporter, we found that CNVs are generated at a high rate and undergo selection with predictable dynamics across independently evolving replicate populations. Many CNVs appear to be generated through DNA replication-based processes that are mediated by the presence of short, interrupted, inverted-repeat sequences. Our results have important implications for the role of CNVs in evolutionary processes and the molecular mechanisms that underlie CNV formation. We discuss the possible extension of our method to other applications, including tracking the dynamics of CNVs in models of human tumors.
Collapse
Affiliation(s)
- Stephanie Lauer
- Institute for Systems Genetics, New York University Langone Health, New York, NY, USA
| | - David Gresham
- Center for Genomics and System Biology, Department of Biology, New York University, New York, NY, USA.
| |
Collapse
|
37
|
Abstract
In this perspective, we evaluate the explanatory power of the neutral theory of molecular evolution, 50 years after its introduction by Kimura. We argue that the neutral theory was supported by unreliable theoretical and empirical evidence from the beginning, and that in light of modern, genome-scale data, we can firmly reject its universality. The ubiquity of adaptive variation both within and between species means that a more comprehensive theory of molecular evolution must be sought.
Collapse
Affiliation(s)
- Andrew D Kern
- Department of Genetics, Rutgers University, Piscataway, NJ
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University Bloomington, IN
| |
Collapse
|
38
|
Matsumura K, Imai H, Go Y, Kusuhara M, Yamaguchi K, Shirai T, Ohshima K. Transcriptional activation of a chimeric retrogene PIPSL in a hominoid ancestor. Gene 2018; 678:318-323. [PMID: 30096459 DOI: 10.1016/j.gene.2018.08.033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Revised: 08/05/2018] [Accepted: 08/07/2018] [Indexed: 01/09/2023]
Abstract
Retrogenes are a class of functional genes derived from the mRNA of various intron-containing genes. PIPSL was created through a unique mechanism, whereby distinct genes were assembled at the RNA level, and the resulting chimera was then reverse transcribed and integrated into the genome by the L1 retrotransposon. Expression of PIPSL RNA via its transcription start sites (TSSs) has been confirmed in the testes of humans and chimpanzee. Here, we demonstrated that PIPSL RNA is expressed in the testis of the white-handed gibbon. The 5'-end positions of gibbon RNAs were confined to a narrow range upstream of the PIPSL start codon and overlapped with those of orangutan and human, suggesting that PIPSL TSSs are similar among hominoid species. Reporter assays using a luciferase gene and the flanking sequences of human PIPSL showed that an upstream sequence exhibits weak promoter activity in human cells. Our findings suggest that PIPSL might have acquired a promoter at an early stage of hominoid evolution before the divergence of gibbons and ultimately retained similar TSSs in all of the lineages. Moreover, the upstream sequence derived from the phosphatidylinositol-4-phosphate 5-kinase, type I, alpha 5' untranslated region and/or neighboring repetitive sequences in the genome possibly exhibits promoter activity. Furthermore, we observed that a TATA-box-like sequence has emerged by nucleotide substitution in a lineage leading to humans, with this possibly responsible for a broader distribution of the human PIPSL TSSs.
Collapse
Affiliation(s)
- Kenya Matsumura
- Graduate School of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, Japan; Shizuoka Cancer Center Research Institute, Sunto, Shizuoka, Japan
| | - Hiroo Imai
- Department of Cellular and Molecular Biology, Primate Research Institute, Kyoto University, Inuyama, Aichi, Japan
| | - Yasuhiro Go
- Cognitive Genomics Research Group, Exploratory Research Center on Life and Living Systems, National Institutes of Natural Sciences, Okazaki, Aichi, Japan; Department of Physiological Sciences, National Institute for Physiological Sciences, Okazaki, Aichi, Japan; School of Life Science, SOKENDAI (The Graduate University for Advanced Studies), Okazaki, Aichi, Japan
| | | | - Ken Yamaguchi
- Shizuoka Cancer Center Research Institute, Sunto, Shizuoka, Japan
| | - Tsuyoshi Shirai
- Graduate School of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, Japan
| | - Kazuhiko Ohshima
- Graduate School of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, Japan.
| |
Collapse
|
39
|
Algady W, Louzada S, Carpenter D, Brajer P, Färnert A, Rooth I, Ngasala B, Yang F, Shaw MA, Hollox EJ. The Malaria-Protective Human Glycophorin Structural Variant DUP4 Shows Somatic Mosaicism and Association with Hemoglobin Levels. Am J Hum Genet 2018; 103:769-776. [PMID: 30388403 PMCID: PMC6218809 DOI: 10.1016/j.ajhg.2018.10.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 10/04/2018] [Indexed: 01/23/2023] Open
Abstract
Glycophorin A and glycophorin B are red blood cell surface proteins and are both receptors for the parasite Plasmodium falciparum, which is the principal cause of malaria in sub-Saharan Africa. DUP4 is a complex structural genomic variant that carries extra copies of a glycophorin A-glycophorin B fusion gene and has a dramatic effect on malaria risk by reducing the risk of severe malaria by up to 40%. Using fiber-FISH and Illumina sequencing, we validate the structural arrangement of the glycophorin locus in the DUP4 variant and reveal somatic variation in copy number of the glycophorin B-glycophorin A fusion gene. By developing a simple, specific, PCR-based assay for DUP4, we show that the DUP4 variant reaches a frequency of 13% in the population of a malaria-endemic village in south-eastern Tanzania. We genotype a substantial proportion of that village and demonstrate an association of DUP4 genotype with hemoglobin levels, a phenotype related to malaria, using a family-based association test. Taken together, we show that DUP4 is a complex structural variant that may be susceptible to somatic variation and show that DUP4 is associated with a malarial-related phenotype in a longitudinally followed population.
Collapse
Affiliation(s)
- Walid Algady
- Department of Genetics and Genome Biology, University of Leicester, Leicester LE1 7RH, UK
| | - Sandra Louzada
- Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - Danielle Carpenter
- Department of Genetics and Genome Biology, University of Leicester, Leicester LE1 7RH, UK
| | - Paulina Brajer
- Department of Genetics and Genome Biology, University of Leicester, Leicester LE1 7RH, UK
| | - Anna Färnert
- Division of Infectious Diseases, Department of Medicine Solna, Karolinska Institutet, 17176 Stockholm, Sweden; Department of Infectious Diseases, Karolinska University Hospital, Stockholm 17176, Sweden
| | - Ingegerd Rooth
- Nyamisati Malaria Research, Rufiji, National Institute for Medical Research, Dar-es-Salaam, Tanzania
| | - Billy Ngasala
- Department of Parasitology and Medical Entomology, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania; Department of Women's and Children's Health, International Maternal and Child Health (IMCH), Uppsala Universitet, 75185 Uppsala, Sweden
| | - Fengtang Yang
- Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - Marie-Anne Shaw
- Leeds Institute of Medical Research at St James's, University of Leeds, Leeds LS9 7TF, UK
| | - Edward J Hollox
- Department of Genetics and Genome Biology, University of Leicester, Leicester LE1 7RH, UK.
| |
Collapse
|
40
|
Lilue J, Doran AG, Fiddes IT, Abrudan M, Armstrong J, Bennett R, Chow W, Collins J, Collins S, Czechanski A, Danecek P, Diekhans M, Dolle DD, Dunn M, Durbin R, Earl D, Ferguson-Smith A, Flicek P, Flint J, Frankish A, Fu B, Gerstein M, Gilbert J, Goodstadt L, Harrow J, Howe K, Ibarra-Soria X, Kolmogorov M, Lelliott C, Logan DW, Loveland J, Mathews CE, Mott R, Muir P, Nachtweide S, Navarro FC, Odom DT, Park N, Pelan S, Pham SK, Quail M, Reinholdt L, Romoth L, Shirley L, Sisu C, Sjoberg-Herrera M, Stanke M, Steward C, Thomas M, Threadgold G, Thybert D, Torrance J, Wong K, Wood J, Yalcin B, Yang F, Adams DJ, Paten B, Keane TM. Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci. Nat Genet 2018; 50:1574-1583. [PMID: 30275530 PMCID: PMC6205630 DOI: 10.1038/s41588-018-0223-8] [Citation(s) in RCA: 159] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 08/02/2018] [Indexed: 12/11/2022]
Abstract
We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.
Collapse
MESH Headings
- Animals
- Animals, Laboratory
- Chromosome Mapping/veterinary
- Genetic Loci
- Genome
- Haplotypes/genetics
- Mice
- Mice, Inbred BALB C/genetics
- Mice, Inbred C3H/genetics
- Mice, Inbred C57BL/genetics
- Mice, Inbred CBA/genetics
- Mice, Inbred DBA/genetics
- Mice, Inbred NOD/genetics
- Mice, Inbred Strains/classification
- Mice, Inbred Strains/genetics
- Molecular Sequence Annotation
- Phylogeny
- Polymorphism, Single Nucleotide
- Species Specificity
Collapse
Affiliation(s)
- Jingtao Lilue
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Anthony G. Doran
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Ian T. Fiddes
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Monica Abrudan
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Joel Armstrong
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ruth Bennett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - William Chow
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Joanna Collins
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Stephan Collins
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Centre National de la Recherche Scientifique UMR7104, Institut National de la Santé et de la Recherche Médicale U964, Université de Strasbourg, 67404 Illkirch, France
- Centre des Sciences du Goût et de l’Alimentation, University of Bourgogne Franche-Comté, 21000 Dijon, France
| | - Anne Czechanski
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Petr Danecek
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Mark Diekhans
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Dirk-Dominik Dolle
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Matt Dunn
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Richard Durbin
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Department of Genetics, University of Cambridge, Downing Site, Cambridge CB2 3EH, UK
| | - Dent Earl
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anne Ferguson-Smith
- Department of Genetics, University of Cambridge, Downing Site, Cambridge CB2 3EH, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Jonathan Flint
- Brain Research Institute, University of California, 695 Charles E Young Dr S, Los Angeles, CA 90095, USA
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Beiyuan Fu
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Mark Gerstein
- Yale Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - James Gilbert
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Leo Goodstadt
- OxFORD Asset Management, OxAM House, 6 George Street, Oxford OX1 2BW
| | - Jennifer Harrow
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Kerstin Howe
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | | | - Mikhail Kolmogorov
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Chris Lelliott
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Darren W. Logan
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Jane Loveland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Clayton E. Mathews
- Department of Pathology, Immunology, and Laboratory Medicine, University of Florida, Gainesville, FL, USA
| | - Richard Mott
- Genetics Institute, University College London, Gower Street, London WC1E 6BT, UK
| | - Paul Muir
- Yale Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - Stefanie Nachtweide
- Institute of Mathematics and Computer Science, University of Greifswald, Domstraße 11, 17489 Greifswald, Germany
| | - Fabio C.P. Navarro
- Yale Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - Duncan T. Odom
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, UK
- German Cancer Research Center (DKFZ), Division Signaling and Functional Genomics, 69120 Heidelberg, Germany
| | - Naomi Park
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Sarah Pelan
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Son K Pham
- BioTuring Inc., San Diego, California, CA92121
| | - Mike Quail
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Laura Reinholdt
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Lars Romoth
- Institute of Mathematics and Computer Science, University of Greifswald, Domstraße 11, 17489 Greifswald, Germany
| | - Lesley Shirley
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Cristina Sisu
- Yale Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Bioscience, Brunel University London, Uxbridge UB8 3PH, UK
| | - Marcela Sjoberg-Herrera
- Departamento de Biología Celular y Molecular, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago 8331150, Chile
| | - Mario Stanke
- Institute of Mathematics and Computer Science, University of Greifswald, Domstraße 11, 17489 Greifswald, Germany
| | - Charles Steward
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Mark Thomas
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Glen Threadgold
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - David Thybert
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - James Torrance
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Kim Wong
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Jonathan Wood
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Binnaz Yalcin
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Centre National de la Recherche Scientifique UMR7104, Institut National de la Santé et de la Recherche Médicale U964, Université de Strasbourg, 67404 Illkirch, France
| | - Fengtang Yang
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - David J. Adams
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Benedict Paten
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Thomas M. Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- School of Life Sciences, University of Nottingham, Nottingham, UK
| |
Collapse
|
41
|
Cerbin S, Jiang N. Duplication of host genes by transposable elements. Curr Opin Genet Dev 2018; 49:63-69. [PMID: 29571044 DOI: 10.1016/j.gde.2018.03.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Revised: 02/07/2018] [Accepted: 03/08/2018] [Indexed: 12/12/2022]
Abstract
The availability of large amounts of genomic and transcriptome sequences have allowed systematic surveys about the host gene sequences that have been duplicated by transposable elements. It is now clear that all super-families of transposons are capable of duplicating genes or gene fragments, and such incidents have been detected in a wide spectrum of organisms. Emerging evidence suggests that a considerable portion of them function as coding or non-coding sequences, driving innovations at molecular and phenotypic levels. Interestingly, the duplication events not only have to occur in the reproductive tissues to become heritable, but the duplicated copies are also preferentially expressed in those tissues. As a result, reproductive tissues may serve as the 'incubator' for genes generated by transposable elements.
Collapse
Affiliation(s)
- Stefan Cerbin
- Department of Horticulture, 1066 Bogue Street, Michigan State University, East Lansing, MI 48824, USA
| | - Ning Jiang
- Department of Horticulture, 1066 Bogue Street, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
42
|
da Silva VH, Laine VN, Bosse M, Oers KV, Dibbits B, Visser ME, M A Crooijmans RP, Groenen MAM. CNVs are associated with genomic architecture in a songbird. BMC Genomics 2018; 19:195. [PMID: 29703149 PMCID: PMC6389189 DOI: 10.1186/s12864-018-4577-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 03/02/2018] [Indexed: 12/11/2022] Open
Abstract
Background Understanding variation in genome structure is essential to understand phenotypic differences within populations and the evolutionary history of species. A promising form of this structural variation is copy number variation (CNV). CNVs can be generated by different recombination mechanisms, such as non-allelic homologous recombination, that rely on specific characteristics of the genome architecture. These structural variants can therefore be more abundant at particular genes ultimately leading to variation in phenotypes under selection. Detailed characterization of CNVs therefore can reveal evolutionary footprints of selection and provide insight in their contribution to phenotypic variation in wild populations. Results Here we use genotypic data from a long-term population of great tits (Parus major), a widely studied passerine bird in ecology and evolution, to detect CNVs and identify genomic features prevailing within these regions. We used allele intensities and frequencies from high-density SNP array data from 2,175 birds. We detected 41,029 CNVs concatenated into 8,008 distinct CNV regions (CNVRs). We successfully validated 93.75% of the CNVs tested by qPCR, which were sampled at different frequencies and sizes. A mother-daughter family structure allowed for the evaluation of the inheritance of a number of these CNVs. Thereby, only CNVs with 40 probes or more display segregation in accordance with Mendelian inheritance, suggesting a high rate of false negative calls for smaller CNVs. As CNVRs are a coarse-grained map of CNV loci, we also inferred the frequency of coincident CNV start and end breakpoints. We observed frequency-dependent enrichment of these breakpoints at homologous regions, CpG sites and AT-rich intervals. A gene ontology enrichment analyses showed that CNVs are enriched in genes underpinning neural, cardiac and ion transport pathways. Conclusion Great tit CNVs are present in almost half of the genes and prominent at repetitive-homologous and regulatory regions. Although overlapping genes under selection, the high number of false negatives make neutrality or association tests on CNVs detected here difficult. Therefore, CNVs should be further addressed in the light of their false negative rate and architecture to improve the comprehension of their association with phenotypes and evolutionary history. Electronic supplementary material The online version of this article (10.1186/s12864-018-4577-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Vinicius H da Silva
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands. .,Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands.
| | - Veronika N Laine
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands.,Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands.,Swedish University of Agricultural Sciences (SLU), Ulls väg 26, Uppsala, 750 07, Sweden
| | - Mirte Bosse
- Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands
| | - Kees van Oers
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands
| | - Bert Dibbits
- Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands
| | - Marcel E Visser
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands
| | - Richard P M A Crooijmans
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands.,Netherlands Institute of Ecology (NIOO-KNAW), Droevendaalsesteeg 10, Wageningen, 6708PB, The Netherlands
| | - Martien A M Groenen
- Animal Breeding and Genomics Centre, Wageningen University & Research, Droevendaalsesteeg 1, Wageningen, 6708PB, The Netherlands
| |
Collapse
|
43
|
Spliced integrated retrotransposed element (SpIRE) formation in the human genome. PLoS Biol 2018; 16:e2003067. [PMID: 29505568 PMCID: PMC5860796 DOI: 10.1371/journal.pbio.2003067] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Revised: 03/20/2018] [Accepted: 02/14/2018] [Indexed: 12/20/2022] Open
Abstract
Human Long interspersed element-1 (L1) retrotransposons contain an internal RNA polymerase II promoter within their 5′ untranslated region (UTR) and encode two proteins, (ORF1p and ORF2p) required for their mobilization (i.e., retrotransposition). The evolutionary success of L1 relies on the continuous retrotransposition of full-length L1 mRNAs. Previous studies identified functional splice donor (SD), splice acceptor (SA), and polyadenylation sequences in L1 mRNA and provided evidence that a small number of spliced L1 mRNAs retrotransposed in the human genome. Here, we demonstrate that the retrotransposition of intra-5′UTR or 5′UTR/ORF1 spliced L1 mRNAs leads to the generation of spliced integrated retrotransposed elements (SpIREs). We identified a new intra-5′UTR SpIRE that is ten times more abundant than previously identified SpIREs. Functional analyses demonstrated that both intra-5′UTR and 5′UTR/ORF1 SpIREs lack Cis-acting transcription factor binding sites and exhibit reduced promoter activity. The 5′UTR/ORF1 SpIREs also produce nonfunctional ORF1p variants. Finally, we demonstrate that sequence changes within the L1 5′UTR over evolutionary time, which permitted L1 to evade the repressive effects of a host protein, can lead to the generation of new L1 splicing events, which, upon retrotransposition, generates a new SpIRE subfamily. We conclude that splicing inhibits L1 retrotransposition, SpIREs generally represent evolutionary “dead-ends” in the L1 retrotransposition process, mutations within the L1 5′UTR alter L1 splicing dynamics, and that retrotransposition of the resultant spliced transcripts can generate interindividual genomic variation. Long interspersed element-1 (L1) sequences comprise about 17% of the human genome reference sequence. The average human genome contains about 100 active L1s that mobilize throughout the genome by a “copy and paste” process termed retrotransposition. Active L1s encode two proteins (ORF1p and ORF2p). ORF1p and ORF2p preferentially bind to their encoding RNA, forming a ribonucleoprotein particle (RNP). During retrotransposition, the L1 RNP translocates to the nucleus, where the ORF2p endonuclease makes a single-strand nick in target site DNA that exposes a 3′ hydroxyl group in genomic DNA. The 3′ hydroxyl group then is used as a primer by the ORF2p reverse transcriptase to copy the L1 RNA into cDNA, leading to the integration of an L1 copy at a new genomic location. The evolutionary success of L1 requires the faithful retrotransposition of full-length L1 mRNAs; thus, it was surprising to find that a small number of L1 retrotransposition events are derived from spliced L1 mRNAs. By using genetic, biochemical, and computational approaches, we demonstrate that spliced L1 mRNAs can undergo an initial round of retrotransposition, leading to the generation of spliced integrated retrotransposed elements (SpIREs). SpIREs represent about 2% of previously annotated full-length primate-specific L1s in the human genome reference sequence. However, because splicing leads to intra-L1 deletions that remove critical sequences required for L1 expression, SpIREs generally cannot undergo subsequent rounds of retrotransposition and can be considered “dead on arrival” insertions. Our data further highlight how genetic conflict between L1 and its host has influenced L1 expression, L1 retrotransposition, and L1 splicing dynamics over evolutionary time.
Collapse
|
44
|
Yu M, Chen J, Bao Y, Li J. Genomic analysis of NF-κB signaling pathway reveals its complexity in Crassostrea gigas. FISH & SHELLFISH IMMUNOLOGY 2018; 72:510-518. [PMID: 29162540 DOI: 10.1016/j.fsi.2017.11.034] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Revised: 11/09/2017] [Accepted: 11/17/2017] [Indexed: 06/07/2023]
Abstract
NF-κB signaling pathway is an evolutionarily conserved pathway that plays highly important roles in several developmental, cellular and immune response processes. With the recent release of the draft Pacific oyster (Crassostra gigas) genome sequence, we have sought to identify the various components of the NF-κB signaling pathway in these mollusks and investigate their gene structure. We further constructed phylogenetic trees to establish the evolutionary relationship of the oyster proteins with their homologues in vertebrates and invertebrates using BLASTX and neighbor-joining method. We report the presence of two classic NF-κB/Rel homologues in the pacific oyster namely Cgp100 and CgRel, which possess characteristic RHD domain and a consensus nuclear localization signal, similar to mammalian homologues and an additional CgRel-like protein, unique to C. gigas. Further, in addition to two classical IκB homologues, CgIκB1 and CgIκB2, we have identified three atypical IκB family members namely CgIκB3, CgIκB4 and CgBCL3 which lack the IκB degradation motif and consist of only one exon that might have arisen by retrotransposition from CgIκB1. Finally, we report the presence of three IKKs and one NEMO genes in oyster genome, named CgIKK1, CgIKK2, CgIKK3 and CgNEMO, respectively. While CgIKK1 and CgIKK3 domain structure is similar to their mammalian homologues, CgIKK2 was found to lack the HLH and NBD domains. Overall, the high conservation of the NF-κB/Rel, IκB and IKK family components in the pacific oyster and their structural similarity to the vertebrate and invertebrate homologues underline the functional importance of this pathway in regulation of critical cellular processes across species.
Collapse
Affiliation(s)
- Mingjia Yu
- State Key Laboratory Breeding Base of Marine Genetic Resources, Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, State Oceanic Administration, Xiamen 361005, China
| | - Jianming Chen
- State Key Laboratory Breeding Base of Marine Genetic Resources, Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, State Oceanic Administration, Xiamen 361005, China
| | - Yongbo Bao
- Zhejiang Key Laboratory of Aquatic Germplasm Resources, College of Biological & Environmental Sciences, Zhejiang Wanli University, Ningbo, China.
| | - Jun Li
- Key Laboratory of Tropical Marine Bio-Resources and Ecology, Guangdong Provincial Key Laboratory of Applied Marine Biology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, China.
| |
Collapse
|
45
|
Watson CM, Camm N, Crinnion LA, Antanaviciute A, Adlard J, Markham AF, Carr IM, Charlton R, Bonthron DT. Characterization and Genomic Localization of a SMAD4 Processed Pseudogene. J Mol Diagn 2017; 19:933-940. [PMID: 28867604 DOI: 10.1016/j.jmoldx.2017.08.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Accepted: 08/16/2017] [Indexed: 12/30/2022] Open
Abstract
Like many clinical diagnostic laboratories, the Yorkshire Regional Genetics Service undertakes routine investigation of cancer-predisposed individuals by high-throughput sequencing of patient DNA that has been target-enriched for genes associated with hereditary cancer. Accurate diagnosis using such reagents requires alertness regarding rare nonpathogenic variants that may interfere with variant calling. In a cohort of 2042 such cases, we identified 5 that initially appeared to be carriers of a 95-bp deletion of SMAD4 intron 6. More detailed analysis indicated that these individuals all carried one copy of a SMAD4 processed gene. Because of its interference with diagnostic analysis, we characterized this processed gene in detail. Whole-genome sequencing and confirmatory Sanger sequencing of junction PCR products were used to show that in each of the 5 cases, the SMAD4 processed gene was integrated at the same position on chromosome 9, located within the last intron of the SCAI gene. This rare polymorphic processed gene therefore reflects the occurrence of a single ancestral retrotransposition event. Compared to the reference SMAD4 mRNA sequence NM_005359.5 (https://www.ncbi.nlm.nih.gov/nucleotide), the 5' and 3' untranslated regions of the processed gene are both truncated, but its open reading frame is unaltered. Our experience leads us to advocate the use of an RNA-seq aligner as part of diagnostic assay quality assurance, since this allows recognition of processed pseudogenes in a comparatively facile automated fashion.
Collapse
Affiliation(s)
- Christopher M Watson
- Yorkshire Regional Genetics Service, St. James's University Hospital, Leeds, United Kingdom; MRC Medical Bioinformatics Centre, Leeds Institute for Data Analytics, St. James's University Hospital, Leeds, United Kingdom; MRC Single Cell Functional Genomics Centre, University of Leeds, St. James's University Hospital, Leeds, United Kingdom.
| | - Nick Camm
- Yorkshire Regional Genetics Service, St. James's University Hospital, Leeds, United Kingdom
| | - Laura A Crinnion
- Yorkshire Regional Genetics Service, St. James's University Hospital, Leeds, United Kingdom; MRC Medical Bioinformatics Centre, Leeds Institute for Data Analytics, St. James's University Hospital, Leeds, United Kingdom; MRC Single Cell Functional Genomics Centre, University of Leeds, St. James's University Hospital, Leeds, United Kingdom
| | - Agne Antanaviciute
- MRC Medical Bioinformatics Centre, Leeds Institute for Data Analytics, St. James's University Hospital, Leeds, United Kingdom
| | - Julian Adlard
- Yorkshire Regional Genetics Service, St. James's University Hospital, Leeds, United Kingdom
| | - Alexander F Markham
- MRC Medical Bioinformatics Centre, Leeds Institute for Data Analytics, St. James's University Hospital, Leeds, United Kingdom
| | - Ian M Carr
- MRC Medical Bioinformatics Centre, Leeds Institute for Data Analytics, St. James's University Hospital, Leeds, United Kingdom; MRC Single Cell Functional Genomics Centre, University of Leeds, St. James's University Hospital, Leeds, United Kingdom
| | - Ruth Charlton
- Yorkshire Regional Genetics Service, St. James's University Hospital, Leeds, United Kingdom
| | - David T Bonthron
- Yorkshire Regional Genetics Service, St. James's University Hospital, Leeds, United Kingdom; MRC Medical Bioinformatics Centre, Leeds Institute for Data Analytics, St. James's University Hospital, Leeds, United Kingdom; MRC Single Cell Functional Genomics Centre, University of Leeds, St. James's University Hospital, Leeds, United Kingdom
| |
Collapse
|
46
|
Zhang Y, Li S, Abyzov A, Gerstein MB. Landscape and variation of novel retroduplications in 26 human populations. PLoS Comput Biol 2017; 13:e1005567. [PMID: 28662076 PMCID: PMC5510864 DOI: 10.1371/journal.pcbi.1005567] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Revised: 07/14/2017] [Accepted: 05/12/2017] [Indexed: 01/10/2023] Open
Abstract
Retroduplications come from reverse transcription of mRNAs and their insertion back into the genome. Here, we performed comprehensive discovery and analysis of retroduplications in a large cohort of 2,535 individuals from 26 human populations, as part of 1000 Genomes Phase 3. We developed an integrated approach to discover novel retroduplications combining high-coverage exome and low-coverage whole-genome sequencing data, utilizing information from both exon-exon junctions and discordant paired-end reads. We found 503 parent genes having novel retroduplications absent from the reference genome. Based solely on retroduplication variation, we built phylogenetic trees of human populations; these represent superpopulation structure well and indicate that variable retroduplications are effective population markers. We further identified 43 retroduplication parent genes differentiating superpopulations. This group contains several interesting insertion events, including a SLMO2 retroduplication and insertion into CAV3, which has a potential disease association. We also found retroduplications to be associated with a variety of genomic features: (1) Insertion sites were correlated with regular nucleosome positioning. (2) They, predictably, tend to avoid conserved functional regions, such as exons, but, somewhat surprisingly, also avoid introns. (3) Retroduplications tend to be co-inserted with young L1 elements, indicating recent retrotranspositional activity, and (4) they have a weak tendency to originate from highly expressed parent genes. Our investigation provides insight into the functional impact and association with genomic elements of retroduplications. We anticipate our approach and analytical methodology to have application in a more clinical context, where exome sequencing data is abundant and the discovery of retroduplications can potentially improve the accuracy of SNP calling.
Collapse
Affiliation(s)
- Yan Zhang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, Ohio, United States of America
| | - Shantao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Alexej Abyzov
- Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Mark B. Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Department of Computer Science, Yale University, New Haven, Connecticut, United States of America
| |
Collapse
|
47
|
Casola C, Betrán E. The Genomic Impact of Gene Retrocopies: What Have We Learned from Comparative Genomics, Population Genomics, and Transcriptomic Analyses? Genome Biol Evol 2017; 9:1351-1373. [PMID: 28605529 PMCID: PMC5470649 DOI: 10.1093/gbe/evx081] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/18/2017] [Indexed: 02/07/2023] Open
Abstract
Gene duplication is a major driver of organismal evolution. Gene retroposition is a mechanism of gene duplication whereby a gene's transcript is used as a template to generate retroposed gene copies, or retrocopies. Intriguingly, the formation of retrocopies depends upon the enzymatic machinery encoded by retrotransposable elements, genomic parasites occurring in the majority of eukaryotes. Most retrocopies are depleted of the regulatory regions found upstream of their parental genes; therefore, they were initially considered transcriptionally incompetent gene copies, or retropseudogenes. However, examples of functional retrocopies, or retrogenes, have accumulated since the 1980s. Here, we review what we have learned about retrocopies in animals, plants and other eukaryotic organisms, with a particular emphasis on comparative and population genomic analyses complemented with transcriptomic datasets. In addition, these data have provided information about the dynamics of the different "life cycle" stages of retrocopies (i.e., polymorphic retrocopy number variants, fixed retropseudogenes and retrogenes) and have provided key insights into the retroduplication mechanisms, the patterns and evolutionary forces at work during the fixation process and the biological function of retrogenes. Functional genomic and transcriptomic data have also revealed that many retropseudogenes are transcriptionally active and a biological role has been experimentally determined for many. Finally, we have learned that not only non-long terminal repeat retroelements but also long terminal repeat retroelements play a role in the emergence of retrocopies across eukaryotes. This body of work has shown that mRNA-mediated duplication represents a widespread phenomenon that produces an array of new genes that contribute to organismal diversity and adaptation.
Collapse
Affiliation(s)
- Claudio Casola
- Department of Ecosystem Science and Management, Texas A&M University, TX
| | - Esther Betrán
- Department of Biology, University of Texas at Arlington, Arlington, TX
| |
Collapse
|
48
|
Woodward EL, Biloglav A, Ravi N, Yang M, Ekblad L, Wennerberg J, Paulsson K. Genomic complexity and targeted genes in anaplastic thyroid cancer cell lines. Endocr Relat Cancer 2017; 24:209-220. [PMID: 28235956 DOI: 10.1530/erc-16-0522] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Accepted: 02/24/2017] [Indexed: 12/14/2022]
Abstract
Anaplastic thyroid cancer (ATC) is a highly malignant disease with a very short median survival time. Few studies have addressed the underlying somatic mutations, and the genomic landscape of ATC thus remains largely unknown. In the present study, we have ascertained copy number aberrations, gene fusions, gene expression patterns, and mutations in early-passage cells from ten newly established ATC cell lines using single nucleotide polymorphism (SNP) array analysis, RNA sequencing and whole exome sequencing. The ATC cell line genomes were highly complex and displayed signs of replicative stress and genomic instability, including massive aneuploidy and frequent breakpoints in the centromeric regions and in fragile sites. Loss of heterozygosity involving whole chromosomes was common, but there were no signs of previous near-haploidisation events or chromothripsis. A total of 21 fusion genes were detected, including six predicted in-frame fusions; none were recurrent. Global gene expression analysis showed 661 genes to be differentially expressed between ATC and papillary thyroid cancer cell lines, with pathway enrichment analyses showing downregulation of TP53 signalling as well as cell adhesion molecules in ATC. Besides previously known driver events, such as mutations in BRAF, NRAS, TP53 and the TERT promoter, we identified PTPRD and NEGR1 as putative novel target genes in ATC, based on deletions in six and four cell lines, respectively; the latter gene also carried a somatic mutation in one cell line. Taken together, our data provide novel insights into the tumourigenesis of ATC and may be used to identify new therapeutic targets.
Collapse
Affiliation(s)
- Eleanor L Woodward
- Division of Clinical GeneticsDepartment of Laboratory Medicine, Lund University, Lund, Sweden
| | - Andrea Biloglav
- Division of Clinical GeneticsDepartment of Laboratory Medicine, Lund University, Lund, Sweden
| | - Naveen Ravi
- Division of Clinical GeneticsDepartment of Laboratory Medicine, Lund University, Lund, Sweden
| | - Minjun Yang
- Division of Clinical GeneticsDepartment of Laboratory Medicine, Lund University, Lund, Sweden
| | - Lars Ekblad
- Division of Oncology and PathologyClinical Sciences, Lund University and Skåne University Hospital, Lund, Sweden
| | - Johan Wennerberg
- Division of Otorhinolaryngology/Head and Neck SurgeryClinical Sciences, Lund University and Skåne University Hospital, Lund, Sweden
| | - Kajsa Paulsson
- Division of Clinical GeneticsDepartment of Laboratory Medicine, Lund University, Lund, Sweden
| |
Collapse
|
49
|
Protein-Coding Genes' Retrocopies and Their Functions. Viruses 2017; 9:v9040080. [PMID: 28406439 PMCID: PMC5408686 DOI: 10.3390/v9040080] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2017] [Revised: 04/07/2017] [Accepted: 04/11/2017] [Indexed: 12/11/2022] Open
Abstract
Transposable elements, often considered to be not important for survival, significantly contribute to the evolution of transcriptomes, promoters, and proteomes. Reverse transcriptase, encoded by some transposable elements, can be used in trans to produce a DNA copy of any RNA molecule in the cell. The retrotransposition of protein-coding genes requires the presence of reverse transcriptase, which could be delivered by either non-long terminal repeat (non-LTR) or LTR transposons. The majority of these copies are in a state of “relaxed” selection and remain “dormant” because they are lacking regulatory regions; however, many become functional. In the course of evolution, they may undergo subfunctionalization, neofunctionalization, or replace their progenitors. Functional retrocopies (retrogenes) can encode proteins, novel or similar to those encoded by their progenitors, can be used as alternative exons or create chimeric transcripts, and can also be involved in transcriptional interference and participate in the epigenetic regulation of parental gene expression. They can also act in trans as natural antisense transcripts, microRNA (miRNA) sponges, or a source of various small RNAs. Moreover, many retrocopies of protein-coding genes are linked to human diseases, especially various types of cancer.
Collapse
|
50
|
França GS, Hinske LC, Galante PAF, Vibranovski MD. Unveiling the Impact of the Genomic Architecture on the Evolution of Vertebrate microRNAs. Front Genet 2017; 8:34. [PMID: 28377786 PMCID: PMC5359303 DOI: 10.3389/fgene.2017.00034] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 03/09/2017] [Indexed: 12/12/2022] Open
Abstract
Eukaryotic genomes frequently exhibit interdependency between transcriptional units, as evidenced by regions of high gene density. It is well recognized that vertebrate microRNAs (miRNAs) are usually embedded in those regions. Recent work has shown that the genomic context is of utmost importance to determine miRNA expression in time and space, thus affecting their evolutionary fates over long and short terms. Consequently, understanding the inter- and intraspecific changes on miRNA genomic architecture may bring novel insights on the basic cellular processes regulated by miRNAs, as well as phenotypic evolution and disease-related mechanisms.
Collapse
Affiliation(s)
- Gustavo S França
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo São Paulo, Brazil
| | - Ludwig C Hinske
- Department of Anesthesiology, Clinic of the University of Munich, Ludwig Maximilian University of Munich Munich, Germany
| | - Pedro A F Galante
- Centro de Oncologia Molecular, Hospital Sírio-Libanês São Paulo, Brazil
| | - Maria D Vibranovski
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo São Paulo, Brazil
| |
Collapse
|