1
|
Stavropoulou A, Tassios E, Kalyva M, Georgoulopoulos M, Vakirlis N, Iliopoulos I, Nikolaou C. Distinct chromosomal “niches” in the genome of Saccharomyces cerevisiae provide the background for genomic innovation and shape the fate of gene duplicates. NAR Genom Bioinform 2022; 4:lqac086. [PMID: 36381424 PMCID: PMC9661399 DOI: 10.1093/nargab/lqac086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 10/20/2022] [Accepted: 10/25/2022] [Indexed: 11/15/2022] Open
Abstract
Nearly one third of Saccharomyces cerevisiae protein coding sequences correspond to duplicate genes, equally split between small-scale duplicates (SSD) and whole-genome duplicates (WGD). While duplicate genes have distinct properties compared to singletons, to date, there has been no systematic analysis of their positional preferences. In this work, we show that SSD and WGD genes are organized in distinct gene clusters that occupy different genomic regions, with SSD being more peripheral and WGD more centrally positioned close to centromeric chromatin. Duplicate gene clusters differ from the rest of the genome in terms of gene size and spacing, gene expression variability and regulatory complexity, properties that are also shared by singleton genes residing within them. Singletons within duplicate gene clusters have longer promoters, more complex structure and a higher number of protein–protein interactions. Particular chromatin architectures appear to be important for gene evolution, as we find SSD gene-pair co-expression to be strongly associated with the similarity of nucleosome positioning patterns. We propose that specific regions of the yeast genome provide a favourable environment for the generation and maintenance of small-scale gene duplicates, segregating them from WGD-enriched genomic domains. Our findings provide a valuable framework linking genomic innovation with positional genomic preferences.
Collapse
Affiliation(s)
- Athanasia Stavropoulou
- Medical School, University of Crete , Heraklion 70013, Greece
- Computational Genomics Group, Biomedical Sciences Research Center “Alexander Fleming” , Athens 16672, Greece
| | - Emilios Tassios
- Medical School, University of Crete , Heraklion 70013, Greece
- Computational Genomics Group, Biomedical Sciences Research Center “Alexander Fleming” , Athens 16672, Greece
| | - Maria Kalyva
- European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus , Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | - Nikolaos Vakirlis
- Computational Genomics Group, Biomedical Sciences Research Center “Alexander Fleming” , Athens 16672, Greece
| | | | - Christoforos Nikolaou
- Computational Genomics Group, Biomedical Sciences Research Center “Alexander Fleming” , Athens 16672, Greece
- Hellenic Open University , Patras 26335, Greece
| |
Collapse
|
2
|
Kołomański M, Szyda J, Frąszczak M, Mielczarek M. DNA sequence features underlying large-scale duplications and deletions in human. J Appl Genet 2022; 63:527-533. [PMID: 35590085 PMCID: PMC9365719 DOI: 10.1007/s13353-022-00704-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 03/22/2022] [Accepted: 05/05/2022] [Indexed: 11/25/2022]
Abstract
Copy number variants (CNVs) may cover up to 12% of the whole genome and have substantial impact on phenotypes. We used 5867 duplications and 33,181 deletions available from the 1000 Genomes Project to characterise genomic regions vulnerable to CNV formation and to identify sequence features characteristic for those regions. The GC content for deletions was lower and for duplications was higher than for randomly selected regions. In regions flanking deletions and downstream of duplications, content was higher than in the random sequences, but upstream of duplication content was lower. In duplications and downstream of deletion regions, the percentage of low-complexity sequences was not different from the randomised data. In deletions and upstream of CNVs, it was higher, while for downstream of duplications, it was lower as compared to random sequences. The majority of CNVs intersected with genic regions — mainly with introns. GC content may be associated with CNV formation and CNVs, especially duplications are initiated in low-complexity regions. Moreover, CNVs located or overlapped with introns indicate their role in shaping intron variability. Genic CNV regions were enriched in many essential biological processes such as cell adhesion, synaptic transmission, transport, cytoskeleton organization, immune response and metabolic mechanisms, which indicates that these large-scaled variants play important biological roles.
Collapse
Affiliation(s)
- Mateusz Kołomański
- Biostatistics Group, Department of Genetics, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland
| | - Joanna Szyda
- Biostatistics Group, Department of Genetics, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland
| | - Magdalena Frąszczak
- Biostatistics Group, Department of Genetics, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland
| | - Magda Mielczarek
- Biostatistics Group, Department of Genetics, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland.
| |
Collapse
|
3
|
Gu C, Gao H, Li K, Dai X, Yang Z, Li R, Wen C, He Y. Copy Number Variation Analysis of Euploid Pregnancy Loss. Front Genet 2022; 13:766492. [PMID: 35401693 PMCID: PMC8984164 DOI: 10.3389/fgene.2022.766492] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 02/24/2022] [Indexed: 12/30/2022] Open
Abstract
Objectives: Copy number variant (CNV) is believed to be the potential genetic cause of pregnancy loss. However, CNVs less than 3 Mb in euploid products of conceptions (POCs) remain largely unexplored. The aim of this study was to investigate the features of CNVs less than 3 Mb in POCs and their potential clinical significance in pregnancy loss/fetal death. Methods: CNV data were extracted from a cohort in our institution and 19 peer-reviewed publications, and only those CNVs less than 3 Mb detected in euploid pregnancy loss/fetal death were included. We conducted a CNV map to analyze the distribution of CNVs in chromosomes using R packages karyoploteR_1.10.5. Gene names and annotated gene types covered by those CNVs were mined from the human Release 19 reference genome file and GENECODE database. We assessed the expression patterns and the consequences of murine knock-out of those genes using TiGER and Mouse Genome Informatics (MGI) databases. Functional enrichment and pathway analysis for genes in CNVs were performed using clusterProfiler V3.12.0. Result: Breakpoints of 564 CNVs less than 3 Mb were obtained from 442 euploid POCs, with 349 gains and 185 losses. The CNV map showed that CNVs were distributed in all chromosomes, with the highest frequency detected in chromosome 22 and the lowest frequency in chromosome Y, and CNVs showed a higher density in the pericentromeric and sub-telomeric regions. A total of 5,414 genes mined from the CNV regions (CNVRs), Gene Ontology (GO), and pathway analysis showed that the genes were significantly enriched in multiple terms, especially in sensory perception, membrane region, and tight junction. A total of 995 protein-coding genes have been reported to present mammalian phenotypes in MGI, and 276 of them lead to embryonic lethality or abnormal embryo/placenta in knock-out mouse models. CNV located at 19p13.3 was the most common CNV of all POCs. Conclusion: CNVs less than 3 Mb in euploid POCs distribute unevenly in all chromosomes, and a higher density was seen in the pericentromeric and sub-telomeric regions. The genes in those CNVRs are significantly enriched in biological processes and pathways that are important to embryonic/fetal development. CNV in 19p13.3 and the variations of ARID3A and FSTL3 might contribute to pregnancy loss.
Collapse
Affiliation(s)
- Chongjuan Gu
- Department of Obstetrics and Gynecology, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huan Gao
- Department of Toxicology, School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Kuanrong Li
- Institute of Pediatrics, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Xinyu Dai
- School of Life Sciences, South China Normal University, Guangzhou, China
| | - Zhao Yang
- West China Hospital, Sichuan University, Chengdu, China
| | - Ru Li
- Prenatal Diagnostic Center, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Canliang Wen
- Department of Obstetrics and Gynecology, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Yaojuan He
- Department of Obstetrics and Gynecology, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
4
|
Liu S, Hu G, Luo S, Wu W, Zhou Q, Jin R, Zhang Y, Ruan H, Huang H, Li H. Insights into the evolution of the ISG15 and UBA7 system. Genomics 2022; 114:110302. [DOI: 10.1016/j.ygeno.2022.110302] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Revised: 01/08/2022] [Accepted: 02/01/2022] [Indexed: 11/04/2022]
|
5
|
Pagni S, Mills JD, Frankish A, Mudge JM, Sisodiya SM. Non-coding regulatory elements: Potential roles in disease and the case of epilepsy. Neuropathol Appl Neurobiol 2021; 48:e12775. [PMID: 34820881 DOI: 10.1111/nan.12775] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 10/04/2021] [Accepted: 11/16/2021] [Indexed: 12/27/2022]
Abstract
Non-coding DNA (ncDNA) refers to the portion of the genome that does not code for proteins and accounts for the greatest physical proportion of the human genome. ncDNA includes sequences that are transcribed into RNA molecules, such as ribosomal RNAs (rRNAs), microRNAs (miRNAs), long non-coding RNAs (lncRNAs) and un-transcribed sequences that have regulatory functions, including gene promoters and enhancers. Variation in non-coding regions of the genome have an established role in human disease, with growing evidence from many areas, including several cancers, Parkinson's disease and autism. Here, we review the features and functions of the regulatory elements that are present in the non-coding genome and the role that these regions have in human disease. We then review the existing research in epilepsy and emphasise the potential value of further exploring non-coding regulatory elements in epilepsy. In addition, we outline the most widely used techniques for recognising regulatory elements throughout the genome, current methodologies for investigating variation and the main challenges associated with research in the field of non-coding DNA.
Collapse
Affiliation(s)
- Susanna Pagni
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK
| | - James D Mills
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK.,Amsterdam UMC, Department of (Neuro)Pathology, Amsterdam Neuroscience, University of Amsterdam, Amsterdam, Netherlands
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Sanjay M Sisodiya
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK
| |
Collapse
|
6
|
Reconstruction of proto-vertebrate, proto-cyclostome and proto-gnathostome genomes provides new insights into early vertebrate evolution. Nat Commun 2021; 12:4489. [PMID: 34301952 PMCID: PMC8302630 DOI: 10.1038/s41467-021-24573-z] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 06/25/2021] [Indexed: 02/07/2023] Open
Abstract
Ancient polyploidization events have had a lasting impact on vertebrate genome structure, organization and function. Some key questions regarding the number of ancient polyploidization events and their timing in relation to the cyclostome-gnathostome divergence have remained contentious. Here we generate de novo long-read-based chromosome-scale genome assemblies for the Japanese lamprey and elephant shark. Using these and other representative genomes and developing algorithms for the probabilistic macrosynteny model, we reconstruct high-resolution proto-vertebrate, proto-cyclostome and proto-gnathostome genomes. Our reconstructions resolve key questions regarding the early evolutionary history of vertebrates. First, cyclostomes diverged from the lineage leading to gnathostomes after a shared tetraploidization (1R) but before a gnathostome-specific tetraploidization (2R). Second, the cyclostome lineage experienced an additional hexaploidization. Third, 2R in the gnathostome lineage was an allotetraploidization event, and biased gene loss from one of the subgenomes shaped the gnathostome genome by giving rise to remarkably conserved microchromosomes. Thus, our reconstructions reveal the major evolutionary events and offer new insights into the origin and evolution of vertebrate genomes.
Collapse
|
7
|
Dermauw W, Van Leeuwen T, Feyereisen R. Diversity and evolution of the P450 family in arthropods. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2020; 127:103490. [PMID: 33169702 DOI: 10.1016/j.ibmb.2020.103490] [Citation(s) in RCA: 117] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/09/2020] [Accepted: 10/09/2020] [Indexed: 05/13/2023]
Abstract
The P450 family (CYP genes) of arthropods encodes diverse enzymes involved in the metabolism of foreign compounds and in essential endocrine or ecophysiological functions. The P450 sequences (CYPome) from 40 arthropod species were manually curated, including 31 complete CYPomes, and a maximum likelihood phylogeny of nearly 3000 sequences is presented. Arthropod CYPomes are assembled from members of six CYP clans of variable size, the CYP2, CYP3, CYP4 and mitochondrial clans, as well as the CYP20 and CYP16 clans that are not found in Neoptera. CYPome sizes vary from two dozen genes in some parasitic species to over 200 in species as diverse as collembolans or ticks. CYPomes are comprised of few CYP families with many genes and many CYP families with few genes, and this distribution is the result of dynamic birth and death processes. Lineage-specific expansions or blooms are found throughout the phylogeny and often result in genomic clusters that appear to form a reservoir of catalytic diversity maintained as heritable units. Among the many P450s with physiological functions, six CYP families are involved in ecdysteroid metabolism. However, five so-called Halloween genes are not universally represented and do not constitute the unique pathway of ecdysteroid biosynthesis. The diversity of arthropod CYPomes has only partially been uncovered to date and many P450s with physiological functions regulating the synthesis and degradation of endogenous signal molecules (including ecdysteroids) and semiochemicals (including pheromones and defense chemicals) remain to be discovered. Sequence diversity of arthropod P450s is extreme, and P450 sequences lacking the universally conserved Cys ligand to the heme have evolved several times. A better understanding of P450 evolution is needed to discern the relative contributions of stochastic processes and adaptive processes in shaping the size and diversity of CYPomes.
Collapse
Affiliation(s)
- Wannes Dermauw
- Laboratory of Agrozoology, Department of Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| | - Thomas Van Leeuwen
- Laboratory of Agrozoology, Department of Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| | - René Feyereisen
- Laboratory of Agrozoology, Department of Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000, Ghent, Belgium; Department of Plant and Environmental Sciences, University of Copenhagen, 40 Thorvaldsensvej, DK-1871, Frederiksberg C, Copenhagen, Denmark.
| |
Collapse
|
8
|
Yamasaki M, Makino T, Khor SS, Toyoda H, Miyagawa T, Liu X, Kuwabara H, Kano Y, Shimada T, Sugiyama T, Nishida H, Sugaya N, Tochigi M, Otowa T, Okazaki Y, Kaiya H, Kawamura Y, Miyashita A, Kuwano R, Kasai K, Tanii H, Sasaki T, Honda M, Tokunaga K. Sensitivity to gene dosage and gene expression affects genes with copy number variants observed among neuropsychiatric diseases. BMC Med Genomics 2020; 13:55. [PMID: 32223758 PMCID: PMC7104509 DOI: 10.1186/s12920-020-0699-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 02/24/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Copy number variants (CNVs) have been reported to be associated with diseases, traits, and evolution. However, it is hard to determine which gene should have priority as a target for further functional experiments if a CNV is rare or a singleton. In this study, we attempted to overcome this issue by using two approaches: by assessing the influences of gene dosage sensitivity and gene expression sensitivity. Dosage sensitive genes derived from two-round whole-genome duplication in previous studies. In addition, we proposed a cross-sectional omics approach that utilizes open data from GTEx to assess the effect of whole-genome CNVs on gene expression. METHODS Affymetrix Genome-Wide SNP Array 6.0 was used to detect CNVs by PennCNV and CNV Workshop. After quality controls for population stratification, family relationship and CNV detection, 287 patients with narcolepsy, 133 patients with essential hypersomnia, 380 patients with panic disorders, 164 patients with autism, 784 patients with Alzheimer disease and 1280 healthy individuals remained for the enrichment analysis. RESULTS Overall, significant enrichment of dosage sensitive genes was found across patients with narcolepsy, panic disorders and autism. Particularly, significant enrichment of dosage-sensitive genes in duplications was observed across all diseases except for Alzheimer disease. For deletions, less or no enrichment of dosage-sensitive genes with deletions was seen in the patients when compared to the healthy individuals. Interestingly, significant enrichments of genes with expression sensitivity in brain were observed in patients with panic disorder and autism. While duplications presented a higher burden, deletions did not cause significant differences when compared to the healthy individuals. When we assess the effect of sensitivity to genome dosage and gene expression at the same time, the highest ratio of enrichment was observed in the group including dosage-sensitive genes and genes with expression sensitivity only in brain. In addition, shared CNV regions among the five neuropsychiatric diseases were also investigated. CONCLUSIONS This study contributed the evidence that dosage-sensitive genes are associated with CNVs among neuropsychiatric diseases. In addition, we utilized open data from GTEx to assess the effect of whole-genome CNVs on gene expression. We also investigated shared CNV region among neuropsychiatric diseases.
Collapse
Affiliation(s)
- Maria Yamasaki
- Department of Health Data Science Research, Healthy Aging Innovation Center, Tokyo Metropolitan Geriatric Medical Center, Tokyo, Japan
| | - Takashi Makino
- Laboratory of Evolutionary Genomics, Graduate School of Life Sciences, Tohoku University, Sendai, Japan
| | - Seik-Soon Khor
- Genome Medical Science Project (Toyama), National Center for for Global Health and Medicine, Tokyo, Japan
| | - Hiromi Toyoda
- Genome Medical Science Project (Toyama), National Center for for Global Health and Medicine, Tokyo, Japan
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Taku Miyagawa
- Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Xiaoxi Liu
- RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Hitoshi Kuwabara
- Department of Psychiatry, Hamamatsu University School of Medicine, Hamamatsu, Japan
| | - Yukiko Kano
- Department of Child and Adolescent Psychiatry, Hamamatsu University School of Medicine, Shizuoka, Japan
- Department of Child Psychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Takafumi Shimada
- Division for Counseling and Support, The University of Tokyo, Tokyo, Japan
| | - Toshiro Sugiyama
- Department of Child and Adolescent Psychiatry, Hamamatsu University School of Medicine, Shizuoka, Japan
| | - Hisami Nishida
- Asunaro Hospital for Child and Adolescent Psychiatry, Mie, Japan
| | - Nagisa Sugaya
- Unit of Public Health and Preventive Medicine, School of Medicine, Yokohama City University, Kanagawa, Japan
| | - Mamoru Tochigi
- Department of Neuropsychiatry, Teikyo University Hospital, Tokyo, Japan
| | - Takeshi Otowa
- Department of Neuropsychiatry, NTT Medical Center Tokyo, Tokyo, Japan
| | - Yuji Okazaki
- Department of Psychiatry, Koseikai Michinoo Hospital, Nagasaki, Japan
| | - Hisanobu Kaiya
- Panic Disorder Research Center, Warakukai Med Corp, Tokyo, Japan
| | - Yoshiya Kawamura
- Department of Psychiatry, Shonan Kamakura General Hospital, Kanagawa, Japan
| | - Akinori Miyashita
- Department of Molecular Genetics, Bioresource Science Branch, Center for Bioresources, Brain Research Institute, Niigata University, Niigata, Japan
| | - Ryozo Kuwano
- Department of Molecular Genetics, Bioresource Science Branch, Center for Bioresources, Brain Research Institute, Niigata University, Niigata, Japan
- Asahigawaso Research Institute, Asahigawaso Medical-Welfare Center, Okayama, Japan
| | - Kiyoto Kasai
- Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Hisashi Tanii
- Center for Physical and Mental Health, Mie University, Tsu, Mie Japan
| | - Tsukasa Sasaki
- Division of Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan
| | - Makoto Honda
- Department of Psychiatry and Behavioral Sciences, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
| | - Katsushi Tokunaga
- Genome Medical Science Project (Toyama), National Center for for Global Health and Medicine, Tokyo, Japan
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
9
|
Ni Z, Zhou XY, Aslam S, Niu DK. Characterization of Human Dosage-Sensitive Transcription Factor Genes. Front Genet 2019; 10:1208. [PMID: 31867040 PMCID: PMC6904359 DOI: 10.3389/fgene.2019.01208] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 11/01/2019] [Indexed: 11/13/2022] Open
Abstract
Copy number changes in protein-coding genes are detrimental if the consequent changes in protein concentrations disrupt essential cellular functions. The dosage sensitivity of transcription factor (TF) genes is particularly interesting because their products are essential in regulating the expression of genetic information. From four recently curated data sets of dosage-sensitive genes (genes with conserved copy numbers across mammals, ohnologs, and two data sets of haploinsufficient genes), we compiled a data set of the most reliable dosage-sensitive (MRDS) genes and a data set of the most reliable dosage-insensitive (MRDIS) genes. The MRDS genes were those present in all four data sets, while the MRDIS genes were those absent from any one of the four data sets and with the probability of being loss of function-intolerant (pLI) values < 0.5 in both of the haploinsufficient gene data sets. Enrichment analysis of TF genes among the MRDS and MRDIS gene data sets showed that TF genes are more likely to be dosage-sensitive than other genes in the human genome. The nuclear receptor family was the most enriched TF family among the dosage-sensitive genes. TF families with very few members were also deemed more likely to be dosage-sensitive than TF families with more members. In addition, we found a certain number of dosage-insensitive TFs. The most typical were the Krüppel-associated box domain-containing zinc-finger proteins (KZFPs). Gene ontology (GO) enrichment analysis showed that the MRDS TFs were enriched for many more terms than the MRDIS TFs; however, the proteins interacting with these two groups of TFs did not show such sharp differences. Furthermore, we found that the MRDIS KZFPs were not significantly enriched for any GO terms, whereas their interacting proteins were significantly enriched for thousands of GO terms. Further characterizations revealed significant differences between MRDS TFs and MRDIS TFs in the lengths and nucleotide compositions of DNA-binding sites as well as in expression level, protein size, and selective force.
Collapse
Affiliation(s)
- Zhihua Ni
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, China
- College of Life Sciences, Hebei University, Baoding, China
| | - Xiao-Yu Zhou
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Sidra Aslam
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Deng-Ke Niu
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, China
| |
Collapse
|
10
|
Fotiou E, Williams S, Martin-Geary A, Robertson DL, Tenin G, Hentges KE, Keavney B. Integration of Large-Scale Genomic Data Sources With Evolutionary History Reveals Novel Genetic Loci for Congenital Heart Disease. CIRCULATION-GENOMIC AND PRECISION MEDICINE 2019; 12:442-451. [PMID: 31613678 PMCID: PMC6798745 DOI: 10.1161/circgen.119.002694] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Supplemental Digital Content is available in the text. Most cases of congenital heart disease (CHD) are sporadic and nonsyndromic, with poorly understood etiology. Rare genetic variants have been found to affect the risk of sporadic, nonsyndromic CHD, but individual studies to date are of only moderate sizes, and none to date has incorporated the ohnolog status of candidate genes in the analysis. Ohnologs are genes retained from ancestral whole-genome duplications during evolution; multiple lines of evidence suggest ohnologs are overrepresented among dosage-sensitive genes. We integrated large-scale data on rare variants with evolutionary information on ohnolog status to identify novel genetic loci predisposing to CHD.
Collapse
Affiliation(s)
- Elisavet Fotiou
- Division of Cardiovascular Sciences, School of Medical Sciences, Faculty of Biology, Medicine, and Health, Manchester Academic Health Science Centre (E.F., S.W., G.T., B.K.), University of Manchester
| | - Simon Williams
- Division of Cardiovascular Sciences, School of Medical Sciences, Faculty of Biology, Medicine, and Health, Manchester Academic Health Science Centre (E.F., S.W., G.T., B.K.), University of Manchester
| | - Alexandra Martin-Geary
- Division of Evolution and Genomic science (A.M.-G., D.L.R., K.E.H.), University of Manchester
| | - David L Robertson
- Division of Evolution and Genomic science (A.M.-G., D.L.R., K.E.H.), University of Manchester.,MRC-University of Glasgow Centre for Virus Research (D.L.R.)
| | - Gennadiy Tenin
- Division of Cardiovascular Sciences, School of Medical Sciences, Faculty of Biology, Medicine, and Health, Manchester Academic Health Science Centre (E.F., S.W., G.T., B.K.), University of Manchester
| | - Kathryn E Hentges
- Division of Evolution and Genomic science (A.M.-G., D.L.R., K.E.H.), University of Manchester
| | - Bernard Keavney
- Division of Cardiovascular Sciences, School of Medical Sciences, Faculty of Biology, Medicine, and Health, Manchester Academic Health Science Centre (E.F., S.W., G.T., B.K.), University of Manchester.,Manchester Heart Centre, Manchester University NHS Foundation Trust, Manchester (B.K.)
| |
Collapse
|
11
|
Mourikis TP, Benedetti L, Foxall E, Temelkovski D, Nulsen J, Perner J, Cereda M, Lagergren J, Howell M, Yau C, Fitzgerald RC, Scaffidi P, Ciccarelli FD. Patient-specific cancer genes contribute to recurrently perturbed pathways and establish therapeutic vulnerabilities in esophageal adenocarcinoma. Nat Commun 2019; 10:3101. [PMID: 31308377 PMCID: PMC6629660 DOI: 10.1038/s41467-019-10898-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Accepted: 06/04/2019] [Indexed: 12/25/2022] Open
Abstract
The identification of cancer-promoting genetic alterations is challenging particularly in highly unstable and heterogeneous cancers, such as esophageal adenocarcinoma (EAC). Here we describe a machine learning algorithm to identify cancer genes in individual patients considering all types of damaging alterations simultaneously. Analysing 261 EACs from the OCCAMS Consortium, we discover helper genes that, alongside well-known drivers, promote cancer. We confirm the robustness of our approach in 107 additional EACs. Unlike recurrent alterations of known drivers, these cancer helper genes are rare or patient-specific. However, they converge towards perturbations of well-known cancer processes. Recurrence of the same process perturbations, rather than individual genes, divides EACs into six clusters differing in their molecular and clinical features. Experimentally mimicking the alterations of predicted helper genes in cancer and pre-cancer cells validates their contribution to disease progression, while reverting their alterations reveals EAC acquired dependencies that can be exploited in therapy.
Collapse
Affiliation(s)
- Thanos P Mourikis
- Cancer Systems Biology Laboratory, The Francis Crick Institute, London, NW1 1AT, UK
- School of Cancer and Pharmaceutical Sciences, King's College London, London, SE11UL, UK
| | - Lorena Benedetti
- Cancer Systems Biology Laboratory, The Francis Crick Institute, London, NW1 1AT, UK
- School of Cancer and Pharmaceutical Sciences, King's College London, London, SE11UL, UK
| | - Elizabeth Foxall
- Cancer Systems Biology Laboratory, The Francis Crick Institute, London, NW1 1AT, UK
- School of Cancer and Pharmaceutical Sciences, King's College London, London, SE11UL, UK
| | - Damjan Temelkovski
- Cancer Systems Biology Laboratory, The Francis Crick Institute, London, NW1 1AT, UK
- School of Cancer and Pharmaceutical Sciences, King's College London, London, SE11UL, UK
| | - Joel Nulsen
- Cancer Systems Biology Laboratory, The Francis Crick Institute, London, NW1 1AT, UK
- School of Cancer and Pharmaceutical Sciences, King's College London, London, SE11UL, UK
| | - Juliane Perner
- MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, CB2 OXZ, UK
| | - Matteo Cereda
- Italian Institute for Genomic Medicine (IIGM), Turin, 10126, Italy
| | - Jesper Lagergren
- School of Cancer and Pharmaceutical Sciences, King's College London, London, SE11UL, UK
| | - Michael Howell
- High Throughput Screening Laboratory, The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK
| | | | - Rebecca C Fitzgerald
- MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, CB2 OXZ, UK
| | - Paola Scaffidi
- Cancer Epigenetics Laboratory, The Francis Crick Institute, London, NW1 1AT, UK
- UCL Cancer Institute, University College London, London, WC1E 6DD, UK
| | - Francesca D Ciccarelli
- Cancer Systems Biology Laboratory, The Francis Crick Institute, London, NW1 1AT, UK.
- School of Cancer and Pharmaceutical Sciences, King's College London, London, SE11UL, UK.
| |
Collapse
|
12
|
Lin YL, Gokcumen O. Fine-Scale Characterization of Genomic Structural Variation in the Human Genome Reveals Adaptive and Biomedically Relevant Hotspots. Genome Biol Evol 2019; 11:1136-1151. [PMID: 30887040 PMCID: PMC6475128 DOI: 10.1093/gbe/evz058] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2019] [Indexed: 12/25/2022] Open
Abstract
Genomic structural variants (SVs) are distributed nonrandomly across the human genome. The "hotspots" of SVs have been implicated in evolutionary innovations, as well as medical conditions. However, the evolutionary and biomedical features of these hotspots remain incompletely understood. Here, we analyzed data from 2,504 genomes to construct a refined map of 1,148 SV hotspots in human genomes. We confirmed that segmental duplication-related nonallelic homologous recombination is an important mechanistic driver of SV hotspot formation. However, to our surprise, we also found that a majority of SVs in hotspots do not form through such recombination-based mechanisms, suggesting diverse mechanistic and selective forces shaping hotspots. Indeed, our evolutionary analyses showed that the majority of SV hotspots are within gene-poor regions and evolve under relaxed negative selection or neutrality. However, we still found a small subset of SV hotspots harboring genes that are enriched for anthropologically crucial functions and evolve under geography-specific and balancing adaptive forces. These include two independent hotspots on different chromosomes affecting alpha and beta hemoglobin gene clusters. Biomedically, we found that the SV hotspots coincide with breakpoints of clinically relevant, large de novo SVs, significantly more often than genome-wide expectations. For example, we showed that the breakpoints of multiple large SVs, which lead to idiopathic short stature, coincide with SV hotspots. Therefore, the mutational instability in SV hotpots likely enables chromosomal breaks that lead to pathogenic structural variation formations. Overall, our study contributes to a better understanding of the mutational and adaptive landscape of the genome.
Collapse
Affiliation(s)
- Yen-Lung Lin
- Department of Biological Sciences, University at Buffalo
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo
- Corresponding author: E-mail: or
| |
Collapse
|
13
|
Abstract
An attractive and long-standing hypothesis regarding the evolution of genes after duplication posits that the duplication event creates new evolutionary possibilities by releasing a copy of the gene from constraint. Apparent support was found in numerous analyses, particularly, the observation of higher rates of evolution in duplicated as compared with singleton genes. Could it, instead, be that more duplicable genes (owing to mutation, fixation, or retention biases) are intrinsically faster evolving? To uncouple the measurement of rates of evolution from the determination of duplicate or singleton status, we measure the rates of evolution in singleton genes in outgroup primate lineages but classify these genes as to whether they have duplicated or not in a crown group of great apes. We find that rates of evolution are higher in duplicable genes prior to the duplication event. In part this is owing to a negative correlation between coding sequence length and rate of evolution, coupled with a bias toward smaller genes being more duplicable. The effect is masked by difference in expression rate between duplicable genes and singletons. Additionally, in contradiction to the classical assumption, we find no convincing evidence for an increase in dN/dS after duplication, nor for rate asymmetry between duplicates. We conclude that high rates of evolution of duplicated genes are not solely a consequence of the duplication event, but are rather a predictor of duplicability. These results are consistent with a model in which successful gene duplication events in mammals are skewed toward events of minimal phenotypic impact.
Collapse
Affiliation(s)
- Áine N O'Toole
- Department of Genetics, Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| | - Aoife McLysaght
- Department of Genetics, Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
14
|
Sacerdot C, Louis A, Bon C, Berthelot C, Roest Crollius H. Chromosome evolution at the origin of the ancestral vertebrate genome. Genome Biol 2018; 19:166. [PMID: 30333059 PMCID: PMC6193309 DOI: 10.1186/s13059-018-1559-1] [Citation(s) in RCA: 121] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 10/04/2018] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND It has been proposed that more than 450 million years ago, two successive whole genome duplications took place in a marine chordate lineage before leading to the common ancestor of vertebrates. A precise reconstruction of these founding events would provide a framework to better understand the impact of these early whole genome duplications on extant vertebrates. RESULTS We reconstruct the evolution of chromosomes at the beginning of vertebrate evolution. We first compare 61 extant animal genomes to reconstruct the highly contiguous order of genes in a 326-million-year-old ancestral Amniota genome. In this genome, we establish a well-supported list of duplicated genes originating from the two whole genome duplications to identify tetrads of duplicated chromosomes. From this, we reconstruct a chronology in which a pre-vertebrate genome composed of 17 chromosomes duplicated to 34 chromosomes and was subject to seven chromosome fusions before duplicating again into 54 chromosomes. After the separation of the lineage of Gnathostomata (jawed vertebrates) from Cyclostomata (extant jawless fish), four more fusions took place to form the ancestral Euteleostomi (bony vertebrates) genome of 50 chromosomes. CONCLUSIONS These results firmly establish the occurrence of two whole genome duplications in the lineage that precedes the ancestor of vertebrates, resolving in particular the ambiguity raised by the analysis of the lamprey genome. This work provides a foundation for studying the evolution of vertebrate chromosomes from the standpoint of a common ancestor and particularly the pattern of duplicate gene retention and loss that resulted in the gene composition of extant vertebrate genomes.
Collapse
Affiliation(s)
- Christine Sacerdot
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, PSL Research University, 75005, Paris, France
| | - Alexandra Louis
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, PSL Research University, 75005, Paris, France
| | - Céline Bon
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, PSL Research University, 75005, Paris, France
- Present Address: Laboratoire Éco-Anthropologie et Ethnobiologie, UMR 7206 CNRS - Muséum National d'Histoire Naturelle, Université Paris Diderot, Sorbonne Paris Cité, F-75016, Paris, France
| | | | - Hugues Roest Crollius
- Institut de Biologie de l'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, PSL Research University, 75005, Paris, France.
| |
Collapse
|
15
|
Xu Y, Shi W, Song R, Long W, Guo H, Yuan S, Zhang T. Divergent patterns of genic copy number variation in KCNIP1 gene reveal risk locus of type 2 diabetes in Chinese population. Endocr J 2018; 65:537-545. [PMID: 29491224 DOI: 10.1507/endocrj.ej17-0496] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Copy number variation (CNV) has emerged as another important genetic marker in addition to SNP for understanding etiology of complex disease. Kv channel interacting protein 1 (KCNIP1) is a Ca2+-dependent transcriptional modulator that contributes to the regulation of insulin secretion. Previous genome-wide CNV assay identified the KCNIP1 gene encompassing a CNV region, however, its further effect and risk rate on type 2 diabetes (T2D) have rarely been addressed, especially in Chinese population. The current study aims to detect and excavate genetic distribution profile of KCNIP1 CNV in Chinese T2D and control populations, and further to investigate the associations with clinical characteristics. Divergent patterns of the KCNIP1 CNV were identified (p < 0.01), in which the copy number gain was predominant in T2D, while the copy number normal accounted for the most in control group. Consistently, the individuals with copy number gain showed significant risk on T2D (OR = 4.550, p < 0.01). The KCNIP1 copy numbers presented significantly positive correlations with fasting plasma glucose and glycated hemoglobin in T2D. For OGTT test, the T2D patients with copy number gain had remarkably elevated glucose contents (60, 120, 180-min, p < 0.05 or p < 0.01) and diminished insulin levels (60, 120-min, p < 0.05) than those with copy number loss and normal, which suggested that the KCNIP1 CNV was correlated with the glucose and insulin action. This is the first CNV association study of the KCNIP1 gene in Chinese population, and these data indicated that KCNIP1 might function as a T2D-susceptibility gene whose dysregulation alters insulin production.
Collapse
Affiliation(s)
- Yao Xu
- Institute of Biology and Medicine, College of Life Science and Health, Wuhan University of Science and Technology, Wuhan, Hubei 430081, China
| | - Weilin Shi
- Institute of Biology and Medicine, College of Life Science and Health, Wuhan University of Science and Technology, Wuhan, Hubei 430081, China
| | - Ruhui Song
- Institute of Biology and Medicine, College of Life Science and Health, Wuhan University of Science and Technology, Wuhan, Hubei 430081, China
| | - Wenlin Long
- Institute of Biology and Medicine, College of Life Science and Health, Wuhan University of Science and Technology, Wuhan, Hubei 430081, China
| | - Hui Guo
- Institute of Biology and Medicine, College of Life Science and Health, Wuhan University of Science and Technology, Wuhan, Hubei 430081, China
| | - Shiliang Yuan
- Tianyou Hospital Affiliated to Wuhan University of Science and Technology, Wuhan, Hubei 430064, China
| | - Tongcun Zhang
- Institute of Biology and Medicine, College of Life Science and Health, Wuhan University of Science and Technology, Wuhan, Hubei 430081, China
| |
Collapse
|
16
|
Nakatani Y, McLysaght A. Genomes as documents of evolutionary history: a probabilistic macrosynteny model for the reconstruction of ancestral genomes. Bioinformatics 2018; 33:i369-i378. [PMID: 28881993 PMCID: PMC5870716 DOI: 10.1093/bioinformatics/btx259] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Motivation It has been argued that whole-genome duplication (WGD) exerted a profound influence on the course of evolution. For the purpose of fully understanding the impact of WGD, several formal algorithms have been developed for reconstructing pre-WGD gene order in yeast and plant. However, to the best of our knowledge, those algorithms have never been successfully applied to WGD events in teleost and vertebrate, impeded by extensive gene shuffling and gene losses. Results Here, we present a probabilistic model of macrosynteny (i.e. conserved linkage or chromosome-scale distribution of orthologs), develop a variational Bayes algorithm for inferring the structure of pre-WGD genomes, and study estimation accuracy by simulation. Then, by applying the method to the teleost WGD, we demonstrate effectiveness of the algorithm in a situation where gene-order reconstruction algorithms perform relatively poorly due to a high rate of rearrangement and extensive gene losses. Our high-resolution reconstruction reveals previously overlooked small-scale rearrangements, necessitating a revision to previous views on genome structure evolution in teleost and vertebrate. Conclusions We have reconstructed the structure of a pre-WGD genome by employing a variational Bayes approach that was originally developed for inferring topics from millions of text documents. Interestingly, comparison of the macrosynteny and topic model algorithms suggests that macrosynteny can be regarded as documents on ancestral genome structure. From this perspective, the present study would seem to provide a textbook example of the prevalent metaphor that genomes are documents of evolutionary history. Availability and implementation The analysis data are available for download at http://www.gen.tcd.ie/molevol/supp_data/MacrosyntenyTGD.zip, and the software written in Java is available upon request. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yoichiro Nakatani
- Department of Genetics, Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin 2, Ireland
| | - Aoife McLysaght
- Department of Genetics, Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin 2, Ireland
| |
Collapse
|
17
|
Sekine M, Makino T. Inference of Causative Genes for Alzheimer's Disease Due to Dosage Imbalance. Mol Biol Evol 2017; 34:2396-2407. [PMID: 28666362 DOI: 10.1093/molbev/msx183] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Copy number variations (CNVs) have recently drawn attention as an important genetic factor for diseases, especially common neuropsychiatric disorders including Alzheimer's disease (AD). Because most of the pathogenic CNV regions overlap with multiple genes, it has been challenging to identify the true disease-causing genes amongst them. Notably, a recent study reported that CNV regions containing ohnologs, which are dosage-sensitive genes, are likely to be deleterious. Utilizing the unique feature of ohnologs could be useful for identifying causative genes with pathogenic CNVs, however its effectiveness is still unclear. Although it has been reported that AD is strongly affected by CNVs, most of AD-causing genes with pathogenic CNVs have not been identified yet. Here, we show that dosage-sensitive ohnologs within CNV regions reported in patients with AD are related to the nervous system and are highly expressed in the brain, similar to other known susceptible genes for AD. We found that CNV regions in patients with AD contained dosage-sensitive genes, which are ohnologs not overlapping with control CNV regions, frequently. Furthermore, these dosage-sensitive genes in pathogenic CNV regions had a strong enrichment in the nervous system for mouse knockout phenotype and high expression in the brain similar to the known susceptible genes for AD. Our results demonstrated that selecting dosage-sensitive ohnologs out of multiple genes with pathogenic CNVs is effective in identifying the causative genes for AD. This methodology can be applied to other diseases caused by dosage imbalance and might help to establish the medical diagnosis by analysis of CNVs.
Collapse
Affiliation(s)
- Mizuka Sekine
- Department of Biology, Faculty of Science, Tohoku University, Sendai, Japan
| | - Takashi Makino
- Department of Ecology and Evolutionary Biology, Graduate School of Life Sciences, Tohoku University, Sendai, Japan
| |
Collapse
|
18
|
Fares MA, Sabater-Muñoz B, Toft C. Genome Mutational and Transcriptional Hotspots Are Traps for Duplicated Genes and Sources of Adaptations. Genome Biol Evol 2017; 9:1229-1240. [PMID: 28459980 PMCID: PMC5433386 DOI: 10.1093/gbe/evx085] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/26/2017] [Indexed: 12/23/2022] Open
Abstract
Gene duplication generates new genetic material, which has been shown to lead to major innovations in unicellular and multicellular organisms. A whole-genome duplication occurred in the ancestor of Saccharomyces yeast species but 92% of duplicates returned to single-copy genes shortly after duplication. The persisting duplicated genes in Saccharomyces led to the origin of major metabolic innovations, which have been the source of the unique biotechnological capabilities in the Baker’s yeast Saccharomyces cerevisiae. What factors have determined the fate of duplicated genes remains unknown. Here, we report the first demonstration that the local genome mutation and transcription rates determine the fate of duplicates. We show, for the first time, a preferential location of duplicated genes in the mutational and transcriptional hotspots of S. cerevisiae genome. The mechanism of duplication matters, with whole-genome duplicates exhibiting different preservation trends compared to small-scale duplicates. Genome mutational and transcriptional hotspots are rich in duplicates with large repetitive promoter elements. Saccharomyces cerevisiae shows more tolerance to deleterious mutations in duplicates with repetitive promoter elements, which in turn exhibit higher transcriptional plasticity against environmental perturbations. Our data demonstrate that the genome traps duplicates through the accelerated regulatory and functional divergence of their gene copies providing a source of novel adaptations in yeast.
Collapse
Affiliation(s)
- Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas (CSIC), Universidad Politécnica de Valencia, Valencia, Spain.,Institute for Integrative Systems Biology, Consejo Superior de Investigaciones Científicas (CSIC), Universidad de Valencia, Paterna, Spain.,Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin 2, Dublin, Ireland
| | - Beatriz Sabater-Muñoz
- Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas (CSIC), Universidad Politécnica de Valencia, Valencia, Spain.,Institute for Integrative Systems Biology, Consejo Superior de Investigaciones Científicas (CSIC), Universidad de Valencia, Paterna, Spain.,Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin 2, Dublin, Ireland
| | - Christina Toft
- Institute for Integrative Systems Biology, Consejo Superior de Investigaciones Científicas (CSIC), Universidad de Valencia, Paterna, Spain.,Department of Genetics, University of Valencia, Burjasot, Spain.,Instituto de Agroquímica y Tecnología de los Alimentos, Consejo Superior de Investigaciones Científicas (CSIC), Burjasot, Valencia, Spain
| |
Collapse
|
19
|
Abstract
For a subset of genes in our genome a change in gene dosage, by duplication or deletion, causes a phenotypic effect. These dosage-sensitive genes may confer an advantage upon copy number change, but more typically they are associated with disease, including heart disease, cancers and neuropsychiatric disorders. This gene copy number sensitivity creates characteristic evolutionary constraints that can serve as a diagnostic to identify dosage-sensitive genes. Though the link between copy number change and disease is well-established, the mechanism of pathogenicity is usually opaque. We propose that gene expression level may provide a common basis for the pathogenic effects of many copy number variants.
Collapse
Affiliation(s)
- Alan M Rice
- Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin 2, Ireland
| | - Aoife McLysaght
- Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin 2, Ireland.
| |
Collapse
|
20
|
Chen ECH, Morin A, Chauchat JH, Sankoff D. Statistical analysis of fractionation resistance by functional category and expression. BMC Genomics 2017; 18:366. [PMID: 28589858 PMCID: PMC5461532 DOI: 10.1186/s12864-017-3736-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
Background The current literature establishes the importance of gene functional category and expression in promoting or suppressing duplicate gene loss after whole genome doubling in plants, a process known as fractionation. Inspired by studies that have reported gene expression to be the dominating factor in preventing duplicate gene loss, we analyzed the relative effect of functional category and expression. Methods We use multivariate methods to study data sets on gene retention, function and expression in rosids and asterids to estimate effects and assess their interaction. Results Our results suggest that the effect on duplicate gene retention fractionation by functional category and expression are independent and have no statistical interaction. Conclusion In plants, functional category is the more dominant factor in explaining duplicate gene loss.
Collapse
Affiliation(s)
- Eric C H Chen
- Department of Biology, University of Ottawa, 30 Marie Curie, Ottawa, K1N 6N5, Canada
| | - Annie Morin
- Department of Computer Science, Université de Rennes 1, Rennes Cedex, 69676, France
| | | | - David Sankoff
- Department of Mathematics and Statistics, 585 King Edward, Ottawa, K1N 6N5, Canada.
| |
Collapse
|
21
|
Dosage sensitivity is a major determinant of human copy number variant pathogenicity. Nat Commun 2017; 8:14366. [PMID: 28176757 PMCID: PMC5309798 DOI: 10.1038/ncomms14366] [Citation(s) in RCA: 96] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Accepted: 12/20/2016] [Indexed: 01/22/2023] Open
Abstract
Human copy number variants (CNVs) account for genome variation an order of magnitude larger than single-nucleotide polymorphisms. Although much of this variation has no phenotypic consequences, some variants have been associated with disease, in particular neurodevelopmental disorders. Pathogenic CNVs are typically very large and contain multiple genes, and understanding the cause of the pathogenicity remains a major challenge. Here we show that pathogenic CNVs are significantly enriched for genes involved in development and genes that have greater evolutionary copy number conservation across mammals, indicative of functional constraints. Conversely, genes found in benign CNV regions have more variable copy number. These evolutionary constraints are characteristic of genes in pathogenic CNVs and can only be explained by dosage sensitivity of those genes. These results implicate dosage sensitivity of individual genes as a common cause of CNV pathogenicity. These evolutionary metrics suggest a path to identifying disease genes in pathogenic CNVs. Copy number variants (CNVs) cause significant genomic variation in humans and may be benign or may cause disease. Here, the authors show that pathogenic CNVs are evolutionarily constrained compared with benign, pointing to dosage sensitivity as a potential cause of disease.
Collapse
|
22
|
Srinivasan S, Bettella F, Hassani S, Wang Y, Witoelar A, Schork AJ, Thompson WK, Collier DA, Desikan RS, Melle I, Dale AM, Djurovic S, Andreassen OA. Probing the Association between Early Evolutionary Markers and Schizophrenia. PLoS One 2017; 12:e0169227. [PMID: 28081145 PMCID: PMC5231388 DOI: 10.1371/journal.pone.0169227] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Accepted: 12/13/2016] [Indexed: 12/31/2022] Open
Abstract
Schizophrenia is suggested to be a by-product of the evolution in humans, a compromise for our language, creative thinking and cognitive abilities, and thus, essentially, a human disorder. The time of its origin during the course of human evolution remains unclear. Here we investigate several markers of early human evolution and their relationship to the genetic risk of schizophrenia. We tested the schizophrenia evolutionary hypothesis by analyzing genome-wide association studies of schizophrenia and other human phenotypes in a statistical framework suited for polygenic architectures. We analyzed evolutionary proxy measures: human accelerated regions, segmental duplications, and ohnologs, representing various time periods of human evolution for overlap with the human genomic loci associated with schizophrenia. Polygenic enrichment plots suggest a higher prevalence of schizophrenia associations in human accelerated regions, segmental duplications and ohnologs. However, the enrichment is mostly accounted for by linkage disequilibrium, especially with functional elements like introns and untranslated regions. Our results did not provide clear evidence that markers of early human evolution are more likely associated with schizophrenia. While SNPs associated with schizophrenia are enriched in HAR, Ohno and SD regions, the enrichment seems to be mediated by affiliation to known genomic enrichment categories. Taken together with previous results, these findings suggest that schizophrenia risk may have mainly developed more recently in human evolution.
Collapse
Affiliation(s)
- Saurabh Srinivasan
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Francesco Bettella
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Sahar Hassani
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Yunpeng Wang
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Aree Witoelar
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Andrew J. Schork
- Multimodal Imaging Laboratory, University of California at San Diego, La Jolla, CA, United States of America
- Cognitive Sciences Graduate Program, University of California, San Diego, La Jolla, CA, United States of America
- Center for Human Development, University of California at San Diego, La Jolla, CA, United States of America
| | - Wesley K. Thompson
- Institute of Biological Psychiatry, Mental Health Center St. Hans, Mental Health Services Copenhagen, Roskilde, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - David A. Collier
- Eli Lilly & Co, Erl Wood Manor, Windlesham, Surrey, United Kingdom
| | - Rahul S. Desikan
- Neuroradiology Section, Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA, United States of America
| | - Ingrid Melle
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Anders M. Dale
- Multimodal Imaging Laboratory, University of California at San Diego, La Jolla, CA, United States of America
- Center for Human Development, University of California at San Diego, La Jolla, CA, United States of America
- Department of Neuroscience, University of California at San Diego, La Jolla, CA, United States of America
- Neuroradiology Section, Department of Radiology and Biomedical Imaging, University of California at San Francisco, San Francisco, CA, United States of America
| | - Srdjan Djurovic
- Department of Medical Genetics, Oslo University Hospital, Oslo, Norway
- NORMENT, KG Jebsen Centre for Psychosis Research, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Ole A. Andreassen
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
- Institute of Biological Psychiatry, Mental Health Center St. Hans, Mental Health Services Copenhagen, Roskilde, Denmark
- * E-mail:
| |
Collapse
|
23
|
Xie T, Yang QY, Wang XT, McLysaght A, Zhang HY. Spatial Colocalization of Human Ohnolog Pairs Acts to Maintain Dosage-Balance. Mol Biol Evol 2016; 33:2368-75. [PMID: 27297469 PMCID: PMC4989111 DOI: 10.1093/molbev/msw108] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Ohnologs -paralogous gene pairs generated by whole genome duplication- are enriched for dosage sensitive genes, that is, genes that have a phenotype due to copy number changes. Dosage sensitive genes frequently occur in the same metabolic pathway and in physically interacting proteins. Accumulating evidence reveals that functionally related genes tend to co-localize in the three-dimensional (3D) arrangement of chromosomes. We query whether the spatial distribution of ohnologs has implications for their dosage balance. We analyzed the colocalization frequency of ohnologs based on chromatin interaction datasets of seven human cell lines and found that ohnolog pairs exhibit higher spatial proximity in 3D nuclear organization than other paralog pairs and than randomly chosen ohnologs in the genome. We also found that colocalized ohnologs are more resistant to copy number variations and more likely to be disease-associated genes, which indicates a stronger dosage balance in ohnologs with high spatial proximity. This phenomenon is further supported by the stronger similarity of gene co-expression and of gene ontology terms of colocalized ohnologs. In addition, for a large fraction of ohnologs, the spatial colocalization is conserved in mouse cells, suggestive of functional constraint on their 3D positioning in the nucleus.
Collapse
Affiliation(s)
- Ting Xie
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan P. R. China
| | - Qing-Yong Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan P. R. China
| | - Xiao-Tao Wang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan P. R. China
| | - Aoife McLysaght
- Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin, Ireland
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan P. R. China
| |
Collapse
|
24
|
Roquis D, Rognon A, Chaparro C, Boissier J, Arancibia N, Cosseau C, Parrinello H, Grunau C. Frequency and mitotic heritability of epimutations inSchistosoma mansoni. Mol Ecol 2016; 25:1741-58. [DOI: 10.1111/mec.13555] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Revised: 01/22/2016] [Accepted: 01/23/2016] [Indexed: 12/28/2022]
Affiliation(s)
- David Roquis
- Université de Perpignan Via Domitia; Perpignan F-66860 France
- CNRS; UMR 5244; Interactions Hôtes-Pathogènes-Environnements (IHPE); Perpignan F-66860 France
| | - Anne Rognon
- Université de Perpignan Via Domitia; Perpignan F-66860 France
- CNRS; UMR 5244; Interactions Hôtes-Pathogènes-Environnements (IHPE); Perpignan F-66860 France
| | - Cristian Chaparro
- Université de Perpignan Via Domitia; Perpignan F-66860 France
- CNRS; UMR 5244; Interactions Hôtes-Pathogènes-Environnements (IHPE); Perpignan F-66860 France
| | - Jerome Boissier
- Université de Perpignan Via Domitia; Perpignan F-66860 France
- CNRS; UMR 5244; Interactions Hôtes-Pathogènes-Environnements (IHPE); Perpignan F-66860 France
| | - Nathalie Arancibia
- Université de Perpignan Via Domitia; Perpignan F-66860 France
- CNRS; UMR 5244; Interactions Hôtes-Pathogènes-Environnements (IHPE); Perpignan F-66860 France
| | - Celine Cosseau
- Université de Perpignan Via Domitia; Perpignan F-66860 France
- CNRS; UMR 5244; Interactions Hôtes-Pathogènes-Environnements (IHPE); Perpignan F-66860 France
| | - Hugues Parrinello
- MGX - Montpellier GenomiX IBiSA, Institut de Génomique Fonctionnelle; 141, rue de la Cardonille F-34094 Montpellier Cedex 05 France
| | - Christoph Grunau
- Université de Perpignan Via Domitia; Perpignan F-66860 France
- CNRS; UMR 5244; Interactions Hôtes-Pathogènes-Environnements (IHPE); Perpignan F-66860 France
| |
Collapse
|
25
|
Zielezinski A, Karlowski WM. Early origin and adaptive evolution of the GW182 protein family, the key component of RNA silencing in animals. RNA Biol 2016; 12:761-70. [PMID: 26106978 PMCID: PMC4615383 DOI: 10.1080/15476286.2015.1051302] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The GW182 proteins are a key component of the miRNA-dependent post-transcriptional silencing pathway in animals. They function as scaffold proteins to mediate the interaction of Argonaute (AGO)-containing complexes with cytoplasmic poly(A)-binding proteins (PABP) and PAN2-PAN3 and CCR4-NOT deadenylases. The AGO-GW182 complexes mediate silencing of the target mRNA through induction of translational repression and/or mRNA degradation. Although the GW182 proteins are a subject of extensive experimental research in the recent years, very little is known about their origin and evolution. Here, based on complex functional annotation and phylogenetic analyses, we reveal 448 members of the GW182 protein family from the earliest animals to humans. Our results indicate that a single-copy GW182/TNRC6C progenitor gene arose with the emergence of multicellularity and it multiplied in the last common ancestor of vertebrates in 2 rounds of whole genome duplication (WGD) resulting in 3 genes. Before the divergence of vertebrates, both the AGO- and CCR4-NOT-binding regions of GW182s showed significant acceleration in the accumulation of amino acid changes, suggesting functional adaptation toward higher specificity to the molecules of the silencing complex. We conclude that the silencing ability of the GW182 proteins improves with higher position in the taxonomic classification and increasing complexity of the organism. The first reconstruction of the molecular journey of GW182 proteins from the ancestral metazoan protein to the current mammalian configuration provides new insight into development of the miRNA-dependent post-transcriptional silencing pathway in animals.
Collapse
Affiliation(s)
- Andrzej Zielezinski
- a Department of Computational Biology; Institute of Molecular Biology and Biotechnology; Adam Mickiewicz University ; Poznan , Poland
| | | |
Collapse
|
26
|
Zarrei M, MacDonald JR, Merico D, Scherer SW. A copy number variation map of the human genome. Nat Rev Genet 2015; 16:172-83. [DOI: 10.1038/nrg3871] [Citation(s) in RCA: 565] [Impact Index Per Article: 56.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
27
|
Popadin K, Gutierrez-Arcelus M, Lappalainen T, Buil A, Steinberg J, Nikolaev S, Lukowski S, Bazykin G, Seplyarskiy V, Ioannidis P, Zdobnov E, Dermitzakis E, Antonarakis S. Gene age predicts the strength of purifying selection acting on gene expression variation in humans. Am J Hum Genet 2014; 95:660-74. [PMID: 25480033 DOI: 10.1016/j.ajhg.2014.11.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2014] [Accepted: 11/10/2014] [Indexed: 10/24/2022] Open
Abstract
Gene expression levels can be subject to selection. We hypothesized that the age of gene origin is associated with expression constraints, given that it affects the level of gene integration into the functional cellular environment. By studying the genetic variation affecting gene expression levels (cis expression quantitative trait loci [cis-eQTLs]) and protein levels (cis protein QTLs [cis-pQTLs]), we determined that young, primate-specific genes are enriched in cis-eQTLs and cis-pQTLs. Compared to cis-eQTLs of old genes originating before the zebrafish divergence, cis-eQTLs of young genes have a higher effect size, are located closer to the transcription start site, are more significant, and tend to influence genes in multiple tissues and populations. These results suggest that the expression constraint of each gene increases throughout its lifespan. We also detected a positive correlation between expression constraints (approximated by cis-eQTL properties) and coding constraints (approximated by Ka/Ks) and observed that this correlation might be driven by gene age. To uncover factors associated with the increase in gene-age-related expression constraints, we demonstrated that gene connectivity, gene involvement in complex regulatory networks, gene haploinsufficiency, and the strength of posttranscriptional regulation increase with gene age. We also observed an increase in heritability of gene expression levels with age, implying a reduction of the environmental component. In summary, we show that gene age shapes key gene properties during evolution and is therefore an important component of genome function.
Collapse
|
28
|
Abstract
BACKGROUND Previous work on whole genome doubling in plants established the importance of gene functional category in provoking or suppressing duplicate gene loss, or fractionation. Other studies, particularly in Paramecium have correlated levels of gene expression with vulnerability or resistance to duplicate loss. RESULTS Here we analyze the simultaneous effect of function category and expression in two plant data sets, rosids and asterids. CONCLUSION We demonstrate function category and expression level have independent effects, though expression does not play the dominant role it does in Paramecium.
Collapse
|
29
|
Anderson JE, Kantar MB, Kono TY, Fu F, Stec AO, Song Q, Cregan PB, Specht JE, Diers BW, Cannon SB, McHale LK, Stupar RM. A roadmap for functional structural variants in the soybean genome. G3 (BETHESDA, MD.) 2014; 4:1307-18. [PMID: 24855315 PMCID: PMC4455779 DOI: 10.1534/g3.114.011551] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Accepted: 05/20/2014] [Indexed: 01/25/2023]
Abstract
Gene structural variation (SV) has recently emerged as a key genetic mechanism underlying several important phenotypic traits in crop species. We screened a panel of 41 soybean (Glycine max) accessions serving as parents in a soybean nested association mapping population for deletions and duplications in more than 53,000 gene models. Array hybridization and whole genome resequencing methods were used as complementary technologies to identify SV in 1528 genes, or approximately 2.8%, of the soybean gene models. Although SV occurs throughout the genome, SV enrichment was noted in families of biotic defense response genes. Among accessions, SV was nearly eightfold less frequent for gene models that have retained paralogs since the last whole genome duplication event, compared with genes that have not retained paralogs. Increases in gene copy number, similar to that described at the Rhg1 resistance locus, account for approximately one-fourth of the genic SV events. This assessment of soybean SV occurrence presents a target list of genes potentially responsible for rapidly evolving and/or adaptive traits.
Collapse
Affiliation(s)
- Justin E Anderson
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Michael B Kantar
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108 Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada, V6T 1Z4
| | - Thomas Y Kono
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Fengli Fu
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Adrian O Stec
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Qijian Song
- United States Department of Agriculture, Agricultural Research Service, Soybean Genomics and Improvement Lab, Beltsville, Maryland 20705
| | - Perry B Cregan
- United States Department of Agriculture, Agricultural Research Service, Soybean Genomics and Improvement Lab, Beltsville, Maryland 20705
| | - James E Specht
- Agronomy and Horticulture Department, University of Nebraska, Lincoln, Nebraska 68583
| | - Brian W Diers
- Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801
| | - Steven B Cannon
- United States Department of Agriculture, Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, Iowa 50011
| | - Leah K McHale
- Department of Horticulture and Crop Science, The Ohio State University, Columbus, Ohio 43210
| | - Robert M Stupar
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| |
Collapse
|
30
|
Tinti M, Dissanayake K, Synowsky S, Albergante L, MacKintosh C. Identification of 2R-ohnologue gene families displaying the same mutation-load skew in multiple cancers. Open Biol 2014; 4:140029. [PMID: 24806839 PMCID: PMC4042849 DOI: 10.1098/rsob.140029] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2014] [Accepted: 04/09/2014] [Indexed: 12/12/2022] Open
Abstract
The complexity of signalling pathways was boosted at the origin of the vertebrates, when two rounds of whole genome duplication (2R-WGD) occurred. Those genes and proteins that have survived from the 2R-WGD-termed 2R-ohnologues-belong to families of two to four members, and are enriched in signalling components relevant to cancer. Here, we find that while only approximately 30% of human transcript-coding genes are 2R-ohnologues, they carry 42-60% of the gene mutations in 30 different cancer types. Across a subset of cancer datasets, including melanoma, breast, lung adenocarcinoma, liver and medulloblastoma, we identified 673 2R-ohnologue families in which one gene carries mutations at multiple positions, while sister genes in the same family are relatively mutation free. Strikingly, in 315 of the 322 2R-ohnologue families displaying such a skew in multiple cancers, the same gene carries the heaviest mutation load in each cancer, and usually the second-ranked gene is also the same in each cancer. Our findings inspire the hypothesis that in certain cancers, heterogeneous combinations of genetic changes impair parts of the 2R-WGD signalling networks and force information flow through a limited set of oncogenic pathways in which specific non-mutated 2R-ohnologues serve as effectors. The non-mutated 2R-ohnologues are therefore potential therapeutic targets. These include proteins linked to growth factor signalling, neurotransmission and ion channels.
Collapse
Affiliation(s)
- Michele Tinti
- Division of Cell and Developmental Biology, College of Life Sciences, University of Dundee, Dundee DD1 5EH, UK
| | - Kumara Dissanayake
- Division of Cell and Developmental Biology, College of Life Sciences, University of Dundee, Dundee DD1 5EH, UK
| | - Silvia Synowsky
- MRC Protein Phosphorylation and Ubiquitylation Unit, University of Dundee, Dundee DD1 5EH, UK
| | - Luca Albergante
- Division of Cell and Developmental Biology, College of Life Sciences, University of Dundee, Dundee DD1 5EH, UK
- Division of Computational Biology, College of Life Sciences, University of Dundee, Dundee DD1 5EH, UK
| | - Carol MacKintosh
- Division of Cell and Developmental Biology, College of Life Sciences, University of Dundee, Dundee DD1 5EH, UK
| |
Collapse
|
31
|
Tamate SC, Kawata M, Makino T. Contribution of nonohnologous duplicated genes to high habitat variability in mammals. Mol Biol Evol 2014; 31:1779-86. [PMID: 24714078 DOI: 10.1093/molbev/msu128] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The mechanism by which genetic systems affect environmental adaptation is a focus of considerable attention in the fields of ecology, evolution, and conservation. However, the genomic characteristics that constrain adaptive evolution have remained unknown. A recent study showed that the proportion of duplicated genes in whole Drosophila genomes correlated with environmental variability within habitat, but it remains unclear whether the correlation is observed even in vertebrates whose genomes including a large number of duplicated genes generated by whole-genome duplication (WGD). Here, we focus on fully sequenced mammalian genomes that experienced WGD in early vertebrate lineages and show that the proportion of small-scale duplication (SSD) genes in the genome, but not that of WGD genes, is significantly correlated with habitat variability. Moreover, species with low habitat variability have a higher proportion of lost duplicated genes, particularly SSD genes, than those with high habitat variability. These results indicate that species that inhabit variable environments may maintain more SSD genes in their genomes and suggest that SSD genes are important for adapting to novel environments and surviving environmental changes. These insights may be applied to predicting invasive and endangered species.
Collapse
Affiliation(s)
- Satoshi C Tamate
- Department of Ecology and Evolutionary Biology, Graduate School of Life Sciences, Tohoku University, Aoba-ku, Sendai, Japan
| | - Masakado Kawata
- Department of Ecology and Evolutionary Biology, Graduate School of Life Sciences, Tohoku University, Aoba-ku, Sendai, Japan
| | - Takashi Makino
- Department of Ecology and Evolutionary Biology, Graduate School of Life Sciences, Tohoku University, Aoba-ku, Sendai, JapanDepartment of Ecology and Evolutionary Biology, Graduate School of Life Sciences, Tohoku University, Aoba-ku, Sendai, Japan
| |
Collapse
|
32
|
Liu G, Zou Y, Cheng Q, Zeng Y, Gu X, Su Z. Age distribution patterns of human gene families: divergent for Gene Ontology categories and concordant between different subcellular localizations. Mol Genet Genomics 2013; 289:137-47. [PMID: 24322347 DOI: 10.1007/s00438-013-0799-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 12/03/2013] [Indexed: 12/13/2022]
Abstract
The age distribution of gene duplication events within the human genome exhibits two waves of duplications along with an ancient component. However, because of functional constraint differences, genes in different functional categories might show dissimilar retention patterns after duplication. It is known that genes in some functional categories are highly duplicated in the early stage of vertebrate evolution. However, the correlations of the age distribution pattern of gene duplication between the different functional categories are still unknown. To investigate this issue, we developed a robust pipeline to date the gene duplication events in the human genome. We successfully estimated about three-quarters of the duplication events within the human genome, along with the age distribution pattern in each Gene Ontology (GO) slim category. We found that some GO slim categories show different distribution patterns when compared to the whole genome. Further hierarchical clustering of the GO slim functional categories enabled grouping into two main clusters. We found that human genes located in the duplicated copy number variant regions, whose duplicate genes have not been fixed in the human population, were mainly enriched in the groups with a high proportion of recently duplicated genes. Moreover, we used a phylogenetic tree-based method to date the age of duplications in three signaling-related gene superfamilies: transcription factors, protein kinases and G-protein coupled receptors. These superfamilies were expressed in different subcellular localizations. They showed a similar age distribution as the signaling-related GO slim categories. We also compared the differences between the age distributions of gene duplications in multiple subcellular localizations. We found that the distribution patterns of the major subcellular localizations were similar to that of the whole genome. This study revealed the whole picture of the evolution patterns of gene functional categories in the human genome.
Collapse
Affiliation(s)
- Gangbiao Liu
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Biology Building II 113, Shanghai, 200433, China
| | | | | | | | | | | |
Collapse
|