1
|
Zhao F, Yan Y, Wang Y, Liu Y, Yang R. Splicing complexity as a pivotal feature of alternative exons in mammalian species. BMC Genomics 2023; 24:198. [PMID: 37046221 PMCID: PMC10099729 DOI: 10.1186/s12864-023-09247-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 03/14/2023] [Indexed: 04/14/2023] Open
Abstract
BACKGROUND As a significant process of post-transcriptional gene expression regulation in eukaryotic cells, alternative splicing (AS) of exons greatly contributes to the complexity of the transcriptome and indirectly enriches the protein repertoires. A large number of studies have focused on the splicing inclusion of alternative exons and have revealed the roles of AS in organ development and maturation. Notably, AS takes place through a change in the relative abundance of the transcript isoforms produced by a single gene, meaning that exons can have complex splicing patterns. However, the commonly used percent spliced-in (Ψ) values only define the usage rate of exons, but lose information about the complexity of exons' linkage pattern. To date, the extent and functional consequence of splicing complexity of alternative exons in development and evolution is poorly understood. RESULTS By comparing splicing complexity of exons in six tissues (brain, cerebellum, heart, liver, kidney, and testis) from six mammalian species (human, chimpanzee, gorilla, macaque, mouse, opossum) and an outgroup species (chicken), we revealed that exons with high splicing complexity are prevalent in mammals and are closely related to features of genes. Using traditional machine learning and deep learning methods, we found that the splicing complexity of exons can be moderately predicted with features derived from exons, among which length of flanking exons and splicing strength of downstream/upstream splice sites are top predictors. Comparative analysis among human, chimpanzee, gorilla, macaque, and mouse revealed that, alternative exons tend to evolve to an increased level of splicing complexity and higher tissue specificity in splicing complexity. During organ development, not only developmentally regulated exons, but also 10-15% of non-developmentally regulated exons show dynamic splicing complexity. CONCLUSIONS Our analysis revealed that splicing complexity is an important metric to characterize the splicing dynamics of alternative exons during the development and evolution of mammals.
Collapse
Affiliation(s)
- Feiyang Zhao
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Yubin Yan
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Yaxi Wang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Yuan Liu
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Ruolin Yang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China.
| |
Collapse
|
2
|
3'untranslated regions of tumor suppressor genes evolved specific features to favor cancer resistance. Oncogene 2022; 41:3278-3288. [PMID: 35523946 DOI: 10.1038/s41388-022-02343-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 04/26/2022] [Accepted: 04/28/2022] [Indexed: 11/08/2022]
Abstract
Cancer-related genes have evolved specific genetic and genomic features to favor tumor suppression. Previously we reported that tumor suppressor genes (TSGs) acquired high promoter CpG dinucleotide frequencies during evolution to maintain high expression in normal tissues and resist cancer-specific downregulation. In this study, we investigated whether 3'untranslated regions (3'UTRs) of TSGs have evolved specific features to carry out similar functions. We found that 3'UTRs of TSGs, especially those involved in multiple histological types and pediatric cancers, are longer than those of non-cancer genes. 3'UTRs of TSGs also exhibit higher density of binding sites for RNA-binding proteins (RBPs), particularly those having high affinities to C-rich motifs. Both longer 3'UTR length and RBP binding sites enrichment are correlated with higher gene expression in normal tissues across tissue types. Moreover, both features together with the correlated N6-methyladenosine modification and the extent of protein-protein interactions are positively associated with the ability of TSGs to resist cancer-specific downregulation. These results were successfully validated with independent datasets. Collectively, these findings indicate that TSGs have evolved longer 3'UTR with increased propensity to RBP binding, N6-methyladenosine modification and protein-protein interactions for optimizing their tumor-suppressing functions.
Collapse
|
3
|
Wang X, Hu W, Li X, Huang D, Li Q, Chan H, Zeng J, Xie C, Chen H, Liu X, Gin T, Wang MH, Cheng ASL, Kang W, To KF, Plewczynski D, Zhang Q, Chen X, Chan DCW, Ko H, Wong SH, Yu J, Chan MTV, Zhang L, Wu WKK. Single-Hit Inactivation Drove Tumor Suppressor Genes Out of the X Chromosome during Evolution. Cancer Res 2022; 82:1482-1491. [PMID: 35247889 DOI: 10.1158/0008-5472.can-21-3458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 01/24/2022] [Accepted: 03/01/2022] [Indexed: 11/16/2022]
Abstract
Cancer-related genes are under intense evolutionary pressure. In this study, we conjecture that X-linked tumor suppressor genes (TSG) are not protected by the Knudson's two-hit mechanism and are therefore subject to negative selection. Accordingly, nearly all mammalian species exhibited lower TSG-to-noncancer gene ratios on their X chromosomes compared with nonmammalian species. Synteny analysis revealed that mammalian X-linked TSGs were depleted shortly after the emergence of the XY sex-determination system. A phylogeny-based model unveiled a higher X chromosome-to-autosome relocation flux for human TSGs. This was verified in other mammals by assessing the concordance/discordance of chromosomal locations of mammalian TSGs and their orthologs in Xenopus tropicalis. In humans, X-linked TSGs are younger or larger in size. Consistently, pan-cancer analysis revealed more frequent nonsynonymous somatic mutations of X-linked TSGs. These findings suggest that relocation of TSGs out of the X chromosome could confer a survival advantage by facilitating evasion of single-hit inactivation. SIGNIFICANCE This work unveils extensive trafficking of TSGs from the X chromosome to autosomes during evolution, thus identifying X-linked TSGs as a genetic Achilles' heel in tumor suppression.
Collapse
Affiliation(s)
- Xiansong Wang
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,CUHK Shenzhen Research Institute, Shenzhen, Guangdong, People's Republic of China
| | - Wei Hu
- Department of Gastroenterology, Shenzhen Hospital, Southern Medical University, Shenzhen, Guangdong, People's Republic of China
| | - Xiangchun Li
- Public Laboratory, Tianjin Medical University Cancer Institute and Hospital, Tianjin, People's Republic of China
| | - Dan Huang
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Qing Li
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Hung Chan
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Judeng Zeng
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Chuan Xie
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Peter Hung Pain Research Institute, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Huarong Chen
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Peter Hung Pain Research Institute, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Xiaodong Liu
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Peter Hung Pain Research Institute, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Tony Gin
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Maggie Haitian Wang
- CUHK Shenzhen Research Institute, Shenzhen, Guangdong, People's Republic of China.,Division of Biostatistics, Center for Clinical Research and Biostatistics, JC School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | | | - Wei Kang
- Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Ka-Fai To
- Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Dariusz Plewczynski
- Center of New Technologies, University of Warsaw, Banacha 2c, Warsaw, Poland.,Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Qingpeng Zhang
- School of Data Science, City University of Hong Kong, Hong Kong, People's Republic of China
| | - Xiaoting Chen
- Department of Oncology, Zhujiang Hospital, Southern Medical University, Guangzhou, People's Republic of China
| | - Danny Cheuk Wing Chan
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Peter Hung Pain Research Institute, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Gerald Choa Neuroscience Center, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Ho Ko
- Peter Hung Pain Research Institute, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Gerald Choa Neuroscience Center, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Margaret K. L. Cheung Research Center for Management of Parkinsonism, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Sunny Hei Wong
- CUHK Shenzhen Research Institute, Shenzhen, Guangdong, People's Republic of China.,Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
| | - Jun Yu
- CUHK Shenzhen Research Institute, Shenzhen, Guangdong, People's Republic of China.,Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,State Key Laboratory of Digestive Diseases, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Matthew Tak Vai Chan
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,CUHK Shenzhen Research Institute, Shenzhen, Guangdong, People's Republic of China.,Peter Hung Pain Research Institute, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Lin Zhang
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,CUHK Shenzhen Research Institute, Shenzhen, Guangdong, People's Republic of China.,Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - William Ka Kei Wu
- Department of Anesthesia and Intensive Care, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,CUHK Shenzhen Research Institute, Shenzhen, Guangdong, People's Republic of China.,Peter Hung Pain Research Institute, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.,State Key Laboratory of Digestive Diseases, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| |
Collapse
|
4
|
Jiang S, Du Q, Feng C, Ma L, Zhang Z. CompoDynamics: a comprehensive database for characterizing sequence composition dynamics. Nucleic Acids Res 2022; 50:D962-D969. [PMID: 34718745 PMCID: PMC8728180 DOI: 10.1093/nar/gkab979] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 10/02/2021] [Accepted: 10/06/2021] [Indexed: 11/15/2022] Open
Abstract
Sequence compositions of nucleic acids and proteins have significant impact on gene expression, RNA stability, translation efficiency, RNA/protein structure and molecular function, and are associated with genome evolution and adaptation across all kingdoms of life. Therefore, a devoted resource of sequence compositions and associated features is fundamentally crucial for a wide range of biological research. Here, we present CompoDynamics (https://ngdc.cncb.ac.cn/compodynamics/), a comprehensive database of sequence compositions of coding sequences (CDSs) and genomes for all kinds of species. Taking advantage of the exponential growth of RefSeq data, CompoDynamics presents a wealth of sequence compositions (nucleotide content, codon usage, amino acid usage) and derived features (coding potential, physicochemical property and phase separation) for 118 689 747 high-quality CDSs and 34 562 genomes across 24 995 species. Additionally, interactive analytical tools are provided to enable comparative analyses of sequence compositions and molecular features across different species and gene groups. Collectively, CompoDynamics bears the great potential to better understand the underlying roles of sequence composition dynamics across genes and genomes, providing a fundamental resource in support of a broad spectrum of biological studies.
Collapse
Affiliation(s)
- Shuai Jiang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
| | - Qiang Du
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Changrui Feng
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lina Ma
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhang Zhang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
5
|
Huang D, Wang X, Liu Y, Huang Z, Hu X, Hu W, Li Q, Chan H, Zou Y, Ho IHT, Wang Y, Cheng ASL, Kang W, To KF, Wang MHT, Wong SH, Yu J, Gin T, Zhang Q, Li Z, Shen J, Zhang L, Chan MTV, Liu X, Wu WKK. Multi-omic analysis suggests tumor suppressor genes evolved specific promoter features to optimize cancer resistance. Brief Bioinform 2021; 22:6200210. [PMID: 33783485 DOI: 10.1093/bib/bbab040] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 12/23/2020] [Accepted: 01/28/2021] [Indexed: 12/31/2022] Open
Abstract
Tumor suppressor genes (TSGs) exhibit distinct evolutionary features. We speculated that TSG promoters could have evolved specific features that facilitate their tumor-suppressing functions. We found that the promoter CpG dinucleotide frequencies of TSGs are significantly higher than that of non-cancer genes across vertebrate genomes, and positively correlated with gene expression across tissue types. The promoter CpG dinucleotide frequencies of all genes gradually increase with gene age, for which young TSGs have been subject to a stronger evolutionary pressure. Transcription-related features, namely chromatin accessibility, methylation and ZNF263-, SP1-, E2F4- and SP2-binding elements, are associated with gene expression. Moreover, higher promoter CpG dinucleotide frequencies and chromatin accessibility are positively associated with the ability of TSGs to resist downregulation during tumorigenesis. These results were successfully validated with independent datasets. In conclusion, TSGs evolved specific promoter features that optimized cancer resistance through achieving high expression in normal tissues and resistance to downregulation during tumorigenesis.
Collapse
Affiliation(s)
- Dan Huang
- Chinese University of Hong Kong and the CUHK-Shenzhen Research Institute, China
| | - Xiansong Wang
- Chinese University of Hong Kong and the CUHK-Shenzhen Research Institute, China
| | - Yingzhi Liu
- Chinese University of Hong Kong and the CUHK-Shenzhen Research Institute, China
| | - Ziheng Huang
- Chinese University of Hong Kong and the CUHK-Shenzhen Research Institute, China
| | - Xiaoxu Hu
- Chinese University of Hong Kong and the CUHK-Shenzhen Research Institute, China
| | - Wei Hu
- Chinese University of Hong Kong, China
| | - Qing Li
- Chinese University of Hong Kong, China
| | - Hung Chan
- Chinese University of Hong Kong, China
| | - Yidan Zou
- Chinese University of Hong Kong, China
| | | | - Yan Wang
- Chinese University of Hong Kong, China
| | | | - Wei Kang
- Chinese University of Hong Kong, China
| | - Ka F To
- Chinese University of Hong Kong, China
| | - Maggie H T Wang
- Chinese University of Hong Kong and the CUHK-Shenzhen Research Institute, China
| | | | - Jun Yu
- Chinese University of Hong Kong, China
| | - Tony Gin
- Chinese University of Hong Kong, China
| | | | - Zheng Li
- Peking Union Medical College Hospital, China
| | | | - Lin Zhang
- Chinese University of Hong Kong, China
| | | | | | - William K K Wu
- Chinese University of Hong Kong and a researcher at the CUHK-Shenzhen Research Institute, China
| |
Collapse
|
6
|
Auboeuf D. Physicochemical Foundations of Life that Direct Evolution: Chance and Natural Selection are not Evolutionary Driving Forces. Life (Basel) 2020; 10:life10020007. [PMID: 31973071 PMCID: PMC7175370 DOI: 10.3390/life10020007] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 01/15/2020] [Accepted: 01/16/2020] [Indexed: 12/11/2022] Open
Abstract
The current framework of evolutionary theory postulates that evolution relies on random mutations generating a diversity of phenotypes on which natural selection acts. This framework was established using a top-down approach as it originated from Darwinism, which is based on observations made of complex multicellular organisms and, then, modified to fit a DNA-centric view. In this article, it is argued that based on a bottom-up approach starting from the physicochemical properties of nucleic and amino acid polymers, we should reject the facts that (i) natural selection plays a dominant role in evolution and (ii) the probability of mutations is independent of the generated phenotype. It is shown that the adaptation of a phenotype to an environment does not correspond to organism fitness, but rather corresponds to maintaining the genome stability and integrity. In a stable environment, the phenotype maintains the stability of its originating genome and both (genome and phenotype) are reproduced identically. In an unstable environment (i.e., corresponding to variations in physicochemical parameters above a physiological range), the phenotype no longer maintains the stability of its originating genome, but instead influences its variations. Indeed, environment- and cellular-dependent physicochemical parameters define the probability of mutations in terms of frequency, nature, and location in a genome. Evolution is non-deterministic because it relies on probabilistic physicochemical rules, and evolution is driven by a bidirectional interplay between genome and phenotype in which the phenotype ensures the stability of its originating genome in a cellular and environmental physicochemical parameter-depending manner.
Collapse
Affiliation(s)
- Didier Auboeuf
- Laboratory of Biology and Modelling of the Cell, Univ Lyon, ENS de Lyon, Univ Claude Bernard, CNRS UMR 5239, INSERM U1210, 46 Allée d'Italie, Site Jacques Monod, F-69007, Lyon, France
| |
Collapse
|
7
|
Oldfield CJ, Peng Z, Uversky VN, Kurgan L. Codon selection reduces GC content bias in nucleic acids encoding for intrinsically disordered proteins. Cell Mol Life Sci 2020; 77:149-160. [PMID: 31175370 PMCID: PMC11104855 DOI: 10.1007/s00018-019-03166-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 05/14/2019] [Accepted: 05/28/2019] [Indexed: 02/06/2023]
Abstract
Protein-coding nucleic acids exhibit composition and codon biases between sequences coding for intrinsically disordered regions (IDRs) and those coding for structured regions. IDRs are regions of proteins that are folding self-insufficient and which function without the prerequisite of folded structure. Several authors have investigated composition bias or codon selection in regions encoding for IDRs, primarily in Eukaryota, and concluded that elevated GC content is the result of the biased amino acid composition of IDRs. We substantively extend previous work by examining GC content in regions encoding IDRs, from 44 species in Eukaryota, Archaea, and Bacteria, spanning a wide range of GC content. We confirm that regions coding for IDRs show a significantly elevated GC content, even across all domains of life. Although this is largely attributable to the amino acid composition bias of IDRs, we show that this bias is independent of the overall GC content and, most importantly, we are the first to observe that GC content bias in IDRs is significantly different than expected from IDR amino acid composition alone. We empirically find compensatory codon selection that reduces the observed GC content bias in IDRs. This selection is dependent on the overall GC content of the organism. The codon selection bias manifests as use of infrequent, AT-rich codons in encoding IDRs. Further, we find these relationships to be independent of the intrinsic disorder prediction method used, and independent of estimated translation efficiency. These observations are consistent with the previous work, and we speculate on whether the observed biases are causal or symptomatic of other driving forces.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA
- Institute for Biological Instrumentation, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| |
Collapse
|
8
|
Yin H, Li M, Xia L, He C, Zhang Z. Computational determination of gene age and characterization of evolutionary dynamics in human. Brief Bioinform 2019; 20:2141-2149. [PMID: 30184145 DOI: 10.1093/bib/bby074] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 08/01/2018] [Accepted: 08/02/2018] [Indexed: 12/23/2022] Open
Abstract
Genes originate at different evolutionary time scales and possess different ages, accordingly presenting diverse functional characteristics and reflecting distinct adaptive evolutionary innovations. In the past decades, progresses have been made in gene age identification by a variety of methods that are principally based on comparative genomics. Here we summarize methods for computational determination of gene age and evaluate the effectiveness of different computational methods for age identification. Our results show that improved age determination can be achieved by combining homolog clustering with phylogeny inference, which enables more accurate age identification in human genes. Accordingly, we characterize evolutionary dynamics of human genes based on an extremely long evolutionary time scale spanning ~4,000 million years from archaea/bacteria to human, revealing that young genes are clustered on certain chromosomes and that Mendelian disease genes (including monogenic disease and polygenic disease genes) and cancer genes exhibit divergent evolutionary origins. Taken together, deciphering genes' ages as well as their evolutionary dynamics is of fundamental significance in unveiling the underlying mechanisms during evolution and better understanding how young or new genes become indispensable integrants coupled with novel phenotypes and biological diversity.
Collapse
Affiliation(s)
- Hongyan Yin
- Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, Institute of Tropical Agriculture and Forestry, Hainan University, China
| | - Mengwei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Lin Xia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Chaozu He
- Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, Institute of Tropical Agriculture and Forestry, Hainan University, China
| | - Zhang Zhang
- BIG Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
9
|
Huang JH, Kwan RSY, Tsai ZTY, Lin TC, Tsai HK. Borders of Cis-Regulatory DNA Sequences Preferentially Harbor the Divergent Transcription Factor Binding Motifs in the Human Genome. Front Genet 2018; 9:571. [PMID: 30524473 PMCID: PMC6261980 DOI: 10.3389/fgene.2018.00571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 11/06/2018] [Indexed: 11/17/2022] Open
Abstract
Changes in cis-regulatory DNA sequences and transcription factor (TF) repertoires provide major sources of phenotypic diversity that shape the evolution of gene regulation in eukaryotes. The DNA-binding specificities of TFs may be diversified or produce new variants in different eukaryotic species. However, it is currently unclear how various levels of divergence in TF DNA-binding specificities or motifs became introduced into the cis-regulatory DNA regions of the genome over evolutionary time. Here, we first estimated the evolutionary divergence levels of TF binding motifs and quantified their occurrence at DNase I-hypersensitive sites. Results from our in silico motif scan and experimentally derived chromatin immunoprecipitation (TF-ChIP) show that the divergent motifs tend to be introduced in the edges of cis-regulatory regions, which is probably accompanied by the expansion of the accessible core of promoter-associated regulatory elements during evolution. We also find that the genes neighboring the expanded cis-regulatory regions with the most divergent motifs are associated with functions like development and morphogenesis. Accordingly, we propose that the accumulation of divergent motifs in the edges of cis-regulatory regions provides a functional mechanism for the evolution of divergent regulatory circuits.
Collapse
Affiliation(s)
- Jia-Hsin Huang
- Institute of Information Science, Academia Sinica, Nankang, Taipei, Taiwan
| | | | - Zing Tsung-Yeh Tsai
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Tzu-Chieh Lin
- Institute of Information Science, Academia Sinica, Nankang, Taipei, Taiwan
| | - Huai-Kuang Tsai
- Institute of Information Science, Academia Sinica, Nankang, Taipei, Taiwan
| |
Collapse
|
10
|
Du MZ, Zhang C, Wang H, Liu S, Wei W, Guo FB. The GC Content as a Main Factor Shaping the Amino Acid Usage During Bacterial Evolution Process. Front Microbiol 2018; 9:2948. [PMID: 30581420 PMCID: PMC6292993 DOI: 10.3389/fmicb.2018.02948] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Accepted: 11/16/2018] [Indexed: 11/13/2022] Open
Abstract
Understanding how proteins evolve is important, and the order of amino acids being recruited into the genetic codons was found to be an important factor shaping the amino acid composition of proteins. The latest work about the last universal common ancestor (LUCA) makes it possible to determine the potential factors shaping amino acid compositions during evolution. Those LUCA genes/proteins from Methanococcus maripaludis S2, which is one of the possible LUCA, were investigated. The evolutionary rates of these genes positively correlate with GC contents with P-value significantly lower than 0.05 for 94% homologous genes. Linear regression results showed that compositions of amino acids coded by GC-rich codons positively contribute to the evolutionary rates, while these amino acids tend to be gained in GC-rich organisms according to our results. The first principal component correlates with the GC content very well. The ratios of amino acids of the LUCA proteins coded by GC rich codons positively correlate with the GC content of different bacteria genomes, while the ratios of amino acids coded by AT rich codons negatively correlate with the increase of GC content of genomes. Next, we found that the recruitment order does correlate with the amino acid compositions, but gain and loss in codons showed newly recruited amino acids are not significantly increased along with the evolution. Thus, we conclude that GC content is a primary factor shaping amino acid compositions. GC content shapes amino acid composition to trade off the cost of amino acids with bases, which could be caused by the energy efficiency.
Collapse
Affiliation(s)
- Meng-Ze Du
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | | | - Huan Wang
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Shuo Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wen Wei
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Feng-Biao Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
- Centre for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
11
|
Wang X, Li X, Zhang L, Wong S, Wang M, Tse G, Dai R, Nakatsu G, Coker O, Chen Z, Ko H, Chan J, Liu T, Cheng C, Cheng A, To K, Plewczynski D, Sung J, Yu J, Gin T, Chan M, Wu W. Oncogenes expand during evolution to withstand somatic amplification. Ann Oncol 2018; 29:2254-2260. [PMID: 30204835 DOI: 10.1093/annonc/mdy397] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/30/2023] Open
|
12
|
Castellana S, Mazza T, Capocefalo D, Genov N, Biagini T, Fusilli C, Scholkmann F, Relógio A, Hogenesch JB, Mazzoccoli G. Systematic Analysis of Mouse Genome Reveals Distinct Evolutionary and Functional Properties Among Circadian and Ultradian Genes. Front Physiol 2018; 9:1178. [PMID: 30190679 PMCID: PMC6115496 DOI: 10.3389/fphys.2018.01178] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Accepted: 08/06/2018] [Indexed: 02/02/2023] Open
Abstract
In living organisms, biological clocks regulate 24 h (circadian) molecular, physiological, and behavioral rhythms to maintain homeostasis and synchrony with predictable environmental changes, in particular with those induced by Earth’s rotation on its axis. Harmonics of these circadian rhythms having periods of 8 and 12 h (ultradian) have been documented in several species. In mouse liver, harmonics of the 24-h period of gene transcription hallmarked genes oscillating with a frequency two or three times faster than circadian periodicity. Many of these harmonic transcripts enriched pathways regulating responses to environmental stress and coinciding preferentially with subjective dawn and dusk. At this time, the evolutionary history of genes with rhythmic expression is still poorly known and the role of length-of-day changes due to Earth’s rotation speed decrease over the last four billion years is totally ignored. We hypothesized that ultradian and stress anticipatory genes would be more evolutionarily conserved than circadian genes and background non-oscillating genes. To investigate this issue, we performed broad computational analyses of genes/proteins oscillating at different frequency ranges across several species and showed that ultradian genes/proteins, especially those oscillating with a 12-h periodicity, are more likely to be of ancient origin and essential in mice. In summary, our results show that genes with ultradian transcriptional patterns are more likely to be phylogenetically conserved and associated with the primeval and inevitable dawn/dusk transitions.
Collapse
Affiliation(s)
- Stefano Castellana
- Bioinformatics Unit, IRCCS "Casa Sollievo della Sofferenza", San Giovanni Rotondo, Italy
| | - Tommaso Mazza
- Bioinformatics Unit, IRCCS "Casa Sollievo della Sofferenza", San Giovanni Rotondo, Italy
| | - Daniele Capocefalo
- Bioinformatics Unit, IRCCS "Casa Sollievo della Sofferenza", San Giovanni Rotondo, Italy
| | - Nikolai Genov
- Institute for Theoretical Biology (ITB), Charité - Universitätsmedizin Berlin and Humboldt University of Berlin, Berlin, Germany.,Molekulares Krebsforschungszentrum (MKFZ), Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Tommaso Biagini
- Bioinformatics Unit, IRCCS "Casa Sollievo della Sofferenza", San Giovanni Rotondo, Italy
| | - Caterina Fusilli
- Bioinformatics Unit, IRCCS "Casa Sollievo della Sofferenza", San Giovanni Rotondo, Italy
| | - Felix Scholkmann
- Research Office for Complex Physical and Biological Systems (ROCoS), Zürich, Switzerland.,Department of Neonatology, University Hospital Zürich, University of Zürich, Zürich, Switzerland
| | - Angela Relógio
- Institute for Theoretical Biology (ITB), Charité - Universitätsmedizin Berlin and Humboldt University of Berlin, Berlin, Germany.,Molekulares Krebsforschungszentrum (MKFZ), Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - John B Hogenesch
- Divisions of Human Genetics and Immunobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Gianluigi Mazzoccoli
- Division of Internal Medicine and Chronobiology Unit, IRCCS "Casa Sollievo della Sofferenza", San Giovanni Rotondo, Italy
| |
Collapse
|
13
|
Banerjee S, Chakraborty S. Protein intrinsic disorder negatively associates with gene age in different eukaryotic lineages. MOLECULAR BIOSYSTEMS 2018; 13:2044-2055. [PMID: 28783193 DOI: 10.1039/c7mb00230k] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The emergence of new protein-coding genes in a specific lineage or species provides raw materials for evolutionary adaptations. Until recently, the biology of new genes emerging particularly from non-genic sequences remained unexplored. Although the new genes are subjected to variable selection pressure and face rapid deletion, some of them become functional and are retained in the gene pool. To acquire functional novelties, new genes often get integrated into the pre-existing ancestral networks. However, the mechanism by which young proteins acquire novel interactions remains unanswered till date. Since structural orientation contributes hugely to the mode of proteins' physical interactions, in this regard, we put forward an interesting question - Do new genes encode proteins with stable folds? Addressing the question, we demonstrated that the intrinsic disorder inversely correlates with the evolutionary gene ages - i.e. young proteins are richer in intrinsic disorder than the ancient ones. We further noted that young proteins, which are initially poorly connected hubs, prefer to be structurally more disordered than well-connected ancient proteins. The phenomenon strikingly defies the usual trend of well-connected proteins being highly disordered in structure. We justified that structural disorder might help poorly connected young proteins to undergo promiscuous interactions, which provides the foundation for novel protein interactions. The study focuses on the evolutionary perspectives of young proteins in the light of structural adaptations.
Collapse
Affiliation(s)
- Sanghita Banerjee
- Machine Intelligence Unit, Indian Statistical Institute, 203 Barrackpore Trunk Road, Kolkata 700108, India.
| | | |
Collapse
|