1
|
Gąsiorowski L. Evidence for Multiple Independent Expansions of Fox Gene Families Within Flatworms. J Mol Evol 2025; 93:124-135. [PMID: 39825915 DOI: 10.1007/s00239-024-10226-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Accepted: 12/06/2024] [Indexed: 01/20/2025]
Abstract
Expansion and losses of gene families are important drivers of molecular evolution. A recent survey of Fox genes in flatworms revealed that this superfamily of multifunctional transcription factors, present in all animals, underwent extensive losses and expansions during platyhelminth evolution. In this paper, I analyzed Fox gene complement in four additional species of platyhelminths, that represent early-branching lineages in the flatworm phylogeny: catenulids (Stenostomum brevipharyngium and Stenostomum leucops) and macrostomorphs (Macrostomum hystrix and Macrostomum cliftonense). Phylogenetic analysis of Fox genes from this expanded set of species provided evidence for multiple independent expansions of Fox gene families within flatworms. Notably, FoxG, a panbilaterian brain-patterning gene, appears to be the least susceptible to duplication, while FoxJ1, a conserved ciliogenesis factor, has undergone extensive expansion in various flatworm lineages. Analysis of the single-cell atlas of S. brevipharyngium, combined with RNA in situ hybridization, elucidated the tissue-specific expression of the selected Fox genes: FoxG is expressed in the brain, three of the Fox genes (FoxN2/3-2, FoxO4 and FoxP1) are expressed in the pharyngeal cells of likely glandular function, while one of the FoxQD paralogs is specifically expressed in the protonephridium. Overall, the evolution of Fox genes in flatworms appears to be characterized by an early contraction of the gene complement, followed by lineage-specific expansions that have enabled the co-option of newly evolved paralogs into novel physiological and developmental functions.
Collapse
Affiliation(s)
- Ludwik Gąsiorowski
- Faculty of Biology, Institute of Evolutionary Biology, University of Warsaw, Ul. Żwirki I Wigury 101, 02-089, Warsaw, Poland.
- Department of Tissue Dynamics and Regeneration, Max Planck Institute for Multidisciplinary Sciences, Am Fassberg 11, 37077, Göttingen, Germany.
| |
Collapse
|
2
|
Dewar AE, Belcher LJ, West SA. A phylogenetic approach to comparative genomics. Nat Rev Genet 2025:10.1038/s41576-024-00803-0. [PMID: 39779997 PMCID: PMC7617348 DOI: 10.1038/s41576-024-00803-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/05/2024] [Indexed: 01/11/2025]
Abstract
Comparative genomics, whereby the genomes of different species are compared, has the potential to address broad and fundamental questions at the intersection of genetics and evolution. However, species, genomes and genes cannot be considered as independent data points within statistical tests. Closely related species tend to be similar because they share genes by common descent, which must be accounted for in analyses. This problem of non-independence may be exacerbated when examining genomes or genes but can be addressed by applying phylogeny-based methods to comparative genomic analyses. Here, we review how controlling for phylogeny can change the conclusions of comparative genomics studies. We address common questions on how to apply these methods and illustrate how they can be used to test causal hypotheses. The combination of rapidly expanding genomic datasets and phylogenetic comparative methods is set to revolutionize the biological insights possible from comparative genomic studies.
Collapse
Affiliation(s)
- Anna E Dewar
- Department of Biology, University of Oxford, Oxford, UK.
- St John's College, Oxford, UK.
| | | | - Stuart A West
- Department of Biology, University of Oxford, Oxford, UK
| |
Collapse
|
3
|
Buenaventura T, Bagci H, Patrascan I, Graham JJ, Hipwell KD, Oldenkamp R, King JWD, Urtasun J, Young G, Mouzo D, Gomez-Cabrero D, Rowland BD, Panne D, Fisher AG, Merkenschlager M. Competition shapes the landscape of X-chromosome-linked genetic diversity. Nat Genet 2024; 56:1678-1688. [PMID: 39060501 PMCID: PMC11319201 DOI: 10.1038/s41588-024-01840-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 06/21/2024] [Indexed: 07/28/2024]
Abstract
X chromosome inactivation (XCI) generates clonal heterogeneity within XX individuals. Combined with sequence variation between human X chromosomes, XCI gives rise to intra-individual clonal diversity, whereby two sets of clones express mutually exclusive sequence variants present on one or the other X chromosome. Here we ask whether such clones merely co-exist or potentially interact with each other to modulate the contribution of X-linked diversity to organismal development. Focusing on X-linked coding variation in the human STAG2 gene, we show that Stag2variant clones contribute to most tissues at the expected frequencies but fail to form lymphocytes in Stag2WT Stag2variant mouse models. Unexpectedly, the absence of Stag2variant clones from the lymphoid compartment is due not solely to cell-intrinsic defects but requires continuous competition by Stag2WT clones. These findings show that interactions between epigenetically diverse clones can operate in an XX individual to shape the contribution of X-linked genetic diversity in a cell-type-specific manner.
Collapse
Affiliation(s)
- Teresa Buenaventura
- MRC LMS, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - Hakan Bagci
- MRC LMS, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - Ilinca Patrascan
- MRC LMS, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - Joshua J Graham
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Leicester, UK
| | - Kelsey D Hipwell
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Leicester, UK
| | - Roel Oldenkamp
- Division of Cell Biology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - James W D King
- MRC LMS, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - Jesus Urtasun
- MRC LMS, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - George Young
- MRC LMS, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - Daniel Mouzo
- Translational Bioinformatics Unit, Navarrabiomed, Universidad Pública de Navarra (UPNA), Instituto de Investigación Sanitaria de Navarra (IdiSNA), Pamplona, Spain
| | - David Gomez-Cabrero
- Translational Bioinformatics Unit, Navarrabiomed, Universidad Pública de Navarra (UPNA), Instituto de Investigación Sanitaria de Navarra (IdiSNA), Pamplona, Spain
- Bioscience Program, Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology KAUST, Thuwal, Saudi Arabia
| | - Benjamin D Rowland
- Division of Cell Biology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Daniel Panne
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Leicester, UK
| | - Amanda G Fisher
- MRC LMS, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
- Department of Biochemistry, University of Oxford, Oxford, UK
| | - Matthias Merkenschlager
- MRC LMS, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK.
| |
Collapse
|
4
|
Mantica F, Iñiguez LP, Marquez Y, Permanyer J, Torres-Mendez A, Cruz J, Franch-Marro X, Tulenko F, Burguera D, Bertrand S, Doyle T, Nouzova M, Currie PD, Noriega FG, Escriva H, Arnone MI, Albertin CB, Wotton KR, Almudi I, Martin D, Irimia M. Evolution of tissue-specific expression of ancestral genes across vertebrates and insects. Nat Ecol Evol 2024; 8:1140-1153. [PMID: 38622362 DOI: 10.1038/s41559-024-02398-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 03/08/2024] [Indexed: 04/17/2024]
Abstract
Regulation of gene expression is arguably the main mechanism underlying the phenotypic diversity of tissues within and between species. Here we assembled an extensive transcriptomic dataset covering 8 tissues across 20 bilaterian species and performed analyses using a symmetric phylogeny that allowed the combined and parallel investigation of gene expression evolution between vertebrates and insects. We specifically focused on widely conserved ancestral genes, identifying strong cores of pan-bilaterian tissue-specific genes and even larger groups that diverged to define vertebrate and insect tissues. Systematic inferences of tissue-specificity gains and losses show that nearly half of all ancestral genes have been recruited into tissue-specific transcriptomes. This occurred during both ancient and, especially, recent bilaterian evolution, with several gains being associated with the emergence of unique phenotypes (for example, novel cell types). Such pervasive evolution of tissue specificity was linked to gene duplication coupled with expression specialization of one of the copies, revealing an unappreciated prolonged effect of whole-genome duplications on recent vertebrate evolution.
Collapse
Affiliation(s)
- Federica Mantica
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Luis P Iñiguez
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Yamile Marquez
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jon Permanyer
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Antonio Torres-Mendez
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Josefa Cruz
- Institute of Evolutionary Biology (IBE, CSIC-Universitat Pompeu Fabra), Barcelona, Catalonia, Spain
| | - Xavier Franch-Marro
- Institute of Evolutionary Biology (IBE, CSIC-Universitat Pompeu Fabra), Barcelona, Catalonia, Spain
| | - Frank Tulenko
- Australian Regenerative Medicine Institute, Monash University, Clayton, Victoria, Australia
| | - Demian Burguera
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Stephanie Bertrand
- Sorbonne Université, CNRS, Biologie Intégrative des Organismes Marins; BIOM, Banyuls-sur-Mer, France
| | - Toby Doyle
- Centre for Ecology and Conservation, University of Exeter, Penryn, UK
| | - Marcela Nouzova
- Institute of Parasitology, CAS, České Budějovice, Czech Republic
| | - Peter D Currie
- Australian Regenerative Medicine Institute, Monash University, Clayton, Victoria, Australia
- EMBL Australia; Victorian Node, Monash University, Clayton, Victoria, Australia
| | - Fernando G Noriega
- Biology and BSI, Florida International University, Miami, FL, USA
- Department of Parasitology, University of South Bohemia, České Budějovice, Czech Republic
| | - Hector Escriva
- Sorbonne Université, CNRS, Biologie Intégrative des Organismes Marins; BIOM, Banyuls-sur-Mer, France
| | | | - Caroline B Albertin
- Eugene Bell Center for Regenerative Biology and Tissue Engineering, Marine Biological Laboratory, Woods Hole, MA, USA
| | - Karl R Wotton
- Centre for Ecology and Conservation, University of Exeter, Penryn, UK
| | - Isabel Almudi
- Department of Genetics, Microbiology and Statistics and IRBio, Universitat de Barcelona, Barcelona, Spain
| | - David Martin
- Institute of Evolutionary Biology (IBE, CSIC-Universitat Pompeu Fabra), Barcelona, Catalonia, Spain
| | - Manuel Irimia
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain.
- Universitat Pompeu Fabra, Barcelona, Spain.
- ICREA, Barcelona, Spain.
| |
Collapse
|
5
|
Calamari ZT, Flynn JJ. Gene expression supports a single origin of horns and antlers in hoofed mammals. Commun Biol 2024; 7:509. [PMID: 38769090 PMCID: PMC11106249 DOI: 10.1038/s42003-024-06134-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 04/02/2024] [Indexed: 05/22/2024] Open
Abstract
Horns, antlers, and other bony cranial appendages of even-toed hoofed mammals (ruminant artiodactyls) challenge traditional morphological homology assessments. Cranial appendages all share a permanent bone portion with family-specific integument coverings, but homology determination depends on whether the integument covering is an essential component or a secondary elaboration of each structure. To enhance morphological homology assessments, we tested whether juvenile cattle horn bud transcriptomes share homologous gene expression patterns with deer antlers relative to pig outgroup tissues, treating the integument covering as a secondary elaboration. We uncovered differentially expressed genes that support horn and antler homology, potentially distinguish them from non-cranial-appendage bone and other tissues, and highlight the importance of phylogenetic outgroups in homology assessments. Furthermore, we found differentially expressed genes that could support a shared cranial neural crest origin for horns and antlers and expression patterns that refine our understanding of the timing of horn and antler differentiation.
Collapse
Affiliation(s)
- Zachary T Calamari
- Division of Paleontology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.
- Richard Gilder Graduate School, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.
- Department of Natural Sciences, Baruch College, City University of New York, 17 Lexington Avenue, Box A-920, New York, NY, 10010, USA.
| | - John J Flynn
- Division of Paleontology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA
- Richard Gilder Graduate School, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA
| |
Collapse
|
6
|
Islam M, Behura SK. Role of paralogs in the sex-bias transcriptional and metabolic regulation of the brain-placental axis in mice. Placenta 2024; 145:143-150. [PMID: 38134547 DOI: 10.1016/j.placenta.2023.12.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 12/12/2023] [Accepted: 12/14/2023] [Indexed: 12/24/2023]
Abstract
INTRODUCTION Duplicated genes or paralogs play important roles in the adaptive function of eukaryotic genomes. Animal studies have shown evidence for the functional role of paralogs in pregnancy, but our knowledge about the role of paralogs in the fetoplacental regulation remains limited. In particular, if fetoplacental metabolic regulation is modulated by differential expression of paralogs remains unexamined. METHODS In this study, gene expression profiles of day-15 placenta and fetal brain were compared to identify families or groups of paralogous genes expressed in the placenta and brain of male versus female fetuses in mice. A Bayesian modeling was applied to infer directional relationship of transcriptional variation of the paralogs relative to the phylogenetic variation of the genes in each family. Gas chromatography-mass spectrometry (GC-MS) was used to perform untargeted metabolomics analysis of day-15 placenta and fetal brain of both sexes. RESULTS We identified paralog groups that were expressed in a sex and/or tissue biased manner between the placenta and fetal brain. Bayesian modeling showed evidence for directional relationship between expression and phylogeny of specific paralogs. These relationships were sex specific. GC-MS analysis identified metabolites that were expressed in a sex-bias manner between the placenta and fetal brain. By performing integrative analysis of the metabolomics and gene expression data, we showed that specific groups of metabolites and paralogous genes were expressed in a coordinated manner between the placenta and fetal brain. DISCUSSION The findings of this study collectively suggest that paralogs play an influential role in the regulation of the brain-placental axis in mice.
Collapse
Affiliation(s)
- Maliha Islam
- Division of Animal Sciences, University of Missouri, 920 East Campus Drive, Columbia, Missouri, 65211, USA
| | - Susanta K Behura
- Division of Animal Sciences, University of Missouri, 920 East Campus Drive, Columbia, Missouri, 65211, USA; MU Institute for Data Science and Informatics, University of Missouri, USA; Interdisciplinary Reproduction and Health Group, University of Missouri, USA; Interdisciplinary Neuroscience Program, University of Missouri, USA.
| |
Collapse
|
7
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. Genome Biol Evol 2023; 15:evad211. [PMID: 38000902 PMCID: PMC10709115 DOI: 10.1093/gbe/evad211] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 11/26/2023] Open
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred models for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best-fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
8
|
Huang J, Chen Z, Li B, Qu L, Yang J. RetroSeeker reveals the characteristics, expression, and evolution of a large set of novel retrotransposons. ADVANCED BIOTECHNOLOGY 2023; 1:5. [PMID: 39883328 PMCID: PMC11727581 DOI: 10.1007/s44307-023-00005-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/12/2023] [Accepted: 10/13/2023] [Indexed: 01/31/2025]
Abstract
Retrotransposons are highly prevalent in most animals and account for more than 35% of the human genome. However, the prevalence, biogenesis mechanism and function of retrotransposons remain largely unknown. Here, we developed retroSeeker, a novel computational software that identifies novel retrotransposons from pairwise alignments of genomes and decodes their biogenesis, expression, evolution and potential functions. We discovered that the majority of new retrotransposons exhibit a specific L1 endonuclease cleavage motif, with some motifs precisely located ten nucleotides upstream of the insertion site. We identified that a large number of candidate functional genes might be generated through a retrotransposition mechanism. Importantly, we uncovered previously uncharacterized classes of retrotransposons related to histone genes, mitochondrial genes and vault RNAs. Moreover, we elucidated the tissue-specific expression of retrotransposons and demonstrated their ubiquitous expression in various cancer types. We also revealed the complex evolutionary patterns of retrotransposons and identified numerous species-specific retrotransposition events. Taken together, our findings establish a paradigm for discovering novel classes of retrotransposons and elucidating their new characteristics in any species.
Collapse
Affiliation(s)
- Junhong Huang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275, Guangdong, China
| | - Zhirong Chen
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275, Guangdong, China
| | - Bin Li
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275, Guangdong, China
| | - Lianghu Qu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275, Guangdong, China.
| | - Jianhua Yang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275, Guangdong, China.
- The Fifth Affiliated Hospital, Sun Yat-Sen University, Zhuhai, 519000, Guangdong, China.
| |
Collapse
|
9
|
Sobala ŁF. Evolution and phylogenetic distribution of endo-α-mannosidase. Glycobiology 2023; 33:687-699. [PMID: 37202179 PMCID: PMC11025385 DOI: 10.1093/glycob/cwad041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 05/12/2023] [Accepted: 05/16/2023] [Indexed: 05/20/2023] Open
Abstract
While glycans underlie many biological processes, such as protein folding, cell adhesion, and cell-cell recognition, deep evolution of glycosylation machinery remains an understudied topic. N-linked glycosylation is a conserved process in which mannosidases are key trimming enzymes. One of them is the glycoprotein endo-α-1,2-mannosidase which participates in the initial trimming of mannose moieties from an N-linked glycan inside the cis-Golgi. It is unique as the only endo-acting mannosidase found in this organelle. Relatively little is known about its origins and evolutionary history; so far it was reported to occur only in vertebrates. In this work, a taxon-rich bioinformatic survey to unravel the evolutionary history of this enzyme, including all major eukaryotic clades and a wide representation of animals, is presented. The endomannosidase was found to be more widely distributed in animals and other eukaryotes. The protein motif changes in context of the canonical animal enzyme were tracked. Additionally, the data show the two canonical vertebrate endomannosidase genes, MANEA and MANEAL, arose at the second round of the two vertebrate genome duplications and one more vertebrate paralog, CMANEAL, is uncovered. Finally, a framework where N-glycosylation co-evolved with complex multicellularity is described. A better understanding of the evolution of core glycosylation pathways is pivotal to understanding biology of eukaryotes in general, and the Golgi apparatus in particular. This systematic analysis of the endomannosidase evolution is one step toward this goal.
Collapse
Affiliation(s)
- Łukasz F Sobala
- Laboratory of Glycobiology, Hirszfeld Institute of Immunology and Experimental Therapy, Weigla 12, 53-114 Wroclaw, Poland
| |
Collapse
|
10
|
Jain A, Begum T, Ahmad S. Analysis and Prediction of Pathogen Nucleic Acid Specificity for Toll-like Receptors in Vertebrates. J Mol Biol 2023; 435:168208. [PMID: 37479078 DOI: 10.1016/j.jmb.2023.168208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/20/2023] [Accepted: 07/13/2023] [Indexed: 07/23/2023]
Abstract
Identification of key sequence, expression and function related features of nucleic acid-sensing host proteins is of fundamental importance to understand the dynamics of pathogen-specific host responses. To meet this objective, we considered toll-like receptors (TLRs), a representative class of membrane-bound sensor proteins, from 17 vertebrate species covering mammals, birds, reptiles, amphibians, and fishes in this comparative study. We identified the molecular signatures of host TLRs that are responsible for sensing pathogen nucleic acids or other pathogen-associated molecular patterns (PAMPs), and potentially play important roles in host defence mechanism. Interestingly, our findings reveal that such host-specific features are directly related to the strand (single or double) specificity of nucleic acid from pathogens. However, during host-pathogen interactions, such features were unable to explain the pathogenic PAMP (i.e., DNA, RNA or other) selectivity, suggesting a more complex mechanism. Using these features, we developed a number of machine learning models, of which Random Forest achieved a high performance (94.57% accuracy) to predict strand specificity of TLRs from protein-derived features. We applied the trained model to propose strand specificity of some previously uncharacterized distinct fish-specific novel TLRs (TLR18, TLR23, TLR24, TLR25, TLR27).
Collapse
Affiliation(s)
- Anuja Jain
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India. https://twitter.com/@Anuja334
| | - Tina Begum
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
11
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.09.527893. [PMID: 37645857 PMCID: PMC10461906 DOI: 10.1101/2023.02.09.527893] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well-described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred model for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Canada
- Department of Genetics, Washington University School of Medicine, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Canada
- Department of Quantitative and Computational Biology, University of Southern California, USA
- Department of Biological Sciences, University of Southern California, USA
| |
Collapse
|
12
|
Marlétaz F, Couloux A, Poulain J, Labadie K, Da Silva C, Mangenot S, Noel B, Poustka AJ, Dru P, Pegueroles C, Borra M, Lowe EK, Lhomond G, Besnardeau L, Le Gras S, Ye T, Gavriouchkina D, Russo R, Costa C, Zito F, Anello L, Nicosia A, Ragusa MA, Pascual M, Molina MD, Chessel A, Di Carlo M, Turon X, Copley RR, Exposito JY, Martinez P, Cavalieri V, Ben Tabou de Leon S, Croce J, Oliveri P, Matranga V, Di Bernardo M, Morales J, Cormier P, Geneviève AM, Aury JM, Barbe V, Wincker P, Arnone MI, Gache C, Lepage T. Analysis of the P. lividus sea urchin genome highlights contrasting trends of genomic and regulatory evolution in deuterostomes. CELL GENOMICS 2023; 3:100295. [PMID: 37082140 PMCID: PMC10112332 DOI: 10.1016/j.xgen.2023.100295] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 12/24/2022] [Accepted: 03/06/2023] [Indexed: 04/22/2023]
Abstract
Sea urchins are emblematic models in developmental biology and display several characteristics that set them apart from other deuterostomes. To uncover the genomic cues that may underlie these specificities, we generated a chromosome-scale genome assembly for the sea urchin Paracentrotus lividus and an extensive gene expression and epigenetic profiles of its embryonic development. We found that, unlike vertebrates, sea urchins retained ancestral chromosomal linkages but underwent very fast intrachromosomal gene order mixing. We identified a burst of gene duplication in the echinoid lineage and showed that some of these expanded genes have been recruited in novel structures (water vascular system, Aristotle's lantern, and skeletogenic micromere lineage). Finally, we identified gene-regulatory modules conserved between sea urchins and chordates. Our results suggest that gene-regulatory networks controlling development can be conserved despite extensive gene order rearrangement.
Collapse
Affiliation(s)
- Ferdinand Marlétaz
- Center for Life’s Origin & Evolution, Department of Genetics, Evolution, & Environment, University College London, WC1 6BT London, UK
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l’Énergie Atomique (CEA), Université Paris-Saclay, Évry, France
| | - Arnaud Couloux
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
| | - Julie Poulain
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
| | - Karine Labadie
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l’Énergie Atomique (CEA), Université Paris-Saclay, Évry, France
| | - Corinne Da Silva
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
| | - Sophie Mangenot
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
| | - Benjamin Noel
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
| | - Albert J. Poustka
- Evolution and Development Group, Max-Planck-Institut für Molekulare Genetik, 14195 Berlin, Germany
- Dahlem Center for Genome Research and Medical Systems Biology (Environmental and Phylogenomics Group), 12489 Berlin, Germany
| | - Philippe Dru
- Laboratoire de Biologie du Développement de Villefranche-sur-Mer (LBDV), Sorbonne Université, CNRS, 06230 Villefranche-sur-Mer, France
| | - Cinta Pegueroles
- Institute for Research on Biodiversity (IRBio), Department of Genetics, Microbiology, and Statistics, University of Barcelona, 08028 Barcelona, Spain
| | - Marco Borra
- Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Napoli, Italy
| | - Elijah K. Lowe
- Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Napoli, Italy
| | - Guy Lhomond
- Laboratoire de Biologie du Développement de Villefranche-sur-Mer (LBDV), Sorbonne Université, CNRS, 06230 Villefranche-sur-Mer, France
| | - Lydia Besnardeau
- Laboratoire de Biologie du Développement de Villefranche-sur-Mer (LBDV), Sorbonne Université, CNRS, 06230 Villefranche-sur-Mer, France
| | - Stéphanie Le Gras
- Plateforme GenomEast, IGBMC, CNRS UMR7104, INSERM U1258, Université de Strasbourg, 67404 Illirch Cedex, France
| | - Tao Ye
- Plateforme GenomEast, IGBMC, CNRS UMR7104, INSERM U1258, Université de Strasbourg, 67404 Illirch Cedex, France
| | - Daria Gavriouchkina
- Molecular Genetics Unit, Okinawa Institute of Science and Technology, 904-0495 Onna-son, Japan
| | - Roberta Russo
- Consiglio Nazionale delle Ricerche, Istituto per la Ricerca e l’Innovazione Biomedica (IRIB), 90146 Palermo, Italy
| | - Caterina Costa
- Consiglio Nazionale delle Ricerche, Istituto per la Ricerca e l’Innovazione Biomedica (IRIB), 90146 Palermo, Italy
| | - Francesca Zito
- Consiglio Nazionale delle Ricerche, Istituto per la Ricerca e l’Innovazione Biomedica (IRIB), 90146 Palermo, Italy
| | - Letizia Anello
- Consiglio Nazionale delle Ricerche, Istituto per la Ricerca e l’Innovazione Biomedica (IRIB), 90146 Palermo, Italy
| | - Aldo Nicosia
- Consiglio Nazionale delle Ricerche, Istituto per la Ricerca e l’Innovazione Biomedica (IRIB), 90146 Palermo, Italy
| | - Maria Antonietta Ragusa
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, University of Palermo, 90128 Palermo, Italy
| | - Marta Pascual
- Institute for Research on Biodiversity (IRBio), Department of Genetics, Microbiology, and Statistics, University of Barcelona, 08028 Barcelona, Spain
| | - M. Dolores Molina
- Departament de Genètica, Microbiologia, i Estadística, Universitat de Barcelona, 08028 Barcelona, Spain
- Institut Biology Valrose, Université Côte d’Azur, 06108 Nice Cedex 2, France
| | - Aline Chessel
- Institut Biology Valrose, Université Côte d’Azur, 06108 Nice Cedex 2, France
| | - Marta Di Carlo
- Institute for Biomedical Research and Innovation (CNR), 90146 Palermo, Italy
| | - Xavier Turon
- Department of Marine Ecology, Centre d’Estudis Avançats de Blanes (CEAB, CSIC), 17300 Blanes, Spain
| | - Richard R. Copley
- Laboratoire de Biologie du Développement de Villefranche-sur-Mer (LBDV), Sorbonne Université, CNRS, 06230 Villefranche-sur-Mer, France
| | - Jean-Yves Exposito
- Laboratoire de Biologie Tissulaire et d’Ingénierie Thérapeutique (LBTI), UMR CNRS 5305, Institut de Biologie et Chimie des Protéines, Université Lyon 1, 69367 Lyon, France
| | - Pedro Martinez
- Departament de Genètica, Microbiologia, i Estadística, Universitat de Barcelona, 08028 Barcelona, Spain
- Institut Català de Recerca i Estudis Avançats (ICREA), 08028 Barcelona, Spain
| | - Vincenzo Cavalieri
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, University of Palermo, 90128 Palermo, Italy
| | - Smadar Ben Tabou de Leon
- Department of Marine Biology, Charney School of Marine Sciences, University of Haifa, 31095 Haifa, Israel
| | - Jenifer Croce
- Laboratoire de Biologie du Développement de Villefranche-sur-Mer (LBDV), Sorbonne Université, CNRS, 06230 Villefranche-sur-Mer, France
| | - Paola Oliveri
- Center for Life’s Origin & Evolution, Department of Genetics, Evolution, & Environment, University College London, WC1 6BT London, UK
| | - Valeria Matranga
- Consiglio Nazionale delle Ricerche, Istituto per la Ricerca e l’Innovazione Biomedica (IRIB), 90146 Palermo, Italy
| | - Maria Di Bernardo
- Consiglio Nazionale delle Ricerche, Istituto di Farmacologia Traslazionale, 90146 Palermo, Italy
| | - Julia Morales
- Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, CNRS, Sorbonne Université, 29680 Roscoff, France
| | - Patrick Cormier
- Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff, CNRS, Sorbonne Université, 29680 Roscoff, France
| | - Anne-Marie Geneviève
- Sorbonne Université, CNRS, Biologie Intégrative des Organismes Marins, BIOM, 66650 Banyuls/Mer, France
| | - Jean Marc Aury
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
| | - Valérie Barbe
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l’Énergie Atomique, CNRS, Université Évry, Université Paris-Saclay, 91057 Évry, France
| | - Maria Ina Arnone
- Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Napoli, Italy
| | - Christian Gache
- Laboratoire de Biologie du Développement de Villefranche-sur-Mer (LBDV), Sorbonne Université, CNRS, 06230 Villefranche-sur-Mer, France
| | - Thierry Lepage
- Institut Biology Valrose, Université Côte d’Azur, 06108 Nice Cedex 2, France
| |
Collapse
|
13
|
Lüleci HB, Yılmaz A. Robust and rigorous identification of tissue-specific genes by statistically extending tau score. BioData Min 2022; 15:31. [PMID: 36494766 PMCID: PMC9733102 DOI: 10.1186/s13040-022-00315-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 11/11/2022] [Indexed: 12/13/2022] Open
Abstract
OBJECTIVES In this study, we aimed to identify tissue-specific genes for various human tissues/organs more robustly and rigorously by extending the tau score algorithm. INTRODUCTION Tissue-specific genes are a class of genes whose functions and expressions are preferred in one or several tissues restrictedly. Identification of tissue-specific genes is essential for discovering multi-cellular biological processes such as tissue-specific molecular regulations, tissue development, physiology, and the pathogenesis of tissue-associated diseases. MATERIALS AND METHODS Gene expression data derived from five large RNA sequencing (RNA-seq) projects, spanning 96 different human tissues, were retrieved from ArrayExpress and ExpressionAtlas. The first step is categorizing genes using significant filters and tau score as a specificity index. After calculating tau for each gene in all datasets separately, statistical distance from the maximum expression level was estimated using a new meaningful procedure. Specific expression of a gene in one or several tissues was calculated after the integration of tau and statistical distance estimation, which is called as extended tau approach. Obtained tissue-specific genes for 96 different human tissues were functionally annotated, and some comparisons were carried out to show the effectiveness of the extended tau method. RESULTS AND DISCUSSION Categorization of genes based on expression level and identification of tissue-specific genes for a large number of tissues/organs were executed. Genes were successfully assigned to multiple tissues by generating the extended tau approach as opposed to the original tau score, which can assign tissue specificity to single tissue only.
Collapse
Affiliation(s)
- Hatice Büşra Lüleci
- grid.448834.70000 0004 0595 7127Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| | - Alper Yılmaz
- grid.38575.3c0000 0001 2337 3561Department of Bioengineering, Yildiz Technical University, Istanbul, Turkey
| |
Collapse
|
14
|
Cervantes-Pérez SA, Thibivillliers S, Tennant S, Libault M. Review: Challenges and perspectives in applying single nuclei RNA-seq technology in plant biology. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2022; 325:111486. [PMID: 36202294 DOI: 10.1016/j.plantsci.2022.111486] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 09/12/2022] [Accepted: 09/30/2022] [Indexed: 06/16/2023]
Abstract
Plant single-cell RNA-seq technology quantifies the abundance of plant transcripts at a single-cell resolution. Deciphering the transcriptomes of each plant cell, their regulation during plant cell development, and their response to environmental stresses will support the functional study of genes, the establishment of precise transcriptional programs, the prediction of more accurate gene regulatory networks, and, in the long term, the design of de novo gene pathways to enhance selected crop traits. In this review, we will discuss the opportunities, challenges, and problems, and share tentative solutions associated with the generation and analysis of plant single-cell transcriptomes. We will discuss the benefit and limitations of using plant protoplasts vs. nuclei to conduct single-cell RNA-seq experiments on various plant species and organs, the functional annotation of plant cell types based on their transcriptomic profile, the characterization of the dynamic regulation of the plant genes during cell development or in response to environmental stress, the need to characterize and integrate additional layers of -omics datasets to capture new molecular modalities at the single-cell level and reveal their causalities, the deposition and access to single-cell datasets, and the accessibility of this technology to plant scientists.
Collapse
Affiliation(s)
- Sergio Alan Cervantes-Pérez
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, 68503, USA
| | - Sandra Thibivillliers
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, 68503, USA; Center for Biotechnology, University of Nebraska, Lincoln, NE 68588, USA; Single Cell Genomics Core Facility, University of Nebraska-Lincoln, NE 68588, USA
| | - Sutton Tennant
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, 68503, USA
| | - Marc Libault
- Department of Agronomy and Horticulture, Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE, 68503, USA; Center for Biotechnology, University of Nebraska, Lincoln, NE 68588, USA; Single Cell Genomics Core Facility, University of Nebraska-Lincoln, NE 68588, USA.
| |
Collapse
|
15
|
Nawade B, Kumar A, Maurya R, Subramani R, Yadav R, Singh K, Rangan P. Longer Duration of Active Oil Biosynthesis during Seed Development Is Crucial for High Oil Yield-Lessons from Genome-Wide In Silico Mining and RNA-Seq Validation in Sesame. PLANTS (BASEL, SWITZERLAND) 2022; 11:2980. [PMID: 36365434 PMCID: PMC9657858 DOI: 10.3390/plants11212980] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 09/29/2022] [Accepted: 09/30/2022] [Indexed: 06/16/2023]
Abstract
Sesame, one of the ancient oil crops, is an important oilseed due to its nutritionally rich seeds with high protein content. Genomic scale information for sesame has become available in the public databases in recent years. The genes and their families involved in oil biosynthesis in sesame are less studied than in other oilseed crops. Therefore, we retrieved a total of 69 genes and their translated amino acid sequences, associated with gene families linked to the oil biosynthetic pathway. Genome-wide in silico mining helped identify key regulatory genes for oil biosynthesis, though the findings require functional validation. Comparing sequences of the SiSAD (stearoyl-acyl carrier protein (ACP)-desaturase) coding genes with known SADs helped identify two SiSAD family members that may be palmitoyl-ACP-specific. Based on homology with lysophosphatidic acid acyltransferase (LPAAT) sequences, an uncharacterized gene has been identified as SiLPAAT1. Identified key regulatory genes associated with high oil content were also validated using publicly available transcriptome datasets of genotypes contrasting for oil content at different developmental stages. Our study provides evidence that a longer duration of active oil biosynthesis is crucial for high oil accumulation during seed development. This underscores the importance of early onset of oil biosynthesis in developing seeds. Up-regulating, identified key regulatory genes of oil biosynthesis during early onset of seed development, should help increase oil yields.
Collapse
Affiliation(s)
- Bhagwat Nawade
- Division of Genomic Resources, ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India
| | - Ajay Kumar
- Division of Genomic Resources, ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India
| | - Rasna Maurya
- Division of Genomic Resources, ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India
| | - Rajkumar Subramani
- Division of Genomic Resources, ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India
| | - Rashmi Yadav
- Division of Germplasm Evaluation, ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India
| | - Kuldeep Singh
- Division of Genomic Resources, ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India
| | - Parimalan Rangan
- Division of Genomic Resources, ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD 4072, Australia
| |
Collapse
|
16
|
Alonso-Alvarez C, Andrade P, Cantarero A, Morales J, Carneiro M. Relocation to avoid costs: A hypothesis on red carotenoid-based signals based on recent CYP2J19 gene expression data. Bioessays 2022; 44:e2200037. [PMID: 36209392 DOI: 10.1002/bies.202200037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/25/2022] [Accepted: 09/22/2022] [Indexed: 11/11/2022]
Abstract
In many vertebrates, the enzymatic oxidation of dietary yellow carotenoids generates red keto-carotenoids giving color to ornaments. The oxidase CYP2J19 is here a key effector. Its purported intracellular location suggests a shared biochemical pathway between trait expression and cell functioning. This might guarantee the reliability of red colorations as individual quality signals independent of production costs. We hypothesize that the ornament type (feathers vs. bare parts) and production costs (probably CYP2J19 activity compromising vital functions) could have promoted tissue-specific gene relocation. We review current avian tissue-specific CYP2J19 expression data. Among the ten red-billed species showing CYP2J19 bill expression, only one showed strong hepatic expression. Moreover, a phylogenetically-controlled analysis of 25 red-colored species shows that those producing red bare parts are less likely to have strong hepatic CYP2J19 expression than species with only red plumages. Thus, both production costs and shared pathways might have contributed to the evolution of red signals.
Collapse
Affiliation(s)
- Carlos Alonso-Alvarez
- Department of Evolutionary Ecology, National Museum of Natural Sciences - CSIC. C/ José Gutiérrez Abascal 2, Madrid, Spain
| | - Pedro Andrade
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO, Universidade do Porto, Vairão, Portugal.,BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
| | - Alejandro Cantarero
- Department of Evolutionary Ecology, National Museum of Natural Sciences - CSIC. C/ José Gutiérrez Abascal 2, Madrid, Spain.,Department of Physiology, Veterinary School, Complutense University of Madrid, Madrid, Spain
| | - Judith Morales
- Department of Evolutionary Ecology, National Museum of Natural Sciences - CSIC. C/ José Gutiérrez Abascal 2, Madrid, Spain
| | - Miguel Carneiro
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO, Universidade do Porto, Vairão, Portugal.,BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Vairão, Portugal
| |
Collapse
|
17
|
Escorcia-Rodríguez JM, Esposito M, Freyre-González JA, Moreno-Hagelsieb G. Non-synonymous to synonymous substitutions suggest that orthologs tend to keep their functions, while paralogs are a source of functional novelty. PeerJ 2022; 10:e13843. [PMID: 36065404 PMCID: PMC9440661 DOI: 10.7717/peerj.13843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 07/14/2022] [Indexed: 01/18/2023] Open
Abstract
Orthologs separate after lineages split from each other and paralogs after gene duplications. Thus, orthologs are expected to remain more functionally coherent across lineages, while paralogs have been proposed as a source of new functions. Because protein functional divergence follows from non-synonymous substitutions, we performed an analysis based on the ratio of non-synonymous to synonymous substitutions (dN/dS), as proxy for functional divergence. We used five working definitions of orthology, including reciprocal best hits (RBH), among other definitions based on network analyses and clustering. The results showed that orthologs, by all definitions tested, had values of dN/dS noticeably lower than those of paralogs, suggesting that orthologs generally tend to be more functionally stable than paralogs. The differences in dN/dS ratios remained suggesting the functional stability of orthologs after eliminating gene comparisons with potential problems, such as genes with high codon usage biases, low coverage of either of the aligned sequences, or sequences with very high similarities. Separation by percent identity of the encoded proteins showed that the differences between the dN/dS ratios of orthologs and paralogs were more evident at high sequence identity, less so as identity dropped. The last results suggest that the differences between dN/dS ratios were partially related to differences in protein identity. However, they also suggested that paralogs undergo functional divergence relatively early after duplication. Our analyses indicate that choosing orthologs as probably functionally coherent remains the right approach in comparative genomics.
Collapse
Affiliation(s)
- Juan M. Escorcia-Rodríguez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autonóma de México, Cuernavaca, Morelos, México
| | - Mario Esposito
- Department of Biology, Wilfrid Laurier University, Waterloo, Canada
| | - Julio A. Freyre-González
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autonóma de México, Cuernavaca, Morelos, México
| | | |
Collapse
|
18
|
Schmidbaur H, Kawaguchi A, Clarence T, Fu X, Hoang OP, Zimmermann B, Ritschard EA, Weissenbacher A, Foster JS, Nyholm SV, Bates PA, Albertin CB, Tanaka E, Simakov O. Emergence of novel cephalopod gene regulation and expression through large-scale genome reorganization. Nat Commun 2022; 13:2172. [PMID: 35449136 PMCID: PMC9023564 DOI: 10.1038/s41467-022-29694-7] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 03/28/2022] [Indexed: 12/17/2022] Open
Abstract
Coleoid cephalopods (squid, cuttlefish, octopus) have the largest nervous system among invertebrates that together with many lineage-specific morphological traits enables complex behaviors. The genomic basis underlying these innovations remains unknown. Using comparative and functional genomics in the model squid Euprymna scolopes, we reveal the unique genomic, topological, and regulatory organization of cephalopod genomes. We show that coleoid cephalopod genomes have been extensively restructured compared to other animals, leading to the emergence of hundreds of tightly linked and evolutionary unique gene clusters (microsyntenies). Such novel microsyntenies correspond to topological compartments with a distinct regulatory structure and contribute to complex expression patterns. In particular, we identify a set of microsyntenies associated with cephalopod innovations (MACIs) broadly enriched in cephalopod nervous system expression. We posit that the emergence of MACIs was instrumental to cephalopod nervous system evolution and propose that microsyntenic profiling will be central to understanding cephalopod innovations.
Collapse
Affiliation(s)
- Hannah Schmidbaur
- Department of Neurosciences and Developmental Biology, University of Vienna, Vienna, Austria
| | | | - Tereza Clarence
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Xiao Fu
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Oi Pui Hoang
- Department of Neurosciences and Developmental Biology, University of Vienna, Vienna, Austria
| | - Bob Zimmermann
- Department of Neurosciences and Developmental Biology, University of Vienna, Vienna, Austria
| | - Elena A Ritschard
- Department of Neurosciences and Developmental Biology, University of Vienna, Vienna, Austria
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Naples, Italy
| | | | - Jamie S Foster
- Department of Microbiology and Cell Science, University of Florida, Space Life Science Lab, Merritt Island, FL, USA
| | - Spencer V Nyholm
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, UK
| | - Caroline B Albertin
- Bell Center for Regenerative Biology and Tissue Engineering, Marine Biological Laboratory, Woods Hole, MA, USA.
| | - Elly Tanaka
- Institute for Molecular Pathology, Vienna, Austria.
| | - Oleg Simakov
- Department of Neurosciences and Developmental Biology, University of Vienna, Vienna, Austria.
| |
Collapse
|
19
|
Begum T, Serrano‐Serrano ML, Robinson‐Rechavi M. Performance of a phylogenetic independent contrast method and an improved pairwise comparison under different scenarios of trait evolution after speciation and duplication. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Tina Begum
- Department of Ecology and Evolution University of Lausanne Lausanne Switzerland
- SIB Swiss Institute of Bioinformatics Lausanne Switzerland
| | - Martha Liliana Serrano‐Serrano
- Department of Ecology and Evolution University of Lausanne Lausanne Switzerland
- SIB Swiss Institute of Bioinformatics Lausanne Switzerland
| | - Marc Robinson‐Rechavi
- Department of Ecology and Evolution University of Lausanne Lausanne Switzerland
- SIB Swiss Institute of Bioinformatics Lausanne Switzerland
| |
Collapse
|
20
|
Matsubara S, Osugi T, Shiraishi A, Wada A, Satake H. Comparative analysis of transcriptomic profiles among ascidians, zebrafish, and mice: Insights from tissue-specific gene expression. PLoS One 2021; 16:e0254308. [PMID: 34559810 PMCID: PMC8462739 DOI: 10.1371/journal.pone.0254308] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 09/12/2021] [Indexed: 11/18/2022] Open
Abstract
Tissue/organ-specific genes (TSGs) are important not only for understanding organ development and function, but also for investigating the evolutionary lineages of organs in animals. Here, we investigate the TSGs of 9 adult tissues of an ascidian, Ciona intestinalis Type A (Ciona robusta), which lies in the important position of being the sister group of vertebrates. RNA-seq and qRT-PCR identified the Ciona TSGs in each tissue, and BLAST searches identified their homologs in zebrafish and mice. Tissue distributions of the vertebrate homologs were analyzed and clustered using public RNA-seq data for 12 zebrafish and 30 mouse tissues. Among the vertebrate homologs of the Ciona TSGs in the neural complex, 48% and 63% showed high expression in the zebrafish and mouse brain, respectively, suggesting that the central nervous system is evolutionarily conserved in chordates. In contrast, vertebrate homologs of Ciona TSGs in the ovary, pharynx, and intestine were not consistently highly expressed in the corresponding tissues of vertebrates, suggesting that these organs have evolved in Ciona-specific lineages. Intriguingly, more TSG homologs of the Ciona stomach were highly expressed in the vertebrate liver (17-29%) and intestine (22-33%) than in the mouse stomach (5%). Expression profiles for these genes suggest that the biological roles of the Ciona stomach are distinct from those of their vertebrate counterparts. Collectively, Ciona tissues were categorized into 3 groups: i) high similarity to the corresponding vertebrate tissues (neural complex and heart), ii) low similarity to the corresponding vertebrate tissues (ovary, pharynx, and intestine), and iii) low similarity to the corresponding vertebrate tissues, but high similarity to other vertebrate tissues (stomach, endostyle, and siphons). The present study provides transcriptomic catalogs of adult ascidian tissues and significant insights into the evolutionary lineages of the brain, heart, and digestive tract of chordates.
Collapse
Affiliation(s)
- Shin Matsubara
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
- * E-mail:
| | - Tomohiro Osugi
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| | - Akira Shiraishi
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| | - Azumi Wada
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| | - Honoo Satake
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| |
Collapse
|
21
|
Begum T, Robinson-Rechavi M. Special Care Is Needed in Applying Phylogenetic Comparative Methods to Gene Trees with Speciation and Duplication Nodes. Mol Biol Evol 2021; 38:1614-1626. [PMID: 33169790 PMCID: PMC8042747 DOI: 10.1093/molbev/msaa288] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
How gene function evolves is a central question of evolutionary biology. It can be investigated by comparing functional genomics results between species and between genes. Most comparative studies of functional genomics have used pairwise comparisons. Yet it has been shown that this can provide biased results, as genes, like species, are phylogenetically related. Phylogenetic comparative methods should be used to correct for this, but they depend on strong assumptions, including unbiased tree estimates relative to the hypothesis being tested. Such methods have recently been used to test the “ortholog conjecture,” the hypothesis that functional evolution is faster in paralogs than in orthologs. Although pairwise comparisons of tissue specificity (τ) provided support for the ortholog conjecture, phylogenetic independent contrasts did not. Our reanalysis on the same gene trees identified problems with the time calibration of duplication nodes. We find that the gene trees used suffer from important biases, due to the inclusion of trees with no duplication nodes, to the relative age of speciations and duplications, to systematic differences in branch lengths, and to non-Brownian motion of tissue specificity on many trees. We find that incorrect implementation of phylogenetic method in empirical gene trees with duplications can be problematic. Controlling for biases allows successful use of phylogenetic methods to study the evolution of gene function and provides some support for the ortholog conjecture using three different phylogenetic approaches.
Collapse
Affiliation(s)
- Tina Begum
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
22
|
Robic A, Cerutti C, Kühn C, Faraut T. Comparative Analysis of the Circular Transcriptome in Muscle, Liver, and Testis in Three Livestock Species. Front Genet 2021; 12:665153. [PMID: 34040640 PMCID: PMC8141914 DOI: 10.3389/fgene.2021.665153] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 04/07/2021] [Indexed: 12/13/2022] Open
Abstract
Circular RNAs have been observed in a large number of species and tissues and are now recognized as a clear component of the transcriptome. Our study takes advantage of functional datasets produced within the FAANG consortium to investigate the pervasiveness of circular RNA transcription in farm animals. We describe here the circular transcriptional landscape in pig, sheep and bovine testicular, muscular and liver tissues using total 66 RNA-seq datasets. After an exhaustive detection of circular RNAs, we propose an annotation of exonic, intronic and sub-exonic circRNAs and comparative analyses of circRNA content to evaluate the variability between individuals, tissues and species. Despite technical bias due to the various origins of the datasets, we were able to characterize some features (i) (ruminant) liver contains more exonic circRNAs than muscle (ii) in testis, the number of exonic circRNAs seems associated with the sexual maturity of the animal. (iii) a particular class of circRNAs, sub-exonic circRNAs, are produced by a large variety of multi-exonic genes (protein-coding genes, long non-coding RNAs and pseudogenes) and mono-exonic genes (protein-coding genes from mitochondrial genome and small non-coding genes). Moreover, for multi-exonic genes there seems to be a relationship between the sub-exonic circRNAs transcription level and the linear transcription level. Finally, sub-exonic circRNAs produced by mono-exonic genes (mitochondrial protein-coding genes, ribozyme, and sno) exhibit a particular behavior. Caution has to be taken regarding the interpretation of the unannotated circRNA proportion in a given tissue/species: clusters of circRNAs without annotation were characterized in genomic regions with annotation and/or assembly problems of the respective animal genomes. This study highlights the importance of improving genome annotation to better consider candidate circRNAs and to better understand the circular transcriptome. Furthermore, it emphasizes the need for considering the relative “weight” of circRNAs/parent genes for comparative analyses of several circular transcriptomes. Although there are points of agreement in the circular transcriptome of the same tissue in two species, it will be not possible to do without the characterization of it in both species.
Collapse
Affiliation(s)
- Annie Robic
- INRAE, ENVT, GenPhySE, Université de Toulouse, Castanet-Tolosan, France
| | - Chloé Cerutti
- INRAE, ENVT, GenPhySE, Université de Toulouse, Castanet-Tolosan, France
| | - Christa Kühn
- Institute Genome Biology, Leibniz Institute for Farm Animal Biology (FBN), Dummerstorf, Germany.,Faculty of Agricultural and Environmental Sciences, University of Rostock, Rostock, Germany
| | - Thomas Faraut
- INRAE, ENVT, GenPhySE, Université de Toulouse, Castanet-Tolosan, France
| |
Collapse
|
23
|
Soto-Cerda BJ, Aravena G, Cloutier S. Genetic dissection of flowering time in flax (Linum usitatissimum L.) through single- and multi-locus genome-wide association studies. Mol Genet Genomics 2021; 296:877-891. [PMID: 33903955 DOI: 10.1007/s00438-021-01785-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 04/09/2021] [Indexed: 01/19/2023]
Abstract
In a rapidly changing climate, flowering time (FL) adaptation is important to maximize seed yield in flax (Linum usitatissimum L.). However, our understanding of the genetic mechanism underlying FL in this multipurpose crop remains limited. With the aim of dissecting the genetic architecture of FL in flax, a genome-wide association study (GWAS) was performed on 200 accessions of the flax core collection evaluated in four environments. Two single-locus and six multi-locus models were applied using 70,935 curated single nucleotide polymorphism (SNP) markers. A total of 40 quantitative trait nucleotides (QTNs) associated with 27 quantitative trait loci (QTL) were identified in at least two environments. The number of QTL with positive-effect alleles in accessions was significantly correlated with FL (r = 0.77 to 0.82), indicating principally additive gene actions. Nine QTL were significant in at least three of the four environments accounting for 3.06-14.71% of FL variation. These stable QTL spanned regions that harbored 27 Arabidopsis thaliana and Oryza sativa FL-related orthologous genes including FLOWERING LOCUS T (Lus10013532), FLOWERING LOCUS D (Lus10028817), transcriptional regulator SUPERMAN (Lus10021215), and gibberellin 2-beta-dioxygenase 2 (Lus10037816). In silico gene expression analysis of the 27 FL candidate gene orthologous suggested that they might play roles in the transition from vegetative to reproductive phase, flower development and fertilization. Our results provide new insights into the QTL architecture of flowering time in flax, identify potential candidate genes for further studies, and demonstrate the effectiveness of combining different GWAS models for the genetic dissection of complex traits.
Collapse
Affiliation(s)
- Braulio J Soto-Cerda
- Agriaquaculture Nutritional Genomic Center (CGNA), Las Heras 350, 4781158, Temuco, Chile.
| | - Gabriela Aravena
- Agriaquaculture Nutritional Genomic Center (CGNA), Las Heras 350, 4781158, Temuco, Chile
| | - Sylvie Cloutier
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, 960 Carling Avenue, Ottawa, ON, K1A 0C6, Canada.
| |
Collapse
|
24
|
Brohard-Julien S, Frouin V, Meyer V, Chalabi S, Deleuze JF, Le Floch E, Battail C. Region-specific expression of young small-scale duplications in the human central nervous system. BMC Ecol Evol 2021; 21:59. [PMID: 33882820 PMCID: PMC8059171 DOI: 10.1186/s12862-021-01794-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 04/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The duplication of genes is one of the main genetic mechanisms that led to the gain in complexity of biological tissue. Although the implication of duplicated gene expression in brain evolution was extensively studied through comparisons between organs, their role in the regional specialization of the adult human central nervous system has not yet been well described. RESULTS Our work explored intra-organ expression properties of paralogs through multiple territories of the human central nervous system (CNS) using transcriptome data generated by the Genotype-Tissue Expression (GTEx) consortium. Interestingly, we found that paralogs were associated with region-specific expression in CNS, suggesting their involvement in the differentiation of these territories. Beside the influence of gene expression level on region-specificity, we observed the contribution of both duplication age and duplication type to the CNS region-specificity of paralogs. Indeed, we found that small scale duplicated genes (SSDs) and in particular ySSDs (SSDs younger than the 2 rounds of whole genome duplications) were more CNS region-specific than other paralogs. Next, by studying the two paralogs of ySSD pairs, we observed that when they were region-specific, they tend to be specific to the same region more often than for other paralogs, showing the high co-expression of ySSD pairs. The extension of this analysis to families of paralogs showed that the families with co-expressed gene members (i.e. homogeneous families) were enriched in ySSDs. Furthermore, these homogeneous families tended to be region-specific families, where the majority of their gene members were specifically expressed in the same region. CONCLUSIONS Overall, our study suggests the involvement of ySSDs in the differentiation of human central nervous system territories. Therefore, we show the relevance of exploring region-specific expression of paralogs at the intra-organ level.
Collapse
Affiliation(s)
- Solène Brohard-Julien
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut François Jacob, CEA, Université Paris-Saclay, Evry, France.
- UNATI, Neurospin, Institut Joliot, CEA, Université Paris-Saclay, 91191, Gif-sur-Yvette, France.
- Université Paris-Sud, Université Paris-Saclay, Orsay, France.
| | - Vincent Frouin
- UNATI, Neurospin, Institut Joliot, CEA, Université Paris-Saclay, 91191, Gif-sur-Yvette, France
| | - Vincent Meyer
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut François Jacob, CEA, Université Paris-Saclay, Evry, France
| | - Smahane Chalabi
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut François Jacob, CEA, Université Paris-Saclay, Evry, France
| | - Jean-François Deleuze
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut François Jacob, CEA, Université Paris-Saclay, Evry, France
- Centre d'Etude du Polymorphisme Humain, Fondation Jean Dausset, Paris, France
- Centre de Référence, d'Innovation, d'expertise et de transfert (CREFIX), Evry, France
| | - Edith Le Floch
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut François Jacob, CEA, Université Paris-Saclay, Evry, France.
| | - Christophe Battail
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut François Jacob, CEA, Université Paris-Saclay, Evry, France.
- CEA, Univ. Grenoble Alpes, INSERM, IRIG, Biology of Cancer and Infection UMR1292, 38000, Grenoble, France.
| |
Collapse
|
25
|
Li B, Veturi Y, Verma A, Bradford Y, Daar ES, Gulick RM, Riddler SA, Robbins GK, Lennox JL, Haas DW, Ritchie MD. Tissue specificity-aware TWAS (TSA-TWAS) framework identifies novel associations with metabolic, immunologic, and virologic traits in HIV-positive adults. PLoS Genet 2021; 17:e1009464. [PMID: 33901188 PMCID: PMC8102009 DOI: 10.1371/journal.pgen.1009464] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 05/06/2021] [Accepted: 03/03/2021] [Indexed: 01/01/2023] Open
Abstract
As a type of relatively new methodology, the transcriptome-wide association study (TWAS) has gained interest due to capacity for gene-level association testing. However, the development of TWAS has outpaced statistical evaluation of TWAS gene prioritization performance. Current TWAS methods vary in underlying biological assumptions about tissue specificity of transcriptional regulatory mechanisms. In a previous study from our group, this may have affected whether TWAS methods better identified associations in single tissues versus multiple tissues. We therefore designed simulation analyses to examine how the interplay between particular TWAS methods and tissue specificity of gene expression affects power and type I error rates for gene prioritization. We found that cross-tissue identification of expression quantitative trait loci (eQTLs) improved TWAS power. Single-tissue TWAS (i.e., PrediXcan) had robust power to identify genes expressed in single tissues, but, often found significant associations in the wrong tissues as well (therefore had high false positive rates). Cross-tissue TWAS (i.e., UTMOST) had overall equal or greater power and controlled type I error rates for genes expressed in multiple tissues. Based on these simulation results, we applied a tissue specificity-aware TWAS (TSA-TWAS) analytic framework to look for gene-based associations with pre-treatment laboratory values from AIDS Clinical Trial Group (ACTG) studies. We replicated several proof-of-concept transcriptionally regulated gene-trait associations, including UGT1A1 (encoding bilirubin uridine diphosphate glucuronosyltransferase enzyme) and total bilirubin levels (p = 3.59×10-12), and CETP (cholesteryl ester transfer protein) with high-density lipoprotein cholesterol (p = 4.49×10-12). We also identified several novel genes associated with metabolic and virologic traits, as well as pleiotropic genes that linked plasma viral load, absolute basophil count, and/or triglyceride levels. By highlighting the advantages of different TWAS methods, our simulation study promotes a tissue specificity-aware TWAS analytic framework that revealed novel aspects of HIV-related traits.
Collapse
Affiliation(s)
- Binglan Li
- Department of Biomedical Data Science, Stanford University, Stanford, California, United States of America
| | - Yogasudha Veturi
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Anurag Verma
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Yuki Bradford
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Eric S. Daar
- Lundquist Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America
| | - Roy M. Gulick
- Weill Cornell Medicine, New York City, New York, United States of America
| | - Sharon A. Riddler
- Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Gregory K. Robbins
- Division of Infectious Diseases, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Jeffrey L. Lennox
- Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - David W. Haas
- Departments of Medicine, Pharmacology, Pathology, Microbiology & Immunology, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Department of Internal Medicine, Meharry Medical College, Nashville, Tennessee, United States of America
| | - Marylyn D. Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
26
|
Liu Q, Jiang F, Zhang J, Li X, Kang L. Transcription initiation of distant core promoters in a large-sized genome of an insect. BMC Biol 2021; 19:62. [PMID: 33785021 PMCID: PMC8011201 DOI: 10.1186/s12915-021-01004-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 03/16/2021] [Indexed: 12/30/2022] Open
Abstract
Background Core promoters have a substantial influence on various steps of transcription, including initiation, elongation, termination, polyadenylation, and finally, translation. The characterization of core promoters is crucial for exploring the regulatory code of transcription initiation. However, the current understanding of insect core promoters is focused on those of Diptera (especially Drosophila) species with small genome sizes. Results Here, we present an analysis of the transcription start sites (TSSs) in the migratory locust, Locusta migratoria, which has a genome size of 6.5 Gb. The genomic differences, including lower precision of transcription initiation and fewer constraints on the distance from transcription factor binding sites or regulatory elements to TSSs, were revealed in locusts compared with Drosophila insects. Furthermore, we found a distinct bimodal log distribution of the distances from the start codons to the core promoters of locust genes. We found stricter constraints on the exon length of mRNA leaders and widespread expression activity of the distant core promoters in locusts compared with fruit flies. We further compared core promoters in seven arthropod species across a broad range of genome sizes to reinforce our results on the emergence of distant core promoters in large-sized genomes. Conclusions In summary, our results provide novel insights into the effects of genome size expansion on distant transcription initiation. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-021-01004-5.
Collapse
Affiliation(s)
- Qing Liu
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China.,Sino-Danish College, University of Chinese Academy of Sciences, Beijing, China.,Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Feng Jiang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China.,CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China
| | - Jie Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Xiao Li
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Le Kang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China. .,CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China. .,State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
| |
Collapse
|
27
|
Petrova N, Nazipova A, Gorshkov O, Mokshina N, Patova O, Gorshkova T. Gene Expression Patterns for Proteins With Lectin Domains in Flax Stem Tissues Are Related to Deposition of Distinct Cell Wall Types. FRONTIERS IN PLANT SCIENCE 2021; 12:634594. [PMID: 33995436 PMCID: PMC8121149 DOI: 10.3389/fpls.2021.634594] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Accepted: 03/16/2021] [Indexed: 05/10/2023]
Abstract
The genomes of higher plants encode a variety of proteins with lectin domains that are able to specifically recognize certain carbohydrates. Plants are enriched in a variety of potentially complementary glycans, many of which are located in the cell wall. We performed a genome-wide search for flax proteins with lectin domains and compared the expression of the encoding genes in different stem tissues that have distinct cell wall types with different sets of major polysaccharides. Over 400 genes encoding proteins with lectin domains that belong to different families were revealed in the flax genome; three quarters of these genes were expressed in stem tissues. Hierarchical clustering of the data for all expressed lectins grouped the analyzed samples according to their characteristic cell wall type. Most lectins differentially expressed in tissues with primary, secondary, and tertiary cell walls were predicted to localize at the plasma membrane or cell wall. These lectins were from different families and had various architectural types. Three out of four flax genes for proteins with jacalin-like domains were highly upregulated in bast fibers at the stage of tertiary cell wall deposition. The dynamic changes in transcript level of many genes for lectins from various families were detected in stem tissue over the course of gravitropic response induced by plant gravistimulation. The data obtained in this study indicate a large number of lectin-mediated events in plants and provide insight into the proteins that take part in tissue specialization and reaction to abiotic stress.
Collapse
Affiliation(s)
- Natalia Petrova
- Laboratory of Plant Glycobiology, Kazan Institute of Biochemistry and Biophysics, FRC Kazan Scientific Center of RAS, Kazan, Russia
| | - Alsu Nazipova
- Laboratory of Plant Cell Growth Mechanisms, Kazan Institute of Biochemistry and Biophysics, FRC Kazan Scientific Center of RAS, Kazan, Russia
| | - Oleg Gorshkov
- Laboratory of Plant Cell Growth Mechanisms, Kazan Institute of Biochemistry and Biophysics, FRC Kazan Scientific Center of RAS, Kazan, Russia
| | - Natalia Mokshina
- Laboratory of Plant Glycobiology, Kazan Institute of Biochemistry and Biophysics, FRC Kazan Scientific Center of RAS, Kazan, Russia
| | - Olga Patova
- Institute of Physiology, FRC Komi Science Centre of Ural Branch of Russian Academy of Sciences, Syktyvkar, Russia
| | - Tatyana Gorshkova
- Laboratory of Plant Cell Growth Mechanisms, Kazan Institute of Biochemistry and Biophysics, FRC Kazan Scientific Center of RAS, Kazan, Russia
- *Correspondence: Tatyana Gorshkova,
| |
Collapse
|
28
|
Fodoulian L, Tuberosa J, Rossier D, Boillat M, Kan C, Pauli V, Egervari K, Lobrinus JA, Landis BN, Carleton A, Rodriguez I. SARS-CoV-2 Receptors and Entry Genes Are Expressed in the Human Olfactory Neuroepithelium and Brain. iScience 2020; 23:101839. [PMID: 33251489 PMCID: PMC7685946 DOI: 10.1016/j.isci.2020.101839] [Citation(s) in RCA: 144] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 09/25/2020] [Accepted: 11/18/2020] [Indexed: 12/21/2022] Open
Abstract
Reports indicate an association between COVID-19 and anosmia, as well as the presence of SARS-CoV-2 virions in the olfactory bulb. To test whether the olfactory neuroepithelium may represent a target of the virus, we generated RNA-seq libraries from human olfactory neuroepithelia, in which we found substantial expression of the genes coding for the virus receptor angiotensin-converting enzyme-2 (ACE2) and for the virus internalization enhancer TMPRSS2. We analyzed a human olfactory single-cell RNA-seq dataset and determined that sustentacular cells, which maintain the integrity of olfactory sensory neurons, express ACE2 and TMPRSS2. ACE2 protein was highly expressed in a subset of sustentacular cells in human and mouse olfactory tissues. Finally, we found ACE2 transcripts in specific brain cell types, both in mice and humans. Sustentacular cells thus represent a potential entry door for SARS-CoV-2 in a neuronal sensory system that is in direct connection with the brain.
Collapse
Affiliation(s)
- Leon Fodoulian
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, 1 rue Michel-Servet, 1211 Geneva, Switzerland
| | - Joël Tuberosa
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Daniel Rossier
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Madlaina Boillat
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Chenda Kan
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Véronique Pauli
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Kristof Egervari
- Service of Clinical Pathology, Department of Genetic Medicine, Geneva University Hospitals, Geneva, Switzerland
- Department of Pathology and Immunology, Faculty of Medicine, University of Geneva, 1 rue Michel-Servet, 1211 Geneva, Switzerland
| | - Johannes A. Lobrinus
- Service of Clinical Pathology, Department of Genetic Medicine, Geneva University Hospitals, Geneva, Switzerland
| | - Basile N. Landis
- Rhinology-Olfactology Unit, Department of Otorhinolaryngology, Head and Neck Surgery, Geneva University Hospitals, Geneva, Switzerland
| | - Alan Carleton
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, 1 rue Michel-Servet, 1211 Geneva, Switzerland
| | - Ivan Rodriguez
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| |
Collapse
|
29
|
Fodoulian L, Tuberosa J, Rossier D, Boillat M, Kan C, Pauli V, Egervari K, Lobrinus JA, Landis BN, Carleton A, Rodriguez I. SARS-CoV-2 Receptors and Entry Genes Are Expressed in the Human Olfactory Neuroepithelium and Brain. iScience 2020; 23:101839. [PMID: 33251489 DOI: 10.1101/2020.03.31.013268] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 09/25/2020] [Accepted: 11/18/2020] [Indexed: 05/23/2023] Open
Abstract
Reports indicate an association between COVID-19 and anosmia, as well as the presence of SARS-CoV-2 virions in the olfactory bulb. To test whether the olfactory neuroepithelium may represent a target of the virus, we generated RNA-seq libraries from human olfactory neuroepithelia, in which we found substantial expression of the genes coding for the virus receptor angiotensin-converting enzyme-2 (ACE2) and for the virus internalization enhancer TMPRSS2. We analyzed a human olfactory single-cell RNA-seq dataset and determined that sustentacular cells, which maintain the integrity of olfactory sensory neurons, express ACE2 and TMPRSS2. ACE2 protein was highly expressed in a subset of sustentacular cells in human and mouse olfactory tissues. Finally, we found ACE2 transcripts in specific brain cell types, both in mice and humans. Sustentacular cells thus represent a potential entry door for SARS-CoV-2 in a neuronal sensory system that is in direct connection with the brain.
Collapse
Affiliation(s)
- Leon Fodoulian
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, 1 rue Michel-Servet, 1211 Geneva, Switzerland
| | - Joël Tuberosa
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Daniel Rossier
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Madlaina Boillat
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Chenda Kan
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Véronique Pauli
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| | - Kristof Egervari
- Service of Clinical Pathology, Department of Genetic Medicine, Geneva University Hospitals, Geneva, Switzerland
- Department of Pathology and Immunology, Faculty of Medicine, University of Geneva, 1 rue Michel-Servet, 1211 Geneva, Switzerland
| | - Johannes A Lobrinus
- Service of Clinical Pathology, Department of Genetic Medicine, Geneva University Hospitals, Geneva, Switzerland
| | - Basile N Landis
- Rhinology-Olfactology Unit, Department of Otorhinolaryngology, Head and Neck Surgery, Geneva University Hospitals, Geneva, Switzerland
| | - Alan Carleton
- Department of Basic Neurosciences, Faculty of Medicine, University of Geneva, 1 rue Michel-Servet, 1211 Geneva, Switzerland
| | - Ivan Rodriguez
- Department of Genetics and Evolution, Faculty of Sciences, University of Geneva, quai Ernest-Ansermet 30, 1211 Geneva, Switzerland
| |
Collapse
|
30
|
Fu Y, Xu J, Tang Z, Wang L, Yin D, Fan Y, Zhang D, Deng F, Zhang Y, Zhang H, Wang H, Xing W, Yin L, Zhu S, Zhu M, Yu M, Li X, Liu X, Yuan X, Zhao S. A gene prioritization method based on a swine multi-omics knowledgebase and a deep learning model. Commun Biol 2020; 3:502. [PMID: 32913254 PMCID: PMC7483748 DOI: 10.1038/s42003-020-01233-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 08/07/2020] [Indexed: 12/27/2022] Open
Abstract
The analyses of multi-omics data have revealed candidate genes for objective traits. However, they are integrated poorly, especially in non-model organisms, and they pose a great challenge for prioritizing candidate genes for follow-up experimental verification. Here, we present a general convolutional neural network model that integrates multi-omics information to prioritize the candidate genes of objective traits. By applying this model to Sus scrofa, which is a non-model organism, but one of the most important livestock animals, the model precision was 72.9%, recall 73.5%, and F1-Measure 73.4%, demonstrating a good prediction performance compared with previous studies in Arabidopsis thaliana and Oryza sativa. Additionally, to facilitate the use of the model, we present ISwine (http://iswine.iomics.pro/), which is an online comprehensive knowledgebase in which we incorporated almost all the published swine multi-omics data. Overall, the results suggest that the deep learning strategy will greatly facilitate analyses of multi-omics integration in the future. Yuhua Fu et al. develop a CNN model that integrates multi-omics information to prioritize candidate genes of objective traits. Their model performs well when applied to important livestock non-model animals like Sus scrofa. Finally, the authors present ISwine, an online comprehensive knowledgebase which includes all published swine omics data to facilitate the integration of heterogeneous data.
Collapse
Affiliation(s)
- Yuhua Fu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.,School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Jingya Xu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Zhenshuang Tang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Lu Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Dong Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Yu Fan
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Dongdong Zhang
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Fei Deng
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Yanping Zhang
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Haohao Zhang
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Haiyan Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Wenhui Xing
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Lilin Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Shilin Zhu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Mengjin Zhu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Mei Yu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Xiaolei Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.
| | - Xiaohui Yuan
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China.
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.
| |
Collapse
|
31
|
Amalgamated cross-species transcriptomes reveal organ-specific propensity in gene expression evolution. Nat Commun 2020; 11:4459. [PMID: 32900997 PMCID: PMC7479108 DOI: 10.1038/s41467-020-18090-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 07/29/2020] [Indexed: 12/24/2022] Open
Abstract
The origins of multicellular physiology are tied to evolution of gene expression. Genes can shift expression as organisms evolve, but how ancestral expression influences altered descendant expression is not well understood. To examine this, we amalgamate 1,903 RNA-seq datasets from 182 research projects, including 6 organs in 21 vertebrate species. Quality control eliminates project-specific biases, and expression shifts are reconstructed using gene-family-wise phylogenetic Ornstein-Uhlenbeck models. Expression shifts following gene duplication result in more drastic changes in expression properties than shifts without gene duplication. The expression properties are tightly coupled with protein evolutionary rate, depending on whether and how gene duplication occurred. Fluxes in expression patterns among organs are nonrandom, forming modular connections that are reshaped by gene duplication. Thus, if expression shifts, ancestral expression in some organs induces a strong propensity for expression in particular organs in descendants. Regardless of whether the shifts are adaptive or not, this supports a major role for what might be termed preadaptive pathways of gene expression evolution.
Collapse
|
32
|
Stamboulian M, Guerrero RF, Hahn MW, Radivojac P. The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction. Bioinformatics 2020; 36:i219-i226. [PMID: 32657391 PMCID: PMC7355290 DOI: 10.1093/bioinformatics/btaa468] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
MOTIVATION The computational prediction of gene function is a key step in making full use of newly sequenced genomes. Function is generally predicted by transferring annotations from homologous genes or proteins for which experimental evidence exists. The 'ortholog conjecture' proposes that orthologous genes should be preferred when making such predictions, as they evolve functions more slowly than paralogous genes. Previous research has provided little support for the ortholog conjecture, though the incomplete nature of the data cast doubt on the conclusions. RESULTS We use experimental annotations from over 40 000 proteins, drawn from over 80 000 publications, to revisit the ortholog conjecture in two pairs of species: (i) Homo sapiens and Mus musculus and (ii) Saccharomyces cerevisiae and Schizosaccharomyces pombe. By making a distinction between questions about the evolution of function versus questions about the prediction of function, we find strong evidence against the ortholog conjecture in the context of function prediction, though questions about the evolution of function remain difficult to address. In both pairs of species, we quantify the amount of information that would be ignored if paralogs are discarded, as well as the resulting loss in prediction accuracy. Taken as a whole, our results support the view that the types of homologs used for function transfer are largely irrelevant to the task of function prediction. Maximizing the amount of data used for this task, regardless of whether it comes from orthologs or paralogs, is most likely to lead to higher prediction accuracy. AVAILABILITY AND IMPLEMENTATION https://github.com/predragradivojac/oc. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Moses Stamboulian
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| | - Rafael F Guerrero
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Matthew W Hahn
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| |
Collapse
|
33
|
Geirsdottir L, David E, Keren-Shaul H, Weiner A, Bohlen SC, Neuber J, Balic A, Giladi A, Sheban F, Dutertre CA, Pfeifle C, Peri F, Raffo-Romero A, Vizioli J, Matiasek K, Scheiwe C, Meckel S, Mätz-Rensing K, van der Meer F, Thormodsson FR, Stadelmann C, Zilkha N, Kimchi T, Ginhoux F, Ulitsky I, Erny D, Amit I, Prinz M. Cross-Species Single-Cell Analysis Reveals Divergence of the Primate Microglia Program. Cell 2020; 179:1609-1622.e16. [PMID: 31835035 DOI: 10.1016/j.cell.2019.11.010] [Citation(s) in RCA: 293] [Impact Index Per Article: 58.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 07/30/2019] [Accepted: 11/06/2019] [Indexed: 02/08/2023]
Abstract
Microglia, the brain-resident immune cells, are critically involved in many physiological and pathological brain processes, including neurodegeneration. Here we characterize microglia morphology and transcriptional programs across ten species spanning more than 450 million years of evolution. We find that microglia express a conserved core gene program of orthologous genes from rodents to humans, including ligands and receptors associated with interactions between glia and neurons. In most species, microglia show a single dominant transcriptional state, whereas human microglia display significant heterogeneity. In addition, we observed notable differences in several gene modules of rodents compared with primate microglia, including complement, phagocytic, and susceptibility genes to neurodegeneration, such as Alzheimer's and Parkinson's disease. Our study provides an essential resource of conserved and divergent microglia pathways across evolution, with important implications for future development of microglia-based therapies in humans.
Collapse
Affiliation(s)
- Laufey Geirsdottir
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Eyal David
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Hadas Keren-Shaul
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel; Life Science Core Facility-Israel National Center for Personalized Medicine (G-INCPM), Weizmann Institute of Science, Rehovot, Israel
| | - Assaf Weiner
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | | | - Jana Neuber
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Adam Balic
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, EH25 9RG, United Kingdom
| | - Amir Giladi
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Fadi Sheban
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Charles-Antoine Dutertre
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore; Program in Emerging Infectious Disease, Duke-NUS Medical School, 8 College Road, Singapore, Singapore
| | - Christine Pfeifle
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Biology, Ploen, Germany
| | - Francesca Peri
- Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | - Antonella Raffo-Romero
- Universite Lille, Inserm, U-1192-Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse-PRISM, Lille, France
| | - Jacopo Vizioli
- Universite Lille, Inserm, U-1192-Laboratoire Protéomique, Réponse Inflammatoire et Spectrométrie de Masse-PRISM, Lille, France
| | - Kaspar Matiasek
- Section of Clinical & Comparative Neuropathology, Centre for Clinical Veterinary Medicine, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Christian Scheiwe
- Clinic for Neurosurgery, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Stephan Meckel
- Department of Neuroradiology, Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Kerstin Mätz-Rensing
- German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
| | | | | | - Christine Stadelmann
- Institute of Neuropathology, University Medical Center Göttingen, Göttingen, Germany
| | - Noga Zilkha
- Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel
| | - Tali Kimchi
- Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel
| | - Florent Ginhoux
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore; Shanghai Institute of Immunology, Shanghai JiaoTong University School of Medicine, Shanghai, China; Translational Immunology Institute, Singhealth/Duke-NUS Academic Medical Centre, the Academia, Singapore, Singapore
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel
| | - Daniel Erny
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Berta-Ottenstein-Programme, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| | - Ido Amit
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel.
| | - Marco Prinz
- Institute of Neuropathology, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Signaling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg, Germany; Center for NeuroModulation, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| |
Collapse
|
34
|
Almudi I, Vizueta J, Wyatt CDR, de Mendoza A, Marlétaz F, Firbas PN, Feuda R, Masiero G, Medina P, Alcaina-Caro A, Cruz F, Gómez-Garrido J, Gut M, Alioto TS, Vargas-Chavez C, Davie K, Misof B, González J, Aerts S, Lister R, Paps J, Rozas J, Sánchez-Gracia A, Irimia M, Maeso I, Casares F. Genomic adaptations to aquatic and aerial life in mayflies and the origin of insect wings. Nat Commun 2020; 11:2631. [PMID: 32457347 PMCID: PMC7250882 DOI: 10.1038/s41467-020-16284-8] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 04/27/2020] [Indexed: 01/11/2023] Open
Abstract
The evolution of winged insects revolutionized terrestrial ecosystems and led to the largest animal radiation on Earth. However, we still have an incomplete picture of the genomic changes that underlay this diversification. Mayflies, as one of the sister groups of all other winged insects, are key to understanding this radiation. Here, we describe the genome of the mayfly Cloeon dipterum and its gene expression throughout its aquatic and aerial life cycle and specific organs. We discover an expansion of odorant-binding-protein genes, some expressed specifically in breathing gills of aquatic nymphs, suggesting a novel sensory role for this organ. In contrast, flying adults use an enlarged opsin set in a sexually dimorphic manner, with some expressed only in males. Finally, we identify a set of wing-associated genes deeply conserved in the pterygote insects and find transcriptomic similarities between gills and wings, suggesting a common genetic program. Globally, this comprehensive genomic and transcriptomic study uncovers the genetic basis of key evolutionary adaptations in mayflies and winged insects.
Collapse
Affiliation(s)
- Isabel Almudi
- GEM-DMC2 Unit, The CABD (CSIC-UPO-JA), Ctra. de Utrera km 1, 41013, Seville, Spain.
| | - Joel Vizueta
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Christopher D R Wyatt
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centre for Biodiversity and Environment Research, University College London, Gower Street, London, WC1E 6BT, UK
| | - Alex de Mendoza
- Australian Research Council Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
- Harry Perkins Institute of Medical Research, Perth, Western Australia, Australia
- Queen Mary University of London, School of Biological and Chemical Sciences, Mile End Road, E1 4NS, London, UK
| | - Ferdinand Marlétaz
- Molecular Genetics Unit, Okinawa Institute of Science and Technology, Onna-son, Japan
| | - Panos N Firbas
- GEM-DMC2 Unit, The CABD (CSIC-UPO-JA), Ctra. de Utrera km 1, 41013, Seville, Spain
| | - Roberto Feuda
- Department of Genetics and Genome Biology, University of Leicester, University Road, Leicester, LE1 7RH, UK
| | - Giulio Masiero
- GEM-DMC2 Unit, The CABD (CSIC-UPO-JA), Ctra. de Utrera km 1, 41013, Seville, Spain
| | - Patricia Medina
- GEM-DMC2 Unit, The CABD (CSIC-UPO-JA), Ctra. de Utrera km 1, 41013, Seville, Spain
| | - Ana Alcaina-Caro
- GEM-DMC2 Unit, The CABD (CSIC-UPO-JA), Ctra. de Utrera km 1, 41013, Seville, Spain
| | - Fernando Cruz
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Jessica Gómez-Garrido
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Tyler S Alioto
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Carlos Vargas-Chavez
- Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| | - Kristofer Davie
- Laboratory of Computational Biology, VIB Center for Brain and Disease Research, Herestraat 49, 3000, Louvain, Belgium
- Department of Human Genetics, KU Leuven, Oude Markt 13, 3000, Louvain, Belgium
| | - Bernhard Misof
- Zoological Research Museum Alexander Koenig, Adenauerallee 160, 53113, Bonn, Germany
| | - Josefa González
- Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| | - Stein Aerts
- Laboratory of Computational Biology, VIB Center for Brain and Disease Research, Herestraat 49, 3000, Louvain, Belgium
- Department of Human Genetics, KU Leuven, Oude Markt 13, 3000, Louvain, Belgium
| | - Ryan Lister
- Australian Research Council Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, The University of Western Australia, Perth, Western Australia, Australia
- Harry Perkins Institute of Medical Research, Perth, Western Australia, Australia
| | - Jordi Paps
- School of Biological Sciences, University of Bristol, 24 Tyndall Avenue, Bristol, BS8 1TQ, UK
| | - Julio Rozas
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Alejandro Sánchez-Gracia
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Manuel Irimia
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- ICREA, Barcelona, Spain
| | - Ignacio Maeso
- GEM-DMC2 Unit, The CABD (CSIC-UPO-JA), Ctra. de Utrera km 1, 41013, Seville, Spain
| | - Fernando Casares
- GEM-DMC2 Unit, The CABD (CSIC-UPO-JA), Ctra. de Utrera km 1, 41013, Seville, Spain.
| |
Collapse
|
35
|
David KT, Oaks JR, Halanych KM. Patterns of gene evolution following duplications and speciations in vertebrates. PeerJ 2020; 8:e8813. [PMID: 32266119 PMCID: PMC7120047 DOI: 10.7717/peerj.8813] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 02/27/2020] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Eukaryotic genes typically form independent evolutionary lineages through either speciation or gene duplication events. Generally, gene copies resulting from speciation events (orthologs) are expected to maintain similarity over time with regard to sequence, structure and function. After a duplication event, however, resulting gene copies (paralogs) may experience a broader set of possible fates, including partial (subfunctionalization) or complete loss of function, as well as gain of new function (neofunctionalization). This assumption, known as the Ortholog Conjecture, is prevalent throughout molecular biology and notably plays an important role in many functional annotation methods. Unfortunately, studies that explicitly compare evolutionary processes between speciation and duplication events are rare and conflicting. METHODS To provide an empirical assessment of ortholog/paralog evolution, we estimated ratios of nonsynonymous to synonymous substitutions (ω = dN/dS) for 251,044 lineages in 6,244 gene trees across 77 vertebrate taxa. RESULTS Overall, we found ω to be more similar between lineages descended from speciation events (p < 0.001) than lineages descended from duplication events, providing strong support for the Ortholog Conjecture. The asymmetry in ω following duplication events appears to be largely driven by an increase along one of the paralogous lineages, while the other remains similar to the parent. This trend is commonly associated with neofunctionalization, suggesting that gene duplication is a significant mechanism for generating novel gene functions.
Collapse
Affiliation(s)
- Kyle T. David
- Department of Biological Sciences, Auburn University, Auburn, AL, USA
| | - Jamie R. Oaks
- Department of Biological Sciences, Auburn University, Auburn, AL, USA
| | | |
Collapse
|
36
|
Shafer MER. Cross-Species Analysis of Single-Cell Transcriptomic Data. Front Cell Dev Biol 2019; 7:175. [PMID: 31552245 PMCID: PMC6743501 DOI: 10.3389/fcell.2019.00175] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 08/12/2019] [Indexed: 01/30/2023] Open
Abstract
The ability to profile hundreds of thousands to millions of single cells using scRNA-sequencing has revolutionized the fields of cell and developmental biology, providing incredible insights into the diversity of forms and functions of cell types across many species. These technologies hold the promise of developing detailed cell type phylogenies which can describe the evolutionary and developmental relationships between cell types across species. This will require sampling of many species and taxa using single-cell transcriptomics, and methods to classify cell type homologies and diversifications. Many tools currently exist for analyzing single cell data and identifying cell types. However, cross-species comparisons are complicated by many biological and technical factors. These factors include batch effects common to deep-sequencing approaches, well known evolutionary relationships between orthologous and paralogous genes, and less well-understood evolutionary forces shaping transcriptome variation between species. In this review, I discuss recent developments in computational methods for the comparison of single-cell-omic data across species. These approaches have the potential to provide invaluable insight into how evolutionary forces act at the level of the cell and will further our understanding of the evolutionary origins of animal and cellular diversity.
Collapse
Affiliation(s)
- Maxwell E. R. Shafer
- Biozentrum, University of Basel, Basel, Switzerland
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, United States
| |
Collapse
|
37
|
Bai Y, Dai X, Li Y, Wang L, Li W, Liu Y, Cheng Y, Qin Y. Identification and characterization of pineapple leaf lncRNAs in crassulacean acid metabolism (CAM) photosynthesis pathway. Sci Rep 2019; 9:6658. [PMID: 31040312 PMCID: PMC6491598 DOI: 10.1038/s41598-019-43088-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2018] [Accepted: 04/11/2019] [Indexed: 01/08/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) have been identified in many mammals and plants and are known to play crucial roles in multiple biological processes. Pineapple is an important tropical fruit and a good model for studying the plant evolutionary adaptation to the dry environment and the crassulacean acid metabolism (CAM) photosynthesis strategy; however, the lncRNAs involved in CAM pathway remain poorly characterized. Here, we analyzed the available RNA-seq data sets derived from 26 pineapple leaf samples at 13 time points and identified 2,888 leaf lncRNAs, including 2,046 long intergenic noncoding RNAs (lincRNAs) and 842 long noncoding natural antisense transcripts (lncNATs). Pineapple leaf lncRNAs are expressed in a highly tissue-specific manner. Co-expression analysis of leaf lncRNA and mRNA revealed that leaf lncRNAs are preferentially associated with photosynthesis genes. We further identified leaf lncRNAs that potentially function as competing endogenous RNAs (ceRNAs) of two CAM photosynthesis pathway genes, PPCK and PEPC, and revealed their diurnal expression pattern in leaves. Moreover, we found that 48% of lncRNAs exhibit diurnal expression patterns in leaves, suggesting their important roles in CAM. This study conducted a comprehensive genome-wide analysis of leaf lncRNAs and identified their role in gene expression regulation of the CAM photosynthesis pathway in pineapple.
Collapse
Affiliation(s)
- Youhuang Bai
- College of life science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Xiaozhuan Dai
- College of life science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Yi Li
- College of life science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Lulu Wang
- College of life science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China.,College of Resources and Environment, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Weimin Li
- College of life science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China.,College of Resources and Environment, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Yanhui Liu
- College of life science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Yan Cheng
- College of life science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China.,College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Yuan Qin
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Lab of Sugarcane Biology, College of Agriculture, Guangxi University, Nanning, 530004, Guangxi, China.
| |
Collapse
|
38
|
Feedforward regulation of Myc coordinates lineage-specific with housekeeping gene expression during B cell progenitor cell differentiation. PLoS Biol 2019; 17:e2006506. [PMID: 30978178 PMCID: PMC6481923 DOI: 10.1371/journal.pbio.2006506] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2018] [Revised: 04/24/2019] [Accepted: 03/16/2019] [Indexed: 12/18/2022] Open
Abstract
The differentiation of self-renewing progenitor cells requires not only the regulation of lineage- and developmental stage–specific genes but also the coordinated adaptation of housekeeping functions from a metabolically active, proliferative state toward quiescence. How metabolic and cell-cycle states are coordinated with the regulation of cell type–specific genes is an important question, because dissociation between differentiation, cell cycle, and metabolic states is a hallmark of cancer. Here, we use a model system to systematically identify key transcriptional regulators of Ikaros-dependent B cell–progenitor differentiation. We find that the coordinated regulation of housekeeping functions and tissue-specific gene expression requires a feedforward circuit whereby Ikaros down-regulates the expression of Myc. Our findings show how coordination between differentiation and housekeeping states can be achieved by interconnected regulators. Similar principles likely coordinate differentiation and housekeeping functions during progenitor cell differentiation in other cell lineages. The human body is made from billions of cells comprizing many specialized cell types. All of these cells ultimately come from a single fertilized oocyte in a process that has two key features: proliferation, which expands cell numbers, and differentiation, which diversifies cell types. Here, we have examined the transition from proliferation to differentiation using B lymphocytes as an example. We find that the transition from proliferation to differentiation involves changes in the expression of genes, which can be categorized into cell-type–specific genes and broadly expressed “housekeeping” genes. The expression of many housekeeping genes is controlled by the gene regulatory factor Myc, whereas the expression of many B lymphocyte–specific genes is controlled by the Ikaros family of gene regulatory proteins. Myc is repressed by Ikaros, which means that changes in housekeeping and tissue-specific gene expression are coordinated during the transition from proliferation to differentiation.
Collapse
|
39
|
Ritschard EA, Fitak RR, Simakov O, Johnsen S. Genomic signatures of G-protein-coupled receptor expansions reveal functional transitions in the evolution of cephalopod signal transduction. Proc Biol Sci 2019; 286:20182929. [PMID: 30963849 PMCID: PMC6408891 DOI: 10.1098/rspb.2018.2929] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 02/04/2019] [Indexed: 01/29/2023] Open
Abstract
Coleoid cephalopods show unique morphological and neural novelties, such as arms with tactile and chemosensory suckers and a large complex nervous system. The evolution of such cephalopod novelties has been attributed at a genomic level to independent gene family expansions, yet the exact association and the evolutionary timing remain unclear. In the octopus genome, one such expansion occurred in the G-protein-coupled receptors (GPCRs) repertoire, a superfamily of proteins that mediate signal transduction. Here, we assessed the evolutionary history of this expansion and its relationship with cephalopod novelties. Using phylogenetic analyses, at least two cephalopod- and two octopus-specific GPCR expansions were identified. Signatures of positive selection were analysed within the four groups, and the locations of these sequences in the Octopus bimaculoides genome were inspected. Additionally, the expression profiles of cephalopod GPCRs across various tissues were extracted from available transcriptomic data. Our results reveal the evolutionary history of cephalopod GPCRs. Unexpanded cephalopod GPCRs shared with other bilaterians were found to be mainly nervous tissue specific. By contrast, duplications that are shared between octopus and the bobtail squid or specific to the octopus' lineage generated copies with divergent expression patterns devoted to tissues outside of the brain. The acquisition of novel expression domains was accompanied by gene order rearrangement through either translocation or duplication and gene loss. Lastly, expansions showed signs of positive selection and some were found to form tandem clusters with shared conserved expression profiles in cephalopod innovations such as the axial nerve cord. Altogether, our results contribute to the understanding of the molecular and evolutionary history of signal transduction and provide insights into the role of this expansion during the emergence of cephalopod novelties and/or adaptations.
Collapse
Affiliation(s)
- Elena A. Ritschard
- Department of Molecular Evolution and Development, University of Vienna, Vienna, Austria
- Department of Biology, Duke University, Durham, NC, USA
| | | | - Oleg Simakov
- Department of Molecular Evolution and Development, University of Vienna, Vienna, Austria
| | - Sönke Johnsen
- Department of Biology, Duke University, Durham, NC, USA
| |
Collapse
|
40
|
Huang JH, Kwan RSY, Tsai ZTY, Lin TC, Tsai HK. Borders of Cis-Regulatory DNA Sequences Preferentially Harbor the Divergent Transcription Factor Binding Motifs in the Human Genome. Front Genet 2018; 9:571. [PMID: 30524473 PMCID: PMC6261980 DOI: 10.3389/fgene.2018.00571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 11/06/2018] [Indexed: 11/17/2022] Open
Abstract
Changes in cis-regulatory DNA sequences and transcription factor (TF) repertoires provide major sources of phenotypic diversity that shape the evolution of gene regulation in eukaryotes. The DNA-binding specificities of TFs may be diversified or produce new variants in different eukaryotic species. However, it is currently unclear how various levels of divergence in TF DNA-binding specificities or motifs became introduced into the cis-regulatory DNA regions of the genome over evolutionary time. Here, we first estimated the evolutionary divergence levels of TF binding motifs and quantified their occurrence at DNase I-hypersensitive sites. Results from our in silico motif scan and experimentally derived chromatin immunoprecipitation (TF-ChIP) show that the divergent motifs tend to be introduced in the edges of cis-regulatory regions, which is probably accompanied by the expansion of the accessible core of promoter-associated regulatory elements during evolution. We also find that the genes neighboring the expanded cis-regulatory regions with the most divergent motifs are associated with functions like development and morphogenesis. Accordingly, we propose that the accumulation of divergent motifs in the edges of cis-regulatory regions provides a functional mechanism for the evolution of divergent regulatory circuits.
Collapse
Affiliation(s)
- Jia-Hsin Huang
- Institute of Information Science, Academia Sinica, Nankang, Taipei, Taiwan
| | | | - Zing Tsung-Yeh Tsai
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Tzu-Chieh Lin
- Institute of Information Science, Academia Sinica, Nankang, Taipei, Taiwan
| | - Huai-Kuang Tsai
- Institute of Information Science, Academia Sinica, Nankang, Taipei, Taiwan
| |
Collapse
|
41
|
Mier P, Pérez-Pulido AJ, Andrade-Navarro MA. Automated selection of homologs to track the evolutionary history of proteins. BMC Bioinformatics 2018; 19:431. [PMID: 30453878 PMCID: PMC6245638 DOI: 10.1186/s12859-018-2457-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 10/31/2018] [Indexed: 11/26/2022] Open
Abstract
Background The selection of distant homologs of a query protein under study is a usual and useful application of protein sequence databases. Such sets of homologs are often applied to investigate the function of a protein and the degree to which experimental results can be transferred from one organism to another. In particular, a variety of databases facilitates static browsing for orthologs. However, these resources have a limited power when identifying orthologs between taxonomically distant species. In addition, in some situations, for a given query protein, it is advantageous to compare the sets of orthologs from different specific organisms: this recursive step-wise search might give an idea of the evolutionary path of the protein as a series of consecutive steps, for example gaining or losing domains. However, a step-wise orthology search is a time-consuming task if the number of steps is high. Results To illustrate a solution for this problem, we present the web tool ProteinPathTracker, which allows to track the evolutionary history of a query protein by locating homologs in selected proteomes along several evolutionary paths. Additional functionalities include locking a region of interest to follow its evolution in the discovered homologous sequences and the study of the protein function evolution by analysis of the annotations of the homologs. Conclusions ProteinPathTracker is an easy-to-use web tool that automatises the practice of looking for selected homologs in distant species in a straightforward way for non-expert users. Electronic supplementary material The online version of this article (10.1186/s12859-018-2457-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Pablo Mier
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128, Mainz, Germany.
| | | | - Miguel A Andrade-Navarro
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128, Mainz, Germany
| |
Collapse
|
42
|
Palasca O, Santos A, Stolte C, Gorodkin J, Jensen LJ. TISSUES 2.0: an integrative web resource on mammalian tissue expression. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:4851151. [PMID: 29617745 PMCID: PMC5808782 DOI: 10.1093/database/bay003] [Citation(s) in RCA: 139] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Accepted: 01/04/2018] [Indexed: 11/13/2022]
Abstract
Physiological and molecular similarities between organisms make it possible to translate findings from simpler experimental systems—model organisms—into more complex ones, such as human. This translation facilitates the understanding of biological processes under normal or disease conditions. Researchers aiming to identify the similarities and differences between organisms at the molecular level need resources collecting multi-organism tissue expression data. We have developed a database of gene–tissue associations in human, mouse, rat and pig by integrating multiple sources of evidence: transcriptomics covering all four species and proteomics (human only), manually curated and mined from the scientific literature. Through a scoring scheme, these associations are made comparable across all sources of evidence and across organisms. Furthermore, the scoring produces a confidence score assigned to each of the associations. The TISSUES database (version 2.0) is publicly accessible through a user-friendly web interface and as part of the STRING app for Cytoscape. In addition, we analyzed the agreement between datasets, across and within organisms, and identified that the agreement is mainly affected by the quality of the datasets rather than by the technologies used or organisms compared. Database URL: http://tissues.jensenlab.org/
Collapse
Affiliation(s)
- Oana Palasca
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,Center for non-coding RNA in Technology and Health, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Alberto Santos
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | | | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.,Center for non-coding RNA in Technology and Health, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
43
|
Liu J, Robinson-Rechavi M. Developmental Constraints on Genome Evolution in Four Bilaterian Model Species. Genome Biol Evol 2018; 10:2266-2277. [PMID: 30137380 PMCID: PMC6130771 DOI: 10.1093/gbe/evy177] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/17/2018] [Indexed: 12/12/2022] Open
Abstract
Developmental constraints on genome evolution have been suggested to follow either an early conservation model or an "hourglass" model. Both models agree that late development strongly diverges between species, but debate on which developmental period is the most conserved. Here, based on a modified "Transcriptome Age Index" approach, that is, weighting trait measures by expression level, we analyzed the constraints acting on three evolutionary traits of protein coding genes (strength of purifying selection on protein sequences, phyletic age, and duplicability) in four species: Nematode worm Caenorhabditis elegans, fly Drosophila melanogaster, zebrafish Danio rerio, and mouse Mus musculus. In general, we found that both models can be supported by different genomic properties. Sequence evolution follows an hourglass model, but the evolution of phyletic age and of duplicability follow an early conservation model. Further analyses indicate that stronger purifying selection on sequences in the middle development are driven by temporal pleiotropy of these genes. In addition, we report evidence that expression in late development is enriched with retrogenes, which usually lack efficient regulatory elements. This implies that expression in late development could facilitate transcription of new genes, and provide opportunities for acquisition of function. Finally, in C. elegans, we suggest that dosage imbalance could be one of the main factors that cause depleted expression of high duplicability genes in early development.
Collapse
Affiliation(s)
- Jialin Liu
- Department of Ecology and Evolution, University of Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
44
|
Besray Unal E, Kiel C, Benisty H, Campbell A, Pickering K, Blüthgen N, Sansom OJ, Serrano L. Systems level expression correlation of Ras GTPase regulators. Cell Commun Signal 2018; 16:46. [PMID: 30111366 PMCID: PMC6094892 DOI: 10.1186/s12964-018-0256-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Accepted: 08/02/2018] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Proteins of the ubiquitously expressed core proteome are quantitatively correlated across multiple eukaryotic species. In addition, it was found that many protein paralogues exhibit expression anticorrelation, suggesting that the total level of protein with a given functionality must be kept constant. METHODS We performed Spearman's rank correlation analyses of gene expression levels for the RAS GTPase subfamily and their regulatory GEF and GAP proteins across tissues and across individuals for each tissue. A large set of published data for normal tissues from a wide range of species, human cancer tissues and human cell lines was analysed. RESULTS We show that although the multidomain regulatory proteins of Ras GTPases exhibit considerable tissue and individual gene expression variability, their total amounts are balanced in normal tissues. In a given tissue, the sum of activating (GEFs) and deactivating (GAPs) domains of Ras GTPases can vary considerably, but each person has balanced GEF and GAP levels. This balance is impaired in cell lines and in cancer tissues for some individuals. CONCLUSIONS Our results are relevant for critical considerations of knock out experiments, where functionally related homologs may compensate for the down regulation of a protein.
Collapse
Affiliation(s)
- E. Besray Unal
- Institute of Pathology, Charité - Universitätsmedizin Berlin, 10117 Berlin, Germany
- Integrative Research Institute Life Sciences, Humboldt Universität Berlin, 10115 Berlin, Germany
| | - Christina Kiel
- Centre for Genomic Regulation (CRG), Systems Biology Programme. The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003 Spain
- Present address: Systems Biology Ireland & Charles Institute of Dermatology & School of Medicine, University College Dublin, Belfield, Dublin 4, Ireland
| | - Hannah Benisty
- Centre for Genomic Regulation (CRG), Systems Biology Programme. The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003 Spain
| | - Andrew Campbell
- Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow, G61 1BD UK
| | - Karen Pickering
- Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow, G61 1BD UK
| | - Nils Blüthgen
- Institute of Pathology, Charité - Universitätsmedizin Berlin, 10117 Berlin, Germany
- Integrative Research Institute Life Sciences, Humboldt Universität Berlin, 10115 Berlin, Germany
| | - Owen J. Sansom
- Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow, G61 1BD UK
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), Systems Biology Programme. The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003 Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluís Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
45
|
Dunn CW, Zapata F, Munro C, Siebert S, Hejnol A. Pairwise comparisons across species are problematic when analyzing functional genomic data. Proc Natl Acad Sci U S A 2018; 115:E409-E417. [PMID: 29301966 PMCID: PMC5776959 DOI: 10.1073/pnas.1707515115] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
There is considerable interest in comparing functional genomic data across species. One goal of such work is to provide an integrated understanding of genome and phenotype evolution. Most comparative functional genomic studies have relied on multiple pairwise comparisons between species, an approach that does not incorporate information about the evolutionary relationships among species. The statistical problems that arise from not considering these relationships can lead pairwise approaches to the wrong conclusions and are a missed opportunity to learn about biology that can only be understood in an explicit phylogenetic context. Here, we examine two recently published studies that compare gene expression across species with pairwise methods, and find reason to question the original conclusions of both. One study interpreted pairwise comparisons of gene expression as support for the ortholog conjecture, the hypothesis that orthologs tend to have more similar attributes (expression in this case) than paralogs. The other study interpreted pairwise comparisons of embryonic gene expression across distantly related animals as evidence for a distinct evolutionary process that gave rise to phyla. In each study, distinct patterns of pairwise similarity among species were originally interpreted as evidence of particular evolutionary processes, but instead, we find that they reflect species relationships. These reanalyses concretely show the inadequacy of pairwise comparisons for analyzing functional genomic data across species. It will be critical to adopt phylogenetic comparative methods in future functional genomic work. Fortunately, phylogenetic comparative biology is also a rapidly advancing field with many methods that can be directly applied to functional genomic data.
Collapse
Affiliation(s)
- Casey W Dunn
- Department of Ecology and Evolutionary Biology, Brown University, Providence, RI 02912;
| | - Felipe Zapata
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095
| | - Catriona Munro
- Department of Ecology and Evolutionary Biology, Brown University, Providence, RI 02912
| | - Stefan Siebert
- Department of Molecular and Cellular Biology, University of California, Davis, CA 95616
| | - Andreas Hejnol
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5006, Norway
| |
Collapse
|
46
|
Ramakrishnan Varadarajan A, Mopuri R, Streelman JT, McGrath PT. Genome-wide protein phylogenies for four African cichlid species. BMC Evol Biol 2018; 18:1. [PMID: 29368592 PMCID: PMC5784529 DOI: 10.1186/s12862-017-1072-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 11/15/2017] [Indexed: 11/29/2022] Open
Abstract
Background The thousands of species of closely related cichlid fishes in the great lakes of East Africa are a powerful model for understanding speciation and the genetic basis of trait variation. Recently, the genomes of five species of African cichlids representing five distinct lineages were sequenced and used to predict protein products at a genome-wide level. Here we characterize the evolutionary relationship of each cichlid protein to previously sequenced animal species. Results We used the Treefam database, a set of preexisting protein phylogenies built using 109 previously sequenced genomes, to identify Treefam families for each protein annotated from four cichlid species: Metriaclima zebra, Astatotilapia burtoni, Pundamilia nyererei and Neolamporologus brichardi. For each of these Treefam families, we built new protein phylogenies containing each of the cichlid protein hits. Using these new phylogenies we identified the evolutionary relationship of each cichlid protein to its nearest human and zebrafish protein. This data is available either through download or through a webserver we have implemented. Conclusion These phylogenies will be useful for any cichlid researchers trying to predict biological and protein function for a given cichlid gene, understanding the evolutionary history of a given cichlid gene, identifying recently duplicated cichlid genes, or performing genome-wide analysis in cichlids that relies on using databases generated from other species. Electronic supplementary material The online version of this article (10.1186/s12862-017-1072-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Rohini Mopuri
- Department of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Dr., Atlanta, GA, 30332, USA
| | - J Todd Streelman
- Department of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Dr., Atlanta, GA, 30332, USA
| | - Patrick T McGrath
- Department of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Dr., Atlanta, GA, 30332, USA.
| |
Collapse
|
47
|
Abstract
Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast ‘twilight zone’ in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species’ population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional.
Collapse
Affiliation(s)
- Chris P Ponting
- MRC Human Genetics Unit, The Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, UK.
| |
Collapse
|
48
|
Martín-Durán JM, Ryan JF, Vellutini BC, Pang K, Hejnol A. Increased taxon sampling reveals thousands of hidden orthologs in flatworms. Genome Res 2017; 27:1263-1272. [PMID: 28400424 PMCID: PMC5495077 DOI: 10.1101/gr.216226.116] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 04/10/2017] [Indexed: 11/25/2022]
Abstract
Gains and losses shape the gene complement of animal lineages and are a fundamental aspect of genomic evolution. Acquiring a comprehensive view of the evolution of gene repertoires is limited by the intrinsic limitations of common sequence similarity searches and available databases. Thus, a subset of the gene complement of an organism consists of hidden orthologs, i.e., those with no apparent homology to sequenced animal lineages—mistakenly considered new genes—but actually representing rapidly evolving orthologs or undetected paralogs. Here, we describe Leapfrog, a simple automated BLAST pipeline that leverages increased taxon sampling to overcome long evolutionary distances and identify putative hidden orthologs in large transcriptomic databases by transitive homology. As a case study, we used 35 transcriptomes of 29 flatworm lineages to recover 3427 putative hidden orthologs, some unidentified by OrthoFinder and HaMStR, two common orthogroup inference algorithms. Unexpectedly, we do not observe a correlation between the number of putative hidden orthologs in a lineage and its “average” evolutionary rate. Hidden orthologs do not show unusual sequence composition biases that might account for systematic errors in sequence similarity searches. Instead, gene duplication with divergence of one paralog and weak positive selection appear to underlie hidden orthology in Platyhelminthes. By using Leapfrog, we identify key centrosome-related genes and homeodomain classes previously reported as absent in free-living flatworms, e.g., planarians. Altogether, our findings demonstrate that hidden orthologs comprise a significant proportion of the gene repertoire in flatworms, qualifying the impact of gene losses and gains in gene complement evolution.
Collapse
Affiliation(s)
- José M Martín-Durán
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5006, Norway
| | - Joseph F Ryan
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5006, Norway.,Whitney Laboratory for Marine Bioscience, University of Florida, St. Augustine, Florida 32080, USA
| | - Bruno C Vellutini
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5006, Norway
| | - Kevin Pang
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5006, Norway
| | - Andreas Hejnol
- Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen 5006, Norway
| |
Collapse
|