1
|
A biallelic SNIP1 Amish founder variant causes a recognizable neurodevelopmental disorder. PLoS Genet 2021; 17:e1009803. [PMID: 34570759 PMCID: PMC8496849 DOI: 10.1371/journal.pgen.1009803] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 10/07/2021] [Accepted: 09/02/2021] [Indexed: 12/13/2022] Open
Abstract
SNIP1 (Smad nuclear interacting protein 1) is a widely expressed transcriptional suppressor of the TGF-β signal-transduction pathway which plays a key role in human spliceosome function. Here, we describe extensive genetic studies and clinical findings of a complex inherited neurodevelopmental disorder in 35 individuals associated with a SNIP1 NM_024700.4:c.1097A>G, p.(Glu366Gly) variant, present at high frequency in the Amish community. The cardinal clinical features of the condition include hypotonia, global developmental delay, intellectual disability, seizures, and a characteristic craniofacial appearance. Our gene transcript studies in affected individuals define altered gene expression profiles of a number of molecules with well-defined neurodevelopmental and neuropathological roles, potentially explaining clinical outcomes. Together these data confirm this SNIP1 gene variant as a cause of an autosomal recessive complex neurodevelopmental disorder and provide important insight into the molecular roles of SNIP1, which likely explain the cardinal clinical outcomes in affected individuals, defining potential therapeutic avenues for future research.
Collapse
|
2
|
Changes to the identity of EndoC-βH1 beta cells may be mediated by stress-induced depletion of HNRNPD. Cell Biosci 2021; 11:144. [PMID: 34301309 PMCID: PMC8305497 DOI: 10.1186/s13578-021-00658-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 07/14/2021] [Indexed: 12/02/2022] Open
Abstract
Background Beta cell identity changes occur in the islets of donors with diabetes, but the molecular basis of this remains unclear. Protecting residual functional beta cells from cell identity changes may be beneficial for patients with diabetes. Results A somatostatin-positive cell population was induced in stressed clonal human EndoC-βH1 beta cells and was isolated using FACS. A transcriptomic characterisation of somatostatin-positive cells was then carried out. Gain of somatostatin-positivity was associated with marked dysregulation of the non-coding genome. Very few coding genes were differentially expressed. Potential candidate effector genes were assessed by targeted gene knockdown. Targeted knockdown of the HNRNPD gene induced the emergence of a somatostatin-positive cell population in clonal EndoC-βH1 beta cells comparable with that we have previously reported in stressed cells. Conclusions We report here a role for the HNRNPD gene in determination of beta cell identity in response to cellular stress. These findings widen our understanding of the role of RNA binding proteins and RNA biology in determining cell identity and may be important for protecting remaining beta cell reserve in diabetes. Supplementary Information The online version contains supplementary material available at 10.1186/s13578-021-00658-6.
Collapse
|
3
|
Transcriptomic meta-analysis of disuse muscle atrophy vs. resistance exercise-induced hypertrophy in young and older humans. J Cachexia Sarcopenia Muscle 2021; 12:629-645. [PMID: 33951310 PMCID: PMC8200445 DOI: 10.1002/jcsm.12706] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 02/26/2021] [Accepted: 03/29/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Skeletal muscle atrophy manifests across numerous diseases; however, the extent of similarities/differences in causal mechanisms between atrophying conditions in unclear. Ageing and disuse represent two of the most prevalent and costly atrophic conditions, with resistance exercise training (RET) being the most effective lifestyle countermeasure. We employed gene-level and network-level meta-analyses to contrast transcriptomic signatures of disuse and RET, plus young and older RET to establish a consensus on the molecular features of, and therapeutic targets against, muscle atrophy in conditions of high socio-economic relevance. METHODS Integrated gene-level and network-level meta-analysis was performed on publicly available microarray data sets generated from young (18-35 years) m. vastus lateralis muscle subjected to disuse (unilateral limb immobilization or bed rest) lasting ≥7 days or RET lasting ≥3 weeks, and resistance-trained older (≥60 years) muscle. RESULTS Disuse and RET displayed predominantly separate transcriptional responses, and transcripts altered across conditions were mostly unidirectional. However, disuse and RET induced directly inverted expression profiles for mitochondrial function and translation regulation genes, with COX4I1, ENDOG, GOT2, MRPL12, and NDUFV2, the central hub components of altered mitochondrial networks, and ZMYND11, a hub gene of altered translation regulation. A substantial number of genes (n = 140) up-regulated post-RET in younger muscle were not similarly up-regulated in older muscle, with young muscle displaying a more pronounced extracellular matrix (ECM) and immune/inflammatory gene expression response. Both young and older muscle exhibited similar RET-induced ubiquitination/RNA processing gene signatures with associated PWP1, PSMB1, and RAF1 hub genes. CONCLUSIONS Despite limited opposing gene profiles, transcriptional signatures of disuse are not simply the converse of RET. Thus, the mechanisms of unloading cannot be derived from studying muscle loading alone and provides a molecular basis for understanding why RET fails to target all transcriptional features of disuse. Loss of RET-induced ECM mechanotransduction and inflammatory profiles might also contribute to suboptimal ageing muscle adaptations to RET. Disuse and age-dependent molecular candidates further establish a framework for understanding and treating disuse/ageing atrophy.
Collapse
|
4
|
The acute transcriptional response to resistance exercise: impact of age and contraction mode. Aging (Albany NY) 2020; 11:2111-2126. [PMID: 30996129 PMCID: PMC6503873 DOI: 10.18632/aging.101904] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 03/31/2019] [Indexed: 01/02/2023]
Abstract
Optimization of resistance exercise (RE) remains a hotbed of research for muscle building and maintenance. However, the interactions between the contractile components of RE (i.e. concentric (CON) and eccentric (ECC)) and age, are poorly defined. We used transcriptomics to compare age-related molecular responses to acute CON and ECC exercise. Eight young (21±1 y) and eight older (70±1 y) exercise-naïve male volunteers had vastus lateralis biopsies collected at baseline and 5 h post unilateral CON and contralateral ECC exercise. RNA was subjected to next-generation sequencing and differentially expressed (DE) genes tested for pathway enrichment using Gene Ontology (GO). The young transcriptional response to CON and ECC was highly similar and older adults displayed moderate contraction-specific profiles, with no GO enrichment. Age-specific responses to ECC revealed 104 DE genes unique to young, and 170 DE genes in older muscle, with no GO enrichment. Following CON, 15 DE genes were young muscle-specific, whereas older muscle uniquely expressed 147 up-regulated genes enriched for cell adhesion and blood vessel development, and 28 down-regulated genes involved in mitochondrial respiration, amino acid and lipid metabolism. Thus, older age is associated with contraction-specific regulation often without clear functional relevance, perhaps reflecting a degree of stochastic age-related dysregulation.
Collapse
|
5
|
Identifying Candida albicans Gene Networks Involved in Pathogenicity. Front Genet 2020; 11:375. [PMID: 32391057 PMCID: PMC7193023 DOI: 10.3389/fgene.2020.00375] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 03/26/2020] [Indexed: 11/17/2022] Open
Abstract
Candida albicans is a normal member of the human microbiome. It is also an opportunistic pathogen, which can cause life-threatening systemic infections in severely immunocompromized individuals. Despite the availability of antifungal drugs, mortality rates of systemic infections are high and new drugs are needed to overcome therapeutic challenges including the emergence of drug resistance. Targeting known disease pathways has been suggested as a promising avenue for the development of new antifungals. However, <30% of C. albicans genes are verified with experimental evidence of a gene product, and the full complement of genes involved in important disease processes is currently unknown. Tools to predict the function of partially or uncharacterized genes and generate testable hypotheses will, therefore, help to identify potential targets for new antifungal development. Here, we employ a network-extracted ontology to leverage publicly available transcriptomics data and identify potential candidate genes involved in disease processes. A subset of these genes has been phenotypically screened using available deletion strains and we present preliminary data that one candidate, PEP8, is involved in hyphal development and immune evasion. This work demonstrates the utility of network-extracted ontologies in predicting gene function to generate testable hypotheses that can be applied to pathogenic systems. This could represent a novel first step to identifying targets for new antifungal therapies.
Collapse
|
6
|
Islet-expressed circular RNAs are associated with type 2 diabetes status in human primary islets and in peripheral blood. BMC Med Genomics 2020; 13:64. [PMID: 32312268 PMCID: PMC7171860 DOI: 10.1186/s12920-020-0713-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2019] [Accepted: 04/14/2020] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Circular RNAs are non-coding RNA molecules with gene regulatory potential that have been associated with several human diseases. They are stable and present in the circulation, making them excellent candidates for biomarkers of disease. Despite their promise as biomarkers or future therapeutic targets, information on their expression and functionality in human pancreatic islets is a relatively unexplored subject. METHODS Here we aimed to produce an enriched circRNAome profile for human pancreatic islets by CircleSeq, and to explore the relationship between circRNA expression, diabetes status, genotype at T2D risk loci and measures of glycaemia (insulin secretory index; SI and HbA1c) in human islet preparations from healthy control donors and donors with type 2 diabetes using ANOVA or linear regression as appropriate. We also assessed the effect of elevated glucose, cytokine and lipid and hypoxia on circRNA expression in the human beta cell line EndoC-βH1. RESULTS We identified over 2600 circRNAs present in human islets. Of the five most abundant circRNAs in human islets, four (circCIRBP, circZKSCAN, circRPH3AL and circCAMSAP1) demonstrated marked associations with diabetes status. CircCIRBP demonstrated an association with insulin secretory index in isolated human islets and circCIRBP and circRPH3AL displayed altered expression with elevated fatty acid in treated EndoC-βH1 cells. CircCAMSAP1 was also noted to be associated with T2D status in human peripheral blood. No associations between circRNA expression and genotype at T2D risk loci were identified in our samples. CONCLUSIONS Our data suggest that circRNAs are abundantly expressed in human islets, and that some are differentially regulated in the islets of donors with type 2 diabetes. Some islet circRNAs are also expressed in peripheral blood and the expression of one, circCAMSAP1, correlates with diabetes status. These findings highlight the potential of circRNAs as biomarkers for T2D.
Collapse
|
7
|
Network analysis of human muscle adaptation to aging and contraction. Aging (Albany NY) 2020; 12:740-755. [PMID: 31910159 PMCID: PMC6977671 DOI: 10.18632/aging.102653] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Accepted: 12/24/2019] [Indexed: 12/21/2022]
Abstract
Resistance exercise (RE) remains a primary approach for minimising aging muscle decline. Understanding muscle adaptation to individual contractile components of RE (eccentric, concentric) might optimise RE-based intervention strategies. Herein, we employed a network-driven pipeline to identify putative molecular drivers of muscle aging and contraction mode responses. RNA-sequencing data was generated from young (21±1 y) and older (70±1 y) human skeletal muscle before and following acute unilateral concentric and contralateral eccentric contractions. Application of weighted gene co-expression network analysis identified 33 distinct gene clusters ('modules') with an expression profile regulated by aging, contraction and/or linked to muscle strength. These included two contraction 'responsive' modules (related to 'cell adhesion' and 'transcription factor' processes) that also correlated with the magnitude of post-exercise muscle strength decline. Module searches for 'hub' genes and enriched transcription factor binding sites established a refined set of candidate module-regulatory molecules (536 hub genes and 60 transcription factors) as possible contributors to muscle aging and/or contraction responses. Thus, network-driven analysis can identify new molecular candidates of functional relevance to muscle aging and contraction mode adaptations.
Collapse
|
8
|
circRNAs expressed in human peripheral blood are associated with human aging phenotypes, cellular senescence and mouse lifespan. GeroScience 2019; 42:183-199. [PMID: 31811527 PMCID: PMC7031184 DOI: 10.1007/s11357-019-00120-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 10/04/2019] [Indexed: 10/25/2022] Open
Abstract
Circular RNAs (circRNAs) are an emerging class of non-coding RNA molecules that are thought to regulate gene expression and human disease. Despite the observation that circRNAs are known to accumulate in older organisms and have been reported in cellular senescence, their role in aging remains relatively unexplored. Here, we have assessed circRNA expression in aging human blood and followed up age-associated circRNA in relation to human aging phenotypes, mammalian longevity as measured by mouse median strain lifespan and cellular senescence in four different primary human cell types. We found that circRNAs circDEF6, circEP300, circFOXO3 and circFNDC3B demonstrate associations with parental longevity or hand grip strength in 306 subjects from the InCHIANTI study of aging, and furthermore, circFOXO3 and circEP300 also demonstrate differential expression in one or more human senescent cell types. Finally, four circRNAs tested showed evidence of conservation in mouse. Expression levels of one of these, circPlekhm1, was nominally associated with lifespan. These data suggest that circRNA may represent a novel class of regulatory RNA involved in the determination of aging phenotypes, which may show future promise as both biomarkers and future therapeutic targets for age-related disease.
Collapse
|
9
|
The feasibility of using citizens to segment anatomy from medical images: Accuracy and motivation. PLoS One 2019; 14:e0222523. [PMID: 31600225 PMCID: PMC6786545 DOI: 10.1371/journal.pone.0222523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 09/02/2019] [Indexed: 11/18/2022] Open
Abstract
The development of automatic methods for segmenting anatomy from medical images is an important goal for many medical and healthcare research areas. Datasets that can be used to train and test computer algorithms, however, are often small due to the difficulties in obtaining experts to segment enough examples. Citizen science provides a potential solution to this problem but the feasibility of using the public to identify and segment anatomy in a medical image has not been investigated. Our study therefore aimed to explore the feasibility, in terms of performance and motivation, of using citizens for such purposes. Public involvement was woven into the study design and evaluation. Twenty-nine citizens were recruited and, after brief training, asked to segment the spine from a dataset of 150 magnetic resonance images. Participants segmented as many images as they could within three one-hour sessions. Their accuracy was evaluated by comparing them, as individuals and as a combined consensus, to the segmentations of three experts. Questionnaires and a focus group were used to determine the citizens' motivation for taking part and their experience of the study. Citizen segmentation accuracy, in terms of agreement with the expert consensus segmentation, varied considerably between individual citizens. The citizen consensus, however, was close to the expert consensus, indicating that when pooled, citizens may be able to replace or supplement experts for generating large image datasets. Personal interest and a desire to help were the two most common reasons for taking part in the study.
Collapse
|
10
|
Rapid functional and evolutionary changes follow gene duplication in yeast. Proc Biol Sci 2017; 284:20171393. [PMID: 28835561 PMCID: PMC5577496 DOI: 10.1098/rspb.2017.1393] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 07/13/2017] [Indexed: 12/22/2022] Open
Abstract
Duplication of genes or genomes provides the raw material for evolutionary innovation. After duplication a gene may be lost, recombine with another gene, have its function modified or be retained in an unaltered state. The fate of duplication is usually studied by comparing extant genomes and reconstructing the most likely ancestral states. Valuable as this approach is, it may miss the most rapid evolutionary events. Here, we engineered strains of Saccharomyces cerevisiae carrying tandem and non-tandem duplications of the singleton gene IFA38 to monitor (i) the fate of the duplicates in different conditions, including time scale and asymmetry of gene loss, and (ii) the changes in fitness and transcriptome of the strains immediately after duplication and after experimental evolution. We found that the duplication brings widespread transcriptional changes, but a fitness advantage is only present in fermentable media. In respiratory conditions, the yeast strains consistently lose the non-tandem IFA38 gene copy in a surprisingly short time, within only a few generations. This gene loss appears to be asymmetric and dependent on genome location, since the original IFA38 copy and the tandem duplicate are retained. Overall, this work shows for the first time that gene loss can be extremely rapid and context dependent.
Collapse
|
11
|
Abstract
Background Previous studies have suggested that modern obesogenic environments accentuate the genetic risk of obesity. However, these studies have proven controversial as to which, if any, measures of the environment accentuate genetic susceptibility to high body mass index (BMI). Methods We used up to 120 000 adults from the UK Biobank study to test the hypothesis that high-risk obesogenic environments and behaviours accentuate genetic susceptibility to obesity. We used BMI as the outcome and a 69-variant genetic risk score (GRS) for obesity and 12 measures of the obesogenic environment as exposures. These measures included Townsend deprivation index (TDI) as a measure of socio-economic position, TV watching, a 'Westernized' diet and physical activity. We performed several negative control tests, including randomly selecting groups of different average BMIs, using a simulated environment and including sun-protection use as an environment. Results We found gene-environment interactions with TDI (Pinteraction = 3 × 10 -10 ), self-reported TV watching (Pinteraction = 7 × 10 -5 ) and self-reported physical activity (Pinteraction = 5 × 10 -6 ). Within the group of 50% living in the most relatively deprived situations, carrying 10 additional BMI-raising alleles was associated with approximately 3.8 kg extra weight in someone 1.73 m tall. In contrast, within the group of 50% living in the least deprivation, carrying 10 additional BMI-raising alleles was associated with approximately 2.9 kg extra weight. The interactions were weaker, but present, with the negative controls, including sun-protection use, indicating that residual confounding is likely. Conclusions Our findings suggest that the obesogenic environment accentuates the risk of obesity in genetically susceptible adults. Of the factors we tested, relative social deprivation best captures the aspects of the obesogenic environment responsible.
Collapse
|
12
|
Binding interface change and cryptic variation in the evolution of protein-protein interactions. BMC Evol Biol 2016; 16:40. [PMID: 26892785 PMCID: PMC4758157 DOI: 10.1186/s12862-016-0608-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Accepted: 02/02/2016] [Indexed: 12/03/2022] Open
Abstract
Background Physical interactions between proteins are essential for almost all biological functions and systems. To understand the evolution of function it is therefore important to understand the evolution of molecular interactions. Of key importance is the evolution of binding specificity, the set of interactions made by a protein, since change in specificity can lead to “rewiring” of interaction networks. Unfortunately, the interfaces through which proteins interact are complex, typically containing many amino-acid residues that collectively must contribute to binding specificity as well as binding affinity, structural integrity of the interface and solubility in the unbound state. Results In order to study the relationship between interface composition and binding specificity, we make use of paralogous pairs of yeast proteins. Immediately after duplication these paralogues will have identical sequences and protein products that make an identical set of interactions. As the sequences diverge, we can correlate amino-acid change in the interface with any change in the specificity of binding. We show that change in interface regions correlates only weakly with change in specificity, and many variants in interfaces are functionally equivalent. We show that many of the residue replacements within interfaces are silent with respect to their contribution to binding specificity. Conclusions We conclude that such functionally-equivalent change has the potential to contribute to evolutionary plasticity in interfaces by creating cryptic variation, which in turn may provide the raw material for functional innovation and coevolution. Electronic supplementary material The online version of this article (doi:10.1186/s12862-016-0608-1) contains supplementary material, which is available to authorized users.
Collapse
|
13
|
Abstract
BACKGROUND Biological processes at the molecular level are usually represented by molecular interaction networks. Function is organised and modularity identified based on network topology, however, this approach often fails to account for the dynamic and multifunctional nature of molecular components. For example, a molecule engaging in spatially or temporally independent functions may be inappropriately clustered into a single functional module. To capture biologically meaningful sets of interacting molecules, we use experimentally defined pathways as spatial/temporal units of molecular activity. RESULTS We defined functional profiles of Saccharomyces cerevisiae based on a minimal set of Gene Ontology terms sufficient to represent each pathway's genes. The Gene Ontology terms were used to annotate 271 pathways, accounting for pathway multi-functionality and gene pleiotropy. Pathways were then arranged into a network, linked by shared functionality. Of the genes in our data set, 44% appeared in multiple pathways performing a diverse set of functions. Linking pathways by overlapping functionality revealed a modular network with energy metabolism forming a sparse centre, surrounded by several denser clusters comprised of regulatory and metabolic pathways. Signalling pathways formed a relatively discrete cluster connected to the centre of the network. Genetic interactions were enriched within the clusters of pathways by a factor of 5.5, confirming the organisation of our pathway network is biologically significant. CONCLUSIONS Our representation of molecular function according to pathway relationships enables analysis of gene/protein activity in the context of specific functional roles, as an alternative to typical molecule-centric graph-based methods. The pathway network demonstrates the cooperation of multiple pathways to perform biological processes and organises pathways into functionally related clusters with interdependent outcomes.
Collapse
|
14
|
Abstract
Summary: Gene duplication and loss are important processes in the evolution of gene families. Moreover, growth of families by duplication and retention is an important mechanism by which organisms gain new functions. Therefore the ability to infer the evolutionary histories of families is an important step in understanding the evolution of function. We have recently developed DupliPHY, a software tool to infer gene family histories using parsimony and maximum likelihood. Here, we present DupliPHY-Web a web server for DupliPHY that implements additional maximum likelihood functionality and provides users an intuitive interface to run DupliPHY. Availability and implementation: DupliPHY-Web is available at www.bioinf.manchester.ac.uk/dupliphy/ Contact: ryan.ames@manchester.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
|
15
|
Inferring gene family histories in yeast identifies lineage specific expansions. PLoS One 2014; 9:e99480. [PMID: 24921666 PMCID: PMC4055711 DOI: 10.1371/journal.pone.0099480] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2013] [Accepted: 05/15/2014] [Indexed: 11/24/2022] Open
Abstract
The complement of genes found in the genome is a balance between gene gain and gene loss. Knowledge of the specific genes that are gained and lost over evolutionary time allows an understanding of the evolution of biological functions. Here we use new evolutionary models to infer gene family histories across complete yeast genomes; these models allow us to estimate the relative genome-wide rates of gene birth, death, innovation and extinction (loss of an entire family) for the first time. We show that the rates of gene family evolution vary both between gene families and between species. We are also able to identify those families that have experienced rapid lineage specific expansion/contraction and show that these families are enriched for specific functions. Moreover, we find that families with specific functions are repeatedly expanded in multiple species, suggesting the presence of common adaptations and that these family expansions/contractions are not random. Additionally, we identify potential specialisations, unique to specific species, in the functions of lineage specific expanded families. These results suggest that an important mechanism in the evolution of genome content is the presence of lineage-specific gene family changes.
Collapse
|
16
|
Abstract
MOTIVATION Recent large-scale studies of individuals within a population have demonstrated that there is widespread variation in copy number in many gene families. In addition, there is increasing evidence that the variation in gene copy number can give rise to substantial phenotypic effects. In some cases, these variations have been shown to be adaptive. These observations show that a full understanding of the evolution of biological function requires an understanding of gene gain and gene loss. Accurate, robust evolutionary models of gain and loss events are, therefore, required. RESULTS We have developed weighted parsimony and maximum likelihood methods for inferring gain and loss events. To test these methods, we have used Markov models of gain and loss to simulate data with known properties. We examine three models: a simple birth-death model, a single rate model and a birth-death innovation model with parameters estimated from Drosophila genome data. We find that for all simulations maximum likelihood-based methods are very accurate for reconstructing the number of duplication events on the phylogenetic tree, and that maximum likelihood and weighted parsimony have similar accuracy for reconstructing the ancestral state. Our implementations are robust to different model parameters and provide accurate inferences of ancestral states and the number of gain and loss events. For ancestral reconstruction, we recommend weighted parsimony because it has similar accuracy to maximum likelihood, but is much faster. For inferring the number of individual gene loss or gain events, maximum likelihood is noticeably more accurate, albeit at greater computational cost. AVAILABILITY www.bioinf.manchester.ac.uk/dupliphy CONTACT simon.lovell@manchester.ac.uk; simon.whelan@manchester.ac.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
17
|
Diversification at transcription factor binding sites within a species and the implications for environmental adaptation. Mol Biol Evol 2011; 28:3331-44. [PMID: 21693437 DOI: 10.1093/molbev/msr167] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Evolution of new cellular functions can be achieved both by changes in protein coding sequences and by alteration of expression patterns. Variation of expression may lead to changes in cellular function with relatively little change in genomic sequence. We therefore hypothesize that one of the first signals of functional divergence should be evolution of transcription factor-binding sites (TFBSs). This adaptation should be detectable as substantial variation in the TFBSs of alleles. New data sets allow the first analyses of intraspecies variation from large number of whole-genome sequences. Using data from the Saccharomyces Genome Resequencing Project, we have analyzed variation in TFBSs. We find a large degree of variation both between these closely related strains and between pairs of duplicated genes. There is a correlation between changes in promoter regions and changes in coding sequences, indicating a coupling of changes in expression and function. We show that 1) the types genes with diverged promoters vary between strains from different environments and 2) that patterns of divergence in promoters consistent with positive selection are detectable in alleles between strains and on duplicate promoters. This variation is likely to reflect adaptation to each strain's natural environment. We conclude that, even within a species, we detect signs of selection acting on promoter regions that may act to alter expression patterns. These changes may indicate functional innovation in multiple genes and across the whole genome. Change in function could represent adaptation to the environment and be a precursor to speciation.
Collapse
|
18
|
Abstract
Population-level differences in the number of copies of genes resulting from gene duplication and loss have recently been recognized as an important source of variation in eukaryotes. However, except for a small number of cases, the phenotypic effects of this variation are unknown. Data from the Saccharomyces Genome Resequencing Project permit the study of duplication in genome sequences from a set of individuals within the same population. These sequences can be correlated with available information on the environments from which these yeast strains were isolated. We find that yeast show an abundance of duplicate genes that are lineage specific, leading to a large degree of variation in gene content between individual strains. There is a detectable bias for specific functions, indicating that selection is acting to preferentially retain certain duplicates. Most strikingly, we find that sets of over- and underrepresented duplicates correlate with the environment from which they were isolated. Together, these observations indicate that gene duplication can give rise to substantial phenotypic differences within populations that in turn can offer a shortcut to evolutionary adaptation.
Collapse
|