1
|
Cope AL, Schraiber JG, Pennell M. Macroevolutionary divergence of gene expression driven by selection on protein abundance. Science 2025; 387:1063-1068. [PMID: 40048509 DOI: 10.1126/science.ads2658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 01/24/2025] [Indexed: 03/28/2025]
Abstract
The regulation of messenger RNA (mRNA) and protein abundances is well-studied, but less is known about the evolutionary processes shaping their relationship. To address this, we derived a new phylogenetic model and applied it to multispecies mammalian data. Our analyses reveal (i) strong stabilizing selection on protein abundances over macroevolutionary time, (ii) mutations affecting mRNA abundances minimally impact protein abundances, (iii) mRNA abundances evolve under selection to align with protein abundances, and (iv) mRNA abundances adapt faster than protein abundances owing to greater mutational opportunity. These conclusions are supported by comparisons of model parameters with independent functional genomic data. By decomposing mutational and selective influences on mRNA-protein dynamics, our approach provides a framework for discovering the evolutionary rules that drive divergence in gene expression.
Collapse
Affiliation(s)
- Alexander L Cope
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
- Department of Genetics, Rutgers University, New Brunswick, NJ, USA
- Human Genetics Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
- Robert Wood Johnson Medical School, Rutgers University, New Brunswick, NJ, USA
| | - Joshua G Schraiber
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Matt Pennell
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Department of Computational Biology, Cornell University, Ithaca, CA, USA
| |
Collapse
|
2
|
Wen H, Liu X, Zhao X, Zhao T, Feng C, Chang H, Wang J, Lin J. Evolutionary analysis of the DHHCs in Saccharinae. Sci Rep 2025; 15:2290. [PMID: 39833334 PMCID: PMC11756399 DOI: 10.1038/s41598-025-86463-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Accepted: 01/10/2025] [Indexed: 01/22/2025] Open
Abstract
The DHHC domain genes are crucial for protein lipid modification, a key post-translational modification influencing membrane targeting, subcellular trafficking, and protein function. Despite their significance, the DHHC gene family in Saccharinae remains understudied. Here, we identified 32 (110 alleles), 28, 53, and 48 DHHC genes in Saccharum spontaneum Np-X, Erianthus rufipilus, Miscanthus sinensis, and Miscanthus lutarioriparius, respectively. Collinearity analysis uncovered the loss of two M. lutarioriparius genes, homologues of EruDHHC1C and EruDHHC3A. Phylogenetic and classification analyses categorized DHHC family members into six subgroups (A-F). Ka/Ks ratio analysis indicated that gene duplication in these species was primarily driven by whole-genome duplication (WGD) and dispersed duplication (DSD), with DHHC genes evolving under strong purifying selection. Gene expression and trait correlation analysis revealed a significant negative correlation between SspDHHC28A expression in S. spontaneum and sucrose content, suggesting a role in photosynthesis product transport during rapid growth. This study deepens our understanding of the DHHC gene family's functional dynamics and evolutionary path in Saccharinae, laying a foundation for future research.
Collapse
Affiliation(s)
- Hao Wen
- Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian, China
| | - Xinyu Liu
- Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian, China
| | - Xueting Zhao
- Sanya Institute of Chinese Academy of Tropical Agricultural Sciences, Sanya, 572024, Hainan, China
| | - Tingting Zhao
- National Key Laboratory for Tropical Crop Breeding, Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Sciences, Sanya 572024, Haikou, 571101, Hainan, China
- Sanya Institute of Chinese Academy of Tropical Agricultural Sciences, Sanya, 572024, Hainan, China
| | - Cuilian Feng
- National Key Laboratory for Tropical Crop Breeding, Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Sciences, Sanya 572024, Haikou, 571101, Hainan, China
- Sanya Institute of Chinese Academy of Tropical Agricultural Sciences, Sanya, 572024, Hainan, China
| | - Hailong Chang
- Institute of Nanfan & Seed Industry, Zhanjiang Research Center,Guangdong Academy of Sciences, Guangzhou, 510000, Guangdong, China
| | - Jungang Wang
- National Key Laboratory for Tropical Crop Breeding, Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Sciences, Sanya 572024, Haikou, 571101, Hainan, China.
- Sanya Institute of Chinese Academy of Tropical Agricultural Sciences, Sanya, 572024, Hainan, China.
| | - Jishan Lin
- Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian, China.
- National Key Laboratory for Tropical Crop Breeding, Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Sciences, Sanya 572024, Haikou, 571101, Hainan, China.
- Sanya Institute of Chinese Academy of Tropical Agricultural Sciences, Sanya, 572024, Hainan, China.
| |
Collapse
|
3
|
Sword TT, Dinglasan JLN, Abbas GSK, Barker JW, Spradley ME, Greene ER, Gooden DS, Emrich SJ, Gilchrist MA, Doktycz MJ, Bailey CB. Profiling expression strategies for a type III polyketide synthase in a lysate-based, cell-free system. Sci Rep 2024; 14:12983. [PMID: 38839808 PMCID: PMC11153635 DOI: 10.1038/s41598-024-61376-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 05/06/2024] [Indexed: 06/07/2024] Open
Abstract
Some of the most metabolically diverse species of bacteria (e.g., Actinobacteria) have higher GC content in their DNA, differ substantially in codon usage, and have distinct protein folding environments compared to tractable expression hosts like Escherichia coli. Consequentially, expressing biosynthetic gene clusters (BGCs) from these bacteria in E. coli often results in a myriad of unpredictable issues with regard to protein expression and folding, delaying the biochemical characterization of new natural products. Current strategies to achieve soluble, active expression of these enzymes in tractable hosts can be a lengthy trial-and-error process. Cell-free expression (CFE) has emerged as a valuable expression platform as a testbed for rapid prototyping expression parameters. Here, we use a type III polyketide synthase from Streptomyces griseus, RppA, which catalyzes the formation of the red pigment flaviolin, as a reporter to investigate BGC refactoring techniques. We applied a library of constructs with different combinations of promoters and rppA coding sequences to investigate the synergies between promoter and codon usage. Subsequently, we assess the utility of cell-free systems for prototyping these refactoring tactics prior to their implementation in cells. Overall, codon harmonization improves natural product synthesis more than traditional codon optimization across cell-free and cellular environments. More importantly, the choice of coding sequences and promoters impact protein expression synergistically, which should be considered for future efforts to use CFE for high-yield protein expression. The promoter strategy when applied to RppA was not completely correlated with that observed with GFP, indicating that different promoter strategies should be applied for different proteins. In vivo experiments suggest that there is correlation, but not complete alignment between expressing in cell free and in vivo. Refactoring promoters and/or coding sequences via CFE can be a valuable strategy to rapidly screen for catalytically functional production of enzymes from BCGs, which advances CFE as a tool for natural product research.
Collapse
Affiliation(s)
- Tien T Sword
- Department of Chemistry, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Jaime Lorenzo N Dinglasan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
- Graduate School of Genome Science and Technology, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Ghaeath S K Abbas
- Department of Chemistry, University of Tennessee-Knoxville, Knoxville, TN, USA
- School of Chemistry, University of Sydney, Sydney, NSW, Australia
| | - J William Barker
- Department of Chemistry, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Madeline E Spradley
- Department of Biochemistry, Cellular, and Molecular Biology, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Elijah R Greene
- Department of Chemistry, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Damian S Gooden
- Department of Chemistry, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Scott J Emrich
- Graduate School of Genome Science and Technology, University of Tennessee-Knoxville, Knoxville, TN, USA
- Department of Electrical Engineering and Computer Science, University of Tennessee-Knoxville, Knoxville, TN, USA
- Department of Ecology and Evolutionary Biology, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Michael A Gilchrist
- Graduate School of Genome Science and Technology, University of Tennessee-Knoxville, Knoxville, TN, USA
- Department of Ecology and Evolutionary Biology, University of Tennessee-Knoxville, Knoxville, TN, USA
| | - Mitchel J Doktycz
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
- Graduate School of Genome Science and Technology, University of Tennessee-Knoxville, Knoxville, TN, USA.
| | - Constance B Bailey
- Department of Chemistry, University of Tennessee-Knoxville, Knoxville, TN, USA.
- Graduate School of Genome Science and Technology, University of Tennessee-Knoxville, Knoxville, TN, USA.
- School of Chemistry, University of Sydney, Sydney, NSW, Australia.
| |
Collapse
|
4
|
Kotari I, Kosiol C, Borges R. The Patterns of Codon Usage between Chordates and Arthropods are Different but Co-evolving with Mutational Biases. Mol Biol Evol 2024; 41:msae080. [PMID: 38667829 PMCID: PMC11108087 DOI: 10.1093/molbev/msae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 03/22/2024] [Accepted: 04/15/2024] [Indexed: 05/22/2024] Open
Abstract
Different frequencies amongst codons that encode the same amino acid (i.e. synonymous codons) have been observed in multiple species. Studies focused on uncovering the forces that drive such codon usage showed that a combined effect of mutational biases and translational selection works to produce different frequencies of synonymous codons. However, only few have been able to measure and distinguish between these forces that may leave similar traces on the coding regions. Here, we have developed a codon model that allows the disentangling of mutation, selection on amino acids and synonymous codons, and GC-biased gene conversion (gBGC) which we employed on an extensive dataset of 415 chordates and 191 arthropods. We found that chordates need 15 more synonymous codon categories than arthropods to explain the empirical codon frequencies, which suggests that the extent of codon usage can vary greatly between animal phyla. Moreover, methylation at CpG sites seems to partially explain these patterns of codon usage in chordates but not in arthropods. Despite the differences between the two phyla, our findings demonstrate that in both, GC-rich codons are disfavored when mutations are GC-biased, and the opposite is true when mutations are AT-biased. This indicates that selection on the genomic coding regions might act primarily to stabilize its GC/AT content on a genome-wide level. Our study shows that the degree of synonymous codon usage varies considerably among animals, but is likely governed by a common underlying dynamic.
Collapse
Affiliation(s)
- Ioanna Kotari
- Institut für Populationsgenetik, University of Veterinary Medicine, Veterinärplatz 1, Vienna 1210, Austria
- Vienna Graduate School of Population Genetics, Vienna, Austria
| | - Carolin Kosiol
- Centre for Biological Diversity, School of Biology, University of St Andrews, Fife KY16 9TH, UK
| | - Rui Borges
- Institut für Populationsgenetik, University of Veterinary Medicine, Veterinärplatz 1, Vienna 1210, Austria
| |
Collapse
|
5
|
Akeju OJ, Cope AL. Re-examining Correlations Between Synonymous Codon Usage and Protein Bond Angles in Escherichia coli. Genome Biol Evol 2024; 16:evae080. [PMID: 38619010 PMCID: PMC11077309 DOI: 10.1093/gbe/evae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 04/05/2024] [Accepted: 04/10/2024] [Indexed: 04/16/2024] Open
Abstract
Rosenberg AA, Marx A, Bronstein AM (Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon. Nat Commun. 2022:13:2815) recently found a surprising correlation between synonymous codon usage and the dihedral bond angles of the resulting amino acid. However, their analysis did not account for the strongest known correlate of codon usage: gene expression. We re-examined the relationship between bond angles and codon usage by applying the approach of Rosenberg et al. to simulated protein-coding sequences that (i) have random codon usage, (ii) codon usage determined by mutation biases, and (iii) maintain the general relationship between codon usage and gene expression via the assumption of selection-mutation-drift equilibrium. We observed correlations between dihedral bond angle and codon usage when codon usage is entirely random, indicating possible conflation of noise with differences in bond angle distributions between synonymous codons. More relevant to the general analysis of codon usage patterns, we found surprisingly good agreement between the analysis of the real sequences and the analysis of sequences simulated assuming selection-mutation-drift equilibrium, with 91% of significant synonymous codon pairs detected in the former were also detected in the latter. We believe the correlation between codon usage and dihedral bond angles resulted from the variation in codon usage across genes due to the interplay between mutation bias, natural selection for translation efficiency, and gene expression, further underscoring these factors must be controlled for when looking for novel patterns related to codon usage.
Collapse
Affiliation(s)
| | - Alexander L Cope
- Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, USA
- Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, USA
| |
Collapse
|
6
|
Sword TT, Dinglasan JLN, Abbas GS, William Barker J, Spradley ME, Greene ER, Gooden DS, Emrich SJ, Gilchrist MA, Doktycz MJ, Bailey CB. Profiling Expression Strategies for a Type III Polyketide Synthase in a Lysate-Based, Cell-free System. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.30.569483. [PMID: 38077034 PMCID: PMC10705458 DOI: 10.1101/2023.11.30.569483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
Some of the most metabolically diverse species of bacteria (e.g., Actinobacteria) have higher GC content in their DNA, differ substantially in codon usage, and have distinct protein folding environments compared to tractable expression hosts like Escherichia coli. Consequentially, expressing biosynthetic gene clusters (BGCs) from these bacteria in E. coli frequently results in a myriad of unpredictable issues with protein expression and folding, delaying the biochemical characterization of new natural products. Current strategies to achieve soluble, active expression of these enzymes in tractable hosts, such as BGC refactoring, can be a lengthy trial-and-error process. Cell-free expression (CFE) has emerged as 1) a valuable expression platform for enzymes that are challenging to synthesize in vivo, and as 2) a testbed for rapid prototyping that can improve cellular expression. Here, we use a type III polyketide synthase from Streptomyces griseus, RppA, which catalyzes the formation of the red pigment flaviolin, as a reporter to investigate BGC refactoring techniques. We synergistically tune promoter and codon usage to improve flaviolin production from cell-free expressed RppA. We then assess the utility of cell-free systems for prototyping these refactoring tactics prior to their implementation in cells. Overall, codon harmonization improves natural product synthesis more than traditional codon optimization across cell-free and cellular environments. Refactoring promoters and/or coding sequences via CFE can be a valuable strategy to rapidly screen for catalytically functional production of enzymes from BCGs. By showing the coordinators between CFE versus in vivo expression, this work advances CFE as a tool for natural product research.
Collapse
Affiliation(s)
- Tien T. Sword
- Department of Chemistry, University of Tennessee-Knoxville (Knoxville, TN USA)
| | - Jaime Lorenzo N. Dinglasan
- Biosciences Division, Oak Ridge National Laboratory (Oak Ridge, TN USA)
- Graduate School of Genome Science & Technology, University of Tennessee-Knoxville Knoxville (Knoxville, TN USA)
| | - Ghaeath S.K. Abbas
- Department of Chemistry, University of Tennessee-Knoxville (Knoxville, TN USA)
- University of Sydney, School of Chemistry (Sydney, NSW, Australia)
| | - J. William Barker
- Department of Chemistry, University of Tennessee-Knoxville (Knoxville, TN USA)
| | - Madeline E. Spradley
- Department of Biochemistry, Cellular, and Molecular Biology, University of Tennessee-Knoxville (Knoxville, TN USA)
| | - Elijah R. Greene
- Department of Chemistry, University of Tennessee-Knoxville (Knoxville, TN USA)
| | - Damian S. Gooden
- Department of Chemistry, University of Tennessee-Knoxville (Knoxville, TN USA)
| | - Scott J. Emrich
- Graduate School of Genome Science & Technology, University of Tennessee-Knoxville Knoxville (Knoxville, TN USA)
- Department of Electrical Engineering and Computer Science, University of Tennessee-Knoxville (Knoxville, TN USA)
- Department of Ecology & Evolutionary Biology, University of Tennessee-Knoxville (Knoxville, TN USA)
| | - Michael A. Gilchrist
- Graduate School of Genome Science & Technology, University of Tennessee-Knoxville Knoxville (Knoxville, TN USA)
- Department of Ecology & Evolutionary Biology, University of Tennessee-Knoxville (Knoxville, TN USA)
| | - Mitchel J. Doktycz
- Biosciences Division, Oak Ridge National Laboratory (Oak Ridge, TN USA)
- Graduate School of Genome Science & Technology, University of Tennessee-Knoxville Knoxville (Knoxville, TN USA)
| | - Constance B. Bailey
- Department of Chemistry, University of Tennessee-Knoxville (Knoxville, TN USA)
- Graduate School of Genome Science & Technology, University of Tennessee-Knoxville Knoxville (Knoxville, TN USA)
- University of Sydney, School of Chemistry (Sydney, NSW, Australia)
| |
Collapse
|
7
|
Triandafillou CG, Pan RW, Dinner AR, Drummond DA. Pervasive, conserved secondary structure in highly charged protein regions. PLoS Comput Biol 2023; 19:e1011565. [PMID: 37844070 PMCID: PMC10602382 DOI: 10.1371/journal.pcbi.1011565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 10/26/2023] [Accepted: 10/02/2023] [Indexed: 10/18/2023] Open
Abstract
Understanding how protein sequences confer function remains a defining challenge in molecular biology. Two approaches have yielded enormous insight yet are often pursued separately: structure-based, where sequence-encoded structures mediate function, and disorder-based, where sequences dictate physicochemical and dynamical properties which determine function in the absence of stable structure. Here we study highly charged protein regions (>40% charged residues), which are routinely presumed to be disordered. Using recent advances in structure prediction and experimental structures, we show that roughly 40% of these regions form well-structured helices. Features often used to predict disorder-high charge density, low hydrophobicity, low sequence complexity, and evolutionarily varying length-are also compatible with solvated, variable-length helices. We show that a simple composition classifier predicts the existence of structure far better than well-established heuristics based on charge and hydropathy. We show that helical structure is more prevalent than previously appreciated in highly charged regions of diverse proteomes and characterize the conservation of highly charged regions. Our results underscore the importance of integrating, rather than choosing between, structure- and disorder-based approaches.
Collapse
Affiliation(s)
- Catherine G. Triandafillou
- Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Rosalind Wenshan Pan
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Aaron R. Dinner
- Department of Chemistry, University of Chicago, Chicago, Illinois, United States of America
| | - D. Allan Drummond
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Medicine, Section of Genetic Medicine, The University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
8
|
Triandafillou CG, Pan RW, Dinner AR, Drummond DA. Pervasive, conserved secondary structure in highly charged protein regions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.15.528637. [PMID: 36824805 PMCID: PMC9949069 DOI: 10.1101/2023.02.15.528637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Understanding how protein sequences confer function remains a defining challenge in molecular biology. Two approaches have yielded enormous insight yet are often pursued separately: structure-based, where sequence-encoded structures mediate function, and disorder-based, where sequences dictate physicochemical and dynamical properties which determine function in the absence of stable structure. Here we study highly charged protein regions (>40% charged residues), which are routinely presumed to be disordered. Using recent advances in structure prediction and experimental structures, we show that roughly 40% of these regions form well-structured helices. Features often used to predict disorder-high charge density, low hydrophobicity, low sequence complexity, and evolutionarily varying length-are also compatible with solvated, variable-length helices. We show that a simple composition classifier predicts the existence of structure far better than well-established heuristics based on charge and hydropathy. We show that helical structure is more prevalent than previously appreciated in highly charged regions of diverse proteomes and characterize the conservation of highly charged regions. Our results underscore the importance of integrating, rather than choosing between, structure- and disorder-based approaches.
Collapse
Affiliation(s)
| | - Rosalind Wenshan Pan
- Department of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA
| | | | - D. Allan Drummond
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL
| |
Collapse
|