1
|
Bernaola-Galván P, Carpena P, Gómez-Martín C, Oliver JL. Compositional Structure of the Genome: A Review. Biology (Basel) 2023; 12:849. [PMID: 37372134 DOI: 10.3390/biology12060849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 06/06/2023] [Accepted: 06/07/2023] [Indexed: 06/29/2023]
Abstract
As the genome carries the historical information of a species' biotic and environmental interactions, analyzing changes in genome structure over time by using powerful statistical physics methods (such as entropic segmentation algorithms, fluctuation analysis in DNA walks, or measures of compositional complexity) provides valuable insights into genome evolution. Nucleotide frequencies tend to vary along the DNA chain, resulting in a hierarchically patchy chromosome structure with heterogeneities at different length scales that range from a few nucleotides to tens of millions of them. Fluctuation analysis reveals that these compositional structures can be classified into three main categories: (1) short-range heterogeneities (below a few kilobase pairs (Kbp)) primarily attributed to the alternation of coding and noncoding regions, interspersed or tandem repeats densities, etc.; (2) isochores, spanning tens to hundreds of tens of Kbp; and (3) superstructures, reaching sizes of tens of megabase pairs (Mbp) or even larger. The obtained isochore and superstructure coordinates in the first complete T2T human sequence are now shared in a public database. In this way, interested researchers can use T2T isochore data, as well as the annotations for different genome elements, to check a specific hypothesis about genome structure. Similarly to other levels of biological organization, a hierarchical compositional structure is prevalent in the genome. Once the compositional structure of a genome is identified, various measures can be derived to quantify the heterogeneity of such structure. The distribution of segment G+C content has recently been proposed as a new genome signature that proves to be useful for comparing complete genomes. Another meaningful measure is the sequence compositional complexity (SCC), which has been used for genome structure comparisons. Lastly, we review the recent genome comparisons in species of the ancient phylum Cyanobacteria, conducted by phylogenetic regression of SCC against time, which have revealed positive trends towards higher genome complexity. These findings provide the first evidence for a driven progressive evolution of genome compositional structure.
Collapse
Affiliation(s)
- Pedro Bernaola-Galván
- Department of Applied Physics II and Institute Carlos I for Theoretical and Computational Physics, University of Málaga, 29071 Málaga, Spain
| | - Pedro Carpena
- Department of Applied Physics II and Institute Carlos I for Theoretical and Computational Physics, University of Málaga, 29071 Málaga, Spain
| | - Cristina Gómez-Martín
- Department of Pathology, Cancer Center Amsterdam, Amsterdam UMC, Vrije Universiteit Amsterdam, 1081 HV Amsterdam, The Netherlands
- Department of Genetics, Faculty of Sciences, 18071 and Laboratory of Bioinformatics, Institute of Biotechnology, Center of Biomedical Research, University of Granada, 18100 Granada, Spain
| | - Jose L Oliver
- Department of Genetics, Faculty of Sciences, 18071 and Laboratory of Bioinformatics, Institute of Biotechnology, Center of Biomedical Research, University of Granada, 18100 Granada, Spain
| |
Collapse
|
2
|
Steube N, Moldenhauer M, Weiland P, Saman D, Kilb A, Ramírez Rojas AA, Garg SG, Schindler D, Graumann PL, Benesch JLP, Bange G, Friedrich T, Hochberg GKA. Fortuitously compatible protein surfaces primed allosteric control in cyanobacterial photoprotection. Nat Ecol Evol 2023; 7:756-767. [PMID: 37012377 PMCID: PMC10172135 DOI: 10.1038/s41559-023-02018-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 02/21/2023] [Indexed: 04/05/2023]
Abstract
Highly specific interactions between proteins are a fundamental prerequisite for life, but how they evolve remains an unsolved problem. In particular, interactions between initially unrelated proteins require that they evolve matching surfaces. It is unclear whether such surface compatibilities can only be built by selection in small incremental steps, or whether they can also emerge fortuitously. Here, we used molecular phylogenetics, ancestral sequence reconstruction and biophysical characterization of resurrected proteins to retrace the evolution of an allosteric interaction between two proteins that act in the cyanobacterial photoprotection system. We show that this interaction between the orange carotenoid protein (OCP) and its unrelated regulator, the fluorescence recovery protein (FRP), evolved when a precursor of FRP was horizontally acquired by cyanobacteria. FRP's precursors could already interact with and regulate OCP even before these proteins first encountered each other in an ancestral cyanobacterium. The OCP-FRP interaction exploits an ancient dimer interface in OCP, which also predates the recruitment of FRP into the photoprotection system. Together, our work shows how evolution can fashion complex regulatory systems easily out of pre-existing components.
Collapse
Affiliation(s)
- Niklas Steube
- Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Marcus Moldenhauer
- Institute of Chemistry PC14, Technische Universität Berlin, Berlin, Germany
| | - Paul Weiland
- Department of Chemistry, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), Marburg, Germany
| | - Dominik Saman
- Department of Chemistry, Oxford University, Oxford, UK
- Kavli Institute for Nanoscience Discovery, Oxford University, Oxford, UK
| | - Alexandra Kilb
- Department of Chemistry, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), Marburg, Germany
| | | | - Sriram G Garg
- Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Daniel Schindler
- Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), Marburg, Germany
| | - Peter L Graumann
- Department of Chemistry, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), Marburg, Germany
| | - Justin L P Benesch
- Department of Chemistry, Oxford University, Oxford, UK
- Kavli Institute for Nanoscience Discovery, Oxford University, Oxford, UK
| | - Gert Bange
- Department of Chemistry, University of Marburg, Marburg, Germany
- Center for Synthetic Microbiology (SYNMIKRO), Marburg, Germany
| | - Thomas Friedrich
- Institute of Chemistry PC14, Technische Universität Berlin, Berlin, Germany.
| | - Georg K A Hochberg
- Max Planck Institute for Terrestrial Microbiology, Marburg, Germany.
- Department of Chemistry, University of Marburg, Marburg, Germany.
- Center for Synthetic Microbiology (SYNMIKRO), Marburg, Germany.
| |
Collapse
|
3
|
de la Fuente R, Díaz-Villanueva W, Arnau V, Moya A. Genomic Signature in Evolutionary Biology: A Review. Biology (Basel) 2023; 12:biology12020322. [PMID: 36829597 PMCID: PMC9953303 DOI: 10.3390/biology12020322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 02/11/2023] [Accepted: 02/13/2023] [Indexed: 02/19/2023]
Abstract
Organisms are unique physical entities in which information is stored and continuously processed. The digital nature of DNA sequences enables the construction of a dynamic information reservoir. However, the distinction between the hardware and software components in the information flow is crucial to identify the mechanisms generating specific genomic signatures. In this work, we perform a bibliometric analysis to identify the different purposes of looking for particular patterns in DNA sequences associated with a given phenotype. This study has enabled us to make a conceptual breakdown of the genomic signature and differentiate the leading applications. On the one hand, it refers to gene expression profiling associated with a biological function, which may be shared across taxa. This signature is the focus of study in precision medicine. On the other hand, it also refers to characteristic patterns in species-specific DNA sequences. This interpretation plays a key role in comparative genomics, identifying evolutionary relationships. Looking at the relevant studies in our bibliographic database, we highlight the main factors causing heterogeneities in genome composition and how they can be quantified. All these findings lead us to reformulate some questions relevant to evolutionary biology.
Collapse
Affiliation(s)
- Rebeca de la Fuente
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
- Correspondence:
| | - Wladimiro Díaz-Villanueva
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
| | - Vicente Arnau
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
| | - Andrés Moya
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| |
Collapse
|
4
|
Mohanta TK, Mohanta YK, Avula SK, Nongbet A, Al-Harrasi A. Virtual 2D map of cyanobacterial proteomes. PLoS One 2022; 17:e0275148. [PMID: 36190972 DOI: 10.1371/journal.pone.0275148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 09/12/2022] [Indexed: 11/05/2022] Open
Abstract
Cyanobacteria are prokaryotic Gram-negative organisms prevalent in nearly all habitats. A detailed proteomics study of Cyanobacteria has not been conducted despite extensive study of their genome sequences. Therefore, we conducted a proteome-wide analysis of the Cyanobacteria proteome and found Calothrix desertica as the largest (680331.825 kDa) and Candidatus synechococcus spongiarum as the smallest (42726.77 kDa) proteome of the cyanobacterial kingdom. A Cyanobacterial proteome encodes 312.018 amino acids per protein, with a molecular weight of 182173.1324 kDa per proteome. The isoelectric point (pI) of the Cyanobacterial proteome ranges from 2.13 to 13.32. It was found that the Cyanobacterial proteome encodes a greater number of acidic-pI proteins, and their average pI is 6.437. The proteins with higher pI are likely to contain repetitive amino acids. A virtual 2D map of Cyanobacterial proteome showed a bimodal distribution of molecular weight and pI. Several proteins within the Cyanobacterial proteome were found to encode Selenocysteine (Sec) amino acid, while Pyrrolysine amino acids were not detected. The study can enable us to generate a high-resolution cell map to monitor proteomic dynamics. Through this computational analysis, we can gain a better understanding of the bias in codon usage by analyzing the amino acid composition of the Cyanobacterial proteome.
Collapse
|
5
|
Walker PL, Pakrasi HB. A Ubiquitously Conserved Cyanobacterial Protein Phosphatase Essential for High Light Tolerance in a Fast-Growing Cyanobacterium. Microbiol Spectr 2022;:e0100822. [PMID: 35727069 DOI: 10.1128/spectrum.01008-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Synechococcus elongatus UTEX 2973, the fastest-growing cyanobacterial strain known, optimally grows under extreme high light (HL) intensities of 1,500-2,500 μmol photons m-2 s-1, which is lethal to most other photosynthetic microbes. We leveraged the few genetic differences between Synechococcus 2973 and the HL sensitive strain Synechococcus elongatus PCC 7942 to unravel factors essential for the high light tolerance. We identified a novel protein in Synechococcus 2973 that we have termed HltA for High light tolerance protein A. Using bioinformatic tools, we determined that HltA contains a functional PP2C-type protein phosphatase domain. Phylogenetic analysis showed that the PP2C domain belongs to the bacterial-specific Group II family and is closely related to the environmental stress response phosphatase RsbU. Additionally, we showed that unlike any previously described phosphatases, HltA contains a single N-terminal regulatory GAF domain. We found hltA to be ubiquitous throughout cyanobacteria, indicative of its potentially important role in the photosynthetic lifestyle of these oxygenic phototrophs. Mutations in the hltA gene resulted in severe defects specific to high light growth. These results provide evidence that hltA is a key factor in the tolerance of Synechococcus 2973 to high light and will open new insights into the mechanisms of cyanobacterial light stress response. IMPORTANCE Cyanobacteria are a diverse group of photosynthetic prokaryotes. The cyanobacterium Synechococcus 2973 is a high light tolerant strain with industrial promise due to its fast growth under high light conditions and the availability of genetic modification tools. Currently, little is known about the high light tolerance mechanisms of Synechococcus 2973, and there are many unknowns overall regarding high light tolerance of cyanobacteria. In this study, a comparative genomic analysis of Synechococcus 2973 identified a single nucleotide polymorphism in a locus encoding a serine phosphatase as a key factor for high light tolerance. This novel GAF-containing phosphatase was found to be the sole Group II metal-dependent protein phosphatase that is evolutionarily conserved throughout cyanobacteria. These results shed new light on the light response mechanisms of Synechococcus 2973, improving our understanding of environmental stress response. Additionally, this work will help facilitate the development of Synechococcus 2973 as an industrially useful organism.
Collapse
|