1
|
Bogomolov A, Zolotareva K, Filonov S, Chadaeva I, Rasskazov D, Sharypova E, Podkolodnyy N, Ponomarenko P, Savinkova L, Tverdokhleb N, Khandaev B, Kondratyuk E, Podkolodnaya O, Zemlyanskaya E, Kolchanov NA, Ponomarenko M. AtSNP_TATAdb: Candidate Molecular Markers of Plant Advantages Related to Single Nucleotide Polymorphisms within Proximal Promoters of Arabidopsis thaliana L. Int J Mol Sci 2024; 25:607. [PMID: 38203780 PMCID: PMC10779315 DOI: 10.3390/ijms25010607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/18/2023] [Accepted: 12/30/2023] [Indexed: 01/12/2024] Open
Abstract
The mainstream of the post-genome target-assisted breeding in crop plant species includes biofortification such as high-throughput phenotyping along with genome-based selection. Therefore, in this work, we used the Web-service Plant_SNP_TATA_Z-tester, which we have previously developed, to run a uniform in silico analysis of the transcriptional alterations of 54,013 protein-coding transcripts from 32,833 Arabidopsis thaliana L. genes caused by 871,707 SNPs located in the proximal promoter region. The analysis identified 54,993 SNPs as significantly decreasing or increasing gene expression through changes in TATA-binding protein affinity to the promoters. The existence of these SNPs in highly conserved proximal promoters may be explained as intraspecific diversity kept by the stabilizing natural selection. To support this, we hand-annotated papers on some of the Arabidopsis genes possessing these SNPs or on their orthologs in other plant species and demonstrated the effects of changes in these gene expressions on plant vital traits. We integrated in silico estimates of the TBP-promoter affinity in the AtSNP_TATAdb knowledge base and showed their significant correlations with independent in vivo experimental data. These correlations appeared to be robust to variations in statistical criteria, genomic environment of TATA box regions, plants species and growing conditions.
Collapse
Affiliation(s)
- Anton Bogomolov
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Karina Zolotareva
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Sergey Filonov
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
- Natural Science Department, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Irina Chadaeva
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Dmitry Rasskazov
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Ekaterina Sharypova
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Nikolay Podkolodnyy
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
- Institute of Computational Mathematics and Mathematical Geophysics, Novosibirsk 630090, Russia
| | - Petr Ponomarenko
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Ludmila Savinkova
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Natalya Tverdokhleb
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Bato Khandaev
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
- Natural Science Department, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Ekaterina Kondratyuk
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
- Siberian Federal Scientific Centre of Agro-BioTechnologies of the Russian Academy of Sciences, Krasnoobsk 630501, Novosibirsk Region, Russia
| | - Olga Podkolodnaya
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| | - Elena Zemlyanskaya
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
- Natural Science Department, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Nikolay A. Kolchanov
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
- Natural Science Department, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Mikhail Ponomarenko
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia; (A.B.); (K.Z.); (S.F.); (I.C.); (D.R.); (E.S.); (N.P.); (P.P.); (L.S.); (N.T.); (B.K.); (E.K.); (O.P.); (E.Z.); (N.A.K.)
| |
Collapse
|
2
|
Rasskazov D, Chadaeva I, Sharypova E, Zolotareva K, Khandaev B, Ponomarenko P, Podkolodnyy N, Tverdokhleb N, Vishnevsky O, Bogomolov A, Podkolodnaya O, Savinkova L, Zemlyanskaya E, Golubyatnikov V, Kolchanov N, Ponomarenko M. Plant_SNP_TATA_Z-Tester: A Web Service That Unequivocally Estimates the Impact of Proximal Promoter Mutations on Plant Gene Expression. Int J Mol Sci 2022; 23:ijms23158684. [PMID: 35955817 PMCID: PMC9369029 DOI: 10.3390/ijms23158684] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/01/2022] [Accepted: 08/03/2022] [Indexed: 11/16/2022] Open
Abstract
Synthetic targeted optimization of plant promoters is becoming a part of progress in mainstream postgenomic agriculture along with hybridization of cultivated plants with wild congeners, as well as marker-assisted breeding. Therefore, here, for the first time, we compiled all the experimental data—on mutational effects in plant proximal promoters on gene expression—that we could find in PubMed. Some of these datasets cast doubt on both the existence and the uniqueness of the sought solution, which could unequivocally estimate effects of proximal promoter mutation on gene expression when plants are grown under various environmental conditions during their development. This means that the inverse problem under study is ill-posed. Furthermore, we found experimental data on in vitro interchangeability of plant and human TATA-binding proteins allowing the application of Tikhonov’s regularization, making this problem well-posed. Within these frameworks, we created our Web service Plant_SNP_TATA_Z-tester and then determined the limits of its applicability using those data that cast doubt on both the existence and the uniqueness of the sought solution. We confirmed that the effects (of proximal promoter mutations on gene expression) predicted by Plant_SNP_TATA_Z-tester correlate statistically significantly with all the experimental data under study. Lastly, we exemplified an application of Plant_SNP_TATA_Z-tester to agriculturally valuable mutations in plant promoters.
Collapse
Affiliation(s)
| | - Irina Chadaeva
- Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
| | | | | | - Bato Khandaev
- Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
| | | | - Nikolay Podkolodnyy
- Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
- Institute of Computational Mathematics and Mathematical Geophysics, 630090 Novosibirsk, Russia
| | | | - Oleg Vishnevsky
- Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
| | - Anton Bogomolov
- Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
| | | | | | | | | | | | - Mikhail Ponomarenko
- Institute of Cytology and Genetics, 630090 Novosibirsk, Russia
- Correspondence: ; Tel.: +7-(383)-363-4963 (ext. 1311)
| |
Collapse
|
3
|
Ficarelli M, Antzin-Anduetza I, Hugh-White R, Firth AE, Sertkaya H, Wilson H, Neil SJD, Schulz R, Swanson CM. CpG Dinucleotides Inhibit HIV-1 Replication through Zinc Finger Antiviral Protein (ZAP)-Dependent and -Independent Mechanisms. J Virol 2020; 94:e01337-19. [PMID: 31748389 PMCID: PMC7158733 DOI: 10.1128/jvi.01337-19] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 11/06/2019] [Indexed: 02/07/2023] Open
Abstract
CpG dinucleotides are suppressed in the genomes of many vertebrate RNA viruses, including HIV-1. The cellular antiviral protein ZAP (zinc finger antiviral protein) binds CpGs and inhibits HIV-1 replication when CpGs are introduced into the viral genome. However, it is not known if ZAP-mediated restriction is the only mechanism driving CpG suppression. To determine how CpG dinucleotides affect HIV-1 replication, we increased their abundance in multiple regions of the viral genome and analyzed the effect on RNA expression, protein abundance, and infectious-virus production. We found that the antiviral effect of CpGs was not correlated with their abundance. Interestingly, CpGs inserted into some regions of the genome sensitize the virus to ZAP antiviral activity more efficiently than insertions into other regions, and this sensitivity can be modulated by interferon treatment or ZAP overexpression. Furthermore, the sensitivity of the virus to endogenous ZAP was correlated with its sensitivity to the ZAP cofactor KHNYN. Finally, we show that CpGs in some contexts can also inhibit HIV-1 replication by ZAP-independent mechanisms, and one of these is the activation of a cryptic splice site at the expense of a canonical splice site. Overall, we show that the location and sequence context of the CpG in the viral genome determines its antiviral activity.IMPORTANCE Some RNA virus genomes are suppressed in the nucleotide combination of a cytosine followed by a guanosine (CpG), indicating that they are detrimental to the virus. The antiviral protein ZAP binds viral RNA containing CpGs and prevents the virus from multiplying. However, it remains unknown how the number and position of CpGs in viral genomes affect restriction by ZAP and whether CpGs have other antiviral mechanisms. Importantly, manipulating the CpG content in viral genomes could help create new vaccines. HIV-1 shows marked CpG suppression, and by introducing CpGs into its genome, we show that ZAP efficiently targets a specific region of the viral genome, that the number of CpGs does not predict the magnitude of antiviral activity, and that CpGs can inhibit HIV-1 gene expression through a ZAP-independent mechanism. Overall, the position of CpGs in the HIV-1 genome determines the magnitude and mechanism through which they inhibit the virus.
Collapse
Affiliation(s)
- Mattia Ficarelli
- Department of Infectious Diseases, King's College London, London, United Kingdom
| | | | - Rupert Hugh-White
- Department of Medical and Molecular Genetics, King's College London, London, United Kingdom
| | - Andrew E Firth
- Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Helin Sertkaya
- Department of Infectious Diseases, King's College London, London, United Kingdom
| | - Harry Wilson
- Department of Infectious Diseases, King's College London, London, United Kingdom
| | - Stuart J D Neil
- Department of Infectious Diseases, King's College London, London, United Kingdom
| | - Reiner Schulz
- Department of Medical and Molecular Genetics, King's College London, London, United Kingdom
| | - Chad M Swanson
- Department of Infectious Diseases, King's College London, London, United Kingdom
| |
Collapse
|
4
|
Antzin-Anduetza I, Mahiet C, Granger LA, Odendall C, Swanson CM. Increasing the CpG dinucleotide abundance in the HIV-1 genomic RNA inhibits viral replication. Retrovirology 2017; 14:49. [PMID: 29121951 PMCID: PMC5679385 DOI: 10.1186/s12977-017-0374-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 11/01/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The human immunodeficiency virus type 1 (HIV-1) structural protein Gag is necessary and sufficient to form viral particles. In addition to encoding the amino acid sequence for Gag, the underlying RNA sequence could encode cis-acting elements or nucleotide biases that are necessary for viral replication. Furthermore, RNA sequences that inhibit viral replication could be suppressed in gag. However, the functional relevance of RNA elements and nucleotide biases that promote or repress HIV-1 replication remain poorly understood. RESULTS To characterize if the RNA sequence in gag controls HIV-1 replication, the matrix (MA) region was codon modified, allowing the RNA sequence to be altered without affecting the protein sequence. Codon modification of nucleotides (nt) 22-261 or 22-378 in gag inhibited viral replication by decreasing genomic RNA (gRNA) abundance, gRNA stability, Gag expression, virion production and infectivity. Comparing the effect of these point mutations to deletions of the same region revealed that the mutations inhibited infectious virus production while the deletions did not. This demonstrated that codon modification introduced inhibitory sequences. There is a much lower than expected frequency of CpG dinucleotides in HIV-1 and codon modification introduced a substantial increase in CpG abundance. To determine if they are necessary for inhibition of HIV-1 replication, codons introducing CpG dinucleotides were mutated back to the wild type codon, which restored efficient Gag expression and infectious virion production. To determine if they are sufficient to inhibit viral replication, CpG dinucleotides were inserted into gag in the absence of other changes. The increased CpG dinucleotide content decreased HIV-1 infectivity and viral replication. CONCLUSIONS The HIV-1 RNA sequence contains low abundance of CpG dinucleotides. Increasing the abundance of CpG dinucleotides inhibits multiple steps of the viral life cycle, providing a functional explanation for why CpG dinucleotides are suppressed in HIV-1.
Collapse
Affiliation(s)
- Irati Antzin-Anduetza
- Department of Infectious Diseases, King's College London, 3rd Floor Borough Wing, Guy's Hospital, London, SE1 9RT, UK
| | - Charlotte Mahiet
- Department of Infectious Diseases, King's College London, 3rd Floor Borough Wing, Guy's Hospital, London, SE1 9RT, UK
| | - Luke A Granger
- Department of Infectious Diseases, King's College London, 3rd Floor Borough Wing, Guy's Hospital, London, SE1 9RT, UK
| | - Charlotte Odendall
- Department of Infectious Diseases, King's College London, 3rd Floor Borough Wing, Guy's Hospital, London, SE1 9RT, UK
| | - Chad M Swanson
- Department of Infectious Diseases, King's College London, 3rd Floor Borough Wing, Guy's Hospital, London, SE1 9RT, UK.
| |
Collapse
|
5
|
Laprevotte I, Pupin M, Coward E, Didier G, Terzian C, Devauchelle C, Hénaut A. HIV-1 and HIV-2 LTR nucleotide sequences: assessment of the alignment by N-block presentation, "retroviral signatures" of overrepeated oligonucleotides, and a probable important role of scrambled stepwise duplications/deletions in molecular evolution. Mol Biol Evol 2001; 18:1231-45. [PMID: 11420363 DOI: 10.1093/oxfordjournals.molbev.a003909] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Previous analyses of retroviral nucleotide sequences, suggest a so-called "scrambled duplicative stepwise molecular evolution" (many sectors with successive duplications/deletions of short and longer motifs) that could have stemmed from one or several starter tandemly repeated short sequence(s). In the present report, we tested this hypothesis by focusing on the long terminal repeats (LTRs) (and flanking sequences) of 24 human and 3 simian immunodeficiency viruses. By using a calculation strategy applicable to short sequences, we found consensus overrepresented motifs (often containing CTG or CAG) that were congruent with the previously defined "retroviral signature." We also show many local repetition patterns that are significant when compared with simply shuffled sequences. First- and second-order Markov chain analyses demonstrate that a major portion of the overrepresented oligonucleotides can be predicted from the dinucleotide compositions of the sequences, but by no means can biological mechanisms be deduced from these results: some of the listed local repetitions remain significant against dinucleotide-conserving shuffled sequences; together with previous results, this suggests that interspersed and/or local mononucleotide and oligonucleotide repetitions could have biased the dinucleotide compositions of the sequences. We searched for suggestive evolutionary patterns by scrutinizing a reliable multiple alignment of the 27 sequences. A manually constructed alignment based on homology blocks was in good agreement with the polypeptide alignment in the coding sectors and has been exhaustively assessed by using a multiplied alphabet obtained by the promising mathematical strategy called the N-block presentation (taking into account the environment of each nucleotide in a sequence). Sector by sector, we hypothesize many successive duplication/deletion scenarios that fit our previous evolutionary hypotheses. This suggests an important duplication/deletion role for the reverse transcriptase, particularly in inducing stuttering cryptic simplicity patterns.
Collapse
Affiliation(s)
- I Laprevotte
- Laboratoire Génome et Informatique, Université de Versailles Saint Quentin-en-Yvelines, Versailles, France.
| | | | | | | | | | | | | |
Collapse
|
6
|
Häring D, Kypr J. Variations of the mononucleotide and short oligonucleotide distributions in the genomes of various organisms. J Theor Biol 1999; 201:141-56. [PMID: 10556022 DOI: 10.1006/jtbi.1999.1019] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
We calculated the variation coefficients of the mononucleotide and short oligonucleotide distributions in over 1700 long genomic sequences originating from six organisms to demonstrate that the human and Escherichia coli genomic sequences were the least and the most uniform, respectively. The most non-random genomic distributions were exhibited by the four canonical nucleotides, followed by the strong and weak nucleotides, while the distributions of purine or pyrimidine nucleotides and especially the distributions of (A+C) and (G+T) were significantly more uniform even in the human genome. In the human and mouse genomes, the highest coefficients of variation were further observed with the oligonucleotides where CG was combined with the strong nucleotides while its combination with the weak nucleotides significantly decreased the variation which, however, was still very high. High variation was also exhibited by the remaining oligonucleotides composed exclusively of the strong nucleotides or those containing only weak nucleotides. On the other hand, the distributions of oligonucleotides containing similar and especially the same numbers of the strong and weak nucleotides, but no CG or TA dinucleotide, were the most uniform. The information following from the present analysis will be useful not only in the identification of important genomic regions but also in computer simulations of the genomic nucleotide sequences in order to trace and reproduce the pathways of genome evolution.
Collapse
Affiliation(s)
- D Häring
- Academy of Sciences of the Czech Republic, Královopolská 135, Brno, CZ-61265, Czech Republic
| | | |
Collapse
|
7
|
Yamaguchi Y, Gojobori T. Evolutionary mechanisms and population dynamics of the third variable envelope region of HIV within single hosts. Proc Natl Acad Sci U S A 1997; 94:1264-9. [PMID: 9037041 PMCID: PMC19779 DOI: 10.1073/pnas.94.4.1264] [Citation(s) in RCA: 69] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Clonal diversifications of HIV virus were monitored by periodic samplings on each of the six patients with regard to 183- to 335-bp segments of the env gene, which invariably included the functionally critical V3 region. Subsequently, six individual phylogenetic trees of viral variants were constructed. It was found that at one time or another during the course of disease progression, viral variants were inexplicably released from a strong negative selection against nonsynonymous base substitutions, possibly indicating positive selection. This resulted in concentrated amino acid substitutions at five specific sites within the V3 region. It was noted that these sites were often involved as antigenic determinants that provoked the host immune response and that these sites were also involved in the determination of viral phenotypes as to their cell tropism, syncytium formation capability, and replication rates.
Collapse
Affiliation(s)
- Y Yamaguchi
- Center for Information Biology, National Institute of Genetics, Mishima, Japan
| | | |
Collapse
|
8
|
|
9
|
Berkhout B. Structure and function of the human immunodeficiency virus leader RNA. PROGRESS IN NUCLEIC ACID RESEARCH AND MOLECULAR BIOLOGY 1996; 54:1-34. [PMID: 8768071 DOI: 10.1016/s0079-6603(08)60359-1] [Citation(s) in RCA: 199] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- B Berkhout
- Department of Virology, Academic Medical Center, University of Amsterdam, The Netherlands
| |
Collapse
|
10
|
Berkhout B, van Hemert FJ. The unusual nucleotide content of the HIV RNA genome results in a biased amino acid composition of HIV proteins. Nucleic Acids Res 1994; 22:1705-11. [PMID: 8202375 PMCID: PMC308053 DOI: 10.1093/nar/22.9.1705] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Extremely high frequencies of the A nucleotide are found in the RNA genomes of the lentivirus group of retroviruses. It is presently unknown what molecular force is responsible for this A-pressure. In this manuscript, we demonstrate a correlation between this 'A-pressure' and the amino acid-usage of the lentivirus family. We compared the amino acid composition of the Gag and Pol proteins of the human immunodeficiency viruses type 1 and 2 (HIV-1 and HIV-2) with that of the second group of human retroviruses; the human T-cell leukemia viruses type I and II (HTLV-I and HTLV-II). Differences in total amino acid content correlate with the preference for A-rich codons in the HIV genome. A pair-wise comparison of homologous amino acid positions in the Pol proteins indicates that both conservative and non-conservative changes can be accounted for by this A-bias. The putative molecular mechanism underlying this A-pressure and the evolutionary consequences are discussed.
Collapse
Affiliation(s)
- B Berkhout
- Department of Virology, University of Amsterdam, The Netherlands
| | | |
Collapse
|
11
|
Bronson EC, Anderson JN. Nucleotide composition as a driving force in the evolution of retroviruses. J Mol Evol 1994; 38:506-32. [PMID: 8028030 DOI: 10.1007/bf00178851] [Citation(s) in RCA: 56] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
All complete retrovirus sequences in the GenEMBL database were examined with the goal of assessing possible relationships between the nucleotide composition of retroviral genomes, the amino acid composition of retroviral proteins, and evolutionary strategies used by retroviruses. The results demonstrated that the genome of each viral lineage has a characteristic base composition and that the variations between groups are related to retroviral phylogeny. By analogy to microbial species, we suggest that the variations arise from group-specific patterns of directional mutations where the bias can be exerted on any of the four nucleotides. It is most likely that the mutational patterns are introduced during reverse transcription, and a direct participation of reverse transcriptase in the process is suspected. A straightforward strategy was used to analyze the compositional relationship between nucleotides and encoded amino acids. The procedure entailed calculations of amino acid frequencies from nucleotide content and the comparison of the calculated values to the observed amino acid frequencies in retroviruses. The results revealed an excellent correspondence between variation in genomic base composition and variation in amino acid composition of proteins with the compositional differences extending into all major coding regions of the viruses. Because of the magnitude and dispersion of these effects, and because of the nonconservative nature of many of the substitutions between groups with different genomic biases, we suggest that the variations in protein composition driven by biased nucleotide frequencies are an important factor in shaping the characteristic phenotypes of the different viral lineages. A clue to the nature of the evolutionary forces that are responsible for the generation of nucleotide biases was provided by the observation that viruses with radically different base frequencies most often inhabit the same cell type. This observation, along with analysis of amino acid and nucleotide replacement patterns between and within reverse transcriptase sequences from the various groups, permitted us to advance a model for the evolution of retroviruses. According to the model, speciation could initiate when daughter virions from a single progenitor vary in the direction of their mutational bias. These variations would exert a pleiotropic effect on the frequencies of nucleotides in all viral genes and consequently on the frequencies of amino acids in the encoded proteins. The variants with the most extreme compositional differences would have a selective advantage because their different precursor requirements would enable them to occupy different ecological niches within a single cell.(ABSTRACT TRUNCATED AT 400 WORDS)
Collapse
Affiliation(s)
- E C Bronson
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | | |
Collapse
|
12
|
Rodin S, Ohno S, Rodin A. On concerted origin of transfer RNAs with complementary anticodons. ORIGINS LIFE EVOL B 1993; 23:393-418. [PMID: 7509479 DOI: 10.1007/bf01582088] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Pairs of antiparallely oriented consensus tRNAs with complementary anticodons show surprisingly small numbers of mispairings within the 17-bp- long anticodon stem and loop region. Even smaller such complementary distances are shown by illegitimately complementary anticodons, i.e. those with allowed pairing between G and U bases. Accordingly, we suppose that transfer RNAs have emerged concertedly as complementary strands of primordial double helix-like RNA molecules. Replication of such molecules with illegitimately complementary anticodons might generate new synonymous codons for the same pair of amino acids. Logically, the idea of tRNA concerted origin dictates very ancient establishment of direct links between anticodons and the type of amino acids with which pre-tRNAs were to be charged. More specifically, anticodons (first of all, the 2nd base) could selectively target 'their' amino acids, reaction of acylating itself being performed by another non-specific site of pre-tRNA or even by another ribozyme. In all, the above findings and speculations are consistent to the hypercyclic concept (Eigen and Schuster, 1979), and throw new light on the genetic code origin and associated problems. Also favoring this idea are data on complementary codon usage patterns in different genomes.
Collapse
Affiliation(s)
- S Rodin
- Institute of Cytology & Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk
| | | | | |
Collapse
|
13
|
Rodin S, Ohno S, Rodin A. Transfer RNAs with complementary anticodons: could they reflect early evolution of discriminative genetic code adaptors? Proc Natl Acad Sci U S A 1993; 90:4723-7. [PMID: 8506325 PMCID: PMC46585 DOI: 10.1073/pnas.90.10.4723] [Citation(s) in RCA: 62] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
In accordance with the hypercycle theory of M. Eigen and P. Schuster [(1979) Hypercycle: A Principle of Natural Self-Organization (Springer, New York)], the ancestors of modern tRNAs appear to have emerged via the shortest possible way, both complementary strands of a short symmetrical double helix serving as pre-tRNAs with complementary anticodons. This conclusion is based upon results of comparative sequence analysis of the 17-base-long anticodon loop and stem of tRNAs totaling 896 and especially of 22 pairs of consensus tRNAs with complementary or quasi-complementary anticodons. With regard to the anticodon loop and stem of pairs of consensus tRNAs, complementary distances were considerably less than direct distances--i.e., antiparallel pairing invariably yielded fewer mismatches than direct pairing. Furthermore, the smallest complementary distance was detected when two antiparallel sequences formed irregular G-U bonds in their anticodon triplets. The above implies that pre-tRNAs in peribiotic times were long hairpin structures having 73 bases or more, the middle base of an anticodon being the center of symmetry. Accordingly, each pair of pre-tRNAs with complementary anticodons should have been almost identical with each other except for their three central bases. The above situation appears to have dictated the early establishment of direct links between anticodons and the type of amino acids with which tRNAs are to be charged. This direct link is still maintained between modern aminoacyl-tRNA synthetases and anticodons. Replication of the double helices concertedly generated new codons for the same pair of amino acids. Thus, occurrence of synonymous as well as certain "palindromic" features of the genetic code table might have been determined by this mechanism.
Collapse
Affiliation(s)
- S Rodin
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk
| | | | | |
Collapse
|
14
|
Laprevotte I. Mo-MuLV nucleotide sequence exhibits three levels of oligomeric repetitions, suggesting a stepwise molecular evolution. J Mol Evol 1992; 35:420-8. [PMID: 1336800 DOI: 10.1007/bf00171820] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
An exhaustive computer-assisted analysis of the Moloney murine leukemia virus nucleotide sequence shows numerous deviations in the oligomeric distribution, suggesting three overlapping levels of a stepwise duplicative evolution. (1) The sequence fits the universal rule of TG/CT excess which has been proposed as the construction principle of all sequences, and maintains some degree of symmetry between the two complementary strands. (2) Oligomeric repeating units share a core consensus regularly scattered throughout the sequence. This consensus is not merely predictable from the doublet frequencies and codon usage, but could correspond to an intermediary stage in a so-called periodic-to-chaotic transition. (3) Probable stepwise local duplications could be accounted for by slippagelike mechanisms. Comparison with the human spumaretrovirus (HSRV) shows similar segments in the overrepresented oligomers of the two sequences. The intermediary stage of transition oligomeric repeating units is not so clearly suggested in HSRV, perhaps because of numerous stepwise local duplications. In any case, a common evolutionary origin for the two viruses is not ruled out.
Collapse
Affiliation(s)
- I Laprevotte
- UPR 41 CNRS Recombinaisons Génétiques, Centre Hayem Hôpital Saint-Louis, Paris, France
| |
Collapse
|
15
|
Sell SM. V(D)J recombinase precursors and coding structure of signal sequence directed rearrangement. ACTA ACUST UNITED AC 1992. [DOI: 10.1016/0097-8485(92)80039-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
16
|
Doi H. Importance of purine and pyrimidine content of local nucleotide sequences (six bases long) for evolution of the human immunodeficiency virus type 1. Proc Natl Acad Sci U S A 1991; 88:9282-6. [PMID: 1924392 PMCID: PMC52698 DOI: 10.1073/pnas.88.20.9282] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Human immunodeficiency virus type 1 evolves rapidly, and random base change is thought to act as a major factor in this evolution. However, segments of the viral genome differ in their variability: there is the highly variable env gene, particularly hypervariable regions located within env, and, in contrast, the conservative gag and pol genes. Computer analysis of the nucleotide sequences of human immunodeficiency virus type 1 isolates reveals that base substitution in this virus is nonrandom and affected by local nucleotide sequences. Certain local sequences 6 base pairs long are excessively frequent in the hypervariable regions. These sequences exhibit base-substitution hotspots at specific positions in their 6 bases. The hotspots tend to be nonsilent letters of codons in the hypervariable regions--thus leading to marked amino acid substitutions there. Conversely, in the conservative gag and pol genes the hotspots tend to be silent letters because of a difference in codon frame from the hypervariable regions. Furthermore, base substitutions in the local sequences that frequently appear in the conservative genes occurred at a low level, even within the variable env. Thus, despite the high variability of this virus, the conservative genes and their products could be conserved. These may be some of the strategies evolved in human immunodeficiency virus type 1 to allow for positive-selection pressures, such as the host immune system, and negative-selection pressures on the conservative gene products.
Collapse
Affiliation(s)
- H Doi
- Biological Informatics Section, International Institute for Advanced Study of Social Information Science, Fujitsu Laboratories Ltd., Tokyo, Japan
| |
Collapse
|
17
|
Abstract
Selfish DNA, coding sequences, and junk DNA in the genome are no stranger to each other; rather, they represent three phases in the life cycle of DNA. Accordingly, they all obey the same grammatical rule of TG/CA/CT excess and CG/TA deficiency. On the one hand, it is this very rule which keeps isoelectric points of most proteins near the neutral range. On the other hand, this rule creates numerous palindromes, thus maintaining symmetry between complementary strands. Many of these palindromes encode identical oligopeptides on both strands.
Collapse
Affiliation(s)
- S Ohno
- Beckman Research Institute of the City of Hope, Department of Theoretical Biology, Duarte, CA 91010-0269
| | | |
Collapse
|
18
|
Shpaer EG, Mullins JI. Selection against CpG dinucleotides in lentiviral genes: a possible role of methylation in regulation of viral expression. Nucleic Acids Res 1990; 18:5793-7. [PMID: 2170945 PMCID: PMC332316 DOI: 10.1093/nar/18.19.5793] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Extremely low frequencies of CpG dinucleotides are found in the genomes of the lentivirus subfamily of retroviruses, including the human, simian and feline immunodeficiency viruses (HIV1, HIV2, SIV, and FIV, respectively), equine infectious anemia virus (EIAV), and the ovine lentivirus, Visna. The occurrence of CpG dinucleotides is greater in the 2-3 (NCG) than in the 1-2 (CGN) codon-defined frame, as well as in the gag and env genes, compared to the more conserved pol gene. These differences suggest that CpG depletion in lentiviruses occurs as a result of selection against CpG rather than due to mutational bias, the latter is responsible for low CpG frequencies in vertebrate genomes. CpG levels in the onco-retrovirus subfamily are reduced to a lesser extent, principally due to mutational bias. The difference between the retrovirus subfamilies appears to reflect their evolutionary origin, that is, lentiviruses have no known endogenous counterparts whereas most oncoviruses have endogenous cellular counterparts with which they can undergo recombination. Furthermore, we suggest that the number of CpG dinucleotides in a lentiviral genome determines the maximum potential DNA methylation level of the provirus, which in turn affects viral transcription in host cells.
Collapse
Affiliation(s)
- E G Shpaer
- Department of Microbiology and Immunology, Stanford University School of Medicine, CA 94305-5402
| | | |
Collapse
|
19
|
Ohno S. Grammatical analysis of DNA sequences provides a rationale for the regulatory control of an entire chromosome. Genet Res (Camb) 1990; 56:115-20. [PMID: 2272500 DOI: 10.1017/s0016672300035187] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Regardless of their origins, functions, and base compositions, all DNAs are scriptures written following the same grammatical rule. At the level of syllables, two, CG and TA are seldom used, while three, TG, CT and CA are utilized with abundance. Accordingly, at the level of three-letter words, two complementary base trimers, CTG and CAG, invariably enjoy frequent usage. Inasmuch as two of the three frequently used syllables, TG and CA are complementary to each other, while two seldom used syllables, CG and TA, are both palindromes, two complementary strands of DNA are inherently symmetrical with each other. Consequently, palindromic sequences as favourite targets of DNA-binding proteins occur at unsuspectedly high frequencies, if they contain TG and CA or CTG and CAG. Nevertheless, there are grammatical rules operating among these high frequency palindromes as well; e.g. the palindromic tetramer TGCA occurs nearly two times more often than its reciprocal; CATG. Thus, DNA-binding proteins are provided with a wealth of abundant targets whose densities are influenced by a regional difference in GC/AT ratios to variable degrees. One palindromic heptamer CAGNCTG is an ideal target of one DNA-binding protein engaged in chromosome packaging and in generation of banding patterns. This heptamer occurs once every 1000 bases in moderately GC-rich sequences, while its incidence is reduced to once every 3000 bases in extremely AT-rich sequences. The above must be the very reason that a solitary human X-chromosome DNA coated with mouse DNA-binding proteins in mouse-man somatic hybrids still maintains the original banding pattern and that the inactive X remains inactive, while the active X remains active.
Collapse
Affiliation(s)
- S Ohno
- Beckman Research Institute, City of Hope, Duarte, California 91010-0269
| |
Collapse
|