1
|
Wang L, Zhao H, Wang Z, Ding S, Qin L, Jiang R, Deng X, He Z, Li L. An Evolutionary Perspective of Codon Usage Pattern, Dinucleotide Composition and Codon Pair Bias in Prunus Necrotic Ringspot Virus. Genes (Basel) 2023; 14:1712. [PMID: 37761852 PMCID: PMC10530913 DOI: 10.3390/genes14091712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/24/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023] Open
Abstract
Prunus necrotic ringspot virus (PNRSV) is a significant virus of ornamental plants and fruit trees. It is essential to study this virus due to its impact on the horticultural industry. Several studies on PNRSV diversity and phytosanitary detection technology were reported, but the content on the codon usage bias (CUB), dinucleotide preference and codon pair bias (CPB) of PNRSV is still uncertain. We performed comprehensive analyses on a dataset consisting of 359 coat protein (CP) gene sequences in PNRSV to examine the characteristics of CUB, dinucleotide composition, and CPB. The CUB analysis of PNRSV CP sequences showed that it was not only affected by natural selection, but also affected by mutations, and natural selection played a more significant role compared to mutations as the driving force. The dinucleotide composition analysis showed an over-expression of the CpC/GpA dinucleotides and an under-expression of the UpA/GpC dinucleotides. The dinucleotide composition of the PNRSV CP gene showed a weak association with the viral lineages and hosts, but a strong association with viral codon positions. Furthermore, the CPB of PNRSV CP gene is low and is related to dinucleotide preference and codon usage patterns. This research provides reference for future research on PNRSV genetic diversity and gene evolution mechanism.
Collapse
Affiliation(s)
- Lingqi Wang
- College of Horticulture and Landscape Architecture, Yangzhou University, Yangzhou 225009, China;
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; (H.Z.); (Z.W.); (S.D.); (L.Q.); (R.J.); (X.D.)
| | - Haiting Zhao
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; (H.Z.); (Z.W.); (S.D.); (L.Q.); (R.J.); (X.D.)
| | - Zhilei Wang
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; (H.Z.); (Z.W.); (S.D.); (L.Q.); (R.J.); (X.D.)
| | - Shiwen Ding
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; (H.Z.); (Z.W.); (S.D.); (L.Q.); (R.J.); (X.D.)
| | - Lang Qin
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; (H.Z.); (Z.W.); (S.D.); (L.Q.); (R.J.); (X.D.)
| | - Runzhou Jiang
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; (H.Z.); (Z.W.); (S.D.); (L.Q.); (R.J.); (X.D.)
| | - Xiaolong Deng
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; (H.Z.); (Z.W.); (S.D.); (L.Q.); (R.J.); (X.D.)
| | - Zhen He
- College of Plant Protection, Yangzhou University, Yangzhou 225009, China; (H.Z.); (Z.W.); (S.D.); (L.Q.); (R.J.); (X.D.)
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou 225009, China
| | - Liangjun Li
- College of Horticulture and Landscape Architecture, Yangzhou University, Yangzhou 225009, China;
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou 225009, China
| |
Collapse
|
2
|
Qin L, Ding S, Wang Z, Jiang R, He Z. Host Plants Shape the Codon Usage Pattern of Turnip Mosaic Virus. Viruses 2022; 14:v14102267. [PMID: 36298822 PMCID: PMC9607058 DOI: 10.3390/v14102267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/11/2022] [Accepted: 10/14/2022] [Indexed: 01/25/2023] Open
Abstract
Turnip mosaic virus (TuMV), an important pathogen that causes mosaic diseases in vegetable crops worldwide, belongs to the genus Potyvirus of the family Potyviridae. Previously, the areas of genetic variation, population structure, timescale, and migration of TuMV have been well studied. However, the codon usage pattern and host adaptation analysis of TuMV is unclear. Here, compositional bias and codon usage of TuMV were performed using 184 non-recombinant sequences. We found a relatively stable change existed in genomic composition and a slightly lower codon usage choice displayed in TuMV protein-coding sequences. Statistical analysis presented that the codon usage patterns of TuMV protein-coding sequences were mainly affected by natural selection and mutation pressure, and natural selection was the key influencing factor. The codon adaptation index (CAI) and relative codon deoptimization index (RCDI) revealed that TuMV genes were strongly adapted to Brassica oleracea from the present data. Similarity index (SiD) analysis also indicated that B. oleracea is potentially the preferred host of TuMV. Our study provides the first insights for assessing the codon usage bias of TuMV based on complete genomes and will provide better advice for future research on TuMV origins and evolution patterns.
Collapse
Affiliation(s)
- Lang Qin
- College of Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou 225009, China
| | - Shiwen Ding
- College of Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou 225009, China
| | - Zhilei Wang
- College of Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou 225009, China
| | - Runzhou Jiang
- College of Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou 225009, China
| | - Zhen He
- College of Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou 225009, China
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou 225009, China
- Correspondence:
| |
Collapse
|
3
|
He Z, Qin L, Xu X, Ding S. Evolution and host adaptability of plant RNA viruses: Research insights on compositional biases. Comput Struct Biotechnol J 2022; 20:2600-2610. [PMID: 35685354 PMCID: PMC9160401 DOI: 10.1016/j.csbj.2022.05.021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Revised: 05/10/2022] [Accepted: 05/12/2022] [Indexed: 01/23/2023] Open
Abstract
During recent decades, many new emerging or re-emerging RNA viruses have been found in plants through the development of deep-sequencing technology and big data analysis. These findings largely changed our understanding of the origin, evolution and host range of plant RNA viruses. There is evidence that their genetic composition originates from viruses, and host populations play a key role in the evolution and host adaptability of plant RNA viruses. In this mini-review, we describe the state of our understanding of the evolution of plant RNA viruses in view of compositional biases and explore how they adapt to the host. It appears that adenine rich (A-rich) coding sequences, low CpG and UpA dinucleotide frequencies and lower codon usage patterns were found in the vast majority of plant RNA viruses. The codon usage pattern of plant RNA viruses was influenced by both natural selection and mutation pressure, and natural selection mostly from hosts was the dominant factor. The codon adaptation analyses support that plant RNA viruses probably evolved a dynamic balance between codon adaptation and deoptimization to maintain efficient replication cycles in multiple hosts with various codon usage patterns. In the future, additional combinations of computational and experimental analyses of the nucleotide composition and codon usage of plant RNA viruses should be addressed.
Collapse
Affiliation(s)
- Zhen He
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China
- Corresponding author.
| | - Lang Qin
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China
| | - Xiaowei Xu
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China
| | - Shiwen Ding
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China
| |
Collapse
|
4
|
He Z, Dong Z, Qin L, Gan H. Phylodynamics and Codon Usage Pattern Analysis of Broad Bean Wilt Virus 2. Viruses 2021; 13:v13020198. [PMID: 33525612 PMCID: PMC7912035 DOI: 10.3390/v13020198] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 01/19/2021] [Accepted: 01/25/2021] [Indexed: 12/13/2022] Open
Abstract
Broad bean wilt virus 2 (BBWV-2), which belongs to the genus Fabavirus of the family Secoviridae, is an important pathogen that causes damage to broad bean, pepper, yam, spinach and other economically important ornamental and horticultural crops worldwide. Previously, only limited reports have shown the genetic variation of BBWV2. Meanwhile, the detailed evolutionary changes, synonymous codon usage bias and host adaptation of this virus are largely unclear. Here, we performed comprehensive analyses of the phylodynamics, reassortment, composition bias and codon usage pattern of BBWV2 using forty-two complete genome sequences of BBWV-2 isolates together with two other full-length RNA1 sequences and six full-length RNA2 sequences. Both recombination and reassortment had a significant influence on the genomic evolution of BBWV2. Through phylogenetic analysis we detected three and four lineages based on the ORF1 and ORF2 nonrecombinant sequences, respectively. The evolutionary rates of the two BBWV2 ORF coding sequences were 8.895 × 10−4 and 4.560 × 10−4 subs/site/year, respectively. We found a relatively conserved and stable genomic composition with a lower codon usage choice in the two BBWV2 protein coding sequences. ENC-plot and neutrality plot analyses showed that natural selection is the key factor shaping the codon usage pattern of BBWV2. Strong correlations between BBWV2 and broad bean and pepper were observed from similarity index (SiD), codon adaptation index (CAI) and relative codon deoptimization index (RCDI) analyses. Our study is the first to evaluate the phylodynamics, codon usage patterns and adaptive evolution of a fabavirus, and our results may be useful for the understanding of the origin of this virus.
Collapse
Affiliation(s)
- Zhen He
- School of Horticulture and Plant Protection, Yangzhou University, Yangzhou 225009, China; (Z.D.); (L.Q.); (H.G.)
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou 225009, China
- Correspondence:
| | - Zhuozhuo Dong
- School of Horticulture and Plant Protection, Yangzhou University, Yangzhou 225009, China; (Z.D.); (L.Q.); (H.G.)
| | - Lang Qin
- School of Horticulture and Plant Protection, Yangzhou University, Yangzhou 225009, China; (Z.D.); (L.Q.); (H.G.)
| | - Haifeng Gan
- School of Horticulture and Plant Protection, Yangzhou University, Yangzhou 225009, China; (Z.D.); (L.Q.); (H.G.)
| |
Collapse
|
5
|
He Z, Dong Z, Gan H. Comprehensive codon usage analysis of rice black-streaked dwarf virus based on P8 and P10 protein coding sequences. INFECTION GENETICS AND EVOLUTION 2020; 86:104601. [PMID: 33122052 DOI: 10.1016/j.meegid.2020.104601] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 10/05/2020] [Accepted: 10/18/2020] [Indexed: 12/21/2022]
Abstract
Rice black-streaked dwarf virus (RBSDV) belongs to the genus Fijivirus of the family Reoviridae and is an important pathogen that damages rice, maize and wheat worldwide. Previously, several reports have described the genetic variation and population structure of RBSDV. However, the details of the evolutionary changes, synonymous codon usage patterns and host adaptation of the virus are largely unclear. Here, we performed a detailed analysis of the codon usage and host adaptability of RBSDV based on 130 full-length P8 and 234 full-length P10 sequences. Infrequent recombination and frequent segment reassortment influence the genomic evolution of RBSDV. Our phylogenetic analysis found three and four lineages based on the P8 and P10 non-recombinant sequences respectively. We found relatively stable and conserved genomic composition with lower codon usage choice in the RBSDV P8 and P10 protein coding sequences. Both ENC-plot and neutrality-plot analyses showed that natural selection is the key factor that shapes the codon usage pattern of RBSDV. Codon adaptation index (CAI), relative codon deoptimization index (RCDI) and similarity index (SiD) analyses indicated strong correlation between RBSDV and rice rather than maize, wheat or Laodelphax striatellus. Our study provides deep insight into the evaluation of the codon usage pattern and adaptive evolution of RBSDV based on P8 and P10 sequences and should be taken into consideration for the prevention and control of this virus.
Collapse
Affiliation(s)
- Zhen He
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou, 225009, Jiangsu Province, PR China; Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Wenhui East Road No.48, Yangzhou, 225009, Jiangsu Province, PR China.
| | - Zhuozhuo Dong
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou, 225009, Jiangsu Province, PR China
| | - Haifeng Gan
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No.48, Yangzhou, 225009, Jiangsu Province, PR China
| |
Collapse
|
6
|
Das D, Deb B, Malakar AK, Chakraborty S. Allele frequency analysis of GALC gene causing Krabbe disease in human and its codon usage. Gene 2020; 747:144673. [PMID: 32304783 DOI: 10.1016/j.gene.2020.144673] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 04/05/2020] [Accepted: 04/14/2020] [Indexed: 10/24/2022]
Abstract
Krabbe disease is one of the rarest autosomal recessive disorders in human, caused by mutation in the GALC (β-galactosylceramidase) gene, resulting in several mental and physical health issues. Due to its rarity and phenotypic heterogeneity, diagnosis rate of this disease is very low. This study generated information on the recessive allele frequency dynamics of GALC gene across 15 global populations, with the highest frequency detected in Druze (Israel) population and the lowest frequency in Turkey and the United States. The recessive allele would take more time period (about 24,975 years) to be completely removed from the population having the lowest frequency and vice versa. The codon usage patterns of four isoforms of GALC gene revealed that a few synonymous codons were used more frequently than others in the isoforms. The codon AGA (arginine) was found to be overrepresented in GALC gene, except for galactocerebrosidase isoform a precursor. Further, GALC gene showed low codon usage bias (CUB) as evident from high ENC values (55.7-58.2), with A/T ending codons more preferred to G/C ending codons. CUB analysis elucidated the dual role of mutational pressure (major role) and natural selection (minor role) in GALC gene evolution.
Collapse
Affiliation(s)
- Debaroti Das
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Bornali Deb
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Arup Kumar Malakar
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India.
| |
Collapse
|
7
|
He Z, Dong Z, Gan H. Genetic changes and host adaptability in sugarcane mosaic virus based on complete genome sequences. Mol Phylogenet Evol 2020; 149:106848. [PMID: 32380283 DOI: 10.1016/j.ympev.2020.106848] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 04/10/2020] [Accepted: 04/28/2020] [Indexed: 12/15/2022]
Abstract
Sugarcane mosaic virus (SCMV), a member of the genus Potyvirus in the family Potyviridae, is an important pathogen that causes mosaic diseases in maize, sugarcane, canna and other graminaceous species worldwide. Previously, several reports have showed the genetic variation and population structure of SCMV. However, the evolutionary dynamics, synonymous codon usage pattern and adaptive evolution of the virus is unclear. In this study, we performed comprehensive analyses of phylodynamics, composition bias and codon usage of SCMV using 108 complete genomic sequences. Our phylogenetic analysis found six host- and geographically confined phylogenetic lineages within the SCMV non-recombinant isolates. We found a relatively stable and conserved genomic composition with a lower codon usage choice in the SCMV protein coding sequences. Mutation pressure and natural selection have shaped the codon usage patterns of the SCMV protein coding sequences with natural selection being the dominant factor. The codon adaptation index (CAI), relative codon deoptimization index (RCDI) and similarity index (SiD) analyses revealed a stronger correlation between SCMV and maize than between SCMV and sugarcane or canna. Our study is the first to evaluate the codon usage pattern of SCMV based on complete sequences and may provide a better understanding of the origin of SCMV and its evolutionary patterns for future research.
Collapse
Affiliation(s)
- Zhen He
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China; Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China.
| | - Zhuozhuo Dong
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China
| | - Haifeng Gan
- School of Horticulture and Plant Protection, Yangzhou University, Wenhui East Road No. 48, Yangzhou 225009, Jiangsu Province, PR China
| |
Collapse
|
8
|
Pal A, Saha BK, Saha J. Comparative in silico analysis of ftsZ gene from different bacteria reveals the preference for core set of codons in coding sequence structuring and secondary structural elements determination. PLoS One 2019; 14:e0219231. [PMID: 31841523 PMCID: PMC6913975 DOI: 10.1371/journal.pone.0219231] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 11/28/2019] [Indexed: 11/19/2022] Open
Abstract
The deluge of sequence information in the recent times provide us with an excellent opportunity to compare organisms on a large genomic scale. In this study we have tried to decipher the variation in the gene organization and structuring of a vital bacterial gene called ftsZ which codes for an integral component of the bacterial cell division, the FtsZ protein. FtsZ is homologous to tubulin protein and has been found to be ubiquitous in eubacteria. FtsZ is showing increasing promise as a target for antibacterial drug discovery. Our study of ftsZ protein from 143 different bacterial species spanning a wider range of morphological and physiological type demonstrates that the ftsZ gene of about ninety three percent of the organisms show relatively biased codon usage profile and significant GC deviation from their genomic GC content. Comparative codon usage analysis of ftsZ and a core housekeeping gene rpoB demonstrated that codon usage pattern of ftsZ CDS is shaped by natural selection to a large extent and mimics that of a housekeeping gene. We have also detected a tendency among the different organisms to utilize a core set of codons in structuring the ftsZ coding sequence. We observed that the compositional frequency of the amino acid serine in the FtsZ protein appears to be a indicator of the bacterial lifestyle. Our meticulous analysis of the ftsZ gene linked with the corresponding FtsZ protein show that there is a bias towards the use of specific synonymous codons particularly in the helix and strand regions of the multi-domain FtsZ protein. Overall our findings suggest that in an indispensable and vital protein such as FtsZ, there is an inherent tendency to maintain form for optimized performance in spite of the extrinsic variability in coding features.
Collapse
Affiliation(s)
- Ayon Pal
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| | - Barnan Kumar Saha
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| | - Jayanti Saha
- Microbiology & Computational Biology Laboratory, Department of Botany, Raiganj University, Raiganj, West Bengal, India
| |
Collapse
|
9
|
Analysis of Synonymous Codon Usage Bias in Potato Virus M and Its Adaption to Hosts. Viruses 2019; 11:v11080752. [PMID: 31416257 PMCID: PMC6722529 DOI: 10.3390/v11080752] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 08/12/2019] [Accepted: 08/13/2019] [Indexed: 02/07/2023] Open
Abstract
Potato virus M (PVM) is a member of the genus Carlavirus of the family Betaflexviridae and causes large economic losses of nightshade crops. Several previous studies have elucidated the population structure, evolutionary timescale and adaptive evolution of PVM. However, the synonymous codon usage pattern of PVM remains unclear. In this study, we performed comprehensive analyses of the codon usage and composition of PVM based on 152 nucleotide sequences of the coat protein (CP) gene and 125 sequences of the cysteine-rich nucleic acid binding protein (NABP) gene. We observed that the PVM CP and NABP coding sequences were GC-and AU-rich, respectively, whereas U- and G-ending codons were preferred in the PVM CP and NABP coding sequences. The lower codon usage of the PVM CP and NABP coding sequences indicated a relatively stable and conserved genomic composition. Natural selection and mutation pressure shaped the codon usage patterns of PVM, with natural selection being the most important factor. The codon adaptation index (CAI) and relative codon deoptimization index (RCDI) analysis revealed that the greatest adaption of PVM was to pepino, followed by tomato and potato. Moreover, similarity Index (SiD) analysis showed that pepino had a greater impact on PVM than tomato and potato. Our study is the first attempt to evaluate the codon usage pattern of the PVM CP and NABP genes to better understand the evolutionary changes of a carlavirus.
Collapse
|
10
|
Chakraborty S, Deb B, Barbhuiya PA, Uddin A. Analysis of codon usage patterns and influencing factors in Nipah virus. Virus Res 2019; 263:129-138. [PMID: 30664908 PMCID: PMC7114725 DOI: 10.1016/j.virusres.2019.01.011] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 01/18/2019] [Accepted: 01/18/2019] [Indexed: 11/28/2022]
Abstract
Nipah virus (NiV) genes are AT-rich. Codon usage bias of NiV genes is low. Patterns of codon usage bias differ across the genomes of NiV. Both mutation pressure and natural selection influenced codon usage bias of NiV genes.
Codon usage bias (CUB) is the unequal usage of synonymous codons of an amino acid in which some codons are used more often than others and is widely used in understanding molecular biology, genetics, and functional regulation of gene expression. Nipah virus (NiV) is an emerging zoonotic paramyxovirus that causes fatal disease in both humans and animals. NiV was first identified during an outbreak of a disease in Malaysia in 1998 and then occurred periodically since 2001 in India, Bangladesh, and the Philippines. We used bioinformatics tools to analyze the codon usage patterns in a genome-wide manner among 11 genomes of NiV as no work was reported yet. The compositional properties revealed that the overall GC and AT contents were 41.96 and 58.04%, respectively i.e. Nipah virus genes were AT-rich. Correlation analysis between overall nucleotide composition and its 3rd codon position suggested that both mutation pressure and natural selection might influence the CUB across Nipah genomes. Neutrality plot revealed natural selection might have played a major role while mutation pressure had a minor role in shaping the codon usage bias in NiV genomes.
Collapse
Affiliation(s)
- Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India.
| | - Bornali Deb
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Parvin A Barbhuiya
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi 788150, Assam, India
| |
Collapse
|
11
|
Gardin J, Yeasmin R, Yurovsky A, Cai Y, Skiena S, Futcher B. Measurement of average decoding rates of the 61 sense codons in vivo. eLife 2014; 3. [PMID: 25347064 PMCID: PMC4371865 DOI: 10.7554/elife.03735] [Citation(s) in RCA: 138] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Accepted: 10/24/2014] [Indexed: 12/19/2022] Open
Abstract
Most amino acids can be encoded by several synonymous codons, which are used at
unequal frequencies. The significance of unequal codon usage remains unclear. One
hypothesis is that frequent codons are translated relatively rapidly. However, there
is little direct, in vivo, evidence regarding codon-specific translation rates. In
this study, we generate high-coverage data using ribosome profiling in yeast, analyze
using a novel algorithm, and deduce events at the A- and P-sites of the ribosome.
Different codons are decoded at different rates in the A-site. In general, frequent
codons are decoded more quickly than rare codons, and AT-rich codons are decoded more
quickly than GC-rich codons. At the P-site, proline is slow in forming peptide bonds.
We also apply our algorithm to short footprints from a different conformation of the
ribosome and find strong amino acid-specific (not codon-specific) effects that may
reflect interactions with the exit tunnel of the ribosome. DOI:http://dx.doi.org/10.7554/eLife.03735.001 Genes contain the instructions for making proteins from molecules called amino acids.
These instructions are encoded in the order of the four building blocks that make up
DNA, which are symbolized by the letters A, T, C, and G. The DNA of a gene is first
copied to make a molecule of RNA, and then the letters in the RNA are read in groups
of three (called ‘codons’) by a cellular machine called a ribosome.
‘Sense codons’ each specify one amino acid, and the ribosome decodes
hundreds or thousands of these codons into a chain of amino acids to form a protein.
‘Stop codons’ do not encode amino acids but instead instruct the
ribosome to stop building a protein when the chain is completed. Most proteins are built from 20 different kinds of amino acid, but there are 61 sense
codons. As such, up to six codons can code for the same amino acid. The multiple
codons for a single amino acid, however, are not used equally in gene
sequences—some are used much more often than others. Now, Gardin, Yeasmin et al. have instantly halted the on-going processes of decoding
genes and building proteins in yeast cells. Codons being translated into amino acids
are trapped inside the ribosome; and codons that take the longest to decode are
trapped most often. By using a computer algorithm, Gardin, Yeasmin et al. were able
to measure just how often each kind of sense codon was trapped inside the ribosome
and use this as a measure of how quickly each codon is decoded. The more often a
given codon is used in a gene sequence, the less likely it was found to be trapped
inside the ribosome—which suggests that these codons are decoded quicker than
other codons and pass through the ribosome more quickly. Put another way, it appears
that genes tend to use the codons that can be read the fastest. Certain properties of a codon also affected its decoding speed. Codons with more As
and Ts, for example, are decoded faster than codons with more Cs and Gs. Furthermore,
whenever a chemically unusual amino acid called proline has to be added to a new
protein chain, it slowed down the speed at which the protein was built. The method
described by Gardin, Yeasmin et al. for peering into a decoding ribosome may now help
future studies that aim to answer other questions about how proteins are built. DOI:http://dx.doi.org/10.7554/eLife.03735.002
Collapse
Affiliation(s)
- Justin Gardin
- Department of Molecular Genetics and Microbiology, Stony Brook University, Stony Brook, United States
| | - Rukhsana Yeasmin
- Department of Computer Science, Stony Brook University, Stony Brook, United States
| | - Alisa Yurovsky
- Department of Molecular Genetics and Microbiology, Stony Brook University, Stony Brook, United States
| | - Ying Cai
- Department of Molecular Genetics and Microbiology, Stony Brook University, Stony Brook, United States
| | - Steve Skiena
- Department of Computer Science, Stony Brook University, Stony Brook, United States
| | - Bruce Futcher
- Department of Molecular Genetics and Microbiology, Stony Brook University, Stony Brook, United States
| |
Collapse
|
12
|
Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses. J Virol 2013; 88:1972-89. [PMID: 24284329 DOI: 10.1128/jvi.03031-13] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.
Collapse
|
13
|
Genome-wide patterns of codon bias are shaped by natural selection in the purple sea urchin, Strongylocentrotus purpuratus. G3-GENES GENOMES GENETICS 2013; 3:1069-83. [PMID: 23637123 PMCID: PMC3704236 DOI: 10.1534/g3.113.005769] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Codon usage bias has been documented in a wide diversity of species, but the relative contributions of mutational bias and various forms of natural selection remain unclear. Here, we describe for the first time genome-wide patterns of codon bias at 4623 genes in the purple sea urchin, Strongylocentrotus purpuratus. Preferred codons were identified at 18 amino acids that exclusively used G or C at third positions, which contrasted with the strong AT bias of the genome (overall GC content is 36.9%). The GC content of third positions and coding regions exhibited significant correlations with the magnitude of codon bias. In contrast, the GC content of introns and flanking regions was indistinguishable from the genome-wide background, which suggested a limited contribution of mutational bias to synonymous codon usage. Five distinct clusters of genes were identified that had significantly different synonymous codon usage patterns. A significant correlation was observed between codon bias and mRNA expression supporting translational selection, but this relationship was driven by only one highly biased cluster that represented only 8.6% of all genes. In all five clusters preferred codons were evolutionarily conserved to a similar degree despite differences in their synonymous codon usage distributions and magnitude of codon bias. The third positions of preferred codons in two codon usage groups also paired significantly more often in stems than in loops of mRNA secondary structure predictions, which suggested that codon bias might also affect mRNA stability. Our results suggest that mutational bias has played a minor role in determining codon bias in S. purpuratus and that preferred codon usage may be heterogeneous across different genes and subject to different forms of natural selection.
Collapse
|
14
|
On the origin of synonymous codon usage divergence between thermophilic and mesophilic prokaryotes. FEBS Lett 2007; 581:5825-30. [DOI: 10.1016/j.febslet.2007.11.054] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2007] [Revised: 11/14/2007] [Accepted: 11/16/2007] [Indexed: 01/24/2023]
|
15
|
Chamary JV, Hurst LD. Evidence for selection on synonymous mutations affecting stability of mRNA secondary structure in mammals. Genome Biol 2005; 6:R75. [PMID: 16168082 PMCID: PMC1242210 DOI: 10.1186/gb-2005-6-9-r75] [Citation(s) in RCA: 236] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2005] [Revised: 06/08/2005] [Accepted: 07/20/2005] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND In mammals, contrary to what is usually assumed, recent evidence suggests that synonymous mutations may not be selectively neutral. This position has proven contentious, not least because of the absence of a viable mechanism. Here we test whether synonymous mutations might be under selection owing to their effects on the thermodynamic stability of mRNA, mediated by changes in secondary structure. RESULTS We provide numerous lines of evidence that are all consistent with the above hypothesis. Most notably, by simulating evolution and reallocating the substitutions observed in the mouse lineage, we show that the location of synonymous mutations is non-random with respect to stability. Importantly, the preference for cytosine at 4-fold degenerate sites, diagnostic of selection, can be explained by its effect on mRNA stability. Likewise, by interchanging synonymous codons, we find naturally occurring mRNAs to be more stable than simulant transcripts. Housekeeping genes, whose proteins are under strong purifying selection, are also under the greatest pressure to maintain stability. CONCLUSION Taken together, our results provide evidence that, in mammals, synonymous sites do not evolve neutrally, at least in part owing to selection on mRNA stability. This has implications for the application of synonymous divergence in estimating the mutation rate.
Collapse
Affiliation(s)
- JV Chamary
- Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK
| |
Collapse
|
16
|
Desai D, Zhang K, Barik S, Srivastava A, Bolander MEME, Sarkar G. Intragenic codon bias in a set of mouse and human genes. J Theor Biol 2004; 230:215-25. [PMID: 15302553 DOI: 10.1016/j.jtbi.2004.05.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2004] [Revised: 05/06/2004] [Accepted: 05/06/2004] [Indexed: 11/20/2022]
Abstract
To better conceptualize the mechanism underlying the evolution of synonymous codons, we have analysed intragenic codon usage in chosen "regions" of some mouse and human genes. We divided a given gene into two regions: one consisting of a trinucleotide repeat (TNR) and the other consisting of the "rest of the coding region" (RCR). Usually, a TNR is composed of a repetitive single codon, which may reflect its frequency in a gene. In contrast, a non-random frequency of a codon in the RCR versus TNR (or vice versa) of a gene should indicate a bias for that codon within the TNR. We examined this scenario by comparing codon frequency between the RCR and the cognate TNR(s) for a set of human and mouse genes. A TNR length of six amino acids or more was used to identify genes from the Genbank database. Twenty nine human and twenty one mouse genes containing TNRs coding for nine different amino acid runs were identified. The ratio of codon frequency in a TNR versus the corresponding RCR was expressed as "fold change" which was also regarded as a measure of codon bias (defined as preferential use either in TNR or in RCR). Chi-square values were then determined from the distribution of codon frequency in a TNR vs. the cognate RCR. At p<0.001, 22% and 27%, respectively, of human and mouse TNRs showed codon bias. Greater than 40% of the TNRs (29 out of 69 in human, and 18 of 42 in mouse) showed codon bias at p<0.05. In addition, we identify eight single-codon TNRs in mouse and ten in human genes. Thus, our results show intragenic codon bias in both mouse and human genes expressed in diverse tissue types. Since our results are independent of the Codon Adaptation Index (CAI) and starvation CAI, and since the tRNA repertoire in a cell or in a tissue is constant, our data suggest that other constraints besides tRNA abundance played a role in creating intragenic codon bias in these genes.
Collapse
Affiliation(s)
- Dinakar Desai
- Department of Orthopedics, Mayo Clinic and Foundation, Medical Science Building 3-69, 200 1st Street, SW, Rochester, MN 55905, USA
| | | | | | | | | | | |
Collapse
|
17
|
Chen SL, Lee W, Hottes AK, Shapiro L, McAdams HH. Codon usage between genomes is constrained by genome-wide mutational processes. Proc Natl Acad Sci U S A 2004; 101:3480-5. [PMID: 14990797 PMCID: PMC373487 DOI: 10.1073/pnas.0307827100] [Citation(s) in RCA: 230] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Analysis of genome-wide codon bias shows that only two parameters effectively differentiate the genome-wide codon bias of 100 eubacterial and archaeal organisms. The first parameter correlates with genome GC content, and the second parameter correlates with context-dependent nucleotide bias. Both of these parameters may be calculated from intergenic sequences. Therefore, genome-wide codon bias in eubacteria and archaea may be predicted from intergenic sequences that are not translated. When these two parameters are calculated for genes from nonmammalian eukaryotic organisms, genes from the same organism again have similar values, and genome-wide codon bias may also be predicted from intergenic sequences. In mammals, genes from the same organism are similar only in the second parameter, because GC content varies widely among isochores. Our results suggest that, in general, genome-wide codon bias is determined primarily by mutational processes that act throughout the genome, and only secondarily by selective forces acting on translated sequences.
Collapse
Affiliation(s)
- Swaine L Chen
- Department of Developmental Biology, Stanford University School of Medicine, Beckman Center, B300, Stanford, CA 94304, USA.
| | | | | | | | | |
Collapse
|
18
|
Knight RD, Freeland SJ, Landweber LF. A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol 2001; 2:RESEARCH0010. [PMID: 11305938 PMCID: PMC31479 DOI: 10.1186/gb-2001-2-4-research0010] [Citation(s) in RCA: 210] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2000] [Revised: 02/01/2001] [Accepted: 02/13/2001] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Correlations between genome composition (in terms of GC content) and usage of particular codons and amino acids have been widely reported, but poorly explained. We show here that a simple model of processes acting at the nucleotide level explains codon usage across a large sample of species (311 bacteria, 28 archaea and 257 eukaryotes). The model quantitatively predicts responses (slope and intercept of the regression line on genome GC content) of individual codons and amino acids to genome composition. RESULTS Codons respond to genome composition on the basis of their GC content relative to their synonyms (explaining 71-87% of the variance in response among the different codons, depending on measure). Amino-acid responses are determined by the mean GC content of their codons (explaining 71-79% of the variance). Similar trends hold for genes within a genome. Position-dependent selection for error minimization explains why individual bases respond differently to directional mutation pressure. CONCLUSIONS Our model suggests that GC content drives codon usage (rather than the converse). It unifies a large body of empirical evidence concerning relationships between GC content and amino-acid or codon usage in disparate systems. The relationship between GC content and codon and amino-acid usage is ahistorical; it is replicated independently in the three domains of living organisms, reinforcing the idea that genes and genomes at mutation/selection equilibrium reproduce a unique relationship between nucleic acid and protein composition. Thus, the model may be useful in predicting amino-acid or nucleotide sequences in poorly characterized taxa.
Collapse
Affiliation(s)
- Robin D Knight
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Stephen J Freeland
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Laura F Landweber
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
19
|
Abstract
The compositional evolution of vertebrate genomes is characterized: (i) by one predominant conservative mode, in which nucleotide changes occur, but the base composition of DNA sequences in general, and of coding sequences in particular, does not change; and (ii) by three different shifting or transitional modes, in which nucleotide changes are accompanied by changes in the base composition of sequences. Investigations on these evolutionary modes have shed new light on a central problem in molecular evolution, namely the role played by natural selection in modulating the mutational input. This review will present first the intragenomic shifts, the 'major shifts' and the 'minor shift', and then the 'whole-genome', or 'horizontal', shift. In each case, the shifts were preceded and followed by a conservative mode of evolution. This review expands on a previous one [Bernardi, Gene 241 (2000) 3-17], and summarizes the evidence that the changes of the compositional patterns of the genome and their maintenance are controlled by Darwinian natural selection.
Collapse
Affiliation(s)
- G Bernardi
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Napoli 80121, Italy.
| |
Collapse
|
20
|
Abstract
The nuclear genomes of vertebrates are mosaics of isochores, very long stretches (>>300kb) of DNA that are homogeneous in base composition and are compositionally correlated with the coding sequences that they embed. Isochores can be partitioned in a small number of families that cover a range of GC levels (GC is the molar ratio of guanine+cytosine in DNA), which is narrow in cold-blooded vertebrates, but broad in warm-blooded vertebrates. This difference is essentially due to the fact that the GC-richest 10-15% of the genomes of the ancestors of mammals and birds underwent two independent compositional transitions characterized by strong increases in GC levels. The similarity of isochore patterns across mammalian orders, on the one hand, and across avian orders, on the other, indicates that these higher GC levels were then maintained, at least since the appearance of ancestors of warm-blooded vertebrates. After a brief review of our current knowledge on the organization of the vertebrate genome, evidence will be presented here in favor of the idea that the generation and maintenance of the GC-richest isochores in the genomes of warm-blooded vertebrates were due to natural selection.
Collapse
Affiliation(s)
- G Bernardi
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Napoli, Italy.
| |
Collapse
|
21
|
D'Onofrio G, Jabbari K, Musto H, Alvarez-Valin F, Cruveiller S, Bernardi G. Evolutionary genomics of vertebrates and its implications. Ann N Y Acad Sci 1999; 870:81-94. [PMID: 10415475 DOI: 10.1111/j.1749-6632.1999.tb08867.x] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
The discovery that the vertebrate genomes of warm-blooded vertebrates are mosaics of isochores, long DNA segments homogeneous in base composition, yet belonging to families covering a broad spectrum of GC levels, has led to two major observations. The first is that gene density is strikingly non-uniform in the genome of all vertebrates, gene concentration increasing with increasing GC levels. (Although the genomes of cold-blooded vertebrates are characterized by smaller compositional heterogeneities than those of warm-blooded vertebrates and high GC levels are not attained, their gene distribution is basically similar to that of warm-blooded vertebrates.) The second observation is that the GC-richest and gene-richest isochores underwent a compositional transition (characterized by a strong increase in GC level) between cold- and warm-blooded vertebrates. Evidence to be discussed favors the idea that this compositional transition and the ensuing highly heterogeneous compositional pattern was due to, and was maintained by, natural selection.
Collapse
Affiliation(s)
- G D'Onofrio
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod 2, Paris, France.
| | | | | | | | | | | |
Collapse
|
22
|
Llopart A, Aguadé M. Synonymous rates at the RpII215 gene of Drosophila: variation among species and across the coding region. Genetics 1999; 152:269-80. [PMID: 10224259 PMCID: PMC1460604 DOI: 10.1093/genetics/152.1.269] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The region encompassing the RpII215 gene that encodes the largest component of the RNA polymerase II complex (1889 amino acids) has been sequenced in Drosophila subobscura, D. madeirensis, D. guanche, and D. pseudoobscura. Nonsynonymous divergence estimates (Ka) indicate that this gene has a very low rate of amino acid replacements. Given its low Ka and constitutive expression, synonymous substitution rates are, however, unexpectedly high. Sequence comparisons have allowed the molecular clock hypothesis to be tested. D. guanche is an insular species and it is therefore expected to have a reduced effective size relative to D. subobscura. The significantly higher rate of synonymous substitutions detected in the D. guanche lineage could be explained if synonymous mutations behave as nearly neutral. Significant departure from the molecular clock hypothesis for synonymous and nonsynonymous substitutions was detected when comparing the D. subobscura, D. pseudoobscura, and D. melanogaster lineages. Codon bias and synonymous divergence between D. subobscura and D. melanogaster were negatively correlated across the RpII215 coding region, which indicates that selection coefficients for synonymous mutations vary across the gene. The C-terminal domain (CTD) of the RpII215 protein is structurally and functionally differentiated from the rest of the protein. Synonymous substitution rates were significantly different in both regions, which strongly indicates that synonymous mutations in the CTD and in the non-CTD regions are under detectably different selection coefficients.
Collapse
Affiliation(s)
- A Llopart
- Departament de Genètica, Facultat de Biologia, Universitat de Barcelona, 08071 Barcelona, Spain.
| | | |
Collapse
|
23
|
Ragaini F, Cenini S. Mechanistic studies of palladium-catalysed carbonylation reactions of nitro compounds to isocyanates, carbamates and ureas. ACTA ACUST UNITED AC 1996. [DOI: 10.1016/1381-1169(96)00004-0] [Citation(s) in RCA: 54] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
24
|
Abstract
Recognition of function of newly sequenced DNA fragments is an important area of computational molecular biology. Here we present an extensive review of methods for prediction of functional sites, tRNA, and protein-coding genes and discuss possible further directions of research in this area.
Collapse
Affiliation(s)
- M S Gelfand
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow region, Russia
| |
Collapse
|
25
|
Morton BR. Chloroplast DNA codon use: evidence for selection at the psb A locus based on tRNA availability. J Mol Evol 1993; 37:273-80. [PMID: 8230251 DOI: 10.1007/bf00175504] [Citation(s) in RCA: 114] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Codon use in the three sequenced chloroplast genomes (Marchantia, Oryza, and Nicotiana) is examined. The chloroplast has a bias in that codons NNA and NNT are favored over synonymous NNC and NNG codons. This appears to be a consequence of an overall high A + T content of the genome. This pattern of codon use is not followed by the psb A gene of all three genomes and other psb A sequences examined. In this gene, the codon use favors NNC over NNT for twofold degenerate amino acids. In each case the only tRNA coded by the genome is complementary to the NNC codon. This codon use is similar to the codon use by chloroplast genes examined from Chlamydomonas reinhardtii. Since psb A is the major translation product of the chloroplast, this suggests that selection is acting on the codon use of this gene to adapt codons to tRNA availability, as previously suggested for unicellular organisms.
Collapse
Affiliation(s)
- B R Morton
- Department of Botany and Plant Sciences, University of California, Riverside 92521
| |
Collapse
|
26
|
Huynen MA, Konings DA, Hogeweg P. Equal G and C contents in histone genes indicate selection pressures on mRNA secondary structure. J Mol Evol 1992; 34:280-91. [PMID: 1569583 DOI: 10.1007/bf00160235] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Protein-specific versus taxon-specific patterns of nucleotide frequencies were studied in histone genes. The third positions of codons have a (well-known) taxon-specific G+C level and a histone type-specific G/C ratio. This ratio counterbalances the G/C ratio in the first and second positions so that the overall G and C levels in the coding region become approximately equal. The compensation of the G/C ratio indicates a selection pressure at the mRNA level rather than a selection pressure or mutation bias at the DNA level or a selection pressure on codon usage. The structure of histone mRNAs is compatible with the hypothesis that the G/C compensation is due to selection pressures on mRNA secondary structure. Nevertheless, no specific motifs seem to have been selected, and the free energy of the secondary structures is only slightly lower than that expected on the basis of nucleotide frequencies.
Collapse
Affiliation(s)
- M A Huynen
- Bioinformatics Group, University of Utrecht, The Netherlands
| | | | | |
Collapse
|
27
|
Caron F, Ruiz F. A method for the amplification of Paramecium micronuclear DNA by polymerase chain reaction and its application to the central repeats of Paramecium primaurelia G surface antigen genes. THE JOURNAL OF PROTOZOOLOGY 1992; 39:312-8. [PMID: 1578405 DOI: 10.1111/j.1550-7408.1992.tb01321.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
This paper describes a method which allows the amplification of Paramecium micronuclear DNA. Amacronucleate cells are first obtained by an appropriate treatment with nocodazole, a microtubule depolymerizing agent which blocks the elongation of the macronucleus and the distribution of the micronuclei at cell division between the two daughter cells; then, DNA from such cells is amplified by the polymerase chain reaction technique. We have applied this method to the problem of the central repeats of the G surface antigen of P. primaurelia (strain 156). The central repeats consist of a 74 amino acid sequence repeated in tandem. The sequence identity of these repeats is also found in the nucleotide sequence even at silent codon positions, suggesting the existence of a mechanism of identity maintenance acting at the nucleotide level. Mechanisms based on RNA secondary structure which are frequently proposed as an explanation of this phenomenon are unlikely to be valid in this case. One can, therefore, imagine that these repeats might originate from one micronuclear sequence through duplicative processes which could occur during the formation of the macronucleus. We have used the described technique to amplify the micronuclear version of the central repeats and showed that it is identical to the macronuclear version, thus ruling out the above hypothesis. Therefore, intragenic recombination appears to be the most likely explanation of the sequence identity of these central repeats.
Collapse
Affiliation(s)
- F Caron
- Laboratoire de Génétique Moléculaire, Ecole Normale Supérieure, Paris, France
| | | |
Collapse
|
28
|
Abstract
Ubiquitin is ubiquitous in all eukaryotes and its amino acid sequence shows extreme conservation. Ubiquitin genes comprise direct repeats of the ubiquitin coding unit with no spacers. The nucleotide sequences coding for 13 ubiquitin genes from 11 species reported so far have been compiled and analyzed. The G + C content of codon third base reveals a positive linear correlation with the genome G + C content of the corresponding species. The slope strongly suggests that the overall G + C content of codons of polyubiquitin genes clearly reflects the genome G + C content by AT/GC substitutions at the codon third position. The G + C content of ubiquitin codon third base also shows a positive linear correlation with the overall G + C content of coding regions of compiled genes, indicating the codon choices among synonymous codons reflect the average codon usage pattern of corresponding species. On the other hand, the monoubiquitin gene, which is different from the polyubiquitin gene in gene organization, gene expression, and function of the encoding protein, shows a different codon usage pattern compared with that of the polyubiquitin gene. From comparisons of the levels of synonymous substitutions among ubiquitin repeats and the homology of the amino acid sequence of the tail of monomeric ubiquitin genes, we propose that the molecular evolution of ubiquitin genes occurred as follows: Plural primitive ubiquitin sequences were dispersed on genome in ancestral eukaryotes. Some of them situated in a particular environment fused with the tail sequence to produce monomeric ubiquitin genes that were maintained across species. After divergence of species, polyubiquitin genes were formed by duplication of the other primitive ubiquitin sequences on different chromosomes. Differences in the environments in which ubiquitin genes are embedded reflect the differences in codon choice and in gene expression pattern between poly- and monomeric ubiquitin genes.
Collapse
Affiliation(s)
- K Mita
- Division of Biology, National Institute of Radiological Sciences, Chiba, Japan
| | | | | |
Collapse
|
29
|
Eberle R, Black D. The simian herpesvirus SA8 homologue of the herpes simplex virus gB gene: mapping, sequencing, and comparison to the HSV gB. Arch Virol 1991; 118:67-86. [PMID: 1646593 DOI: 10.1007/bf01311304] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The genomic location and DNA sequence of the simian herpesvirus SA8 gene encoding a homologue of the HSV1 gB glycoprotein was determined. Using a cloned gB gene of herpes simplex virus type 1 (HSV1) as probe in Southern blot hybridizations, the SA8 gB gene was localized to a 10-kbp KpnI fragment mapping in the unique long part of the genome. A 2.8 kbp, 68.4% GC segment of this fragment was sequenced. It contained a 2649 nucleotide ORF possibly encoding a 98.4 kDa polypeptide. The predicted amino acid sequence of the SA8 gB polypeptide is 78.4% and 78.9% identical to the sequence of the HSV1 and HSV2 gBs, respectively, and was 88.4% similar or identical to both HSV gB sequences. Structural characteristics predicted for the SA8 gB polypeptide were very similar to those of HSV1 gB. These included a hydrophobic signal sequence of 29 amino acids, conservation of all 10 cysteine residues and 5 of 6 potential N-linked glycosylation sites present in the HSV1 gB, a triple hydrophobic transmembrane domain, and a highly charged cytoplasmic tail region. Both hierarchical cluster analysis and phylogenetic analysis of sequences for gB polypeptides of 12 different herpesviruses demonstrated that the gB glycoprotein of SA8 is most closely related to the HSV gB glycoproteins. Comparison of these closely related gB sequences identified four regions in which non-conservative amino acid substitutions were clustered. Localized regions of the gB polypeptide were identified which are likely to be associated with the conserved structure/function of the polypeptide.
Collapse
Affiliation(s)
- R Eberle
- Department of Veterinary Parasitology, Microbiology, and Public Health, College of Veterinary Medicine, Oklahoma State University, Stillwater
| | | |
Collapse
|
30
|
Evolution of DNA Sequence Contributions of Mutational Bias and Selection to the Origin of Chromosomal Compartments. ACTA ACUST UNITED AC 1990. [DOI: 10.1007/978-3-642-75599-6_1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
31
|
Marín A, Bertranpetit J, Oliver JL, Medina JR. Variation in G + C-content and codon choice: differences among synonymous codon groups in vertebrate genes. Nucleic Acids Res 1989; 17:6181-9. [PMID: 2570402 PMCID: PMC318270 DOI: 10.1093/nar/17.15.6181] [Citation(s) in RCA: 73] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The relationship between G + C-content and codon usage in genes of human, mus, rat, bovine and chicken nuclear genomes was investigated. Correlation and lineal regression analyses were carried out on plots that related the frequency of each codon within each synonymous codon group to the G + C-content of the coding sequence as a whole. Under GC pressure, in most of the quartet codon groups there is a preferential choice of the C-ending codon, except in leucine and valine codon groups where the choice of the G-ending codon is preferred. Among ducts, the choice of codons specifying phenylalanine and glutamate shows the strongest dependence on G + C-content. The relationship found between G + C-content and codon usage in these genomes correlate with taxonomic distance.
Collapse
Affiliation(s)
- A Marín
- Departimento de Genética y Biotecnia, Facultad de Biología, Universidad de Sevilla, Spain
| | | | | | | |
Collapse
|
32
|
Abstract
MS2 is an RNA bacteriophage (3569 bases). The secondary structure of the RNA has been determined, and is known to play an important role in regulating translation. Paired regions of the genome have a higher G+C content than unpaired regions. It has been suggested that this reflects selection for high G+C content to encourage pairing, but a re-analysis of the data together with computer simulation suggest that it is an automatic consequence in any RNA sequence of the way it folds up to minimise its free energy. It has also been suggested that the three registers in which pairing can occur in a coding region are used differentially to optimise the use of the redundancy of the genetic code, but re-analysis of the data shows only weak statistical support for this hypothesis.
Collapse
Affiliation(s)
- M Bulmer
- Department of Statistics, Oxford, UK
| |
Collapse
|
33
|
Mita K, Ichimura S, Zama M, James TC. Specific codon usage pattern and its implications on the secondary structure of silk fibroin mRNA. J Mol Biol 1988; 203:917-25. [PMID: 3210244 DOI: 10.1016/0022-2836(88)90117-9] [Citation(s) in RCA: 80] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
We have identified two distinctive regions of the repetitive unit nucleotide sequence of fibroin mRNA of Bombyx mori. The codon usage for the major amino acids, glycine, alanine and serine is distinctly different in these two regions, indicating that it is determined by the fibroin mRNA or gene structure but not by the tRNA population. Comparative computer analyses of nucleotide substitutions in the unit sequence suggest that selection has operated on the codon usage to optimize the secondary structure characteristic of the fibroin mRNA.
Collapse
Affiliation(s)
- K Mita
- Division of Chemistry, National Institute of Radiological Sciences, Chiba, Japan
| | | | | | | |
Collapse
|
34
|
Hanai R, Suyama A, Wada A. Characteristic features of thermal stability map of DNA in Escherichia coli and eukaryotic genes. J Biomol Struct Dyn 1988; 6:51-62. [PMID: 3078238 DOI: 10.1080/07391102.1988.10506482] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Distribution of double-helix thermal stability of Escherichia coli and eukaryotic DNAs was analyzed. The results confirmed the previous propositions based on the study of the stability distribution in phage DNAs: (1) stability fluctuation appears near the boundaries of protein coding regions (PCRs) and non protein coding regions (NPCRs); (2) PCRs have less fluctuation than NPCRs. The present analysis also revealed that the local G + C content is lower in the beginning of PCRs of E. coli than the average G + C content of PCR and that deviations in the amino acid composition and the third letter usage PCRs are involved in the low G + C content; the biological meaning of this is discussed in relation to mRNA structure.
Collapse
Affiliation(s)
- R Hanai
- Department of Physics, Faculty of Science, University of Tokyo, Japan
| | | | | |
Collapse
|
35
|
Filipski J, Salinas J, Rodier F. Two distinct compositional classes of vertebrate gene-bearing DNA stretches, their structures and possible evolutionary origin. DNA (MARY ANN LIEBERT, INC.) 1987; 6:109-18. [PMID: 3582090 DOI: 10.1089/dna.1987.6.109] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Genomes of vertebrates are built of long, compositionally uniform DNA regions differing in guanine and cytidine (G + C) content. Examination of G + C distribution and CpG dinucleotide frequency in the longest stretches of vertebrate DNA base sequences available show that the long-range structural features are correlated with the structure of genes. Two classes of DNA stretches are conspicuous: (i) the stretches having low G + C content and low CpG doublet frequency and (ii) stretches rich in G + C containing CpG-rich islands. Both classes show other compositional islands containing exons. These structural features result from evolutionary pressures acting on the DNA or RNA level, as well as from mutations and repair differently biased in different genomic compartments. The analysis presented provides a rationale for a discussion of evolution of the long-range structural characteristics of DNA.
Collapse
|
36
|
Abstract
I briefly discuss some aspects of theoretical molecular biology. Specifically, I include the issues of searches for homologies via string matchings, for patterns of specific nucleotide groupings and of sequence-structure relationship. The various approaches developed in order to achieve this end are described, attempting to convey some of the excitement in this quickly growing field.
Collapse
Affiliation(s)
- R Nussinov
- Sackler Institute of Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv Univrsity, Ramat Aviv, Israel
| |
Collapse
|
37
|
Abstract
Nucleotide sequences of all genomes are subject to compositional constraints that affect, to about the same extent, both coding and noncoding sequences; influence not only the structure and function of the genome, but also those of transcripts and proteins; are the result of environmental pressures; and largely control the fixation of mutations. These findings indicate that noncoding sequences are associated with biological functions; that the organismal phenotype comprises two components, the classical phenotype, corresponding to the "gene products," and a "genome phenotype," which is defined by the compositional constraints; and that natural selection plays a more important role in genome evolution than do random events.
Collapse
|
38
|
Bernardi G, Olofsson B, Filipski J, Zerial M, Salinas J, Cuny G, Meunier-Rotival M, Rodier F. The mosaic genome of warm-blooded vertebrates. Science 1985; 228:953-8. [PMID: 4001930 DOI: 10.1126/science.4001930] [Citation(s) in RCA: 717] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Most of the nuclear genome of warm-blooded vertebrates is a mosaic of very long (much greater than 200 kilobases) DNA segments, the isochores; these isochores are fairly homogeneous in base composition and belong to a small number of major classes distinguished by differences in guanine-cytosine (GC) content. The families of DNA molecules derived from such classes can be separated and used to study the genome distribution of any sequence which can be probed. This approach has revealed (i) that the distribution of genes, integrated viral sequences, and interspersed repeats is highly nonuniform in the genome, and (ii) that the base composition and ratio of CpG to GpC in both coding and noncoding sequences, as well as codon usage, mainly depend on the GC content of the isochores harboring the sequences. The compositional compartmentalization of the genome of warm-blooded vertebrates is discussed with respect to its evolutionary origin, its causes, and its effects on chromosome structure and function.
Collapse
|
39
|
Abstract
We present theoretical considerations that suggest that synonymous-codon usage might be expected to be close to an equilibrium distribution given a very homogeneous process of silent substitution. By homogeneous we mean that substitution depends only on the two bases involved, so that 12 base-substitution rates completely describe the silent substitution process. We have developed a method of statistically testing for such homogeneous equilibrium and applied it to reported data on the codon usages of different classes of organisms. Weakly expressed bacterial sequences and both mammalian and nonmammalian eukaryotic sequences deviate significantly from a random pattern of codon usage, in the direction of homogeneous equilibrium. On the other hand, highly expressed bacterial sequences do not exhibit homogeneous equilibrium, which may be correlated with recent experimental results showing that they are optimized to accept the most abundant tRNAs. To examine the effect of amino acid replacements on the homogeneous model of silent substitution, we divided the amino acids with degenerate codes into two classes, those with high mutabilities and those with low, and performed the same analysis on bacterial and eukaryotic data sets. The codon sets of the highly mutable class of amino acids are not further from homogeneous equilibrium than are the codon sets of the class with low mutabilities. We also found for the eukaryotic data that these independent classes of codon sets show very similar equilibrium patterns. The various results suggest a high level of uniformity in the process of silent fixation in the different synonymous-codon sets, especially in eukaryotes.
Collapse
|
40
|
Lipman DJ, Wilbur WJ. Interaction of silent and replacement changes in eukaryotic coding sequences. J Mol Evol 1985; 21:161-7. [PMID: 6442990 DOI: 10.1007/bf02100090] [Citation(s) in RCA: 30] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
We examined the codon usages in well-conserved and less-well-conserved regions of vertebrate protein genes and found them to be similar. Despite this similarity, there is a statistically significant decrease in codon bias in the less-well-conserved regions. Our analysis suggests that although those codon changes initially fixed under amino acid replacements tend to follow the overall codon usage pattern, they also reduce the bias in codon usage. This decrease in codon bias leads one to predict that the rate of change of synonymous codons should be greater in those regions that are less well conserved at the amino acid level than in the better-conserved regions. Our analysis supports this prediction. Furthermore, we demonstrate a significantly elevated rate of change of synonymous codons among the adjacent codons 5' to amino acid replacement positions. This provides further support for the idea that there are contextual constraints on the choice of synonymous codons in eukaryotes.
Collapse
|
41
|
Shpaer EG. The secondary structure of mRNAs from Escherichia coli: its possible role in increasing the accuracy of translation. Nucleic Acids Res 1985; 13:275-88. [PMID: 3889832 PMCID: PMC340990 DOI: 10.1093/nar/13.1.275] [Citation(s) in RCA: 36] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
A secondary structure model was proposed for mRNAs during translation (in a polysome) where the secondary structure is described by a set of small unbranched hairpins. Computer simulation experiments reveal that the number of hairpins is much greater (P less than 10(-6) in highly expressed mRNAs from E. coli as compared with the random sequences coding for the same amino acid sequence, i.e. certain synonymous codons are used in definite mRNA positions to increase the number of hairpins. No constraints on the amino acid sequence, which would affect the secondary structure of mRNAs, were found. The codons UGU, UGC (Cys), GCC (Ala), ACA, ACG (Thr), CCU, CCC (Pro), etc. translated by minor tRNAs were found to occur significantly more frequently in the position 5' to the hairpins than the other codons translated by major tRNAs (P less than 5.10(-6). This correlation leads to the hypothesis that the process of hairpin unfolding can increase the time of translocation from the A to P ribosome site of the codon 5' to the hairpin, thus decreasing the probability of translational error (the latter would likely occur more frequently in the codons translated by minor tRNAs).
Collapse
|
42
|
Abstract
Because the genetic code is redundant for most amino acids, different codons can be used in a given position without altering the structure of the protein for which the gene codes. This flexibility permits information encoding structural, and therefore functional, properties of RNA and DNA to be transmitted simultaneously by a protein-coding sequence of DNA. Among the other messages that might be transmitted, it is proposed, is one modulating the evolution of the DNA itself.
Collapse
|
43
|
Perrin P. Coding strategy differences between constant and variable segments of immunoglobulin genes. Nucleic Acids Res 1984; 12:5515-27. [PMID: 6462913 PMCID: PMC318936 DOI: 10.1093/nar/12.13.5515] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Vertebrate immunoglobulin (Ig) mRNAs reveal intraspecies variation in codon usage distinct from that seen with yeast or bacterial genes. Comparison of all available Ig gene sequences shows that %(G + C) in codon position III is consistently lower in variable (V) segments than in constant (C) segments. I find an even lower %(G + C) in the hypervariable domains of V segments. This analysis suggests that base substitution in Ig genes correlates positively with local A + T content.
Collapse
|
44
|
Abstract
We have studied the statistical constraints on synonymous codon choice to evaluate various proposals regarding the origin of the bias in synonymous codon usage observed by Fiers et al. (1975), Air et al. (1976), Grantham et al. (1980) and others. We have determined the statistical dependence of the degenerate third base on either of its nearest neighbors in mitochondrial, prokaryotic, and eukaryotic coding sequences. We noted an increasing dependence of the third base on its nearest neighbors in moving from mitochondria to prokaryotes to eukaryotes. A statistical model assuming random equiprobable selection of synonymous codons was found grossly adequate for the mitochondria, but totally inadequate for prokaryotes and eukaryotes. A model assuming selection of synonymous codons reflecting a genomic strategy, i.e. the genome hypothesis of Grantham et al. (1980), gave a good approximation of the mitochondrial sequences. A statistical model which exactly maintains codon frequency, but allows the position of corresponding synonymous codons to vary was only grossly adequate for prokaryotes and totally inadequate for eukaryotes. The results of these simulations are consistent with the measures on experimental sequences and suggest that a "frequency constraint" model such as that of Grantham et al. (1980) may be an adequate explanation of the codon usage in mitochondria. However, in addition to this frequency constraint, there may be constraints on synonymous codon choice in prokaryotes due to codon context. Furthermore, any proposal to explain codon usage in eukaryotes must involve a constraint on the context of a codon in the sequence.
Collapse
|
45
|
|
46
|
Aruja A, Vilu R, Raukas E. Detection of periodic patterns in RNA sequences: the first encapsidated region of the TMV RNA. J Theor Biol 1982; 94:457-70. [PMID: 7078214 DOI: 10.1016/0022-5193(82)90321-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
47
|
Miyata T, Hayashida H. Extraordinarily high evolutionary rate of pseudogenes: evidence for the presence of selective pressure against changes between synonymous codons. Proc Natl Acad Sci U S A 1981; 78:5739-43. [PMID: 6795634 PMCID: PMC348847 DOI: 10.1073/pnas.78.9.5739] [Citation(s) in RCA: 77] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Comparisons of nucleotide sequences of several pseudogenes described to date, including alpha- and beta-globin and immunoglobulin kappa-type variable domain pseudogenes, with those of functional counterparts revealed that pseudogenes accumulate mutations at an extremely high rate uniformly over their entirety. It is remarkable that the evolutionary rate exceeds the rate of changes between synonymous codons, the highest known rate, in functional genes. Because no pseudogenes appear to function, this result strongly supports the neutral theory. In addition this result apparently indicates the presence of selective pressure against changes between synonymous codons in functional genes. Close examinations of codon utilization patterns in pseudogenes and functional genes revealed a significant correlation between the rate of changes at synonymous codon sites and the strength of bias in code word usage. This implies that even synonymous codon changes are not completely free from selective pressure but are constrained in part, although presumably weakly, depending on the degree of bias in code word usage. We also reexamined alignment between mouse beta h3 (pseudogene) and beta maj sequences and found a unique structure of the beta h3 that is homologous in sequence to the beta maj gene overall but contains a long deletion (about 150 base pairs) in the middle of the gene.
Collapse
|
48
|
Singer BS, Gold L, Shinedling ST, Colkitt M, Hunter LR, Pribnow D, Nelson MA. Analysis in vivo of translational mutants of the rIIB cistron of bacteriophage T4. J Mol Biol 1981; 149:405-32. [PMID: 7031268 DOI: 10.1016/0022-2836(81)90479-4] [Citation(s) in RCA: 59] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
49
|
Napoli C, Gold L, Singer BS. Translational reinitiation in the rIIB cistron of bacteriophage T4. J Mol Biol 1981; 149:433-49. [PMID: 7310886 DOI: 10.1016/0022-2836(81)90480-0] [Citation(s) in RCA: 61] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
50
|
Miller WL, Coit D, Baxter JD, Martial JA. Cloning of bovine prolactin cDNA and evolutionary implications of its sequence. DNA (MARY ANN LIEBERT, INC.) 1981; 1:37-50. [PMID: 6299665 DOI: 10.1089/dna.1.1981.1.37] [Citation(s) in RCA: 33] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Prolactin, growth hormone, and chorionic somatomammotropin (placental lactogen) constitute a set of related polypeptides believed to derive from a common evolutionary ancestor protein. We have cloned and sequenced DNA complementary to the mRNA coding for bovine prolactin. This cDNA contains 702 bases corresponding to 10 amino acids in the leader peptide, all 199 amino acids of the hormone, and 75 nucleotides in the 3' untranslated region of the mRNA. Nucleotide sequence analysis of this cDNA permitted the identification of 10 amino acids in the signal peptide, plus the correction or elucidation of amino acid assignments at 16 sites where aspartic and glutamic acids had not been distinguished from their amides by amino acid sequencing. Codon usage in bovine prolactin mRNA is nonrandom, but, similarly to rat and human prolactins, it does not exhibit the strong preference for G or C in codon third positions seen in bovine, rat, and human growth hormone mRNAs. The translational termination signal in bovine prolactin in UAA, also the same as in rat and human prolactins and differing from the UAG "stop" codon used in bovine, rat, and human growth hormones and human chorionic somatomammotropin. The amino acid and mRNA nucleotide sequences of bovine, rat, and human prolactins and growth hormones were compared by several techniques based on various theories of molecular evolution. The comparison of prolactin to growth hormone is consistent in all three species, suggesting that the genes for these two hormones diverged about 350 million years ago. However, comparisons among the three prolactins or among the three growth hormones to determine the times of evolutionary divergence of the three species generated values that were inconsistent with each other and with the fossil record. Analysis of these discrepancies suggests that the genes for prolactin and growth hormone may now be evolving by different mechanisms.
Collapse
|