1
|
Cheregi O, Vermaas W, Funk C. The search for new chlorophyll-binding proteins in the cyanobacterium Synechocystis sp. PCC 6803. J Biotechnol 2012; 162:124-33. [PMID: 22759916 DOI: 10.1016/j.jbiotec.2012.06.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Revised: 06/21/2012] [Accepted: 06/25/2012] [Indexed: 01/24/2023]
Abstract
Light harvesting provides a major challenge in the production of biofuels from microorganisms; while sunlight provides the energy necessary for biomass/biofuel production, at the same time it damages the cells. The genome of Synechocystis sp. PCC 6803 was searched for open reading frames that might code for yet unidentified chlorophyll-binding proteins with low molecular mass that could be involved in stress-adaptation. Amongst 9167 hypothetical ORFs corresponding to potential polypeptides of 100 amino acids or less, two were identified that had the potential to be pigment-binding, because they (i) encoded a potential transmembrane region, (ii) showed sequence similarity with known chlorophyll-binding domains, (iii) were conserved in other cyanobacterial species, and (iv) their codon adaptation index indicated significant translation probability. The two ORFs were located complementary (antisense) and internal to the ferrochelatase (hemH) and the pyruvate dehydrogenase (pdh) genes and therefore were named a-fch and a-pdh, respectively. Transcription of both genes was confirmed; however, no translated proteins could be detected immunologically. Whereas mutations within a-pdh or a-fch did not lead to any obvious phenotype, it is clear that transcripts and proteins over and above the currently known set may play a role in defining the physiology of cyanobacteria and other organisms.
Collapse
Affiliation(s)
- Otilia Cheregi
- Department of Chemistry, Umeå University, SE 90187 Umeå, Sweden.
| | | | | |
Collapse
|
2
|
Predicting statistical properties of open reading frames in bacterial genomes. PLoS One 2012; 7:e45103. [PMID: 23028785 PMCID: PMC3454372 DOI: 10.1371/journal.pone.0045103] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Accepted: 08/14/2012] [Indexed: 11/26/2022] Open
Abstract
An analytical model based on the statistical properties of Open Reading Frames (ORFs) of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.
Collapse
|
3
|
Sabath N, Morris JS, Graur D. Is there a twelfth protein-coding gene in the genome of influenza A? A selection-based approach to the detection of overlapping genes in closely related sequences. J Mol Evol 2011; 73:305-15. [PMID: 22187135 DOI: 10.1007/s00239-011-9477-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 12/02/2011] [Indexed: 02/06/2023]
Abstract
Protein-coding genes often contain long overlapping open-reading frames (ORFs), which may or may not be functional. Current methods that utilize the signature of purifying selection to detect functional overlapping genes are limited to the analysis of sequences from divergent species, thus rendering them inapplicable to genes found only in closely related sequences. Here, we present a method for the detection of selection signatures on overlapping reading frames by using closely related sequences, and apply the method to several known overlapping genes, and to an overlapping ORF on the negative strand of segment 8 of influenza A virus (NEG8), for which the suggestion has been made that it is functional. We find no evidence that NEG8 is under selection, suggesting that the intact reading frame might be non-functional, although we cannot fully exclude the possibility that the method is not sensitive enough to detect the signature of selection acting on this gene. We present the limitations of the method using known overlapping genes and suggest several approaches to improve it in future studies. Finally, we examine alternative explanations for the sequence conservation of NEG8 in the absence of selection. We show that overlap type and genomic context affect the conservation of intact overlapping ORFs and should therefore be considered in any attempt of estimating the signature of selection in overlapping genes.
Collapse
Affiliation(s)
- Niv Sabath
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, 8057 Zurich, Switzerland.
| | | | | |
Collapse
|
4
|
Griffin BD, Nagy É. Coding potential and transcript analysis of fowl adenovirus 4: insight into upstream ORFs as common sequence features in adenoviral transcripts. J Gen Virol 2011; 92:1260-1272. [PMID: 21430092 DOI: 10.1099/vir.0.030064-0] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Recombinant fowl adenoviruses (FAdVs) have been successfully used as veterinary vaccine vectors. However, insufficient definitions of the protein-coding and non-coding regions and an incomplete understanding of virus-host interactions limit the progress of next-generation vectors. FAdVs are known to cause several diseases of poultry. Certain isolates of species FAdV-C are the aetiological agent of inclusion body hepatitis/hydropericardium syndrome (IBH/HPS). In this study, we report the complete 45667 bp genome sequence of FAdV-4 of species FAdV-C. Assessment of the protein-coding potential of FAdV-4 was carried out with the Bio-Dictionary-based Gene Finder together with an evaluation of sequence conservation among species FAdV-A and FAdV-D. On this basis, 46 potentially protein-coding ORFs were identified. Of these, 33 and 13 ORFs were assigned high and low protein-coding potential, respectively. Homologues of the ancestral adenoviral genes were, with few exceptions, assigned high protein-coding potential. ORFs that were unique to the FAdVs were differentiated into high and low protein-coding potential groups. Notable putative genes with high protein-coding capacity included the previously unreported fiber 1, hypothetical 10.3K and hypothetical 10.5K genes. Transcript analysis revealed that several of the small ORFs less than 300 nt in length that were assigned low coding potential contributed to upstream ORFs (uORFs) in important mRNAs, including the ORF22 mRNA. Subsequent analysis of the previously reported transcripts of FAdV-1, FAdV-9, human adenovirus 2 and bovine adenovirus 3 identified widespread uORFs in AdV mRNAs that have the potential to act as important translational regulatory elements.
Collapse
Affiliation(s)
- Bryan D Griffin
- Department of Pathobiology, Ontario Veterinary College, University of Guelph, 50 Stone Road East, Guelph, ON N1G 2W1, Canada
| | - Éva Nagy
- Department of Pathobiology, Ontario Veterinary College, University of Guelph, 50 Stone Road East, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
5
|
Sabath N, Graur D. Detection of functional overlapping genes: simulation and case studies. J Mol Evol 2010; 71:308-16. [PMID: 20820768 DOI: 10.1007/s00239-010-9386-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2009] [Accepted: 07/26/2010] [Indexed: 12/16/2022]
Abstract
As far as protein-coding genes are concerned, there is a non-zero probability that at least one of the five possible overlapping sequences of any gene will contain an open-reading frame (ORF) of a length that may be suitable for coding a functional protein. It is, however, very difficult to determine whether or not such an ORF is functional. Recently, we proposed a method that predicts functionality of an overlapping ORF if it can be shown that it has been subject to purifying selection during its evolution. Here, we use simulation to test this method under several conditions and compare it with the method of Firth and Brown. We found that under most conditions, our method detects functional overlapping genes with higher sensitivity than Firth and Brown's method, while maintaining high specificity. Further, we tested the hypothesis that the two aminoacyl tRNA synthetase classes have originated from a pair of overlapping genes. A central piece of evidence ostensibly supporting this hypothesis is the assertion that an overlapping ORF of a heat-shock protein-70 gene, which exhibits some similarity to class 2 aminoacyl tRNA synthetases, is functional. We found signature of purifying selection only in highly divergent sequences, suggesting that the method yields false-positives in high sequence divergence and that the overlapping ORF is not a functional gene. Finally, we examined three cases of overlap in the human genome. We find varying signatures of purifying selection acting on these overlaps, raising the possibility that two of the overlapping genes may not be functional.
Collapse
Affiliation(s)
- Niv Sabath
- Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA.
| | | |
Collapse
|
6
|
Abstract
High-throughput DNA sequencing is increasing the amount of public complete genomes even though a precise gene catalogue for each organism is not yet available. In this context, computational gene finders play a key role in producing a first and cost-effective annotation. Nowadays a compilation of gene prediction tools has been made available to the scientific community and, despite the high number, they can be divided into two main categories: (1) ab initio and (2) evidence based. In the following, we will provide an overview of main methodologies to predict correct exon-intron structures of eukaryotic genes falling in such categories. We will take into account also new strategies that commonly refine ab initio predictions employing comparative genomics or other evidence such as expression data. Finally, we will briefly introduce metrics to in house evaluation of gene predictions in terms of sensitivity and specificity at nucleotide, exon, and gene levels as well.
Collapse
Affiliation(s)
- Ernesto Picardi
- Dipartimento di Biochimica e Biologia Molecolare E Quagliariello, University of Bari, Bari, Italy
| | | |
Collapse
|
7
|
Rodin AS, Rodin SN, Carter CW. On primordial sense-antisense coding. J Mol Evol 2009; 69:555-67. [PMID: 19956936 PMCID: PMC2853367 DOI: 10.1007/s00239-009-9288-4] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2009] [Accepted: 09/18/2009] [Indexed: 11/29/2022]
Abstract
The genetic code is implemented by aminoacyl-tRNA synthetases (aaRS). These 20 enzymes are divided into two classes that, despite performing same functions, have nothing common in structure. The mystery of this striking partition of aaRSs might have been concealed in their sterically complementary modes of tRNA recognition that, as we have found recently, protect the tRNAs with complementary anticodons from confusion in translation. This finding implies that, in the beginning, life increased its coding repertoire by the pairs of complementary codons (rather than one-by-one) and used both complementary strands of genes as templates for translation. The class I and class II aaRSs may represent one of the most important examples of such primordial sense-antisense (SAS) coding (Rodin and Ohno, Orig Life Evol Biosph 25:565-589, 1995). In this report, we address the issue of SAS coding in a wider scope. We suggest a variety of advantages that such coding would have had in exploring a wider sequence space before translation became highly specific. In particular, we confirm that in Achlya klebsiana a single gene might have originally coded for an HSP70 chaperonin (class II aaRS homolog) and an NAD-specific GDH-like enzyme (class I aaRS homolog) via its sense and antisense strands. Thus, in contrast to the conclusions in Williams et al. (Mol Biol Evol 26:445-450, 2009), this could indeed be a "Rosetta stone" gene (Carter and Duax, Mol Cell 10:705-708, 2002) (eroded somewhat, though) for the SAS origin of the two aaRS classes.
Collapse
Affiliation(s)
- Andrei S Rodin
- Human Genetics Center, School of Public Health, University of Texas, Houston, TX 77225, USA.
| | | | | |
Collapse
|
8
|
Williams TA, Wolfe KH, Fares MA. No Rosetta Stone for a Sense–Antisense Origin of Aminoacyl tRNA Synthetase Classes. Mol Biol Evol 2008; 26:445-50. [DOI: 10.1093/molbev/msn267] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
9
|
Abstract
Human cytomegalovirus (HCMV) contains a large and complex E-type genome. There are both clinical isolates of the virus that have been passaged minimally in fibroblasts and so-called laboratory strains that have been extensively passaged and adapted to growth in fibroblasts. The genomes of laboratory strains have undergone rearrangements. To date, the genomes of five clinical isolates have been sequenced. We have re-evaluated the coding content of clinical isolates by identifying the set of open reading frames (ORFs) that are conserved in all five sequenced clinical isolates. We have further determined which of these ORFs are present in the chimpanzee cytomegalovirus (CCMV) genome. A total of 173 ORFs are present in all HCMV genomes and the CCMV genome, and we conclude that these ORFs are very likely to be functional. An additional 59 ORFs are present in the genomes of all five HCMV isolates, but not in CCMV. We have discounted 26 of this latter set of ORFs, because they reside in regions of the genome unlikely to encode functional ORFs. The remaining 33 ORFs are potentially functional ORFs that are specific to HCMV.
Collapse
Affiliation(s)
- E Murphy
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544-1014, USA
| | | |
Collapse
|
10
|
Chen R, Yan H, Zhao KN, Martinac B, Liu GB. Comprehensive analysis of prokaryotic mechanosensation genes: their characteristics in codon usage. ACTA ACUST UNITED AC 2007; 18:269-78. [PMID: 17541832 DOI: 10.1080/10425170601136564] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In the present study, we examined GC nucleotide composition, relative synonymous codon usage (RSCU), effective number of codons (ENC), codon adaptation index (CAI) and gene length for 308 prokaryotic mechanosensitive ion channel (MSC) genes from six evolutionary groups: Euryarchaeota, Actinobacteria, Alphaproteobacteria, Betaproteobacteria, Firmicutes, and Gammaproteobacteria. Results showed that: (1) a wide variation of overrepresentation of nucleotides exists in the MSC genes; (2) codon usage bias varies considerably among the MSC genes; (3) both nucleotide constraint and gene length play an important role in shaping codon usage of the bacterial MSC genes; and (4) synonymous codon usage of prokaryotic MSC genes is phylogenetically conserved. Knowledge of codon usage in prokaryotic MSC genes may benefit from the study of the MSC genes in eukaryotes in which few MSC genes have been identified and functionally analysed.
Collapse
Affiliation(s)
- Rong Chen
- School of Medicine, Xi'an Jiaotong University, Xi'an, People's Republic of China
| | | | | | | | | |
Collapse
|
11
|
Jensen KT, Petersen L, Falk S, Iversen P, Andersen P, Theisen M, Krogh A. Novel overlapping coding sequences in Chlamydia trachomatis. FEMS Microbiol Lett 2006; 265:106-17. [PMID: 17038047 DOI: 10.1111/j.1574-6968.2006.00480.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Chlamydia trachomatis is the aetiological agent of trachoma and sexually transmitted infections. The C. trachomatis genome sequence revealed an organism adapted to the intracellular habitat with a high coding ratio and a small genome consisting of 1.042-kilobase (kb) with 895 annotated protein coding genes. Here, we repredict the protein-coding genes of the C. trachomatis genome using the gene-finder EasyGene that was trained specifically for C. trachomatis, and compare it with the primary C. trachomatis annotation. Our work predicts 15 genes not listed in the primary annotation and 853 that are in agreement with the primary annotation. Forty two genes from the primary annotation are not predicted by EasyGene. The majority of these genes are listed as hypothetical in the primary annotation. The 15 novel predicted genes all overlap with genes on the complementary strand. We find homologues of several of the novel genes in C. trachomatis Serovar A and Chlamydia muridarum. Several of the genes have typical gene-like and protein-like features. Furthermore, we confirm transcriptional activity from 10 of the putative genes. The combined evidence suggests that at least seven of the 15 are protein coding genes. The data suggest the presence of overlapping active genes in C. trachomatis.
Collapse
Affiliation(s)
- Klaus T Jensen
- Laboratory of Infectious Diseases Immunology, Chlamydia Vaccine Unit, Statens Serum Institut, 5 Artillerivej, DK-2300 Copenhagen, Denmark.
| | | | | | | | | | | | | |
Collapse
|
12
|
Veloso F, Riadi G, Aliaga D, Lieph R, Holmes DS. Large-scale, multi-genome analysis of alternate open reading frames in bacteria and archaea. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2005; 9:91-105. [PMID: 15805780 DOI: 10.1089/omi.2005.9.91] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Analysis of over 300,000 annotated genes in 105 bacterial and archaeal genomes reveals an unexpectedly high frequency of large (>300 nucleotides) alternate open reading frames (ORFs). Especially notable is the very high frequency of alternate ORFs in frames +3 and -1 (where the annotated gene is defined as frame +1). The occurrence of alternate ORFs is correlated with genomic G+C content and is strongly influenced by synonymous codon usage bias. The frequency of alternate ORFs in frame -1 is also influenced by the occurrence of codons encoding leucine and serine in frame +1. Although some alternate ORFs have been shown to encode proteins, many others are probably not expressed because they lack appropriate signals for transcription and translation. These latter can be mis-annotated by automatic gene finding programs leading to errors in public databases. Especially prone to mis-annotation is frame -1, because it exhibits a potential codon usage and theoretical capacity to encode proteins with an amino acid composition most similar to real genes. Some alternate ORFs are conserved across bacterial or archaeal species, and can give rise to misannotated "conserved hypothetical" genes, while others are unique to a genome and are misidentified as "hypothetical orphan" genes, contributing significantly to the orphan gene paradox.
Collapse
Affiliation(s)
- Felipe Veloso
- Laboratory and Bioinformatics and Genome Biology, Andrés Bello University and Millennium Institute of Fundamental and Applied Biology, Santiago, Chile
| | | | | | | | | |
Collapse
|
13
|
Abstract
We analyzed the codon usage bias of eight open reading frames (ORFs) across up to 79 human papillomavirus (HPV) genotypes from three distinct phylogenetic groups. All eight ORFs across HPV genotypes show a strong codon usage bias, amongst degenerately encoded amino acids, toward 18 codons mainly with T at the 3rd position. For all 18 degenerately encoded amino acids, codon preferences amongst human and animal PV ORFs are significantly different from those averaged across mammalian genes. Across the HPV types, the L2 ORFs show the highest codon usage bias (73.2+/-1.6% and the E4 ORFs the lowest (51.1+/-0.5%), reflecting as similar bias in codon 3rd position A+T content (L2: 76.1+/-4.2%; E4: 58.6+/-4.5%). The E4 ORF, uniquely amongst the HPV ORFs, is G+C rich, while the other ORFs are A+T rich. Codon usage bias correlates positively with A+T content at the codon 3rd position in the E2, E6, L1 and L2 ORFs, but negatively in the E4 ORFs. A general conservation of preferred codon usage across human and non-human PV genotypes whether they originate from a same supergroup or not, together with observed difference between the preferred codon usage for HPV ORFs and for genes of the cells they infect, suggests that specific codon usage bias and A+T content variation may somehow increase the replicational fitness of HPVs in mammalian epithelial cells, and have practical implications for gene therapy of HPV infection.
Collapse
Affiliation(s)
- Kong-Nan Zhao
- Centre for Immunology and Cancer Research, Princess Alexandra Hospital, University of Queensland, Qld 4102, Woolloongabba, Australia.
| | | | | |
Collapse
|
14
|
Murphy E, Yu D, Grimwood J, Schmutz J, Dickson M, Jarvis MA, Hahn G, Nelson JA, Myers RM, Shenk TE. Coding potential of laboratory and clinical strains of human cytomegalovirus. Proc Natl Acad Sci U S A 2003; 100:14976-81. [PMID: 14657367 PMCID: PMC299866 DOI: 10.1073/pnas.2136652100] [Citation(s) in RCA: 402] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Six strains of human cytomegalovirus have been sequenced, including two laboratory strains (AD169 and Towne) that have been extensively passaged in fibroblasts and four clinical isolates that have been passaged to a limited extent in the laboratory (Toledo, FIX, PH, and TR). All of the sequenced viral genomes have been cloned as infectious bacterial artificial chromosomes. A total of 252 ORFs with the potential to encode proteins have been identified that are conserved in all four clinical isolates of the virus. Multiple sequence alignments revealed substantial variation in the amino acid sequences encoded by many of the conserved ORFs.
Collapse
Affiliation(s)
- Eain Murphy
- Department of Molecular Biology, Princeton University, Princeton, NJ 80544, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Murphy E, Rigoutsos I, Shibuya T, Shenk TE. Reevaluation of human cytomegalovirus coding potential. Proc Natl Acad Sci U S A 2003; 100:13585-90. [PMID: 14593199 PMCID: PMC263857 DOI: 10.1073/pnas.1735466100] [Citation(s) in RCA: 147] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The Bio-Dictionary-based Gene Finder was used to reassess the coding potential of the AD169 laboratory strain of human cytomegalovirus and sequences in the Toledo strain that are missing in the laboratory strain of the virus. The gene-finder algorithm assesses the potential of an ORF to encode a protein based on matches to a database of amino acid patterns derived from a large collection of proteins. The algorithm was used to score all human cytomegalovirus ORFs with the potential to encode polypeptides >/=50 aa in length. As a further test for functionality, the genomes of the chimpanzee, rhesus, and murine cytomegaloviruses were searched for orthologues of the predicted human cytomegalovirus ORFs. The analysis indicates that 37 previously annotated ORFs ought to be discarded, and at least nine previously unrecognized ORFs with relatively strong coding potential should be added. Thus, the human cytomegalovirus genome appears to contain approximately 192 unique ORFs with the potential to encode a protein. Support for several of the predictions of our in silico analysis was obtained by sequencing several domains within a clinical isolate of human cytomegalovirus.
Collapse
Affiliation(s)
- Eain Murphy
- Department of Molecular Biology, Princeton University, Princeton, NJ 80544, USA
| | | | | | | |
Collapse
|
16
|
|
17
|
Monnerjahn C, Techel D, Mohamed SA, Rensing L. A non-stop antisense reading frame in the grp78 gene of Neurospora crassa is homologous to the Achlya klebsiana NAD-gdh gene but is not being transcribed. FEMS Microbiol Lett 2000; 183:307-12. [PMID: 10675602 DOI: 10.1111/j.1574-6968.2000.tb08976.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
A long non-stop reading frame exists on the antisense strand of the grp78 gene (cDNA and genomic DNA) of Neurospora crassa. Computer analysis revealed a strong similarity of the putative antisense protein to the 10th exon of the NAD-dependent glutamate dehydrogenase gene (NAD-gdh) of Achlya klebsiana, which is itself located on the complementary strand of a transcribed hsc70 gene homologue. In Neurospora, no grp78 antisense mRNA was detected by Northern blot and reverse transcription-coupled polymerase chain reaction analyses, indicating that this long reading frame is not being transcribed. Hypotheses for the presence of such unexpressed non-stop reading frames are discussed.
Collapse
Affiliation(s)
- C Monnerjahn
- Institute of Cell Biology, Biochemistry and Biotechnology, University of Bremen, D-28334, Bremen, Germany
| | | | | | | |
Collapse
|
18
|
McEwan NR, Gatherer D. Codon indices as a predictor of gene functionality in aFrankiaoperon. ACTA ACUST UNITED AC 1999. [DOI: 10.1139/b99-068] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The mutational response index and measurements of codon bias were determined in eight potential open reading frames in a Frankia operon that encodes genes for nitrogen fixation. The functionality of the different open reading frames is assessed in light of these results and compared with previously published results, as is the applicability of these techniques to the assessment of translational function of putative open reading frames.Key words: Frankia, codon usage, codon bias, open reading frames, mutation pressure.
Collapse
|
19
|
Walker ND, McEwan NR, Wallace RJ. Overlapping sequences with high homology to functional proteins coexist on complementary strands of DNA in the rumen bacterium Prevotella albensis. Biochem Biophys Res Commun 1999; 263:58-62. [PMID: 10486253 DOI: 10.1006/bbrc.1999.1316] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The potential for two complementary fragments of DNA from a clone from the ruminal bacterium Prevotella albensis to encode sequences with homology to at least part of functional proteins is described. One strand contains a sequence with high homology to dnaK, a member of the hsp70 family, and the other strand contains a sequence with some homology to glutamate dehydrogenase genes. Overlapping of these two genes on opposite strands has been reported in eukaryotic species, and is now reported for the first time in a bacterial species. Further investigation of previously described dnaK genes demonstrates that it is more widespread than might be anticipated, with all thirty other dnaK genes investigated also retaining long sequences encoding at least part of a sequence with high homology to a glutamate dehydrogenase gene.
Collapse
Affiliation(s)
- N D Walker
- Rowett Research Institute, Greenburn Road, Aberdeen, Bucksburn, AB21 9SB, Scotland
| | | | | |
Collapse
|
20
|
Li W. Statistical properties of open reading frames in complete genome sequences. COMPUTERS & CHEMISTRY 1999; 23:283-301. [PMID: 10404621 DOI: 10.1016/s0097-8485(99)00014-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Some statistical properties of open reading frames in all currently available complete genome sequences are analyzed (seventeen prokatyotic genomes, and 16 chromosome sequences from the yeast genome). The size distribution of open reading frames is characterized by various techniques, such as quantile tables, QQ-plots, rank-size plots (Zipf's plots), and spatial densities. The issue of the influence of CG% on the size distribution is addressed. When yeast chromosomes are compared with archaeal and eubacterial genomes, they tend to have more long open reading frames. There is little or no evidence to reject the null hypothesis that open reading frames on six different reading frames and two strands distribute similarly. A topic of current interest, the base composition asymmetry in open reading frames between the two strands, is studied using regression analysis. The base composition asymmetry at three codon positions is analyzed separately. It was shown in these genome sequences that the first codon position is G- and A-rich (i.e. purine-rich); there is a co-existence of A- and T-rich branches at the second codon position; and the third codon position is weakly T-rich.
Collapse
Affiliation(s)
- W Li
- Laboratory of Statistical Genetics, Rockefeller University, New York, NY 10021, USA.
| |
Collapse
|