1
|
Wichmann S, Ardern Z. Optimality in the standard genetic code is robust with respect to comparison code sets. Biosystems 2019; 185:104023. [DOI: 10.1016/j.biosystems.2019.104023] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 08/22/2019] [Accepted: 08/24/2019] [Indexed: 01/22/2023]
|
2
|
Rao YS, Wang ZF, Chai XW, Nie QH, Zhang XQ. Relationship between 5′ UTR length and gene expression pattern in chicken. Genetica 2013; 141:311-8. [DOI: 10.1007/s10709-013-9730-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Accepted: 08/11/2013] [Indexed: 11/29/2022]
|
3
|
Andersson JO, Sjögren ÅM, Horner DS, Murphy CA, Dyal PL, Svärd SG, Logsdon JM, Ragan MA, Hirt RP, Roger AJ. A genomic survey of the fish parasite Spironucleus salmonicida indicates genomic plasticity among diplomonads and significant lateral gene transfer in eukaryote genome evolution. BMC Genomics 2007; 8:51. [PMID: 17298675 PMCID: PMC1805757 DOI: 10.1186/1471-2164-8-51] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2006] [Accepted: 02/14/2007] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Comparative genomic studies of the mitochondrion-lacking protist group Diplomonadida (diplomonads) has been lacking, although Giardia lamblia has been intensively studied. We have performed a sequence survey project resulting in 2341 expressed sequence tags (EST) corresponding to 853 unique clones, 5275 genome survey sequences (GSS), and eleven finished contigs from the diplomonad fish parasite Spironucleus salmonicida (previously described as S. barkhanus). RESULTS The analyses revealed a compact genome with few, if any, introns and very short 3' untranslated regions. Strikingly different patterns of codon usage were observed in genes corresponding to frequently sampled ESTs versus genes poorly sampled, indicating that translational selection is influencing the codon usage of highly expressed genes. Rigorous phylogenomic analyses identified 84 genes--mostly encoding metabolic proteins--that have been acquired by diplomonads or their relatively close ancestors via lateral gene transfer (LGT). Although most acquisitions were from prokaryotes, more than a dozen represent likely transfers of genes between eukaryotic lineages. Many genes that provide novel insights into the genetic basis of the biology and pathogenicity of this parasitic protist were identified including 149 that putatively encode variant-surface cysteine-rich proteins which are candidate virulence factors. A number of genomic properties that distinguish S. salmonicida from its human parasitic relative G. lamblia were identified such as nineteen putative lineage-specific gene acquisitions, distinct mutational biases and codon usage and distinct polyadenylation signals. CONCLUSION Our results highlight the power of comparative genomic studies to yield insights into the biology of parasitic protists and the evolution of their genomes, and suggest that genetic exchange between distantly-related protist lineages may be occurring at an appreciable rate in eukaryote genome evolution.
Collapse
Affiliation(s)
- Jan O Andersson
- Institute of Cell and Molecular Biology, Uppsala University, Biomedical Center, Uppsala, Sweden
| | - Åsa M Sjögren
- The Canadian Institute for Advanced Research, Program in Evolutionary Biology, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Microbiology, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - David S Horner
- Department of Zoology, The Natural History Museum, London, UK
- Dipartimento di Scienze Biomolecolare e Biotecnologie, University of Milan, Milan, Italy
| | - Colleen A Murphy
- Institute for Marine Biosciences, National Research Council of Canada, Halifax, Nova Scotia, Canada
| | - Patricia L Dyal
- Department of Zoology, The Natural History Museum, London, UK
| | - Staffan G Svärd
- Institute of Cell and Molecular Biology, Uppsala University, Biomedical Center, Uppsala, Sweden
| | - John M Logsdon
- Roy J. Carver Center for Comparative Genomics, Department of Biological Sciences, University of Iowa, Iowa City, USA
| | - Mark A Ragan
- Institute for Marine Biosciences, National Research Council of Canada, Halifax, Nova Scotia, Canada
- ARC Centre in Bioinformatics, and Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Robert P Hirt
- Department of Zoology, The Natural History Museum, London, UK
- School of Biology, The Devonshire building, The University of Newcastle upon Tyne, UK
| | - Andrew J Roger
- The Canadian Institute for Advanced Research, Program in Evolutionary Biology, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
4
|
Abstract
Most of the phenotypic diversity that we perceive in the natural world is directly attributable to the peculiar structure of the eukaryotic gene, which harbors numerous embellishments relative to the situation in prokaryotes. The most profound changes include introns that must be spliced out of precursor mRNAs, transcribed but untranslated leader and trailer sequences (untranslated regions), modular regulatory elements that drive patterns of gene expression, and expansive intergenic regions that harbor additional diffuse control mechanisms. Explaining the origins of these features is difficult because they each impose an intrinsic disadvantage by increasing the genic mutation rate to defective alleles. To address these issues, a general hypothesis for the emergence of eukaryotic gene structure is provided here. Extensive information on absolute population sizes, recombination rates, and mutation rates strongly supports the view that eukaryotes have reduced genetic effective population sizes relative to prokaryotes, with especially extreme reductions being the rule in multicellular lineages. The resultant increase in the power of random genetic drift appears to be sufficient to overwhelm the weak mutational disadvantages associated with most novel aspects of the eukaryotic gene, supporting the idea that most such changes are simple outcomes of semi-neutral processes rather than direct products of natural selection. However, by establishing an essentially permanent change in the population-genetic environment permissive to the genome-wide repatterning of gene structure, the eukaryotic condition also promoted a reliable resource from which natural selection could secondarily build novel forms of organismal complexity. Under this hypothesis, arguments based on molecular, cellular, and/or physiological constraints are insufficient to explain the disparities in gene, genomic, and phenotypic complexity between prokaryotes and eukaryotes.
Collapse
Affiliation(s)
- Michael Lynch
- Department of Biology, Indiana University, Bloomington, USA.
| |
Collapse
|
5
|
Veloso F, Riadi G, Aliaga D, Lieph R, Holmes DS. Large-scale, multi-genome analysis of alternate open reading frames in bacteria and archaea. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2005; 9:91-105. [PMID: 15805780 DOI: 10.1089/omi.2005.9.91] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Analysis of over 300,000 annotated genes in 105 bacterial and archaeal genomes reveals an unexpectedly high frequency of large (>300 nucleotides) alternate open reading frames (ORFs). Especially notable is the very high frequency of alternate ORFs in frames +3 and -1 (where the annotated gene is defined as frame +1). The occurrence of alternate ORFs is correlated with genomic G+C content and is strongly influenced by synonymous codon usage bias. The frequency of alternate ORFs in frame -1 is also influenced by the occurrence of codons encoding leucine and serine in frame +1. Although some alternate ORFs have been shown to encode proteins, many others are probably not expressed because they lack appropriate signals for transcription and translation. These latter can be mis-annotated by automatic gene finding programs leading to errors in public databases. Especially prone to mis-annotation is frame -1, because it exhibits a potential codon usage and theoretical capacity to encode proteins with an amino acid composition most similar to real genes. Some alternate ORFs are conserved across bacterial or archaeal species, and can give rise to misannotated "conserved hypothetical" genes, while others are unique to a genome and are misidentified as "hypothetical orphan" genes, contributing significantly to the orphan gene paradox.
Collapse
Affiliation(s)
- Felipe Veloso
- Laboratory and Bioinformatics and Genome Biology, Andrés Bello University and Millennium Institute of Fundamental and Applied Biology, Santiago, Chile
| | | | | | | | | |
Collapse
|
6
|
Johnson ZI, Chisholm SW. Properties of overlapping genes are conserved across microbial genomes. Genome Res 2004; 14:2268-72. [PMID: 15520290 PMCID: PMC525685 DOI: 10.1101/gr.2433104] [Citation(s) in RCA: 109] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2004] [Revised: 08/12/2004] [Indexed: 11/25/2022]
Abstract
There are numerous examples from the genomes of viruses, mitochondria, and chromosomes that adjacent genes can overlap, sharing at least one nucleotide. Overlaps have been hypothesized to be involved in genome size minimization and as a regulatory mechanism of gene expression. Here we show that overlapping genes are a consistent feature (approximately one-third of all genes) across all microbial genomes sequenced to date, have homologs in more microbes than do non-overlapping genes, and are therefore likely more conserved. In addition, the size, phase (reading frame offset), and distribution, among other characteristics, of overlapping genes are most consistent with the hypothesis that overlaps function in the regulation of gene expression. The upstream sequences and conservation of overlapping orthologs of two model organisms from the genus Prochlorococcus that have significantly different GC-content, and therefore different nucleotide sequences for orthologs, are also consistent with small overlapping sequence regions and programmed shifts in reading frame as a common mechanism in the regulation of microbial gene expression.
Collapse
Affiliation(s)
- Zackary I Johnson
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | |
Collapse
|
7
|
Abstract
Giardia lamblia is a ubiquitous intestinal pathogen of mammals. Evolutionary studies have also defined it as a member of one of the earliest diverging eukaryotic lineages that we are able to cultivate and study in the laboratory. Despite early recognition of its striking structure resembling a half pear endowed with eight flagella and a unique ventral disk, a molecular understanding of the cytoskeleton of Giardia has been slow to emerge. Perhaps most importantly, although the association of Giardia with diarrhoeal disease has been known for several hundred years, little is known of the mechanism by which Giardia exacts such a toll on its host. What is clear, however, is that the flagella and disk are essential for parasite motility and attachment to host intestinal epithelial cells. Because peristaltic flow expels intestinal contents, attachment is necessary for parasites to remain in the small intestine and cause diarrhoea, underscoring the essential role of the cytoskeleton in virulence. This review presents current day knowledge of the cytoskeleton, focusing on its role in motility and attachment. As the advent of new molecular technologies in Giardia sets the stage for a renewed focus on the cytoskeleton and its role in Giardia virulence, we discuss future research directions in cytoskeletal function and regulation.
Collapse
Affiliation(s)
- Heidi G Elmendorf
- Department of Biology, Georgetown University, 348 Reiss Building 37th and O Sts. NW, Washington, DC 20057, USA.
| | | | | |
Collapse
|
8
|
Cheng C, Paddock CD, Reddy Ganta R. Molecular heterogeneity of Ehrlichia chaffeensis isolates determined by sequence analysis of the 28-kilodalton outer membrane protein genes and other regions of the genome. Infect Immun 2003; 71:187-95. [PMID: 12496165 PMCID: PMC143425 DOI: 10.1128/iai.71.1.187-195.2003] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2002] [Revised: 10/08/2002] [Accepted: 10/16/2002] [Indexed: 11/20/2022] Open
Abstract
Ehrlichia chaffeensis, a tick-transmitted rickettsial agent, is responsible for human monocytic ehrlichiosis (HME). In this study, we genetically mapped 10 isolates obtained from HME patients. Sequence analysis of the 28-kDa outer membrane protein (OMP) multigene locus spanning 6 of the 22 tandemly arranged genes identified three distinct genetic groups with shared homology among isolates within each group. Isolates in Groups I and III contained six genes each, while Group II isolates had a gene deletion. There were two regions on the locus where novel gene deletion or insertion mutations occurred, resulting in the net loss of one gene in Group II isolates. Numerous nucleotide differences among genes in isolates of each group also were detected. The shared homology among isolates in each group for the 28-kDa OMP locus suggests the derivation of clonal lineages. Transcription and translation analysis of the locus revealed differences in the expressed genes of different group isolates. Analysis of the 120-kDa OMP gene and variable-length PCR target gene showed size variations resulting from loss or gain of long, direct repeats within the protein coding sequences. To our knowledge this is the first study that looked at several regions of the genome simultaneously, and we provide the first evidence of heterogeneity resulting from gene deletion and insertion mutations in the E. chaffeensis genome. Diversity in different genomic regions could be the result of a selection process or of independently evolved genes.
Collapse
Affiliation(s)
- Chuanmin Cheng
- Department of Diagnostic Medicine/Pathobiology, College of Veterinary Medicine, Kansas State University, Manhattan, Kansas 66506, USA
| | | | | |
Collapse
|
9
|
Morrison HG, Zamora G, Campbell RK, Sogin ML. Inferring protein function from genomic sequence: Giardia lamblia expresses a phosphatidylinositol kinase-related kinase similar to yeast and mammalian TOR. Comp Biochem Physiol B Biochem Mol Biol 2002; 133:477-91. [PMID: 12470813 DOI: 10.1016/s1096-4959(02)00218-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Functional assays of genes have historically led to insights about the activities of a protein or protein cascade. However, the rapid expansion of genomic and proteomic information for a variety of diverse taxa is an alternative and powerful means of predicting function by comparing the enzymes and metabolic pathways used by different organisms. As part of the Giardia lamblia genome sequencing project, we routinely survey the complement of predicted proteins and compare those found in this putatively early diverging eukaryote with those of prokaryotes and more recently evolved eukaryotic lineages. Such comparisons reveal the minimal composition of conserved metabolic pathways, suggest which proteins may have been acquired by lateral transfer, and, by their absence, hint at functions lost in the transition from a free-living to a parasitic lifestyle. Here, we describe the use of bioinformatic approaches to investigate the complement and conservation of proteins in Giardia involved in the regulation of translation. We compare an FK506 binding protein homologue and phosphatidylinositol kinase-related kinase present in Giardia to those found in other eukaryotes for which complete genomic sequence data are available. Our investigation of the Giardia genome suggests that PIK-related kinases are of ancient origin and are highly conserved.
Collapse
Affiliation(s)
- Hilary G Morrison
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA 02543-1015, USA.
| | | | | | | |
Collapse
|
10
|
Rogozin IB, Makarova KS, Natale DA, Spiridonov AN, Tatusov RL, Wolf YI, Yin J, Koonin EV. Congruent evolution of different classes of non-coding DNA in prokaryotic genomes. Nucleic Acids Res 2002; 30:4264-71. [PMID: 12364605 PMCID: PMC140549 DOI: 10.1093/nar/gkf549] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Prokaryotic genomes are considered to be 'wall-to-wall' genomes, which consist largely of genes for proteins and structural RNAs, with only a small fraction of the genomic DNA allotted to intergenic regions, which are thought to typically contain regulatory signals. The majority of bacterial and archaeal genomes contain 6-14% non-coding DNA. Significant positive correlations were detected between the fraction of non-coding DNA and inter- and intra-operonic distances, suggesting that different classes of non-coding DNA evolve congruently. In contrast, no correlation was found between any of these characteristics of non-coding sequences and the number of genes or genome size. Thus, the non-coding regions and the gene sets in prokaryotes seem to evolve in different regimes. The evolution of non-coding regions appears to be determined primarily by the selective pressure to minimize the amount of non-functional DNA, while maintaining essential regulatory signals, because of which the content of non-coding DNA in different genomes is relatively uniform and intra- and inter-operonic non-coding regions evolve congruently. In contrast, the gene set is optimized for the particular environmental niche of the given microbe, which results in the lack of correlation between the gene number and the characteristics of non-coding regions.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Iwabe N, Miyata T. Kinesin-related genes from diplomonad, sponge, amphioxus, and cyclostomes: divergence pattern of kinesin family and evolution of giardial membrane-bounded organella. Mol Biol Evol 2002; 19:1524-33. [PMID: 12200480 DOI: 10.1093/oxfordjournals.molbev.a004215] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
To understand the question of whether divergence of eukaryotic genes by gene duplications and domain shufflings proceeded gradually or intermittently during evolution, we have cloned and sequenced Giardia lamblia cDNAs encoding kinesins and kinesin-related proteins and have obtained 13 kinesin-related cDNAs, some of which are likely homologs of vertebrate kinesins involved in vesicle transfer to ER, Golgi, and plasma membrane. A phylogenetic tree of the kinesin family revealed that most gene duplications that gave rise to different kinesin subfamilies with distinct functions have been completed before the earliest divergence of extant eukaryotes. This suggests that the complex endomembrane system has arisen very early in eukaryotic evolution, and the diminutive ER and Golgi apparatus recognized in the giardial cells, together with the absence of mitochondria, might be characters acquired secondarily during the evolution of parasitism. To understand the divergence pattern of the kinesin family in the lineage leading to vertebrates, seven more Unc104-related cDNAs have been cloned from sponge, amphioxus, hagfish, and lamprey. The divergence pattern of the animal Unc104/KIF1 subfamily is characterized by two active periods in gene duplication interrupted by a considerably long period of silence, instead of proceeding gradually: animals underwent extensive gene duplications before the parazoan-eumetazoan split. In the early evolution of vertebrates around the cyclostome-gnathostome split, further gene duplications occurred, by which a variety of genes with similar structures over the entire regions were generated. This pattern of divergence is similar to those of animal genes involved in cell-cell communication and developmental control.
Collapse
Affiliation(s)
- Naoyuki Iwabe
- Department of Biophysics, Graduate School of Science, Kyoto University, Japan
| | | |
Collapse
|
12
|
Garbarino JE, Gibbons IR. Expression and genomic analysis of midasin, a novel and highly conserved AAA protein distantly related to dynein. BMC Genomics 2002; 3:18. [PMID: 12102729 PMCID: PMC117441 DOI: 10.1186/1471-2164-3-18] [Citation(s) in RCA: 77] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2002] [Accepted: 07/08/2002] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The largest open reading frame in the Saccharomyces genome encodes midasin (MDN1p, YLR106p), an AAA ATPase of 560 kDa that is essential for cell viability. Orthologs of midasin have been identified in the genome projects for Drosophila, Arabidopsis, and Schizosaccharomyces pombe. RESULTS Midasin is present as a single-copy gene encoding a well-conserved protein of approximately 600 kDa in all eukaryotes for which data are available. In humans, the gene maps to 6q15 and encodes a predicted protein of 5596 residues (632 kDa). Sequence alignments of midasin from humans, yeast, Giardia and Encephalitozoon indicate that its domain structure comprises an N-terminal domain (35 kDa), followed by an AAA domain containing six tandem AAA protomers (approximately 30 kDa each), a linker domain (260 kDa), an acidic domain (approximately 70 kDa) containing 35-40% aspartate and glutamate, and a carboxy-terminal M-domain (30 kDa) that possesses MIDAS sequence motifs and is homologous to the I-domain of integrins. Expression of hemagglutamin-tagged midasin in yeast demonstrates a polypeptide of the anticipated size that is localized principally in the nucleus. CONCLUSIONS The highly conserved structure of midasin in eukaryotes, taken in conjunction with its nuclear localization in yeast, suggests that midasin may function as a nuclear chaperone and be involved in the assembly/disassembly of macromolecular complexes in the nucleus. The AAA domain of midasin is evolutionarily related to that of dynein, but it appears to lack a microtubule-binding site.
Collapse
Affiliation(s)
- Joan E Garbarino
- Molecular and Cell Biology Department, University of California Berkeley, Berkeley CA 94720-3200, USA
| | - I R Gibbons
- Molecular and Cell Biology Department, University of California Berkeley, Berkeley CA 94720-3200, USA
| |
Collapse
|