1
|
|
2
|
Construction and Characterization of Chimeric Proteins Composed of Type-1 and Type-2 Periplasmic Binding Proteins MglB and ArgT. Biosci Biotechnol Biochem 2014; 68:808-13. [PMID: 15118307 DOI: 10.1271/bbb.68.808] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The respective type-1 and type-2 periplasmic binding proteins (PBPs) MglB and ArgT are believed to have evolved from a common ancestor into siblings showing topological differences in their main chain connectivity. At first glance, they show similar structure. But, more detailed examination reveals that the chain connectivity of ArgT is more convoluted than that of MglB. Reflecting that complexity, the folding of ArgT is complicated and involves intermediate folds. On the other hand, the folding of MglB is a simple two-state transition. In the present study, we constructed and characterized several chimeras made up of various subdomains of MglB and ArgT with the aim of gaining insight into the evolution of protein folding and protein structure. Although these chimeras did not fold as compactly as their parental proteins, some did exhibit cooperative folding, which suggests that novel proteins with new connectivity and new folding pathways could have emerged at a fairly high rate throughout the evolution of proteins.
Collapse
|
3
|
SABRE2: A Database Connecting Plant EST/Full-Length cDNA Clones with Arabidopsis Information. ACTA ACUST UNITED AC 2014; 55:e5. [DOI: 10.1093/pcp/pct177] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
4
|
Plant Genome DataBase Japan (PGDBj): a portal website for the integration of plant genome-related databases. PLANT & CELL PHYSIOLOGY 2014; 55:e8. [PMID: 24363285 PMCID: PMC3894704 DOI: 10.1093/pcp/pct189] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The Plant Genome DataBase Japan (PGDBj, http://pgdbj.jp/?ln=en) is a portal website that aims to integrate plant genome-related information from databases (DBs) and the literature. The PGDBj is comprised of three component DBs and a cross-search engine, which provides a seamless search over the contents of the DBs. The three DBs are as follows. (i) The Ortholog DB, providing gene cluster information based on the amino acid sequence similarity. Over 500,000 amino acid sequences of 20 Viridiplantae species were subjected to reciprocal BLAST searches and clustered. Sequences from plant genome DBs (e.g. TAIR10 and RAP-DB) were also included in the cluster with a direct link to the original DB. (ii) The Plant Resource DB, integrating the SABRE DB, which provides cDNA and genome sequence resources accumulated and maintained in the RIKEN BioResource Center and National BioResource Projects. (iii) The DNA Marker DB, providing manually or automatically curated information of DNA markers, quantitative trait loci and related linkage maps, from the literature and external DBs. As the PGDBj targets various plant species, including model plants, algae, and crops important as food, fodder and biofuel, researchers in the field of basic biology as well as a wide range of agronomic fields are encouraged to perform searches using DNA sequences, gene names, traits and phenotypes of interest. The PGDBj will return the search results from the component DBs and various types of linked external DBs.
Collapse
|
5
|
Development of full-length cDNAs from Chinese cabbage (Brassica rapa Subsp. pekinensis) and identification of marker genes for defence response. DNA Res 2011; 18:277-89. [PMID: 21745830 PMCID: PMC3158467 DOI: 10.1093/dnares/dsr018] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2011] [Accepted: 05/25/2011] [Indexed: 11/13/2022] Open
Abstract
Arabidopsis belongs to the Brassicaceae family and plays an important role as a model plant for which researchers have developed fine-tuned genome resources. Genome sequencing projects have been initiated for other members of the Brassicaceae family. Among these projects, research on Chinese cabbage (Brassica rapa subsp. pekinensis) started early because of strong interest in this species. Here, we report the development of a library of Chinese cabbage full-length cDNA clones, the RIKEN BRC B. rapa full-length cDNA (BBRAF) resource, to accelerate research on Brassica species. We sequenced 10 000 BBRAF clones and confirmed 5476 independent clones. Most of these cDNAs showed high homology to Arabidopsis genes, but we also obtained more than 200 cDNA clones that lacked any sequence homology to Arabidopsis genes. We also successfully identified several possible candidate marker genes for plant defence responses from our analysis of the expression of the Brassica counterparts of Arabidopsis marker genes in response to salicylic acid and jasmonic acid. We compared gene expression of these markers in several Chinese cabbage cultivars. Our BBRAF cDNA resource will be publicly available from the RIKEN Bioresource Center and will help researchers to transfer Arabidopsis-related knowledge to Brassica crops.
Collapse
|
6
|
Abstract
The RIKEN integrated database of mammals (http://scinets.org/db/mammal) is the official undertaking to integrate its mammalian databases produced from multiple large-scale programs that have been promoted by the institute. The database integrates not only RIKEN's original databases, such as FANTOM, the ENU mutagenesis program, the RIKEN Cerebellar Development Transcriptome Database and the Bioresource Database, but also imported data from public databases, such as Ensembl, MGI and biomedical ontologies. Our integrated database has been implemented on the infrastructure of publication medium for databases, termed SciNetS/SciNeS, or the Scientists' Networking System, where the data and metadata are structured as a semantic web and are downloadable in various standardized formats. The top-level ontology-based implementation of mammal-related data directly integrates the representative knowledge and individual data records in existing databases to ensure advanced cross-database searches and reduced unevenness of the data management operations. Through the development of this database, we propose a novel methodology for the development of standardized comprehensive management of heterogeneous data sets in multiple databases to improve the sustainability, accessibility, utility and publicity of the data of biomedical information.
Collapse
|
7
|
Abstract
The National BioResource Project (NBRP) is a Japanese project that aims to establish a system for collecting, preserving and providing bioresources for use as experimental materials for life science research. It is promoted by 27 core resource facilities, each concerned with a particular group of organisms, and by one information center. The NBRP database is a product of this project. Thirty databases and an integrated database-retrieval system (BioResource World: BRW) have been created and made available through the NBRP home page (http://www.nbrp.jp). The 30 independent databases have individual features which directly reflect the data maintained by each resource facility. The BRW is designed for users who need to search across several resources without moving from one database to another. BRW provides access to a collection of 4.5-million records on bioresources including wild species, inbred lines, mutants, genetically engineered lines, DNA clones and so on. BRW supports summary browsing, keyword searching, and searching by DNA sequences or gene ontology. The results of searches provide links to online requests for distribution of research materials. A circulation system allows users to submit details of papers published on research conducted using NBRP resources.
Collapse
|
8
|
Analysis of multiple occurrences of alternative splicing events in Arabidopsis thaliana using novel sequenced full-length cDNAs. DNA Res 2009; 16:155-64. [PMID: 19423640 PMCID: PMC2695776 DOI: 10.1093/dnares/dsp009] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Alternative splicing (AS) is a mechanism by which multiple types of mature mRNAs are generated from a single pre-mature mRNA. In this study, we completely sequenced 1800 full-length cDNAs from Arabidopsis thaliana, which had 5′ and/or 3′ sequences that were previously found to have AS events or alternative transcription start sites. Unexpectedly, these sequences gave us further evidence of AS, as 601 out of 1800 transcripts showed novel AS events. We focused on the combination patterns of multiple AS events within individual genes. Interestingly, some specific AS event combination patterns tended to appear more frequently than expected. The two most common patterns were: (i) alternative donor–0∼12 times of exon skips–alternative acceptor and (ii) several times (∼8) of retained introns. We also found that multiple AS events in a transcript tend to have the same effects concerning the length of the mature mRNA. Our current results are consistent with our previous observations, which showed changes in AS profiles under different conditions, and suggest the involvement of hypothetical cis- and trans-acting factors in the regulation of AS events.
Collapse
|
9
|
Four-dimensional quantitative analysis of the gait of mutant mice using coarse-grained motion capture. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2009; 2009:5227-5230. [PMID: 19964861 DOI: 10.1109/iembs.2009.5334287] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
To analyze an abnormal gait pattern in mutant mice (Hugger), we conducted coarse-grained motion capture. Using a simple retroreflective marker-based approach, we could detect high-resolution mutant-specific gait patterns. The phenotypic gait patterns are caused by extreme vertical motion of limbs, revealing inefficient motor functions. To elucidate the inefficiency, we developed a musculoskeletal computer model of the mouse hindlimb based on X-ray CT data. By integrating motion data with the model, we determined mutant-specific musculotendon lengths, suggesting that three major muscles were involved in the abnormal gait. This approach worked well on laboratory mice, which were putatively too small to be motion capture subjects. Motion capture technology was originally developed for human study, and our approach may help fill neuroscience gaps between mouse and human behavioral phenotypes.
Collapse
|
10
|
Abstract
It is desirable to estimate a tree of life, a species tree including all available species in the 3 superkingdoms, Archaea, Bacteria, and Eukaryota, using not a limited number of genes but full-scale genome information. Here, we report a new method for constructing a tree of life based on protein domain organizations, that is, sequential order of domains in a protein, of all proteins detected in a genome of an organism. The new method is free from the identification of orthologous gene sets and therefore does not require the burdensome and error-prone computation. By pairwise comparisons of the repertoires of protein domain organizations of 17 archaeal, 136 bacterial, and 14 eukaryotic organisms, we computed evolutionary distances among them and constructed a tree of life. Our tree shows monophyly in Archaea, Bacteria, and Eukaryota and then monophyly in each of eukaryotic kingdoms and in most bacterial phyla. In addition, the branching pattern of the bacterial phyla in our tree is consistent with the widely accepted bacterial taxonomy and is very close to other genome-based trees. A couple of inconsistent aspects between the traditional trees and the genome-based trees including ours, however, would perhaps urge to revise the conventional view, particularly on the phylogenetic positions of hyperthermophiles.
Collapse
|
11
|
Abstract
To elucidate the origins of the MHC-B-MHC-C pair and the MHC class I chain-related molecule (MIC)A-MICB pair, we sequenced an MHC class I genomic region of humans, chimpanzees, and rhesus monkeys and analyzed the regions from an evolutionary stand-point, focusing first on LINE sequences that are paralogous within each of the first two species and orthologous between them. Because all the long interspersed nuclear element (LINE) sequences were fragmented and nonfunctional, they were suitable for conducting phylogenetic study and, in particular, for estimating evolutionary time. Our study has revealed that MHC-B and MHC-C duplicated 22.3 million years (Myr) ago, and the ape MICA and MICB duplicated 14.1 Myr ago. We then estimated the divergence time of the rhesus monkey by using other orthologous LINE sequences in the class I regions of the three primate species. The result indicates that rhesus monkeys, and possibly the Old World monkeys in general, diverged from humans 27-30 Myr ago. Interestingly, rhesus monkeys were found to have not the pair of MHC-B and MHC-C but many repeated genes similar to MHC-B. These results support our inference that MHC-B and MHC-C duplicated after the divergence between apes and Old World monkeys.
Collapse
|
12
|
Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biol 2004; 2:e162. [PMID: 15103394 PMCID: PMC393292 DOI: 10.1371/journal.pbio.0020162] [Citation(s) in RCA: 267] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2003] [Accepted: 04/01/2004] [Indexed: 01/08/2023] Open
Abstract
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.
Collapse
|
13
|
Characterization of folding pathways of the type-1 and type-2 periplasmic binding proteins MglB and ArgT. J Biochem 2003; 133:371-6. [PMID: 12761173 DOI: 10.1093/jb/mvg049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The family of periplasmic binding proteins (PBPs) is believed to have arisen from a common ancestor and to have differentiated into two types. At first approximation, both types of PBPs have the same fold pattern, reflecting their common origin. However, the connection between the main chains of a type 2 PBP is more complicated than a type 1 PBP's. We have been interested in the possibility that such structural changes affect the folding of PBPs. In this study, we have characterized the folding pathways of MglB (a type 1 PBP) and ArgT (a type 2 PBP) by using urea gradient gel electrophoresis, fast protein size-exclusion liquid chromatography and hydrophobic dye ANS binding assay. We found a distinct difference in folding between these two proteins. The folding of MglB followed a simple two-state transition model, whereas the folding of ArgT was more complicated.
Collapse
|
14
|
Parallel evolution of ligand specificity between LacI/GalR family repressors and periplasmic sugar-binding proteins. Mol Biol Evol 2003; 20:267-77. [PMID: 12598694 DOI: 10.1093/molbev/msg038] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The bacterial LacI/GalR family repressors such as lactose operon repressor (LacI), purine nucleotide synthesis repressor (PurR), and trehalose operon repressor (TreR) consist of not only the N-terminal helix-turn-helix DNA-binding domain but also the C-terminal ligand-binding domain that is structurally homologous to periplasmic sugar-binding proteins. These structural features imply that the repressor family evolved by acquiring the DNA-binding domain in the N-terminal of an ancestral periplasmic binding protein (PBP). Phylogenetic analysis of the LacI/GalR family repressors and their PBP homologues revealed that the acquisition of the DNA-binding domain occurred first in the family, and ligand specificity then evolved. The phylogenetic tree also indicates that the acquisition occurred only once before the divergence of the major lineages of eubacteria, and that the LacI/GalR and the PBP families have since undergone extensive gene duplication/loss independently along the evolutionary lineages. Multiple alignments of the repressors and PBPs furthermore revealed that repressors and PBPs with the same ligand specificity have the same or similar residues in their binding sites. This result, together with the phylogenetic relationship, demonstrates that the repressors and the PBPs individually acquired the same ligand specificity by homoplasious replacement, even though their genes are encoded in the same operon.
Collapse
|
15
|
[How to make good use of CLUSTALW]. TANPAKUSHITSU KAKUSAN KOSO. PROTEIN, NUCLEIC ACID, ENZYME 2002; 47:1237-9. [PMID: 12166070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/26/2023]
|
16
|
Abstract
When protein sequences divergently evolve under functional constraints, some individual amino acid replacements that reverse the charge (e.g. Lys to Asp) may be compensated by a replacement at a second position that reverses the charge in the opposite direction (e.g. Glu to Arg). When these side-chains are near in space (proximal), such double replacements might be driven by natural selection, if either is selectively disadvantageous, but both together restore fully the ability of the protein to contribute to fitness (are together "neutral"). Accordingly, many have sought to identify pairs of positions in a protein sequence that suffer compensatory replacements, often as a way to identify positions near in space in the folded structure. A "charge compensatory signal" might manifest itself in two ways. First, proximal charge compensatory replacements may occur more frequently than predicted from the product of the probabilities of individual positions suffering charge reversing replacements independently. Conversely, charge compensatory pairs of changes may be observed to occur more frequently in proximal pairs of sites than in the average pair. Normally, charge compensatory covariation is detected by comparing the sequences of extant proteins at the "leaves" of phylogenetic trees. We show here that the charge compensatory signal is more evident when it is sought by examining individual branches in the tree between reconstructed ancestral sequences at nodes in the tree. Here, we find that the signal is especially strong when the positions pairs are in a single secondary structural unit (e.g. alpha helix or beta strand) that brings the side-chains suffering charge compensatory covariation near in space, and may be useful in secondary structure prediction. Also, "node-node" and "node-leaf" compensatory covariation may be useful to identify the better of two equally parsimonious trees, in a way that is independent of the mathematical formalism used to construct the tree itself. Further, compensatory covariation may provide a signal that indicates whether an episode of sequence evolution contains more or less divergence in functional behavior. Compensatory covariation analysis on reconstructed evolutionary trees may become a valuable tool to analyze genome sequences, and use these analyses to extract biomedically useful information from proteome databases.
Collapse
|
17
|
Abstract
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) has made an effort to collect as much data as possible mainly from Japanese researchers. The increase rates of the data we collected, annotated and released to the public in the past year are 43% for the number of entries and 52% for the number of bases. The increase rates are accelerated even after the human genome was sequenced, because sequencing technology has been remarkably advanced and simplified, and research in life science has been shifted from the gene scale to the genome scale. In addition, we have developed the Genome Information Broker (GIB, http://gib.genes.nig.ac.jp) that now includes more than 50 complete microbial genome and Arabidopsis genome data. We have also developed a database of the human genome, the Human Genomics Studio (HGS, http://studio.nig.ac.jp). HGS provides one with a set of sequences being as continuous as possible in any one of the 24 chromosomes. Both GIB and HGS have been updated incorporating newly available data and retrieval tools.
Collapse
|
18
|
Domain dislocation: a change of core structure in periplasmic binding proteins in their evolutionary history. J Mol Biol 1999; 286:279-90. [PMID: 9931266 DOI: 10.1006/jmbi.1998.2454] [Citation(s) in RCA: 167] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Periplasmic binding proteins (PBPs) serve as receptors for various water-soluble ligands in ATP-binding cassette (ABC) transport systems, and form one of the largest protein families in eubacterial and archaebacterial genomes. They are considered to be derived from a common ancestor, judging from their similarities of three-dimensional structure, their mechanism of ligand binding and the operon structure of their genes. Nevertheless, there are two types of topological arrangements of the central beta-sheets in their core structures. It follows that there must have been differentiation in the core structure, which we call "domain dislocation", in the course of evolution of the PBP family. To find a clue as to when the domain dislocation occurred, we constructed phylogenetic trees for PBPs based on their amino acid sequences and three-dimensional structures, respectively. The trees show that the proteins of each type clearly cluster together, strongly indicating that the change in the core structure occurred only once in the evolution of PBPs. We also constructed a phylogenetic tree for the ABC proteins that are encoded by the same operon of their partner PBP, and obtained the same result. Based on the phylogenetic relationship and comparison of the topological arrangements of PBPs, we obtained a reasonable genealogical chart of structural changes in the PBP family. The present analysis shows that the unidirectional change of protein evolution is clearly deduced at the level of protein three-dimensional structure rather than the level of amino acid sequence.
Collapse
|
19
|
Formal design and implementation of an improved DDBJ DNA database with a new schema and object-oriented library. Bioinformatics 1998; 14:472-8. [PMID: 9694985 DOI: 10.1093/bioinformatics/14.6.472] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The DNA Data Bank of Japan (DDBJ) has developed a new DNA database system with a new schema design to accommodate rapid change and growth of requirements on the system. RESULTS The new schema and systems were created using an object-oriented design approach. The design was accomplished in accordance with ANSI/SPARC three-level schema architecture. First, the conceptual schema was designed using a functional model named AIS (associative information structure) and was visualized in extended diagram format. The model is a natural extension of an ER (entity relationship) model and describes real-world objects in binary associations between entities with the concept of order. Second, the schema was mapped on a relational database as a physical schema. All details are concentrated in this schema and the layer lying above enjoys physical independence. Finally, as another layer, external modeling was introduced for the database applications interface. It provides set-at-a-time basis operations and was implemented as a C++ object-oriented library. On this common framework of a new schema, a new annotator's workbench named Yamato II and a World Wide Web (WWW) submission system named Sakura have been successfully developed to improve drastically daily transactions in the DDBJ. AVAILABILITY Sakura is available at the following address: http://sakura.ddbj.nig.ac.jp. CONTACT hsugawar@genes.nig.ac.jp
Collapse
|
20
|
Abstract
We at the DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) have recently begun receiving, processing and releasing EST and genome sequence data submitted by various Japanese genome projects. The data include those for human, Arabidopsis thaliana, rice, nematode, Synechocystis sp. and Escherichia coli. Since the quantity of data is very large, we organized teams to conduct preliminary discussions with project teams about data submission and handling for release to the public. We also developed a mass submission tool to cope with a large quantity of data. In addition, to provide genome data on WWW, we developed a genome information system using Java. This system (http://mol.genes.nig.ac.jp/ecoli/) can in theory be used for any genome sequence data. These activities will facilitate processing of large quantities of EST and genome data.
Collapse
|
21
|
Abstract
The gene (AK) encoding adenylate kinase (AK) of Halobacterium halobium was cloned. AK consisted of 648 bp and coded for 216 amino acids (aa). S1 mapping and primer extension experiments indicated that the transcription start point (tsp) was located immediately upstream from the start codon. The TAT-like promoter sequence was found at a position 20-24 bp upstream from tsp. The most striking property of the enzyme was a putative Zn finger-like structure with four cysteines. It might contribute to the structural stability of the molecule in high-salt conditions. Phylogenetic analysis indicated two lineages of the AK family, the short and long types which diverged a long time ago, possibly before the separation of prokaryotes and eukaryotes. Although the H. halobium AK belongs to the long-type AK lineage, it is located in an intermediary position between the two lineages of the phylogenetic tree, indicating early divergence of the gene along the long-type lineage.
Collapse
|
22
|
Ancient divergence of long and short isoforms of adenylate kinase: molecular evolution of the nucleoside monophosphate kinase family. FEBS Lett 1996; 385:214-20. [PMID: 8647254 DOI: 10.1016/0014-5793(96)00367-5] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Adenylate kinases (AK) from vertebrates are separated into three isoforms, AK1, AK2 and AK3, based on structure, subcellular localization and substrate specificity. AK1 is the short type with the amino acid sequence being 27 residues shorter than sequences of the long types, AK2 and AK3. A phylogenetic tree prepared for the AK isozymes and other members of the nucleoside monophosphate (NMP) kinase family shows that the divergence of long and short types occurred first and then differentiation in subcellular localization or substrate specificity took place. The first step involved a drastic change in the three-dimensional structure of the LID domain. The second step was caused mainly by smaller changes in amino acid sequences.
Collapse
|
23
|
[Molecular evolution of proteins with RNA-binding domains]. TANPAKUSHITSU KAKUSAN KOSO. PROTEIN, NUCLEIC ACID, ENZYME 1994; 39:2177-2188. [PMID: 7972866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
|
24
|
Abstract
RNA-binding proteins (RNPs) involved in splicing, processing and translation regulation contain one to four RNA-binding domains. We constructed a phylogenetic tree for the RNA-binding domains, including those of poly(A)-binding protein (PABP), splicing factors, chloroplast RNPs, hnRNPs, snRNP U1-70K, nucleolin and Drosophila sex determinants. Proteins with similar functions were found to have closely related RNA-binding domains and common domain organizations. In light of these observation, one can assume the function of an RNA-binding protein, based on the evolutionary relationship between its RNA-binding domain(s) and domain organization, as compared with other RNPs.
Collapse
|
25
|
Diversity of a ribonucleoprotein family in tobacco chloroplasts: two new chloroplast ribonucleoproteins and a phylogenetic tree of ten chloroplast RNA-binding domains. Nucleic Acids Res 1991; 19:6485-90. [PMID: 1721701 PMCID: PMC329204 DOI: 10.1093/nar/19.23.6485] [Citation(s) in RCA: 49] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Two new ribonucleoproteins (RNPs) have been identified from a tobacco chloroplast lysate. These two proteins (cp29A and cp29B) are nuclear-encoded and have a less affinity to single-stranded DNA as compared with three other chloroplast RNPs (cp28, cp31 and cp33) previously isolated. DNA sequencing revealed that both contain two consensus sequence-type homologous RNA-binding domains (CS-RBDs) and a very acidic amino-terminal domain but shorter than that of cp28, cp31 and cp33. Comparison of cp29A and cp29B showed a 19 amino acid insertion in the region separating the two CS-RBDs in cp29B. This insertion results in three tandem repeats of a glycine-rich sequence of 10 amino acids, which is a novel feature in RNPs. The two proteins are encoded by different single nuclear genes and no alternatively spliced transcripts could be identified. We constructed a phylogenetic tree for the ten chloroplast CS-RBDs. These results suggest that there is a sizable RNP family in chloroplasts and the diversity was mainly generated through a series of gene duplications rather than through alternative pre-mRNA splicing. The gene for cp29B contains three introns. The first and second introns interrupt the first CS-RBD and the third intron does the second CS-RBD. The position of the first intron site is the same as that in the human hnRNP A1 protein gene.
Collapse
|
26
|
Robustness of maximum likelihood tree estimation against different patterns of base substitutions. J Mol Evol 1991; 32:79-91. [PMID: 1849180 DOI: 10.1007/bf02099932] [Citation(s) in RCA: 65] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
In the maximum likelihood (ML) method for estimating a molecular phylogenetic tree, the pattern of nucleotide substitutions for computing likelihood values is assumed to be simpler than that of the actual evolutionary process, simply because the process, considered to be quite devious, is unknown. The problem, however, is that there has been no guarantee to endorse the simplification. To study this problem, we first evaluated the robustness of the ML method in the estimation of molecular trees against different nucleotide substitution patterns, including Jukes and Cantor's, the simplest ever proposed. Namely, we conducted computer simulations in which we could set up various evolutionary models of a hypothetical gene, and define a true tree to which an estimated tree by the ML method was to be compared. The results show that topology estimation by the ML method is considerably robust against different ratios of transitions to transversions and different GC contents, but branch length estimation is not so. The ML tree estimation based on Jukes and Cantor's model is also revealed to be resistant to GC content, but rather sensitive to the ratio of transitions to transversions. We then applied the ML method with different substitution patterns to nucleotide sequence data on tax gene from T-cell leukemia viruses whose evolutionary process must have been more complicated than that of the hypothetical gene. The results are in accordance with those from the simulation study, showing that Jukes and Cantor's model is as useful as a more complicated one for making inferences about molecular phylogeny of the viruses.
Collapse
|