1
|
Kulanayake S, Tikoo SK. Adenovirus Core Proteins: Structure and Function. Viruses 2021; 13:v13030388. [PMID: 33671079 PMCID: PMC7998265 DOI: 10.3390/v13030388] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Revised: 02/19/2021] [Accepted: 02/24/2021] [Indexed: 01/04/2023] Open
Abstract
Adenoviruses have served as a model for investigating viral-cell interactions and discovering different cellular processes, such as RNA splicing and DNA replication. In addition, the development and evaluation of adenoviruses as the viral vectors for vaccination and gene therapy has led to detailed investigations about adenovirus biology, including the structure and function of the adenovirus encoded proteins. While the determination of the structure and function of the viral capsid proteins in adenovirus biology has been the subject of numerous reports, the last few years have seen increased interest in elucidating the structure and function of the adenovirus core proteins. Here, we provide a review of research about the structure and function of the adenovirus core proteins in adenovirus biology.
Collapse
Affiliation(s)
- Shermila Kulanayake
- Vaccine and Infectious Disease Organization-International Vaccine Center (VIDO-InterVac), University of Saskatchewan, Saskatoon, SK S7N5E3, Canada;
- Vaccinology & Immunotherapeutics Program, School of Public Health, University of Saskatchewan, Saskatoon, SK S7N5E3, Canada
| | - Suresh K. Tikoo
- Vaccine and Infectious Disease Organization-International Vaccine Center (VIDO-InterVac), University of Saskatchewan, Saskatoon, SK S7N5E3, Canada;
- Vaccinology & Immunotherapeutics Program, School of Public Health, University of Saskatchewan, Saskatoon, SK S7N5E3, Canada
- Correspondence:
| |
Collapse
|
2
|
Issa M, Elaziz MA. Analyzing COVID-19 virus based on enhanced fragmented biological Local Aligner using improved Ions Motion Optimization algorithm. Appl Soft Comput 2020; 96:106683. [PMID: 32901204 PMCID: PMC7467904 DOI: 10.1016/j.asoc.2020.106683] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 08/07/2020] [Accepted: 08/24/2020] [Indexed: 11/16/2022]
Abstract
SARS-CoV-2 (COVID-19) virus is a havoc pandemic that infects millions of people over the world and thousands of infected cases dead. So, it is vital to propose new intelligent data analysis tools and enhance the existed ones to aid scientists in analyzing the COVID-19 virus. Fragmented Local Aligner Technique (FLAT) is a data analysis tool that is used for detecting the longest common consecutive subsequence (LCCS) between a pair of biological data sequences. FLAT is an aligner tool that can be used to find the LCCS between COVID-19 virus and other viruses to help in other biochemistry and biological operations. In this study, the enhancement of FLAT based on modified Ions Motion Optimization (IMO) is developed to produce acceptable LCCS with efficient performance in a reasonable time. The proposed method was tested to find the LCCS between Orflab poly-protein and surface glycoprotein of COVID-19 and other viruses. The experimental results demonstrate that the proposed model succeeded in producing the best LCCS against other algorithms using real LCCS measured by the SW algorithm as a reference.
Collapse
Affiliation(s)
- Mohamed Issa
- Computer and Systems Department, Faculty of Engineering, Zagazig University, Zagazig 44519, Egypt
| | - Mohamed Abd Elaziz
- Hubei Engineering Research Center on Big Data Security, School of Cyber Science & Engineering, Huazhong University of Science and Technology, Wuhan 430074, China.,Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt
| |
Collapse
|
3
|
Ma XX, Feng YP, Liu JL, Ma B, Chen L, Zhao YQ, Guo PH, Guo JZ, Ma ZR, Zhang J. The effects of the codon usage and translation speed on protein folding of 3Dpol of foot-and-mouth disease virus. Vet Res Commun 2013; 37:243-50. [DOI: 10.1007/s11259-013-9564-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/10/2013] [Indexed: 10/26/2022]
|
4
|
Zhu Y, Li T, Li D, Zhang Y, Xiong W, Sun J, Tang Z, Chen G. Using predicted shape string to enhance the accuracy of γ-turn prediction. Amino Acids 2011; 42:1749-55. [DOI: 10.1007/s00726-011-0889-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Accepted: 03/09/2011] [Indexed: 11/25/2022]
|
5
|
Analysis of the function of cytoplasmic fibers formed by the rubella virus nonstructural replicase proteins. Virology 2010; 406:212-27. [PMID: 20696450 DOI: 10.1016/j.virol.2010.07.025] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2010] [Revised: 05/30/2010] [Accepted: 07/18/2010] [Indexed: 11/23/2022]
Abstract
The P150 and P90 replicase proteins of rubella virus (RUBV), a plus-strand RNA Togavirus, produce a unique cytoplasmic fiber network resembling microtubules. Pharmacological and mutagenic approaches were used to determine if these fibers functioned in virus replication. The pharmacological approach revealed that microtubules were required for fiber formation, but neither was necessary for virus replication. Through the mutagenic approach it was found that α-helices near both termini of P150 were necessary for fiber assembly and infectivity, but fiber formation and viability could not be correlated because most of these mutations were lethal. The N-terminal α-helix of P150 affected both proteolytic processing of P150 and P90 from the P200 precursor and targeting of P200, possibly through directing conformational folding of P200. Finally, we made the unexpected discovery that RUBV genomes can spread from cell-to-cell without virus particles, a process that we hypothesize utilizes RUBV-induced cytoplasmic projections containing fibers and replication complexes.
Collapse
|
6
|
Sims JJ, Cohen RE. Linkage-specific avidity defines the lysine 63-linked polyubiquitin-binding preference of rap80. Mol Cell 2009; 33:775-83. [PMID: 19328070 DOI: 10.1016/j.molcel.2009.02.011] [Citation(s) in RCA: 184] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2008] [Revised: 12/23/2008] [Accepted: 02/12/2009] [Indexed: 11/28/2022]
Abstract
Linkage-specific polyubiquitin recognition is thought to make possible the diverse set of functional outcomes associated with ubiquitination. Thus far, mechanistic insight into this selectivity has been largely limited to single domains that preferentially bind to lysine 48-linked polyubiquitin (K48-polyUb) in isolation. Here, we propose a mechanism, linkage-specific avidity, in which multiple ubiquitin-binding domains are arranged in space so that simultaneous, high-affinity interactions are optimum with one polyUb linkage but unfavorable or impossible with other polyUb topologies and monoUb. Our model is human Rap80, which contains tandem ubiquitin interacting motifs (UIMs) that bind to K63-polyUb at DNA double-strand breaks. We show how the sequence between the Rap80 UIMs positions the domains for efficient avid binding across a single K63 linkage, thus defining selectivity. We also demonstrate K48-specific avidity in a different protein, ataxin-3. Using tandem UIMs, we establish the general principles governing polyUb linkage selectivity and affinity in multivalent ubiquitin receptors.
Collapse
Affiliation(s)
- Joshua J Sims
- Department of Biochemistry and Molecular Biology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA
| | | |
Collapse
|
7
|
Chen J, Chaudhari N. Cascaded bidirectional recurrent neural networks for protein secondary structure prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2007; 4:572-582. [PMID: 17975269 DOI: 10.1109/tcbb.2007.1055] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Protein secondary structure (PSS) prediction is an important topic in bioinformatics. Our study on a large set of non-homologous proteins shows that long-range interactions commonly exist and negatively affect PSS prediction. Besides, we also reveal strong correlations between secondary structure (SS) elements. In order to take into account the long-range interactions and SS-SS correlations, we propose a novel prediction system based on cascaded bidirectional recurrent neural network (BRNN). We compare the cascaded BRNN against another two BRNN architectures, namely the original BRNN architecture used for speech recognition as well as Pollastri's BRNN that was proposed for PSS prediction. Our cascaded BRNN achieves an overall three state accuracy Q3 of 74.38\%, and reaches a high Segment OVerlap (SOV) of 66.0455. It outperforms the original BRNN and Pollastri's BRNN in both Q3 and SOV. Specifically, it improves the SOV score by 4-6%.
Collapse
|
8
|
Lebrun M, Filée P, Galleni M, Mainil JG, Linden A, Taminiau B. Purification of the recombinant beta2 toxin (CPB2) from an enterotoxaemic bovine Clostridium perfringens strain and production of a specific immune serum. Protein Expr Purif 2007; 55:119-31. [PMID: 17562369 DOI: 10.1016/j.pep.2007.04.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2007] [Revised: 04/20/2007] [Accepted: 04/21/2007] [Indexed: 10/23/2022]
Abstract
Overgrowth of Clostridium perfringens clones with production of one or more of its toxin(s) results in diverse digestive and systemic pathologies in human and animals, such as cattle enterotoxaemia. The so-called beta2 toxin (CPB2) is the most recently described major toxin produced by C. perfringens. In this study, the cpb2 ORF (cpb2FM) from a cattle C. perfringens-associated enterotoxaemia was cloned and sequenced. The cpb2FM and its deduced nucleotide sequence clearly corresponded to the cpb2 allele considered as "consensus" and not to "atypical" allele, despite its "non-porcine" origin. Expression assays of the recombinant toxin CPB2FM were performed in Escherichia coli and Bacillus subtilis with the expression vector pBLTS72, and by genomic integration by double recombination in B. subtilis. Highest level of production was obtained with the expression vector in B. subtilis 168 strain. The recombinant CPB2FM protein was purified and a specific rabbit polyclonal antiserum was produced. Polyclonal antibodies could detect CPB2 production in supernatants of C. perfringens from enterotoxaemic cattle.
Collapse
Affiliation(s)
- M Lebrun
- Department of Infectious and Parasitic Diseases-Bacteriology, Faculty of Veterinary Medicine, University of Liège, Liège 4000, Belgium.
| | | | | | | | | | | |
Collapse
|
9
|
Abstract
In this paper, we study the problem of computing the similarity of two protein structures by measuring their contact-map overlap. Contact-map overlap abstracts the problem of computing the similarity of two polygonal chains as a graph-theoretic problem. In R3, we present the first polynomial time algorithm with any guarantee on the approximation ratio for the 3-dimensional problem. More precisely, we give an algorithm for the contact-map overlap problem with an approximation ratio of sigma where sigma = min{sigma(P1), sigma(P2)} <or= O(n(1/2)) is a decomposition parameter depending on the input polygonal chains P1 and P2. In R2, we improve the running time of the previous best known approximation algorithm from O(n(6)) to O(n(3) log n) at the cost of decreasing the approximation ratio by half. We also give hardness results for the problem in three dimensions, suggesting that approximating it better than O(n(epsilon)), for some epsilon > 0, is hard.
Collapse
Affiliation(s)
- Pankaj K Agarwal
- Dept. of Computer Science, Duke University, Durham, North Carolina, USA.
| | | | | |
Collapse
|
10
|
Wodrich H, Cassany A, D'Angelo MA, Guan T, Nemerow G, Gerace L. Adenovirus core protein pVII is translocated into the nucleus by multiple import receptor pathways. J Virol 2006; 80:9608-18. [PMID: 16973564 PMCID: PMC1617226 DOI: 10.1128/jvi.00850-06] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Adenoviruses are nonenveloped viruses with an approximately 36-kb double-stranded DNA genome that replicate in the nucleus. Protein VII, an abundant structural component of the adenovirus core that is strongly associated with adenovirus DNA, is imported into the nucleus contemporaneously with the adenovirus genome shortly after virus infection and may promote DNA import. In this study, we evaluated whether protein VII uses specific receptor-mediated mechanisms for import into the nucleus. We found that it contains potent nuclear localization signal (NLS) activity by transfection of cultured cells with protein VII fusion constructs and by microinjection of cells with recombinant protein VII fusions. We identified three NLS-containing regions in protein VII by deletion mapping and determined important NLS residues by site-specific mutagenesis. We found that recombinant protein VII and its NLS-containing domains strongly and specifically bind to importin alpha, importin beta, importin 7, and transportin, which are among the most abundant cellular nuclear import receptors. Moreover, these receptors can mediate the nuclear import of protein VII fusions in vitro in permeabilized cells. Considered together, these data support the hypothesis that protein VII is a major NLS-containing adaptor for receptor-mediated import of adenovirus DNA and that multiple import pathways are utilized to promote efficient nuclear entry of the viral genome.
Collapse
Affiliation(s)
- Harald Wodrich
- Institut de Génétique Moléculaire de Montpellier, UMR 5535 CNRS, 1919 Route de Mende, 34293 Montpellier Cedex 05, France.
| | | | | | | | | | | |
Collapse
|
11
|
Liu LX, Li ML, Tan FY, Lu MC, Wang KL, Guo YZ, Wen ZN, Jiang L. Local sequence information-based support vector machine to classify voltage-gated potassium channels. Acta Biochim Biophys Sin (Shanghai) 2006; 38:363-71. [PMID: 16761093 DOI: 10.1111/j.1745-7270.2006.00177.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
In our previous work, we developed a computational tool, PreK-ClassK-ClassKv, to predict and classify potassium (K+) channels. For K+ channel prediction (PreK) and classification at family level (ClassK), this method performs well. However, it does not perform so well in classifying voltage-gated potassium (Kv) channels (ClassKv). In this paper, a new method based on the local sequence information of Kv channels is introduced to classify Kv channels. Six transmembrane domains of a Kv channel protein are used to define a protein, and the dipeptide composition technique is used to transform an amino acid sequence to a numerical sequence. A Kv channel protein is represented by a vector with 2000 elements, and a support vector machine algorithm is applied to classify Kv channels. This method shows good performance with averages of total accuracy (Acc), sensitivity (SE), specificity (SP), reliability (R) and Matthews correlation coefficient (MCC) of 98.0%, 89.9%, 100%, 0.95 and 0.94 respectively. The results indicate that the local sequence information-based method is better than the global sequence information-based method to classify Kv channels.
Collapse
Affiliation(s)
- Li-Xia Liu
- College of Chemistry, Sichuan University, Chengdu 610064, China,
| | | | | | | | | | | | | | | |
Collapse
|
12
|
Chen J, Chaudhari N. Bidirectional segmented-memory recurrent neural network for protein secondary structure prediction. Soft comput 2005. [DOI: 10.1007/s00500-005-0489-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Wilson CL, Boardman PE, Doig AJ, Hubbard SJ. Improved prediction for N-termini of alpha-helices using empirical information. Proteins 2005; 57:322-30. [PMID: 15340919 DOI: 10.1002/prot.20218] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The prediction of the secondary structure of proteins from their amino acid sequences remains a key component of many approaches to the protein folding problem. The most abundant form of regular secondary structure in proteins is the alpha-helix, in which specific residue preferences exist at the N-terminal locations. Propensities derived from these observed amino acid frequencies in the Protein Data Bank (PDB) database correlate well with experimental free energies measured for residues at different N-terminal positions in alanine-based peptides. We report a novel method to exploit this data to improve protein secondary structure prediction through identification of the correct N-terminal sequences in alpha-helices, based on existing popular methods for secondary structure prediction. With this algorithm, the number of correctly predicted alpha-helix start positions was improved from 30% to 38%, while the overall prediction accuracy (Q3) remained the same, using cross-validated testing. Although the algorithm was developed and tested on multiple sequence alignment-based secondary structure predictions, it was also able to improve the predictions of start locations by methods that use single sequences to make their predictions. Furthermore, the residue frequencies at N-terminal positions of the improved predictions better reflect those seen at the N-terminal positions of alpha-helices in proteins. This has implications for areas such as comparative modeling, where a more accurate prediction of the N-terminal regions of alpha-helices should benefit attempts to model adjacent loop regions. The algorithm is available as a Web tool, located at http://rocky.bms.umist.ac.uk/elephant.
Collapse
Affiliation(s)
- Claire L Wilson
- Department of Biomolecular Sciences, University of Manchester Institute of Science and Technology, Manchester, United Kingdom
| | | | | | | |
Collapse
|
14
|
Gu W, Zhou T, Ma J, Sun X, Lu Z. Folding type specific secondary structure propensities of synonymous codons. IEEE Trans Nanobioscience 2004; 2:150-7. [PMID: 15376949 DOI: 10.1109/tnb.2003.817024] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
We have proposed new amino acid secondary structure propensities in proteins with different folding types based on synonymous codons. They have been derived from 200 all alpha, all beta, alpha/beta, and alpha + beta proteins of known structures and their coding genes. The secondary structure propensities of the same codon in gene coding for different folding type proteins are not the same. For instance, amino acid Ile coded by AUU is indifferent to form the alpha unit in the alpha + beta protein class, but it is a former and a breaker for the alpha unit in the all alpha protein class and the alpha/beta class, respectively. On the other hand, the secondary structure propensities of different synonymous codons in the coding genes with the same folding type are also not all the same. As an example, CGU, CGG, and AGA, which are synonymous codons of Arg, are preferential to form the alpha unit in all alpha proteins, while CGA is an alpha unit breaker and the other two synonymous codons, CGC and AGG, are indifferent to form or break the alpha unit. As a result, protein secondary structure information contained both in mRNA sequences and in amino acid sequences has been introduced in these codon-based amino acid secondary structure propensities. These codon-based amino acid secondary structure propensities are helpful to in vitro protein design and protein secondary structure prediction.
Collapse
Affiliation(s)
- Wanjun Gu
- Key Laboratory of Molecular and Biomolecular Electronics, Southeast University, Ministry of Education, Nanjing 210096, China
| | | | | | | | | |
Collapse
|
15
|
|
16
|
Kloczkowski A, Ting KL, Jernigan RL, Garnier J. Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence. Proteins 2002; 49:154-66. [PMID: 12210997 DOI: 10.1002/prot.10181] [Citation(s) in RCA: 114] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have modified and improved the GOR algorithm for the protein secondary structure prediction by using the evolutionary information provided by multiple sequence alignments, adding triplet statistics, and optimizing various parameters. We have expanded the database used to include the 513 non-redundant domains collected recently by Cuff and Barton (Proteins 1999;34:508-519; Proteins 2000;40:502-511). We have introduced a variable size window that allowed us to include sequences as short as 20-30 residues. A significant improvement over the previous versions of GOR algorithm was obtained by combining the PSI-BLAST multiple sequence alignments with the GOR method. The new algorithm will form the basis for the future GOR V release on an online prediction server. The average accuracy of the prediction of secondary structure with multiple sequence alignment and full jack-knife procedure was 73.5%. The accuracy of the prediction increases to 74.2% by limiting the prediction to 375 (of 513) sequences having at least 50 PSI-BLAST alignments. The average accuracy of the prediction of the new improved program without using multiple sequence alignments was 67.5%. This is approximately a 3% improvement over the preceding GOR IV algorithm (Garnier J, Gibrat JF, Robson B. Methods Enzymol 1996;266:540-553; Kloczkowski A, Ting K-L, Jernigan RL, Garnier J. Polymer 2002;43:441-449). We have discussed alternatives to the segment overlap (Sov) coefficient proposed by Zemla et al. (Proteins 1999;34:220-223).
Collapse
Affiliation(s)
- A Kloczkowski
- Laboratory of Experimental and Computational Biology, NCI, NIH, Bethesda, Maryland, USA
| | | | | | | |
Collapse
|
17
|
Robson B, Mordasini T, Curioni A. Studies in the assessment of folding quality for protein modeling and structure prediction. J Proteome Res 2002; 1:115-33. [PMID: 12643532 DOI: 10.1021/pr0155228] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A diagnostic for assessing the quality of a fold has been developed to which further criteria can be progressively added. The goal is to create a measure that can follow the status of a protein structure in a simulation or modeling process, when the answer (the experimental structure) is not known in advance, rather than simply reject deliberate misfolds. This places greater emphasis on the need to study, and calibrate against, marginal cases, i.e., unusual native structures, incomplete structures, partially erroneous X-ray structures, good models, poor models, and the effect of cofactors. The first three terms introduced in the diagnostic are appropriate core-forming properties or noncore properties of residues in relation to tertiary structure, appropriate neighboring structure density for each residue in relation to tertiary structure, and secondary structure consistency. While the method emerges as a useful simulation analysis tool, we find a need for further fine-tuning to diminish sensitivity to minor conformational changes that retain essential features of the fold, balanced against the need to obtain a more sensitive response when a conformational change involves less physically meaningful interatomic interactions. This dual utility is difficult to obtain: the investigation highlights some of the issues. Initial attempts to obtain it have led to terms in the diagnostic that are admittedly complex: simplifications must also be explored.
Collapse
Affiliation(s)
- Barry Robson
- IBM Research, T. J. Watson Research Laboratory, Yorktown Heights, New York 10598, USA
| | | | | |
Collapse
|
18
|
Abstract
Using information from sequence alignments significantly improves protein secondary structure prediction. Typically, more divergent profiles yield better predictions. Recently, various groups have shown that accuracy can be improved significantly by using PSI-BLAST profiles to develop new prediction methods. Here, we focused on the influences of various alignment strategies on two 8-year-old PHD methods. The following results stood out. (i) PHD using pairwise alignments predicts about 72% of all residues correctly in one of the three states: helix, strand, and other. Using larger databases and PSI-BLAST raised accuracy to 75%. (ii) More than 60% of the improvement originated from the growth of current sequence databases; about 20% resulted from detailed changes in the alignment procedure (substitution matrix, thresholds, and gap penalties). Another 20% of the improvement resulted from carefully using iterated PSI-BLAST searches. (iii) It is of interest that we failed to improve prediction accuracy further when attempting to refine the alignment by dynamic programming (MaxHom and ClustalW). (iv) Improvement through family growth appears to saturate at some point. However, most families have not reached this saturation. Hence, we anticipate that prediction accuracy will continue to rise with database growth.
Collapse
Affiliation(s)
- Dariusz Przybylski
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, USA.
| | | |
Collapse
|
19
|
Abstract
An ab initio method has been developed to predict helix formation for polypeptides. The approach relies on the systematic analysis of overlapping oligopeptides to determine the helical propensity for individual residues. Detailed atomistic level modeling, including entropic contributions, and solvation/ionization energies calculated through the solution of the Poisson-Boltzmann equation, is utilized. The calculation of probabilities for helix formation is based on the generation of ensembles of low energy conformers. The approach, which is easily amenable to parallelization, is shown to perform very well for several benchmark polypeptide systems, including the bovine pancreatic trypsin inhibitor, the immunoglobulin binding domain of protein G, the chymotrypsin inhibitor 2, the R69 N-terminal domain of phage 434 repressor, and the wheat germ agglutinin.
Collapse
Affiliation(s)
- J L Klepeis
- Department of Chemical Engineering, Princeton University, New Jersey 08544-5263, USA
| | | |
Collapse
|
20
|
Kloczkowski A, Ting KL, Jernigan R, Garnier J. Protein secondary structure prediction based on the GOR algorithm incorporating multiple sequence alignment information. POLYMER 2002. [DOI: 10.1016/s0032-3861(01)00425-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
21
|
Maraia RJ, Intine RV. La protein and its associated small nuclear and nucleolar precursor RNAs. Gene Expr 2002; 10:41-57. [PMID: 11868987 PMCID: PMC5977531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
After transcription by RNA polymerase (pol) III, nascent Pol III transcripts pass through RNA processing, modification, and transport machineries as part of their posttranscriptional maturation process. The first factor to interact with Pol III transcripts is La protein, which binds principally via its conserved N-terminal domain (NTD), to the UUU-OH motif that results from transcription termination. This review includes a sequence Logo of the most conserved region of La and its refined modeling as an RNA recognition motif (RRM). La protects RNAs from 3' exonucleolytic digestion and also contributes to their nuclear retention. The variety of modifications found on La-associated RNAs is reviewed in detail and considered in the contexts of how La may bind the termini of structured RNAs without interfering with recognition by modification enzymes, and its ability to chaperone RNAs through multiple parts of their maturation pathways. The CTD of human La recognizes the 5' end region of nascent RNA in a manner that is sensitive to serine 366 phosphorylation. Although the CTD can control pre-tRNA cleavage by RNase P, a rate-limiting step in tRNASerUGA maturation, the extent to which it acts in the maturation pathway(s) of other transcripts is unknown but considered here. Evidence that a fraction of La resides in the nucleolus together with recent findings that several Pol III transcripts pass through the nucleolus is also reviewed. An imminent goal is to understand how the bipartite RNA binding, intracellular trafficking, and signal transduction activities of La are integrated with the maturation pathways of the various RNAs with which it associates.
Collapse
Affiliation(s)
- Richard J Maraia
- Laboratory of Molecular Growth Regulation, National Institute of Child Health & Human Development, National Institutes of Health, Bethesda, MD 20892-2753, USA.
| | | |
Collapse
|
22
|
Bultynck G, Rossi D, Callewaert G, Missiaen L, Sorrentino V, Parys JB, De Smedt H. The conserved sites for the FK506-binding proteins in ryanodine receptors and inositol 1,4,5-trisphosphate receptors are structurally and functionally different. J Biol Chem 2001; 276:47715-24. [PMID: 11598113 DOI: 10.1074/jbc.m106573200] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We compared the interaction of the FK506-binding protein (FKBP) with the type 3 ryanodine receptor (RyR3) and with the type 1 and type 3 inositol 1,4,5-trisphosphate receptor (IP(3)R1 and IP(3)R3), using a quantitative GST-FKBP12 and GST-FKBP12.6 affinity assay. We first characterized and mapped the interaction of the FKBPs with the RyR3. GST-FKBP12 as well as GST-FKBP12.6 were able to bind approximately 30% of the solubilized RyR3. The interaction was completely abolished by FK506, strengthened by the addition of Mg(2+), and weakened in the absence of Ca(2+) but was not affected by the addition of cyclic ADP-ribose. By using proteolytic mapping and site-directed mutagenesis, we pinpointed Val(2322), located in the central modulatory domain of the RyR3, as a critical residue for the interaction of RyR3 with FKBPs. Substitution of Val(2322) for leucine (as in IP(3)R1) or isoleucine (as in RyR2) decreased the binding efficiency and shifted the selectivity to FKBP12.6; substitution of Val(2322) for aspartate completely abolished the FKBP interaction. Importantly, the occurrence of the valylprolyl residue as alpha-helix breaker was an important determinant of FKBP binding. This secondary structure is conserved among the different RyR isoforms but not in the IP(3)R isoforms. A chimeric RyR3/IP(3)R1, containing the core of the FKBP12-binding site of IP(3)R1 in the RyR3 context, retained this secondary structure and was able to interact with FKBPs. In contrast, IP(3)Rs did not interact with the FKBP isoforms. This indicates that the primary sequence in combination with the local structural environment plays an important role in targeting the FKBPs to the intracellular Ca(2+)-release channels. Structural differences in the FKBP-binding site of RyRs and IP(3)Rs may contribute to the occurrence of a stable interaction between RyR isoforms and FKBPs and to the absence of such interaction with IP(3)Rs.
Collapse
Affiliation(s)
- G Bultynck
- Laboratorium voor Fysiologie, K.U.Leuven Campus Gasthuisberg O/N, Herestraat 49, B-3000 Leuven, Belgium
| | | | | | | | | | | | | |
Collapse
|
23
|
Lecompte O, Thompson JD, Plewniak F, Thierry J, Poch O. Multiple alignment of complete sequences (MACS) in the post-genomic era. Gene 2001; 270:17-30. [PMID: 11403999 DOI: 10.1016/s0378-1119(01)00461-9] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Multiple alignment, since its introduction in the early seventies, has become a cornerstone of modern molecular biology. It has traditionally been used to deduce structure / function by homology, to detect conserved motifs and in phylogenetic studies. There has recently been some renewed interest in the development of multiple alignment techniques, with current opinion moving away from a single all-encompassing algorithm to iterative and / or co-operative strategies. The exploitation of multiple alignments in genome annotation projects represents a qualitative leap in the functional analysis process, opening the way to the study of the co-evolution of validated sets of proteins and to reliable phylogenomic analysis. However, the alignment of the highly complex proteins detected by today's advanced database search methods is a daunting task. In addition, with the explosion of the sequence databases and with the establishment of numerous specialized biological databases, multiple alignment programs must evolve if they are to successfully rise to the new challenges of the post-genomic era. The way forward is clearly an integrated system bringing together sequence data, knowledge-based systems and prediction methods with their inherent unreliability. The incorporation of such heterogeneous, often non-consistent, data will require major changes to the fundamental alignment algorithms used to date. Such an integrated multiple alignment system will provide an ideal workbench for the validation, propagation and presentation of this information in a format that is concise, clear and intuitive.
Collapse
Affiliation(s)
- O Lecompte
- Laboratoire de Biologie et Génomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire (CNRS/INSERM/ULP), BP 163, 67404 Cedex, Illkirch, France
| | | | | | | | | |
Collapse
|
24
|
Abstract
The relationship between the amino acid sequence and the three-dimensional structure of proteins with internal repeats is discussed. In particular, correlations between the amino acid composition and the ability to fold in a unique structure, as well as classification of the structures based on their repeat length, are described. This analysis suggests rules that can be used for the structural prediction of repeat-containing proteins. The paper is focused on prediction and modeling of solenoid-like proteins with the repeat length ranging between 5 and 40 residues. The models of leucine-rich repeat proteins and bacterial proteins with pentapeptide repeats are examined in light of the recently solved structures of the related molecules.
Collapse
Affiliation(s)
- A V Kajava
- Center for Molecular Modeling, Bethesda, Maryland 20892-5626, USA
| |
Collapse
|
25
|
Zhang Y, Wang PG, Brew K. Specificity and mechanism of metal ion activation in UDP-galactose:beta -galactoside-alpha -1,3-galactosyltransferase. J Biol Chem 2001; 276:11567-74. [PMID: 11133981 DOI: 10.1074/jbc.m006530200] [Citation(s) in RCA: 43] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
UDP-galactose:beta-galactosyl-alpha1,3-galactosyltransferase (alpha3GT) catalyzes the synthesis of galactosyl-alpha-1,3-beta-galactosyl structures in mammalian glycoconjugates. In humans the gene for alpha3GT is inactivated, and its product, the alpha-Gal epitope, is the target of a large fraction of natural antibodies. alpha3GT is a member of a family of metal-dependent-retaining glycosyltransferases that includes the histo blood group A and B enzymes. Mn(2+) activates the catalytic domain of alpha3GT (alpha3GTcd), but the affinity reported for this ion is very low relative to physiological levels. Enzyme activity over a wide range of metal ion concentrations indicates a dependence on Mn(2+) binding to two sites. At physiological metal ion concentrations, Zn(2+) gives higher levels of activity and may be the natural cofactor. To determine the role of the cation, metal activation was perturbed by substituting Co(2+) and Zn(2+) for Mn(2+) and by mutagenesis of a conserved D(149)VD(151) sequence motif that is considered to act in cation binding in many glycosyltransferases. The aspartates of this motif were found to be essential for activity, and the kinetic properties of a Val(150) to Ala mutant with reduced activity were determined. The results indicate that the cofactor is involved in binding UDP-galactose and has a crucial influence on catalytic efficiency for galactose transfer and for the low endogenous UDP-galactose hydrolase activity. It may therefore interact with one or more phosphates of UDP-galactose in the Michaelis complex and in the transition state for cleavage of the UDP to galactose bond. The DXD motif conserved in many glycosyltransferases appears to have a key role in metal-mediated donor substrate binding and phosphate-sugar bond cleavage.
Collapse
Affiliation(s)
- Y Zhang
- Department of Biochemistry and Molecular Biology, University of Miami School of Medicine, Miami, Florida 33101, USA
| | | | | |
Collapse
|
26
|
Maraia RJ, Intine RV. Recognition of nascent RNA by the human La antigen: conserved and divergent features of structure and function. Mol Cell Biol 2001; 21:367-79. [PMID: 11134326 PMCID: PMC86573 DOI: 10.1128/mcb.21.2.367-379.2001] [Citation(s) in RCA: 100] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Affiliation(s)
- R J Maraia
- Laboratory of Molecular Growth Regulation, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, USA.
| | | |
Collapse
|
27
|
Mugilan SA, Veluraja K. Generation of deviation parameters for amino acid singlets, doublets and triplets from three-dimentional structures of proteins and its implications for secondary structure prediction from amino acid sequences. J Biosci 2000; 25:81-91. [PMID: 10824202 DOI: 10.1007/bf02985185] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
We present a new method, secondary structure prediction by deviation parameter (SSPDP) for predicting the secondary structure of proteins from amino acid sequence. Deviation parameters (DP) for amino acid singlets, doublets and triplets were computed with respect to secondary structural elements of proteins based on the dictionary of secondary structure prediction (DSSP)-generated secondary structure for 408 selected non-homologous proteins. To the amino acid triplets which are not found in the selected dataset, a DP value of zero is assigned with respect to the secondary structural elements of proteins. The total number of parameters generated is 15,432, in the possible parameters of 25,260. Deviation parameter is complete with respect to amino acid singlets, doublets, and partially complete with respect to amino acid triplets. These generated parameters were used to predict secondary structural elements from amino acid sequence. The secondary structure predicted by our method (SSPDP) was compared with that of single sequence (NNPREDICT) and multiple sequence (PHD) methods. The average value of the percentage of prediction accuracy for a helix by SSPDP, NNPREDICT and PHD methods was found to be 57%, 44% and 69% respectively for the proteins in the selected dataset. For b-strand the prediction accuracy is found to be 69%, 21% and 53% respectively by SSPDP, NNPREDICT and PHD methods. This clearly indicates that the secondary structure prediction by our method is as good as PHD method but much better than NNPREDICT method.
Collapse
Affiliation(s)
- S A Mugilan
- Department of Physics, Manonmaniam Sundaranar University, Tirunelveli 627 012, Tamil Nadu, India
| | | |
Collapse
|
28
|
Abstract
We present a novel method for predicting the secondary structure of a protein from its amino acid sequence. Most existing methods predict each position in turn based on a local window of residues, sliding this window along the length of the sequence. In contrast, we develop a probabilistic model of protein sequence/structure relationships in terms of structural segments, and formulate secondary structure prediction as a general Bayesian inference problem. A distinctive feature of our approach is the ability to develop explicit probabilistic models for alpha-helices, beta-strands, and other classes of secondary structure, incorporating experimentally and empirically observed aspects of protein structure such as helical capping signals, side chain correlations, and segment length distributions. Our model is Markovian in the segments, permitting efficient exact calculation of the posterior probability distribution over all possible segmentations of the sequence using dynamic programming. The optimal segmentation is computed and compared to a predictor based on marginal posterior modes, and the latter is shown to provide significant improvement in predictive accuracy. The marginalization procedure provides exact secondary structure probabilities at each sequence position, which are shown to be reliable estimates of prediction uncertainty. We apply this model to a database of 452 nonhomologous structures, achieving accuracies as high as the best currently available methods. We conclude by discussing an extension of this framework to model nonlocal interactions in protein structures, providing a possible direction for future improvements in secondary structure prediction accuracy.
Collapse
Affiliation(s)
- S C Schmidler
- Section on Medical Informatics, Stanford University School of Medicine, CA 94305, USA.
| | | | | |
Collapse
|
29
|
Jermutus L, Guez V, Bedouelle H. Disordered C-terminal domain of tyrosyl-tRNA synthetase: secondary structure prediction. Biochimie 1999; 81:235-44. [PMID: 10385005 DOI: 10.1016/s0300-9084(99)80057-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The C-terminal domain (residues 320-419) of tyrosyl-tRNA synthetase (TyrRS) from Bacillus stearothermophilus is disordered in the crystal structure and involved in the binding of the anticodon arm of tRNA(Tyr). The sequences of 11 TyrRSs of prokaryotic or mitochondrial origins were aligned and the alignment showed the existence of conserved residues in the sequences of the C-terminal domains. A consensus could be deduced from the application of five programs of secondary structure prediction to the 11 sequences of the query set. These results suggested that the sequences of the C-terminal domains determined a precise and conserved secondary structure. They predicted that the C-terminal domain would have a mixed fold (alpha/beta or alpha+beta), with the alpha-helices in the first half of the sequence and the beta-strands mainly in its second half. Several programs of fold recognition from sequence alone, by threading onto known structures, were applied but none of them identified a type of fold that would be common to the different sequences of the query set. Therefore, the fold of the C-terminal, anticodon binding domain might be novel.
Collapse
Affiliation(s)
- L Jermutus
- Groupe d'Ingénierie des Protéines (CNRS URA 1129), Unité de Biochimie Cellulaire, Institut Pasteur, Paris, France
| | | | | |
Collapse
|
30
|
Deroo S, El Kasmi KC, Fournier P, Theisen D, Brons NH, Herrmann M, Desmet J, Muller CP. Enhanced antigenicity of a four-contact-residue epitope of the measles virus hemagglutinin protein by phage display libraries: evidence of a helical structure in the putative active site. Mol Immunol 1998; 35:435-43. [PMID: 9798648 DOI: 10.1016/s0161-5890(98)00057-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Antigenicity and conformational propensities of synthetic peptides corresponding to the sequential epitope H236-255 of the measles virus hemagglutinin protein were investigated. This epitope corresponds to the neutralising and protective monoclonal antibody BH129 and includes Arg243, implicated in CD46-down-regulation and Arg253 that has been mapped to the putative enzymatic site. Fine mapping with truncation-, elongation-, Gly- and Ala-substitution analogues defined EL-QL as the critical residues of the minimal epitope S244ELSQL249. CD spectra of peptides, comparison with the 3D-structure of homologous sequences, and prediction algorithms suggested a helical structure with the contact residues E245L-QL249 located on the protein surface. Mimotopes obtained with a 6-mer phage display library contained a consensus Pro (important for binding) instead of Ser247 of the wild-type sequence (irrelevant for binding). The kink induced by Pro seemed to be essential to bring the 4 contact-residues in the mimotopes and in the corresponding short peptides together. CD analysis and prediction algorithms suggested that non-helical conformations of the phage insert and of the peptides may favourably mimic the antigenic helical turns of the wild-type sequence, resulting in an up to 135 times higher antigenicity of the mAb towards the mimotope peptides.
Collapse
Affiliation(s)
- S Deroo
- Laboratoire National de Santé, Luxembourg, Luxembourg
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Benner SA, Cannarozzi G, Gerloff D, Turcotte M, Chelvanayagam G. Bona Fide Predictions of Protein Secondary Structure Using Transparent Analyses of Multiple Sequence Alignments. Chem Rev 1997; 97:2725-2844. [PMID: 11851479 DOI: 10.1021/cr940469a] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Steven A. Benner
- Department of Chemistry, University of Florida, Gainesville, Florida 32611-7200
| | | | | | | | | |
Collapse
|
32
|
Di Francesco V, Garnier J, Munson PJ. Protein topology recognition from secondary structure sequences: application of the hidden Markov models to the alpha class proteins. J Mol Biol 1997; 267:446-63. [PMID: 9096237 DOI: 10.1006/jmbi.1996.0874] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The three-dimensional fold of a protein is described by the organization of its secondary structure elements in 3D space, i.e. its "topology". We find that the protein topology can be recognized from the ID sequence of secondary structure states of the residues alone. Automated recognition is facilitated by use of hidden Markov models (HMMs) to represent topology families of proteins. Such models can be trained on the experimentally observed secondary structure sequences of family members using well established algorithms. Here, we model various topology groups in the alpha class of proteins and identify, from a large database, those proteins having the topology described by each model. The correct topology family for protein secondary structure sequences could be recognized 12 out of 14 times. When the observed secondary structure sequences are replaced with predicted sequences recognition is still achievable 8 out of 14 times. The success rate for observed sequences indicates that our approach will become increasingly useful as the accuracy of secondary prediction algorithms is improved. Our study indicates that the HMMs are useful for protein topology recognition even when no detectable primary amino acid sequence similarity is present. To illustrate the potential utility of our method, protein topology recognition is attempted on leptin, the obese gene product, and the human interleukin-6 sequence, for which fold predictions have been previously published.
Collapse
Affiliation(s)
- V Di Francesco
- Laboratory of Structural Biology, Division of Computer Research and Technology, National Institutes of Health, Bethesda, MD 20892-5626, USA.
| | | | | |
Collapse
|
33
|
Abstract
In this study we present an accurate secondary structure prediction procedure by using an query and related sequences. The most novel aspect of our approach is its reliance on local pairwise alignment of the sequence to be predicted with each related sequence rather than utilization of a multiple alignment. The residue-by-residue accuracy of the method is 75% in three structural states after jack-knife tests. The gain in prediction accuracy compared with the existing techniques, which are at best 72%, is achieved by secondary structure propensities based on both local and long-range effects, utilization of similar sequence information in the form of carefully selected pairwise alignment fragments, and reliance on a large collection of known protein primary structures. The method is especially appropriate for large-scale sequence analysis of efforts such as genome characterization, where precise and significant multiple sequence alignments are not available or achievable.
Collapse
Affiliation(s)
- D Frishman
- European Molecular Biology Laboratory, Heidelberg, Germany
| | | |
Collapse
|
34
|
|
35
|
Frishman D, Argos P. The future of protein secondary structure prediction accuracy. FOLDING & DESIGN 1997; 2:159-62. [PMID: 9218953 DOI: 10.1016/s1359-0278(97)00022-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
BACKGROUND The accuracy of secondary structure prediction for a protein from knowledge of its sequence has been significantly improved by about 7% to the 70-75% range by inclusion of information residing in sequences similar to the query sequence. The scientific literature has been inconsistent, if not negative, regarding chances for further improvement from the vast knowledge to be provided by genome sequencing efforts. RESULTS By applying a prediction technique that is particularly sensitive to added sequence information to a standard set of query sequences with related primary structures taken from chronologically successive releases of the SWISS-PROT database, it is shown that prediction accuracy can be expected to reach 80-85% with a large 10-fold increase in present sequence knowledge. CONCLUSIONS Even with present prediction approaches, improvement in prediction accuracy can still be expected, albeit limited to no more than 10%.
Collapse
Affiliation(s)
- D Frishman
- Martinsried Institute for Protein Sequences, Max-Planck-Institute for Biochemistry, Germany.
| | | |
Collapse
|
36
|
|