1
|
Prescott L. SARS-CoV-2 3CLpro whole human proteome cleavage prediction and enrichment/depletion analysis. Comput Biol Chem 2022; 98:107671. [PMID: 35429835 PMCID: PMC8958254 DOI: 10.1016/j.compbiolchem.2022.107671] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Revised: 03/21/2022] [Accepted: 03/25/2022] [Indexed: 12/12/2022]
Abstract
A novel coronavirus (SARS-CoV-2) has devastated the globe as a pandemic that has killed millions of people. Widespread vaccination is still uncertain, so many scientific efforts have been directed toward discovering antiviral treatments. Many drugs are being investigated to inhibit the coronavirus main protease, 3CLpro, from cleaving its viral polyprotein, but few publications have addressed this protease’s interactions with the host proteome or their probable contribution to virulence. Too few host protein cleavages have been experimentally verified to fully understand 3CLpro’s global effects on relevant cellular pathways and tissues. Here, I set out to determine this protease’s targets and corresponding potential drug targets. Using a neural network trained on cleavages from 392 coronavirus proteomes with a Matthews correlation coefficient of 0.985, I predict that a large proportion of the human proteome is vulnerable to 3CLpro, with 4898 out of approximately 20,000 human proteins containing at least one putative cleavage site. These cleavages are nonrandomly distributed and are enriched in the epithelium along the respiratory tract, brain, testis, plasma, and immune tissues and depleted in olfactory and gustatory receptors despite the prevalence of anosmia and ageusia in COVID-19 patients. Affected cellular pathways include cytoskeleton/motor/cell adhesion proteins, nuclear condensation and other epigenetics, host transcription and RNAi, ribosomal stoichiometry and nascent-chain detection and degradation, ubiquitination, pattern recognition receptors, coagulation, lipoproteins, redox, and apoptosis. This whole proteome cleavage prediction demonstrates the importance of 3CLpro in expected and nontrivial pathways affecting virulence, lead me to propose more than a dozen potential therapeutic targets against coronaviruses, and should therefore be applied to all viral proteases and subsequently experimentally verified.
Collapse
|
2
|
Abstract
During the last three decades or so, many efforts have been made to study the protein cleavage
sites by some disease-causing enzyme, such as HIV (Human Immunodeficiency Virus) protease
and SARS (Severe Acute Respiratory Syndrome) coronavirus main proteinase. It has become increasingly
clear <i>via</i> this mini-review that the motivation driving the aforementioned studies is quite wise,
and that the results acquired through these studies are very rewarding, particularly for developing peptide
drugs.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
3
|
Song J, Wang Y, Li F, Akutsu T, Rawlings ND, Webb GI, Chou KC. iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief Bioinform 2020; 20:638-658. [PMID: 29897410 PMCID: PMC6556904 DOI: 10.1093/bib/bby028] [Citation(s) in RCA: 124] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Revised: 03/02/2018] [Indexed: 01/03/2023] Open
Abstract
Regulation of proteolysis plays a critical role in a myriad of important cellular processes. The key to better understanding the mechanisms that control this process is to identify the specific substrates that each protease targets. To address this, we have developed iProt-Sub, a powerful bioinformatics tool for the accurate prediction of protease-specific substrates and their cleavage sites. Importantly, iProt-Sub represents a significantly advanced version of its successful predecessor, PROSPER. It provides optimized cleavage site prediction models with better prediction performance and coverage for more species-specific proteases (4 major protease families and 38 different proteases). iProt-Sub integrates heterogeneous sequence and structural features and uses a two-step feature selection procedure to further remove redundant and irrelevant features in an effort to improve the cleavage site prediction accuracy. Features used by iProt-Sub are encoded by 11 different sequence encoding schemes, including local amino acid sequence profile, secondary structure, solvent accessibility and native disorder, which will allow a more accurate representation of the protease specificity of approximately 38 proteases and training of the prediction models. Benchmarking experiments using cross-validation and independent tests showed that iProt-Sub is able to achieve a better performance than several existing generic tools. We anticipate that iProt-Sub will be a powerful tool for proteome-wide prediction of protease-specific substrates and their cleavage sites, and will facilitate hypothesis-driven functional interrogation of protease-specific substrate cleavage and proteolytic events.
Collapse
Affiliation(s)
- Jiangning Song
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia.,Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia
| | - Yanan Wang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Fuyi Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, 611-0011, Japan
| | - Neil D Rawlings
- EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, USA and Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
4
|
Some illuminating remarks on molecular genetics and genomics as well as drug development. Mol Genet Genomics 2020; 295:261-274. [PMID: 31894399 DOI: 10.1007/s00438-019-01634-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 12/05/2019] [Indexed: 02/07/2023]
Abstract
Facing the explosive growth of biological sequences unearthed in the post-genomic age, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, but still keep it with considerable sequence-order information or its special pattern. To deal with such a challenging problem, the ideas of "pseudo amino acid components" and "pseudo K-tuple nucleotide composition" have been proposed. The ideas and their approaches have further stimulated the birth for "distorted key theory", "wenxing diagram", and substantially strengthening the power in treating the multi-label systems, as well as the establishment of the famous "5-steps rule". All these logic developments are quite natural that are very useful not only for theoretical scientists but also for experimental scientists in conducting genetics/genomics analysis and drug development. Presented in this review paper are also their future perspectives; i.e., their impacts will become even more significant and propounding.
Collapse
|
5
|
Qiu WR, Sun BQ, Xiao X, Xu ZC, Jia JH, Chou KC. iKcr-PseEns: Identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics 2018; 110:239-246. [DOI: 10.1016/j.ygeno.2017.10.008] [Citation(s) in RCA: 99] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 10/23/2017] [Accepted: 10/25/2017] [Indexed: 01/23/2023]
|
6
|
Song J, Wang Y, Li F, Akutsu T, Rawlings ND, Webb GI, Chou KC. iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief Bioinform 2018. [DOI: 10.1093/bib/bby028 epub ahead of print].] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Jiangning Song
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia
| | - Yanan Wang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Fuyi Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, 611-0011, Japan
| | - Neil D Rawlings
- EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, USA and Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
7
|
Jia J, Liu Z, Xiao X, Liu B, Chou KC. pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach. J Theor Biol 2016; 394:223-230. [DOI: 10.1016/j.jtbi.2016.01.020] [Citation(s) in RCA: 231] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2015] [Revised: 01/06/2016] [Accepted: 01/07/2016] [Indexed: 10/22/2022]
|
8
|
Chen W, Feng P, Ding H, Lin H, Chou KC. Using deformation energy to analyze nucleosome positioning in genomes. Genomics 2016; 107:69-75. [DOI: 10.1016/j.ygeno.2015.12.005] [Citation(s) in RCA: 87] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Revised: 12/06/2015] [Accepted: 12/22/2015] [Indexed: 12/28/2022]
|
9
|
Rögnvaldsson T, You L, Garwicz D. Bioinformatic approaches for modeling the substrate specificity of HIV-1 protease: an overview. Expert Rev Mol Diagn 2014; 7:435-51. [PMID: 17620050 DOI: 10.1586/14737159.7.4.435] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
HIV-1 protease has a broad and complex substrate specificity, which hitherto has escaped a simple comprehensive definition. This, and the relatively high mutation rate of the retroviral protease, makes it challenging to design effective protease inhibitors. Several attempts have been made during the last two decades to elucidate the enigmatic cleavage specificity of HIV-1 protease and to predict cleavage of novel substrates using bioinformatic analysis methods. This review describes the methods that have been utilized to date to address this important problem and the results achieved. The data sets used are also reviewed and important aspects of these are highlighted.
Collapse
Affiliation(s)
- Thorsteinn Rögnvaldsson
- Halmstad University, School of Information Science, Computer & Electrical Engineering, Halmstad, Sweden.
| | | | | |
Collapse
|
10
|
NS2B/3 proteolysis at the C-prM junction of the tick-borne encephalitis virus polyprotein is highly membrane dependent. Virus Res 2012; 168:48-55. [PMID: 22727684 PMCID: PMC3437442 DOI: 10.1016/j.virusres.2012.06.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2012] [Revised: 06/11/2012] [Accepted: 06/11/2012] [Indexed: 11/21/2022]
Abstract
The replication of tick-borne encephalitis virus (TBEV), like that of all flaviviruses, is absolutely dependent on proteolytic processing. Production of the mature proteins C and prM from their common precursor requires the activity of the viral NS2B/3 protease (NS2B/3(pro)) at the C-terminus of protein C and the host signal peptidase I (SPaseI) at the N-terminus of protein prM. Recently, we have shown in cell culture that the cleavage of protein C and the subsequent production of TBEV particles can be made dependent on the activity of the foot-and-mouth disease virus 3C protease, but not on the activity of the HIV-1 protease (HIV1(pro)) (Schrauf et al., 2012). To investigate this failure, we developed an in vitro cleavage assay to assess the two cleavage reactions performed on the C-prM precursor. Accordingly, a recombinant modular NS2B/3(pro), consisting of the protease domain of NS3 linked to the core-domain of cofactor NS2B, was expressed in E. coli and purified to homogeneity. This enzyme could cleave a C-prM protein synthesised in rabbit reticulocyte lysates. However, cleavage was only specific when protein synthesis was performed in the presence of canine pancreatic microsomal membranes and required the prevention of signal peptidase I (SPaseI) activity by lengthening the h-region of the signal peptide. The presence of membranes allowed the concentration of NS2B/3(pro) used to be reduced by 10-20 fold. Substitution of the NS2B/3(pro) cleavage motif in C-prM by a HIV-1(pro) motif inhibited NS2B/3(pro) processing in the presence of microsomal membranes but allowed cleavage by HIV-1(pro) at the C-prM junction. This system shows that processing at the C-terminus of protein C by the TBEV NS2B/3(pro) is highly membrane dependent and will allow the examination of how the membrane topology of protein C affects both SPaseI and NS2B/3(pro) processing.
Collapse
|
11
|
Study of Inhibitors Against SARS Coronavirus by Computational Approaches. VIRAL PROTEASES AND ANTIVIRAL PROTEASE INHIBITOR THERAPY 2009. [PMCID: PMC7122585 DOI: 10.1007/978-90-481-2348-3_1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
12
|
HIVcleave: a web-server for predicting human immunodeficiency virus protease cleavage sites in proteins. Anal Biochem 2008; 375:388-90. [PMID: 18249180 DOI: 10.1016/j.ab.2008.01.012] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2007] [Revised: 01/08/2008] [Accepted: 01/09/2008] [Indexed: 11/24/2022]
Abstract
According to the ''distorted key theory'' [K.C. Chou, Analytical Biochemistry, 233 (1996) 1-14], the information of cleavage sites of proteins by HIV (human immunodeficiency virus) protease is very useful for finding effective inhibitors against HIV, the culprit of AIDS (acquired immunodeficiency syndrome). To meet the increasing need in this regard, a web-server called HIVcleave was established at http://chou.med.harvard.edu/bioinf/HIV/. In this note we provide a step-to-step guide for how to use HIVcleave to identify the cleavage sites of a query protein sequence by HIV-1 and HIV-2 proteases, respectively.
Collapse
|
13
|
Liang GZ, Li SZ. A new sequence representation as applied in better specificity elucidation for human immunodeficiency virus type 1 protease. Biopolymers 2007; 88:401-12. [PMID: 17206631 DOI: 10.1002/bip.20669] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Factor analysis scales of generalized amino acid information (FASGAI) involving hydrophobicity, alpha and turn propensities, bulky properties, compositional characteristics, local flexibility, and electronic properties were derived from 516 property parameters of 20-coded amino acids, and was then employed to represent sequence structures of 746 peptides with 8 amino acid residues. Cleavage site prediction models for human immunodeficiency virus type 1 protease by linear discriminant analysis and support vector machine with radial basis function kernel were constructed to identify if they could be cleaved or not, and were further utilized to investigate the cleavage specificity. These diversified properties, including the bulky properties, secondary conformation characteristics, electronic properties, and hydrophobicity at the first, the second, the fourth, the fifth, and the sixth residue, are possibly important factors in determining HIV PR cleavage or not. Particularly, maximal positive and negative influences result from the bulky properties of different sites. Further results from analysis of variance also likely reflect that the HIV PR recognizes diversified key properties of various sites in the octameric sequences. Satisfactory results show that FASGAI can not only be used to represent sequence structures of various functional peptides, but alsoprovide a potential feasible measure for exploring relationship between protein motif sequences and their functions.
Collapse
Affiliation(s)
- Gui Z Liang
- College of Bioengineering, Chongqing University, Chongqing 400030, People's Republic of China.
| | | |
Collapse
|
14
|
|
15
|
Cai YD, Liu XJ, Xu XB, Chou KC. Support Vector Machines for predicting HIV protease cleavage sites in protein. J Comput Chem 2002; 23:267-74. [PMID: 11924738 DOI: 10.1002/jcc.10017] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Knowledge of the polyprotein cleavage sites by HIV protease will refine our understanding of its specificity, and the information thus acquired is useful for designing specific and efficient HIV protease inhibitors. The pace in searching for the proper inhibitors of HIV protease will be greatly expedited if one can find an accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease. In this article, a Support Vector Machine is applied to predict the cleavability of oligopeptides by proteases with multiple and extended specificity subsites. We selected HIV-1 protease as the subject of the study. Two hundred ninety-nine oligopeptides were chosen for the training set, while the other 63 oligopeptides were taken as a test set. Because of its high rate of self-consistency (299/299 = 100%), a good result in the jackknife test (286/299 = 95%) and correct prediction rate (55/63 = 87%), it is expected that the Support Vector Machine method can be referred to as a useful assistant technique for finding effective inhibitors of HIV protease, which is one of the targets in designing potential drugs against AIDS. The principle of the Support Vector Machine method can also be applied to analyzing the specificity of other multisubsite enzymes.
Collapse
Affiliation(s)
- Yu-Dong Cai
- Shanghai Research Centre of Biotechnology, Chinese Academy of Sciences, People's Republic of China.
| | | | | | | |
Collapse
|
16
|
Cai YD, Yu H, Chou KC. Artificial neural network method for predicting HIV protease cleavage sites in protein. JOURNAL OF PROTEIN CHEMISTRY 1998; 17:607-15. [PMID: 9853675 DOI: 10.1007/bf02780962] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Knowledge of the polyprotein cleavage sites by HIV protease will refine our understanding of its specificity, and the information thus acquired will be useful for designing specific and efficient HIV protease inhibitors. The search for inhibitors of HIV protease will be greatly expedited if one can find an accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease. In this paper, Kohonen's self-organization model, which uses typical artificial neural networks, is applied to predict the cleavability of oligopeptides by proteases with multiple and extended specificity subsites. We selected HIV-1 protease as the subject of study. We chose 299 oligopeptides for the training set, and another 63 oligopeptides for the test set. Because of its high rate of correct prediction (58/63 = 92.06%) and stronger fault-tolerant ability, the neural network method should be a useful technique for finding effective inhibitors of HIV protease, which is one of the targets in designing potential drugs against AIDS. The principle of the artificial neural network method can also be applied to analyzing the specificity of any multisubsite enzyme.
Collapse
Affiliation(s)
- Y D Cai
- Shanghai Research Centre of Biotechnology, Chinese Academy of Sciences
| | | | | |
Collapse
|
17
|
Cai YD, Yu H, Chou KC. Artificial neural network method for predicting the specificity of GalNAc-transferase. JOURNAL OF PROTEIN CHEMISTRY 1997; 16:689-700. [PMID: 9330227 DOI: 10.1023/a:1026306520790] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The specificity of GalNAc-transferase is consistent with the existence of an extended site composed of nine subsites, denoted by R4, R3, R2, R1, R0, R1', R2', R3', and R4', where the acceptor at R0 is either Ser or Thr to which the reducing monosaccharide is anchored. To predict whether a peptide will react with the enzyme to form a Ser- or Thr-conjugated glycopeptide, a neural network method--Kohonen's self-organization model is proposed in this paper. Three hundred five oligopeptides are chosen for the training site, with another 30 oligopeptides for the test set. Because of its high correct prediction rate (26/30 = 86.7%) and stronger fault-tolerant ability, it is expected that the neural network method can be used as a technique for predicting O-glycosylation and designing effective inhibitors of GalNAc-transferase. It might also be useful for targeting drugs to specific sites in the body and for enzyme replacement therapy for the treatment of genetic disorders.
Collapse
|
18
|
Sun ZR, Zhang CT, Wu FH, Peng LW. A vector projection method for predicting supersecondary motifs. JOURNAL OF PROTEIN CHEMISTRY 1996; 15:721-9. [PMID: 9008295 DOI: 10.1007/bf01887145] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
For the 11 types of most frequently occurring supersecondary motifs, we used a new method--the vector projection method--to predict a protein's supersecondary structure. In a training set of peptides and a test set of peptides we obtained a satisfactory result, with a prediction accuracy of about 90%. The high prediction accuracy indicates that this method is reasonable for predicting the folding motifs of proteins. This work provides insight into the problem of predicting a protein's local structure accurately, and is of particular value in protein modeling, prediction, and molecule design.
Collapse
Affiliation(s)
- Z R Sun
- Department of Biological Sciences and Biotechnology, Tsinghua University, Beijing, China
| | | | | | | |
Collapse
|
19
|
Chou KC, Tomasselli AG, Reardon IM, Heinrikson RL. Predicting human immunodeficiency virus protease cleavage sites in proteins by a discriminant function method. Proteins 1996; 24:51-72. [PMID: 8628733 DOI: 10.1002/(sici)1097-0134(199601)24:1<51::aid-prot4>3.0.co;2-r] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Based on the sequence-coupled (Markov chain) model and vector-projection principle, a discriminant function method is proposed to predict sites in protein substrates that should be susceptible to cleavage by the HIV-1 protease. The discriminant function is defined by delta = phi+ - phi-, where phi+ and phi- are the cleavable and noncleavable attributes for a given peptide, and they can be derived from two complementary sets of peptides, S+ and S-, known to be cleavable and noncleavable, respectively, by the enzyme. The rate of correct prediction by the method for the 62 cleavable peptides and 239 noncleavable peptides in the training set are 100 and 96.7%, respectively. Application of the method to the 55 sequences which are outside the training set and known to be cleaved by the HIV-1 protease accurately predicted 100% of the peptides as substrates of the enzyme. The method also predicted all but one of the sites hydrolyzed by the protease in native HIV-1 and HIV-2 reverse transcriptases, where the HIV-1 protease discriminates between nearly identical sequences in a very subtle fashion. Finally, the algorithm predicts correctly all of the HIV-1 protease processing sites in the native gag and gag/pol HIV-1 polyproteins, and all of the cleavage sites identified in denatured protease and reverse transcriptase. The new predictive algorithm provides a novel route toward understanding the specificity of this important therapeutic target.
Collapse
Affiliation(s)
- K C Chou
- Pharmacia & Upjohn Laboratories, Kalamazoo, Michigan 49001-4940, USA
| | | | | | | |
Collapse
|
20
|
Chou KC. A sequence-coupled vector-projection model for predicting the specificity of GalNAc-transferase. Protein Sci 1995; 4:1365-83. [PMID: 7670379 PMCID: PMC2143175 DOI: 10.1002/pro.5560040712] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The specificity of GalNAc-transferase is consistent with the existence of an extended site composed of nine subsites, denoted by R4, R3, R2, R1, R0, R1', R2', R3', and R4', where the acceptor at R0 is either Ser or Thr to which the reducing monosaccharide is being anchored. To predict whether a peptide will react with the enzyme to form a Ser- or Thr-conjugated glycopeptide, a new method has been proposed based on the vector-projection approach as well as the sequence-coupled principle. By incorporating the sequence-coupled effect among the subsites, the interaction mechanism among subsites during glycosylation can be reflected and, by using the vector projection approach, arbitrary assignment for insufficient experimental data can be avoided. The very high ratio of correct predictions versus total predictions for the data in both the training and the testing sets indicates that the method is self-consistent and efficient. It provides a rapid means for predicting O-glycosylation and designing effective inhibitors of GalNAc-transferase, which might be useful for targeting drugs to specific sites in the body and for enzyme replacement therapy for the treatment of genetic disorders.
Collapse
Affiliation(s)
- K C Chou
- Upjohn Laboratories, Kalamazoo, Michigan 49001-4940, USA
| |
Collapse
|
21
|
Chou KC, Zhang CT, Kézdy FJ, Poorman RA. A vector projection method for predicting the specificity of GalNAc-transferase. Proteins 1995; 21:118-26. [PMID: 7777486 DOI: 10.1002/prot.340210205] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminytransferase (GalNAc-transferase) is consistent with the existence of an extended site composed of nine subsites, denoted by P4, P3, P2, P1, P0, P1', P2', P3', P4', where the acceptor at P0 is being either Ser or Thr. To predict whether a peptide will react with the enzyme to form a Ser- or Thr-conjugated glycopeptide, a vector projection method is proposed which uses a training set of amino acid sequences surrounding 90 Ser and 106 Thr O-glycosylation sites extracted from the National Biomedical Research Foundation Protein Database. The model postulates independent interactions of the 9 amino acid moieties with their respective binding sites. The high ratio of correct predictions vs. total predictions for the data in both the training and the testing sets indicates that the method is self-consistent and efficient. It provides a rapid means for predicting O-glycosylation and designing effective inhibitors of GalNAc-transferase.
Collapse
Affiliation(s)
- K C Chou
- Upjohn Laboratories, Kalamazoo, Michigan 49007-4940, USA
| | | | | | | |
Collapse
|
22
|
Tomasselli AG, Sarcich JL, Barrett LJ, Reardon IM, Howe WJ, Evans DB, Sharma SK, Heinrikson RL. Human immunodeficiency virus type-1 reverse transcriptase and ribonuclease H as substrates of the viral protease. Protein Sci 1993; 2:2167-76. [PMID: 7507754 PMCID: PMC2142316 DOI: 10.1002/pro.5560021216] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
A study has been made of the susceptibility of recombinant constructs of reverse transcriptase (RT) and ribonuclease H (RNase H) from human immunodeficiency virus type 1 (HIV-1) to digestion by the HIV-1 protease. At neutral pH, the protease attacks a single peptide bond, Phe440-Tyr441, in one of the protomers of the folded, active RT/RNase H (p66/p66) homodimer to give a stable, active heterodimer (p66/p51) that is resistant to further hydrolysis (Chattopadhyay, D., et al., 1992, J. Biol. Chem. 267, 14227-14232). The COOH-terminal p15 fragment released in the process, however, is rapidly degraded by the protease by cleavage at Tyr483-Leu484 and Tyr532-Leu533. In marked contrast to this p15 segment, both p66/p51 and a folded RNase H construct are stable to breakdown by the protease at neutral pH. It is only at pH values around 4 that these latter proteins appear to unfold and, under these conditions, the heterodimer undergoes extensive proteolysis. RNase H is also hydrolyzed at low pH, but cleavage takes place primarily at Gly436-Ala437 and at Phe440-Tyr441, and only much more slowly at residues 483, 494, and 532. This observation can be reconciled by inspection of crystallographic models of RNase H, which show that residues 483, 494, and 532 are relatively inaccessible in comparison to Gly436 and Phe440. Our results fit a model in which the p66/p66 homodimer exists in a conformation that mirrors that of the heterodimer, but with a p15 segment on one of the protomers that is structurally disordered to the extent that all of its potential HIV protease cleavage sites are accessible for hydrolysis.
Collapse
Affiliation(s)
- A G Tomasselli
- Biochemistry Unit, Upjohn Laboratories, Kalamazoo, Michigan 49001
| | | | | | | | | | | | | | | |
Collapse
|