Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Magnan CN, Baldi P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics 2014;30:2592-7. [PMID: 24860169 DOI: 10.1093/bioinformatics/btu352] [Citation(s) in RCA: 239] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

For:	Magnan CN, Baldi P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics 2014;30:2592-7. [PMID: 24860169 DOI: 10.1093/bioinformatics/btu352] [Citation(s) in RCA: 239] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

101

Chen T. Identification and characterization of the LRR repeats in plant LRR-RLKs. BMC Mol Cell Biol 2021;22:9. [PMID: 33509084 PMCID: PMC7841916 DOI: 10.1186/s12860-021-00344-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 01/12/2021] [Indexed: 01/11/2023] Open

Abstract

Background

Leucine-rich-repeat receptor-like kinases (LRR-RLKs) play central roles in sensing various signals to regulate plant development and environmental responses. The extracellular domains (ECDs) of plant LRR-RLKs contain LRR motifs, consisting of highly conserved residues and variable residues, and are responsible for ligand perception as a receptor or co-receptor. However, there are few comprehensive studies on the ECDs of LRR-RLKs due to the difficulty in effectively identifying the divergent LRR repeats.

Results

In the current study, an efficient LRR motif prediction program, the “Phyto-LRR prediction” program, was developed based on the position-specific scoring matrix algorithm (PSSM) with some optimizations. This program was trained by 16-residue plant-specific LRR-highly conserved segments (HCS) from LRR-RLKs of 17 represented land plant species and a database containing more than 55,000 predicted LRRs based on this program was constructed. Both the prediction tool and database are freely available at http://phytolrr.com/ for website usage and at http://github.com/phytolrr for local usage. The LRR-RLKs were classified into 18 subgroups (SGs) according to the maximum-likelihood phylogenetic analysis of kinase domains (KDs) of the sequences. Based on the database and the SGs, the characteristics of the LRR motifs in the ECDs of the LRR-RLKs were examined, such as the arrangement of the LRRs, the solvent accessibility, the variable residues, and the N-glycosylation sites, revealing a comprehensive profile of the plant LRR-RLK ectodomains.

Conclusion

The “Phyto-LRR prediction” program is effective in predicting the LRR segments in plant LRR-RLKs, which, together with the database, will facilitate the exploration of plant LRR-RLKs functions. Based on the database, comprehensive sequential characteristics of the plant LRR-RLK ectodomains were profiled and analyzed.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12860-021-00344-y.

Collapse

102

Bhagwat NR, Owens SN, Ito M, Boinapalli JV, Poa P, Ditzel A, Kopparapu S, Mahalawat M, Davies OR, Collins SR, Johnson JR, Krogan NJ, Hunter N. SUMO is a pervasive regulator of meiosis. eLife 2021;10:57720. [PMID: 33502312 PMCID: PMC7924959 DOI: 10.7554/elife.57720] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 01/26/2021] [Indexed: 02/06/2023] Open

Abstract

Protein modification by SUMO helps orchestrate the elaborate events of meiosis to faithfully produce haploid gametes. To date, only a handful of meiotic SUMO targets have been identified. Here, we delineate a multidimensional SUMO-modified meiotic proteome in budding yeast, identifying 2747 conjugation sites in 775 targets, and defining their relative levels and dynamics. Modified sites cluster in disordered regions and only a minority match consensus motifs. Target identities and modification dynamics imply that SUMOylation regulates all levels of chromosome organization and each step of meiotic prophase I. Execution-point analysis confirms these inferences, revealing functions for SUMO in S-phase, the initiation of recombination, chromosome synapsis and crossing over. K15-linked SUMO chains become prominent as chromosomes synapse and recombine, consistent with roles in these processes. SUMO also modifies ubiquitin, forming hybrid oligomers with potential to modulate ubiquitin signaling. We conclude that SUMO plays diverse and unanticipated roles in regulating meiotic chromosome metabolism.

Most mammalian, yeast and other eukaryote cells have two sets of chromosomes, one from each parent, which contain all the cell’s DNA. Sex cells – like the sperm and egg – however, have half the number of chromosomes and are formed by a specialized type of cell division known as meiosis. At the start of meiosis, each cell replicates its chromosomes so that it has twice the amount of DNA. The cell then undergoes two rounds of division to form sex cells which each contain only one set of chromosomes. Before the cell divides, the two duplicated sets of chromosomes pair up and swap sections of their DNA. This exchange allows each new sex cell to have a unique combination of DNA, resulting in offspring that are genetically distinct from their parents.

This complex series of events is tightly regulated, in part, by a protein called the 'small ubiquitin-like modifier' (or SUMO for short), which attaches itself to other proteins and modifies their behavior. This process, known as SUMOylation, can affect a protein’s stability, where it is located in the cell and how it interacts with other proteins. However, despite SUMO being known as a key regulator of meiosis, only a handful of its protein targets have been identified.

To gain a better understanding of what SUMO does during meiosis, Bhagwat et al. set out to find which proteins are targeted by SUMO in budding yeast and to map the specific sites of modification. The experiments identified 2,747 different sites on 775 different proteins, suggesting that SUMO regulates all aspects of meiosis. Consistently, inactivating SUMOylation at different times revealed SUMO plays a role at every stage of meiosis, including the replication of DNA and the exchanges between chromosomes. In depth analysis of the targeted proteins also revealed that SUMOylation targets different groups of proteins at different stages of meiosis and interacts with other protein modifications, including the ubiquitin system which tags proteins for destruction.

The data gathered by Bhagwat et al. provide a starting point for future research into precisely how SUMO proteins control meiosis in yeast and other organisms. In humans, errors in meiosis are the leading cause of pregnancy loss and congenital diseases. Most of the proteins identified as SUMO targets in budding yeast are also present in humans. So, this research could provide a platform for medical advances in the future. The next step is to study mammalian models, such as mice, to confirm that the regulation of meiosis by SUMO is the same in mammals as in yeast.

Collapse

Affiliation(s)

Nikhil R Bhagwat Howard Hughes Medical Institute, University of California Davis, Davis, United States.,Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Shannon N Owens Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Masaru Ito Howard Hughes Medical Institute, University of California Davis, Davis, United States.,Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Jay V Boinapalli Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Philip Poa Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Alexander Ditzel Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Srujan Kopparapu Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Meghan Mahalawat Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Owen Richard Davies Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne, United Kingdom
Sean R Collins Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States
Jeffrey R Johnson Department of Cellular & Molecular Pharmacology, University of California San Francisco, San Francisco, United States
Nevan J Krogan Department of Cellular & Molecular Pharmacology, University of California San Francisco, San Francisco, United States
Neil Hunter Howard Hughes Medical Institute, University of California Davis, Davis, United States.,Department of Microbiology & Molecular Genetics, University of California Davis, Davis, United States.,Department of Molecular & Cellular Biology, University of California Davis, Davis, United States

Collapse

103

Zhang J, Chen Q, Liu B. NCBRPred: predicting nucleic acid binding residues in proteins based on multilabel learning. Brief Bioinform 2021;22:6102667. [PMID: 33454744 DOI: 10.1093/bib/bbaa397] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 11/05/2020] [Accepted: 12/03/2020] [Indexed: 01/01/2023] Open

104

Bernier SC, Millette MA, Roy S, Cantin L, Coutinho A, Salesse C. Structural information and membrane binding of truncated RGS9-1 Anchor Protein and its C-terminal hydrophobic segment. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2021;1863:183566. [PMID: 33453187 DOI: 10.1016/j.bbamem.2021.183566] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Revised: 12/22/2020] [Accepted: 01/10/2021] [Indexed: 01/19/2023]

105

Karimi M, Wu D, Wang Z, Shen Y. Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts. J Chem Inf Model 2020;61:46-66. [PMID: 33347301 DOI: 10.1021/acs.jcim.0c00866] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Abstract

Predicting compound-protein affinity is beneficial for accelerating drug discovery. Doing so without the often-unavailable structure data is gaining interest. However, recent progress in structure-free affinity prediction, made by machine learning, focuses on accuracy but leaves much to be desired for interpretability. Defining intermolecular contacts underlying affinities as a vehicle for interpretability; our large-scale interpretability assessment finds previously used attention mechanisms inadequate. We thus formulate a hierarchical multiobjective learning problem, where predicted contacts form the basis for predicted affinities. We solve the problem by embedding protein sequences (by hierarchical recurrent neural networks) and compound graphs (by graph neural networks) with joint attentions between protein residues and compound atoms. We further introduce three methodological advances to enhance interpretability: (1) structure-aware regularization of attentions using protein sequence-predicted solvent exposure and residue-residue contact maps; (2) supervision of attentions using known intermolecular contacts in training data; and (3) an intrinsically explainable architecture where atomic-level contacts or "relations" lead to molecular-level affinity prediction. The first two and all three advances result in DeepAffinity+ and DeepRelations, respectively. Our methods show generalizability in affinity prediction for molecules that are new and dissimilar to training examples. Moreover, they show superior interpretability compared to state-of-the-art interpretable methods: with similar or better affinity prediction, they boost the AUPRC of contact prediction by around 33-, 35-, 10-, and 9-fold for the default test, new-compound, new-protein, and both-new sets, respectively. We further demonstrate their potential utilities in contact-assisted docking, structure-free binding site prediction, and structure-activity relationship studies without docking. Our study represents the first model development and systematic model assessment dedicated to interpretable machine learning for structure-free compound-protein affinity prediction.

Collapse

106

Phylogenomic analyses recover a clade of large-bodied decapodiform cephalopods. Mol Phylogenet Evol 2020;156:107038. [PMID: 33285289 DOI: 10.1016/j.ympev.2020.107038] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 10/30/2020] [Accepted: 12/01/2020] [Indexed: 12/14/2022]

107

Enhancing protein backbone angle prediction by using simpler models of deep neural networks. Sci Rep 2020;10:19430. [PMID: 33173130 PMCID: PMC7655839 DOI: 10.1038/s41598-020-76317-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Accepted: 10/23/2020] [Indexed: 11/09/2022] Open

108

Urban G, Torrisi M, Magnan CN, Pollastri G, Baldi P. Protein profiles: Biases and protocols. Comput Struct Biotechnol J 2020;18:2281-2289. [PMID: 32994887 PMCID: PMC7486441 DOI: 10.1016/j.csbj.2020.08.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Revised: 08/14/2020] [Accepted: 08/15/2020] [Indexed: 11/13/2022] Open

Abstract

The use of evolutionary profiles to predict protein secondary structure, as well as other protein structural features, has been standard practice since the 1990s. Using profiles in the input of such predictors, in place or in addition to the sequence itself, leads to significantly more accurate predictions. While profiles can enhance structural signals, their role remains somewhat surprising as proteins do not use profiles when folding in vivo. Furthermore, the same sequence-based redundancy reduction protocols initially derived to train and evaluate sequence-based predictors, have been applied to train and evaluate profile-based predictors. This can lead to unfair comparisons since profiles may facilitate the bleeding of information between training and test sets. Here we use the extensively studied problem of secondary structure prediction to better evaluate the role of profiles and show that: (1) high levels of profile similarity between training and test proteins are observed when using standard sequence-based redundancy protocols; (2) the gain in accuracy for profile-based predictors, over sequence-based predictors, strongly relies on these high levels of profile similarity between training and test proteins; and (3) the overall accuracy of a profile-based predictor on a given protein dataset provides a biased measure when trying to estimate the actual accuracy of the predictor, or when comparing it to other predictors. We show, however, that this bias can be mitigated by implementing a new protocol (EVALpro) which evaluates the accuracy of profile-based predictors as a function of the profile similarity between training and test proteins. Such a protocol not only allows for a fair comparison of the predictors on equally hard or easy examples, but also reduces the impact of choosing a given similarity cutoff when selecting test proteins. The EVALpro program is available in the SCRATCH suite ( www.scratch.proteomics.ics.uci.edu) and can be downloaded at: www.download.igb.uci.edu/#evalpro.

Collapse

109

de Brevern AG. Impact of protein dynamics on secondary structure prediction. Biochimie 2020;179:14-22. [PMID: 32946990 DOI: 10.1016/j.biochi.2020.09.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 09/04/2020] [Accepted: 09/10/2020] [Indexed: 02/08/2023]

110

Guo Z, Hou J, Cheng J. DNSS2: Improved ab initio protein secondary structure prediction using advanced deep learning architectures. Proteins 2020;89:207-217. [PMID: 32893403 DOI: 10.1002/prot.26007] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 07/07/2020] [Accepted: 09/02/2020] [Indexed: 12/27/2022]

111

Yazdani Z, Rafiei A, Yazdani M, Valadan R. Design an Efficient Multi-Epitope Peptide Vaccine Candidate Against SARS-CoV-2: An in silico Analysis. Infect Drug Resist 2020;13:3007-3022. [PMID: 32943888 PMCID: PMC7459237 DOI: 10.2147/idr.s264573] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 07/28/2020] [Indexed: 12/13/2022] Open

112

Azginoglu N, Aydin Z, Celik M. Structural profile matrices for predicting structural properties of proteins. J Bioinform Comput Biol 2020;18:2050022. [PMID: 32649260 DOI: 10.1142/s0219720020500225] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

113

Sun J, Frishman D. DeepHelicon: Accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks. J Struct Biol 2020;212:107574. [PMID: 32663598 DOI: 10.1016/j.jsb.2020.107574] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 07/03/2020] [Accepted: 07/07/2020] [Indexed: 01/16/2023]

114

Zhu M, Kuechler ER, Zhang J, Matalon O, Dubreuil B, Hofmann A, Loewen C, Levy ED, Gsponer J, Mayor T. Proteomic analysis reveals the direct recruitment of intrinsically disordered regions to stress granules in S. cerevisiae. J Cell Sci 2020;133:jcs244657. [PMID: 32503941 DOI: 10.1242/jcs.244657] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 05/15/2020] [Indexed: 01/21/2023] Open

115

Liu T, Wang Z. MASS: predict the global qualities of individual protein models using random forests and novel statistical potentials. BMC Bioinformatics 2020;21:246. [PMID: 32631256 PMCID: PMC7336608 DOI: 10.1186/s12859-020-3383-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Accepted: 01/22/2020] [Indexed: 11/10/2022] Open

116

Bhatnager R, Bhasin M, Arora J, Dang AS. Epitope based peptide vaccine against SARS-COV2: an immune-informatics approach. J Biomol Struct Dyn 2020;39:5690-5705. [PMID: 32619134 DOI: 10.1080/07391102.2020.1787227] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

117

Ayub U, Naveed H, Shahzad W. PRRAT_AM—An advanced ant-miner to extract accurate and comprehensible classification rules. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106326] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

118

Juan SH, Chen TR, Lo WC. A simple strategy to enhance the speed of protein secondary structure prediction without sacrificing accuracy. PLoS One 2020;15:e0235153. [PMID: 32603341 PMCID: PMC7326220 DOI: 10.1371/journal.pone.0235153] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 06/09/2020] [Indexed: 01/06/2023] Open

Abstract

The secondary structure prediction of proteins is a classic topic of computational structural biology with a variety of applications. During the past decade, the accuracy of prediction achieved by state-of-the-art algorithms has been >80%; meanwhile, the time cost of prediction increased rapidly because of the exponential growth of fundamental protein sequence data. Based on literature studies and preliminary observations on the relationships between the size/homology of the fundamental protein dataset and the speed/accuracy of predictions, we raised two hypotheses that might be helpful to determine the main influence factors of the efficiency of secondary structure prediction. Experimental results of size and homology reductions of the fundamental protein dataset supported those hypotheses. They revealed that shrinking the size of the dataset could substantially cut down the time cost of prediction with a slight decrease of accuracy, which could be increased on the contrary by homology reduction of the dataset. Moreover, the Shannon information entropy could be applied to explain how accuracy was influenced by the size and homology of the dataset. Based on these findings, we proposed that a proper combination of size and homology reductions of the protein dataset could speed up the secondary structure prediction while preserving the high accuracy of state-of-the-art algorithms. Testing the proposed strategy with the fundamental protein dataset of the year 2018 provided by the Universal Protein Resource, the speed of prediction was enhanced over 20 folds while all accuracy measures remained equivalently high. These findings are supposed helpful for improving the efficiency of researches and applications depending on the secondary structure prediction of proteins. To make future implementations of the proposed strategy easy, we have established a database of size and homology reduced protein datasets at http://10.life.nctu.edu.tw/UniRefNR.

Collapse

119

Shi Q, Chen W, Huang S, Jin F, Dong Y, Wang Y, Xue Z. DNN-Dom: predicting protein domain boundary from sequence alone by deep neural network. Bioinformatics 2020;35:5128-5136. [PMID: 31197306 DOI: 10.1093/bioinformatics/btz464] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 05/07/2019] [Accepted: 06/05/2019] [Indexed: 11/13/2022] Open

120

Cao Z, Du W, Li G, Cao H. DEEPSMP: A deep learning model for predicting the ectodomain shedding events of membrane proteins. J Bioinform Comput Biol 2020;18:2050017. [PMID: 32576054 DOI: 10.1142/s0219720020500171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Abstract

Membrane proteins play essential roles in modern medicine. In recent studies, some membrane proteins involved in ectodomain shedding events have been reported as the potential drug targets and biomarkers of some serious diseases. However, there are few effective tools for identifying the shedding event of membrane proteins. So, it is necessary to design an effective tool for predicting shedding event of membrane proteins. In this study, we design an end-to-end prediction model using deep neural networks with long short-term memory (LSTM) units and attention mechanism, to predict the ectodomain shedding events of membrane proteins only by sequence information. Firstly, the evolutional profiles are encoded from original sequences of these proteins by Position-Specific Iterated BLAST (PSI-BLAST) on Uniref50 database. Then, the LSTM units which contain memory cells are used to hold information from past inputs to the network and the attention mechanism is applied to detect sorting signals in proteins regardless of their position in the sequence. Finally, a fully connected dense layer and a softmax layer are used to obtain the final prediction results. Additionally, we also try to reduce overfitting of the model by using dropout, L2 regularization, and bagging ensemble learning in the model training process. In order to ensure the fairness of performance comparison, firstly we use cross validation process on training dataset obtained from an existing paper. The average accuracy and area under a receiver operating characteristic curve (AUC) of five-fold cross-validation are 81.19% and 0.835 using our proposed model, compared to 75% and 0.78 by a previously published tool, respectively. To better validate the performance of the proposed model, we also evaluate the performance of the proposed model on independent test dataset. The accuracy, sensitivity, and specificity are 83.14%, 84.08%, and 81.63% using our proposed model, compared to 70.20%, 71.97%, and 67.35% by the existing model. The experimental results validate that the proposed model can be regarded as a general tool for predicting ectodomain shedding events of membrane proteins. The pipeline of the model and prediction results can be accessed at the following URL: http://www.csbg-jlu.info/DeepSMP/.

Collapse

121

Karimi M, Wu D, Wang Z, Shen Y. DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinformatics 2020;35:3329-3338. [PMID: 30768156 DOI: 10.1093/bioinformatics/btz111] [Citation(s) in RCA: 250] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Revised: 12/26/2018] [Accepted: 02/12/2019] [Indexed: 02/02/2023] Open

122

Du W, Sun Y, Li G, Cao H, Pang R, Li Y. CapsNet-SSP: multilane capsule network for predicting human saliva-secretory proteins. BMC Bioinformatics 2020;21:237. [PMID: 32517646 PMCID: PMC7285745 DOI: 10.1186/s12859-020-03579-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Accepted: 06/01/2020] [Indexed: 01/24/2023] Open

Abstract

Background

Compared with disease biomarkers in blood and urine, biomarkers in saliva have distinct advantages in clinical tests, as they can be conveniently examined through noninvasive sample collection. Therefore, identifying human saliva-secretory proteins and further detecting protein biomarkers in saliva have significant value in clinical medicine. There are only a few methods for predicting saliva-secretory proteins based on conventional machine learning algorithms, and all are highly dependent on annotated protein features. Unlike conventional machine learning algorithms, deep learning algorithms can automatically learn feature representations from input data and thus hold promise for predicting saliva-secretory proteins.

Results

We present a novel end-to-end deep learning model based on multilane capsule network (CapsNet) with differently sized convolution kernels to identify saliva-secretory proteins only from sequence information. The proposed model CapsNet-SSP outperforms existing methods based on conventional machine learning algorithms. Furthermore, the model performs better than other state-of-the-art deep learning architectures mostly used to analyze biological sequences. In addition, we further validate the effectiveness of CapsNet-SSP by comparison with human saliva-secretory proteins from existing studies and known salivary protein biomarkers of cancer.

Conclusions

The main contributions of this study are as follows: (1) an end-to-end model based on CapsNet is proposed to identify saliva-secretory proteins from the sequence information; (2) the proposed model achieves better performance and outperforms existing models; and (3) the saliva-secretory proteins predicted by our model are statistically significant compared with existing cancer biomarkers in saliva. In addition, a web server of CapsNet-SSP is developed for saliva-secretory protein identification, and it can be accessed at the following URL: http://www.csbg-jlu.info/CapsNet-SSP/. We believe that our model and web server will be useful for biomedical researchers who are interested in finding salivary protein biomarkers, especially when they have identified candidate proteins for analyzing diseased tissues near or distal to salivary glands using transcriptome or proteomics.

Collapse

123

Hou J, Adhikari B, Tanner JJ, Cheng J. SAXSDom: Modeling multidomain protein structures using small-angle X-ray scattering data. Proteins 2020;88:775-787. [PMID: 31860156 PMCID: PMC7230021 DOI: 10.1002/prot.25865] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Revised: 11/18/2019] [Accepted: 12/14/2019] [Indexed: 12/27/2022]

124

Gress A, Kalinina OV. SphereCon-a method for precise estimation of residue relative solvent accessible area from limited structural information. Bioinformatics 2020;36:3372-3378. [PMID: 32154837 DOI: 10.1093/bioinformatics/btaa159] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 02/28/2020] [Accepted: 03/04/2020] [Indexed: 11/13/2022] Open

125

Wekesa JS, Meng J, Luan Y. A deep learning model for plant lncRNA-protein interaction prediction with graph attention. Mol Genet Genomics 2020;295:1091-1102. [DOI: 10.1007/s00438-020-01682-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 05/01/2020] [Indexed: 02/06/2023]

126

Wekesa JS, Meng J, Luan Y. Multi-feature fusion for deep learning to predict plant lncRNA-protein interaction. Genomics 2020;112:2928-2936. [PMID: 32437848 DOI: 10.1016/j.ygeno.2020.05.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 04/22/2020] [Accepted: 05/05/2020] [Indexed: 12/28/2022]

127

Shapovalov M, Dunbrack RL, Vucetic S. Multifaceted analysis of training and testing convolutional neural networks for protein secondary structure prediction. PLoS One 2020;15:e0232528. [PMID: 32374785 PMCID: PMC7202669 DOI: 10.1371/journal.pone.0232528] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 04/16/2020] [Indexed: 11/30/2022] Open

Abstract

Protein secondary structure prediction remains a vital topic with broad applications. Due to lack of a widely accepted standard in secondary structure predictor evaluation, a fair comparison of predictors is challenging. A detailed examination of factors that contribute to higher accuracy is also lacking. In this paper, we present: (1) new test sets, Test2018, Test2019, and Test2018-2019, consisting of proteins from structures released in 2018 and 2019 with less than 25% identity to any protein published before 2018; (2) a 4-layer convolutional neural network, SecNet, with an input window of ±14 amino acids which was trained on proteins ≤25% identical to proteins in Test2018 and the commonly used CB513 test set; (3) an additional test set that shares no homologous domains with the training set proteins, according to the Evolutionary Classification of Proteins (ECOD) database; (4) a detailed ablation study where we reverse one algorithmic choice at a time in SecNet and evaluate the effect on the prediction accuracy; (5) new 4- and 5-label prediction alphabets that may be more practical for tertiary structure prediction methods. The 3-label accuracy (helix, sheet, coil) of the leading predictors on both Test2018 and CB513 is 81-82%, while SecNet's accuracy is 84% for both sets. Accuracy on the non-homologous ECOD set is only 0.6 points (83.9%) lower than the results on the Test2018-2019 set (84.5%). The ablation study of features, neural network architecture, and training hyper-parameters suggests the best accuracy results are achieved with good choices for each of them while the neural network architecture is not as critical as long as it is not too simple. Protocols for generating and using unbiased test, validation, and training sets are provided. Our data sets, including input features and assigned labels, and SecNet software including third-party dependencies and databases, are downloadable from dunbrack.fccc.edu/ss and github.com/sh-maxim/ss.

Collapse

128

MOHL JONATHONE, GERKEN THOMAS, LEUNG MINGYING. Predicting mucin-type O-Glycosylation using enhancement value products from derived protein features. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2020;19:2040003. [PMID: 33208985 PMCID: PMC7671581 DOI: 10.1142/s0219633620400039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

129

León Y, Zapata L, Salas-Burgos A, Oñate A. In silico design of a vaccine candidate based on autotransporters and HSP against the causal agent of shigellosis, Shigella flexneri. Mol Immunol 2020;121:47-58. [DOI: 10.1016/j.molimm.2020.02.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 02/10/2020] [Accepted: 02/12/2020] [Indexed: 12/19/2022]

130

Pandey A, Braun EL. Phylogenetic Analyses of Sites in Different Protein Structural Environments Result in Distinct Placements of the Metazoan Root. BIOLOGY 2020;9:E64. [PMID: 32231097 PMCID: PMC7235752 DOI: 10.3390/biology9040064] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 03/09/2020] [Accepted: 03/20/2020] [Indexed: 12/23/2022]

Abstract

Phylogenomics, the use of large datasets to examine phylogeny, has revolutionized the study of evolutionary relationships. However, genome-scale data have not been able to resolve all relationships in the tree of life; this could reflect, at least in part, the poor-fit of the models used to analyze heterogeneous datasets. Some of the heterogeneity may reflect the different patterns of selection on proteins based on their structures. To test that hypothesis, we developed a pipeline to divide phylogenomic protein datasets into subsets based on secondary structure and relative solvent accessibility. We then tested whether amino acids in different structural environments had distinct signals for the topology of the deepest branches in the metazoan tree. We focused on a dataset that appeared to have a mixture of signals and we found that the most striking difference in phylogenetic signal reflected relative solvent accessibility. Analyses of exposed sites (residues located on the surface of proteins) yielded a tree that placed ctenophores sister to all other animals whereas sites buried inside proteins yielded a tree with a sponge+ctenophore clade. These differences in phylogenetic signal were not ameliorated when we conducted analyses using a set of maximum-likelihood profile mixture models. These models are very similar to the Bayesian CAT model, which has been used in many analyses of deep metazoan phylogeny. In contrast, analyses conducted after recoding amino acids to limit the impact of deviations from compositional stationarity increased the congruence in the estimates of phylogeny for exposed and buried sites; after recoding amino acid trees estimated using the exposed and buried site both supported placement of ctenophores sister to all other animals. Although the central conclusion of our analyses is that sites in different structural environments yield distinct trees when analyzed using models of protein evolution, our amino acid recoding analyses also have implications for metazoan evolution. Specifically, our results add to the evidence that ctenophores are the sister group of all other animals and they further suggest that the placozoa+cnidaria clade found in some other studies deserves more attention. Taken as a whole, these results provide striking evidence that it is necessary to achieve a better understanding of the constraints due to protein structure to improve phylogenetic estimation.

Collapse

131

Smolarczyk T, Roterman-Konieczna I, Stapor K. Protein Secondary Structure Prediction: A Review of Progress and Directions. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191017104639] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

132

The Order-Disorder Continuum: Linking Predictions of Protein Structure and Disorder through Molecular Simulation. Sci Rep 2020;10:2068. [PMID: 32034199 PMCID: PMC7005769 DOI: 10.1038/s41598-020-58868-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Accepted: 10/16/2019] [Indexed: 12/11/2022] Open

Abstract

Intrinsically disordered proteins (IDPs) and intrinsically disordered regions within proteins (IDRs) serve an increasingly expansive list of biological functions, including regulation of transcription and translation, protein phosphorylation, cellular signal transduction, as well as mechanical roles. The strong link between protein function and disorder motivates a deeper fundamental characterization of IDPs and IDRs for discovering new functions and relevant mechanisms. We review recent advances in experimental techniques that have improved identification of disordered regions in proteins. Yet, experimentally curated disorder information still does not currently scale to the level of experimentally determined structural information in folded protein databases, and disorder predictors rely on several different binary definitions of disorder. To link secondary structure prediction algorithms developed for folded proteins and protein disorder predictors, we conduct molecular dynamics simulations on representative proteins from the Protein Data Bank, comparing secondary structure and disorder predictions with simulation results. We find that structure predictor performance from neural networks can be leveraged for the identification of highly dynamic regions within molecules, linked to disorder. Low accuracy structure predictions suggest a lack of static structure for regions that disorder predictors fail to identify. While disorder databases continue to expand, secondary structure predictors and molecular simulations can improve disorder predictor performance, which aids discovery of novel functions of IDPs and IDRs. These observations provide a platform for the development of new, integrated structural databases and fusion of prediction tools toward protein disorder characterization in health and disease.

Collapse

133

Bohra N, Sasidharan S, Raj S, Balaji SN, Saudagar P. Utilising capsid proteins of poliovirus to design a multi-epitope based subunit vaccine by immunoinformatics approach. MOLECULAR SIMULATION 2020. [DOI: 10.1080/08927022.2020.1720916] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

134

Torrisi M, Pollastri G, Le Q. Deep learning methods in protein structure prediction. Comput Struct Biotechnol J 2020;18:1301-1310. [PMID: 32612753 PMCID: PMC7305407 DOI: 10.1016/j.csbj.2019.12.011] [Citation(s) in RCA: 132] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 12/19/2019] [Accepted: 12/20/2019] [Indexed: 01/01/2023] Open

135

Fukuda H, Tomii K. DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment. BMC Bioinformatics 2020;21:10. [PMID: 31918654 PMCID: PMC6953294 DOI: 10.1186/s12859-019-3190-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 11/04/2019] [Indexed: 12/30/2022] Open

Abstract

Background

Recently developed methods of protein contact prediction, a crucially important step for protein structure prediction, depend heavily on deep neural networks (DNNs) and multiple sequence alignments (MSAs) of target proteins. Protein sequences are accumulating to an increasing degree such that abundant sequences to construct an MSA of a target protein are readily obtainable. Nevertheless, many cases present different ends of the number of sequences that can be included in an MSA used for contact prediction. The abundant sequences might degrade prediction results, but opportunities remain for a limited number of sequences to construct an MSA. To resolve these persistent issues, we strove to develop a novel framework using DNNs in an end-to-end manner for contact prediction.

Results

We developed neural network models to improve precision of both deep and shallow MSAs. Results show that higher prediction accuracy was achieved by assigning weights to sequences in a deep MSA. Moreover, for shallow MSAs, adding a few sequential features was useful to increase the prediction accuracy of long-range contacts in our model. Based on these models, we expanded our model to a multi-task model to achieve higher accuracy by incorporating predictions of secondary structures and solvent-accessible surface areas. Moreover, we demonstrated that ensemble averaging of our models can raise accuracy. Using past CASP target protein domains, we tested our models and demonstrated that our final model is superior to or equivalent to existing meta-predictors.

Conclusions

The end-to-end learning framework we built can use information derived from either deep or shallow MSAs for contact prediction. Recently, an increasing number of protein sequences have become accessible, including metagenomic sequences, which might degrade contact prediction results. Under such circumstances, our model can provide a means to reduce noise automatically. According to results of tertiary structure prediction based on contacts and secondary structures predicted by our model, more accurate three-dimensional models of a target protein are obtainable than those from existing ECA methods, starting from its MSA. DeepECA is available from https://github.com/tomiilab/DeepECA.

Collapse

136

An enhanced protein secondary structure prediction using deep learning framework on hybrid profile based features. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2019.105926] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

137

IDP-LZerD: Software for Modeling Disordered Protein Interactions. Methods Mol Biol 2020;2165:231-244. [PMID: 32621228 DOI: 10.1007/978-1-0716-0708-4_13] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

138

The MULTICOM Protein Structure Prediction Server Empowered by Deep Learning and Contact Distance Prediction. Methods Mol Biol 2020;2165:13-26. [PMID: 32621217 DOI: 10.1007/978-1-0716-0708-4_2] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

139

Shi Q, Chen W, Huang S, Wang Y, Xue Z. Deep learning for mining protein data. Brief Bioinform 2019;22:194-218. [PMID: 31867611 DOI: 10.1093/bib/bbz156] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 10/21/2019] [Accepted: 11/07/2019] [Indexed: 01/16/2023] Open

140

Pal A, Saha BK, Saha J. Comparative in silico analysis of ftsZ gene from different bacteria reveals the preference for core set of codons in coding sequence structuring and secondary structural elements determination. PLoS One 2019;14:e0219231. [PMID: 31841523 PMCID: PMC6913975 DOI: 10.1371/journal.pone.0219231] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 11/28/2019] [Indexed: 11/19/2022] Open

141

Raimondi D, Orlando G, Vranken WF, Moreau Y. Exploring the limitations of biophysical propensity scales coupled with machine learning for protein sequence analysis. Sci Rep 2019;9:16932. [PMID: 31729443 PMCID: PMC6858301 DOI: 10.1038/s41598-019-53324-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 10/25/2019] [Indexed: 11/21/2022] Open

142

Hong J, Luo Y, Mou M, Fu J, Zhang Y, Xue W, Xie T, Tao L, Lou Y, Zhu F. Convolutional neural network-based annotation of bacterial type IV secretion system effectors with enhanced accuracy and reduced false discovery. Brief Bioinform 2019;21:1825-1836. [PMID: 31860715 DOI: 10.1093/bib/bbz120] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2019] [Revised: 08/12/2019] [Accepted: 08/21/2019] [Indexed: 12/20/2022] Open

Abstract

The type IV bacterial secretion system (SS) is reported to be one of the most ubiquitous SSs in nature and can induce serious conditions by secreting type IV SS effectors (T4SEs) into the host cells. Recent studies mainly focus on annotating new T4SE from the huge amount of sequencing data, and various computational tools are therefore developed to accelerate T4SE annotation. However, these tools are reported as heavily dependent on the selected methods and their annotation performance need to be further enhanced. Herein, a convolution neural network (CNN) technique was used to annotate T4SEs by integrating multiple protein encoding strategies. First, the annotation accuracies of nine encoding strategies integrated with CNN were assessed and compared with that of the popular T4SE annotation tools based on independent benchmark. Second, false discovery rates of various models were systematically evaluated by (1) scanning the genome of Legionella pneumophila subsp. ATCC 33152 and (2) predicting the real-world non-T4SEs validated using published experiments. Based on the above analyses, the encoding strategies, (a) position-specific scoring matrix (PSSM), (b) protein secondary structure & solvent accessibility (PSSSA) and (c) one-hot encoding scheme (Onehot), were identified as well-performing when integrated with CNN. Finally, a novel strategy that collectively considers the three well-performing models (CNN-PSSM, CNN-PSSSA and CNN-Onehot) was proposed, and a new tool (CNN-T4SE, https://idrblab.org/cnnt4se/) was constructed to facilitate T4SE annotation. All in all, this study conducted a comprehensive analysis on the performance of a collection of encoding strategies when integrated with CNN, which could facilitate the suppression of T4SS in infection and limit the spread of antimicrobial resistance.

Collapse

143

Kaul T, Eswaran M, Ahmad S, Thangaraj A, Jain R, Kaul R, Raman NM, Bharti J. Probing the effect of a plus 1bp frameshift mutation in protein-DNA interface of domestication gene, NAMB1, in wheat. J Biomol Struct Dyn 2019;38:3633-3647. [PMID: 31621500 DOI: 10.1080/07391102.2019.1680435] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

144

Wozniak PP, Pelc J, Skrzypecki M, Vriend G, Kotulska M. Bio-knowledge-based filters improve residue-residue contact prediction accuracy. Bioinformatics 2019;34:3675-3683. [PMID: 29850768 DOI: 10.1093/bioinformatics/bty416] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Accepted: 05/19/2018] [Indexed: 11/13/2022] Open

145

Dhal AK, Pani A, Mahapatra RK, Yun SI. An immunoinformatics approach for design and validation of multi-subunit vaccine against Cryptosporidium parvum. Immunobiology 2019;224:747-757. [PMID: 31522782 DOI: 10.1016/j.imbio.2019.09.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 08/29/2019] [Accepted: 09/03/2019] [Indexed: 12/30/2022]

146

Abelin JG, Harjanto D, Malloy M, Suri P, Colson T, Goulding SP, Creech AL, Serrano LR, Nasir G, Nasrullah Y, McGann CD, Velez D, Ting YS, Poran A, Rothenberg DA, Chhangawala S, Rubinsteyn A, Hammerbacher J, Gaynor RB, Fritsch EF, Greshock J, Oslund RC, Barthelme D, Addona TA, Arieta CM, Rooney MS. Defining HLA-II Ligand Processing and Binding Rules with Mass Spectrometry Enhances Cancer Epitope Prediction. Immunity 2019;51:766-779.e17. [PMID: 31495665 DOI: 10.1016/j.immuni.2019.08.012] [Citation(s) in RCA: 167] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 06/19/2019] [Accepted: 08/15/2019] [Indexed: 12/30/2022]

147

Bahrami AA, Payandeh Z, Khalili S, Zakeri A, Bandehpour M. Immunoinformatics: In Silico Approaches and Computational Design of a Multi-epitope, Immunogenic Protein. Int Rev Immunol 2019;38:307-322. [PMID: 31478759 DOI: 10.1080/08830185.2019.1657426] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

148

Khurana S, Rawi R, Kunji K, Chuang GY, Bensmail H, Mall R. DeepSol: a deep learning framework for sequence-based protein solubility prediction. Bioinformatics 2019;34:2605-2613. [PMID: 29554211 DOI: 10.1093/bioinformatics/bty166] [Citation(s) in RCA: 114] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Accepted: 03/13/2018] [Indexed: 01/09/2023] Open

149

Torrisi M, Kaleel M, Pollastri G. Deeper Profiles and Cascaded Recurrent and Convolutional Neural Networks for state-of-the-art Protein Secondary Structure Prediction. Sci Rep 2019;9:12374. [PMID: 31451723 PMCID: PMC6710256 DOI: 10.1038/s41598-019-48786-x] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 08/12/2019] [Indexed: 01/10/2023] Open

150

Eng CH, Backman TWH, Bailey CB, Magnan C, García Martín H, Katz L, Baldi P, Keasling JD. ClusterCAD: a computational platform for type I modular polyketide synthase design. Nucleic Acids Res 2019;46:D509-D515. [PMID: 29040649 PMCID: PMC5753242 DOI: 10.1093/nar/gkx893] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 09/24/2017] [Indexed: 01/10/2023] Open

Affiliation(s)

Clara H Eng Department of Chemical and Biomolecular Engineering, University of California, Berkeley, CA 94720, USA
Tyler W H Backman Joint BioEnergy Institute, 5885 Hollis Street, Emeryville, CA 94608, USA.,Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.,Department of Energy Agile BioFoundry, Emeryville, CA 94608, USA
Constance B Bailey Joint BioEnergy Institute, 5885 Hollis Street, Emeryville, CA 94608, USA.,Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
Christophe Magnan Department of Computer Science, University of California, Irvine, CA 92697, USA.,Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, USA
Héctor García Martín Joint BioEnergy Institute, 5885 Hollis Street, Emeryville, CA 94608, USA.,Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.,Department of Energy Agile BioFoundry, Emeryville, CA 94608, USA
Leonard Katz QB3 Institute, University of California, Berkeley, CA 94720, USA
Pierre Baldi Department of Computer Science, University of California, Irvine, CA 92697, USA.,Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, USA
Jay D Keasling Department of Chemical and Biomolecular Engineering, University of California, Berkeley, CA 94720, USA.,Joint BioEnergy Institute, 5885 Hollis Street, Emeryville, CA 94608, USA.,Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.,Department of Energy Agile BioFoundry, Emeryville, CA 94608, USA.,QB3 Institute, University of California, Berkeley, CA 94720, USA.,Department of Bioengineering, University of California, Berkeley, CA 94720, USA.,Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK2970 Horsholm, Denmark

Collapse