Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Matinyan S, Filipcik P, Abrahams JP. Deep learning applications in protein crystallography. Acta Crystallogr A Found Adv 2024;80:1-17. [PMID: 38189437 PMCID: PMC10833361 DOI: 10.1107/s2053273323009300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 10/24/2023] [Indexed: 01/09/2024] Open

Fakhar AZ, Liu J, Pajerowska-Mukhtar KM, Mukhtar MS. The Lost and Found: Unraveling the Functions of Orphan Genes. J Dev Biol 2023;11:27. [PMID: 37367481 PMCID: PMC10299390 DOI: 10.3390/jdb11020027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 05/19/2023] [Accepted: 05/26/2023] [Indexed: 06/28/2023] Open

Patel CN, Mall R, Bensmail H. AI-driven drug repurposing and binding pose meta dynamics identifies novel targets for Monkeypox virus. J Infect Public Health 2023;16:799-807. [PMID: 36966703 PMCID: PMC10014505 DOI: 10.1016/j.jiph.2023.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 02/28/2023] [Accepted: 03/05/2023] [Indexed: 03/17/2023] Open

Wang PH, Zhu YH, Yang X, Yu DJ. GCmapCrys: Integrating graph attention network with predicted contact map for multi-stage protein crystallization propensity prediction. Anal Biochem 2023;663:115020. [PMID: 36521558 DOI: 10.1016/j.ab.2022.115020] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 12/05/2022] [Accepted: 12/10/2022] [Indexed: 12/14/2022]

Wang S, Zhao H. SADeepcry: a deep learning framework for protein crystallization propensity prediction using self-attention and auto-encoder networks. Brief Bioinform 2022;23:6678422. [PMID: 36037090 DOI: 10.1093/bib/bbac352] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 07/15/2022] [Accepted: 07/27/2022] [Indexed: 11/14/2022] Open

Xiouras C, Cameli F, Quilló GL, Kavousanakis ME, Vlachos DG, Stefanidis GD. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem Rev 2022;122:13006-13042. [PMID: 35759465 DOI: 10.1021/acs.chemrev.2c00141] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Zhang X, Xuan J, Yao C, Gao Q, Wang L, Jin X, Li S. A deep learning approach for orphan gene identification in moso bamboo (Phyllostachys edulis) based on the CNN + Transformer model. BMC Bioinformatics 2022;23:162. [PMID: 35513802 PMCID: PMC9069780 DOI: 10.1186/s12859-022-04702-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 04/28/2022] [Indexed: 12/02/2022] Open

Lin K, Quan X, Jin C, Shi Z, Yang J. An Interpretable Double-Scale Attention Model for Enzyme Protein Class Prediction Based on Transformer Encoders and Multi-Scale Convolutions. Front Genet 2022;13:885627. [PMID: 35432476 PMCID: PMC9012241 DOI: 10.3389/fgene.2022.885627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 03/07/2022] [Indexed: 12/01/2022] Open

Jin C, Shi Z, Kang C, Lin K, Zhang H. TLCrys: Transfer Learning Based Method for Protein Crystallization Prediction. Int J Mol Sci 2022;23:972. [PMID: 35055158 PMCID: PMC8778968 DOI: 10.3390/ijms23020972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 01/05/2022] [Accepted: 01/14/2022] [Indexed: 11/17/2022] Open

Vollmar M, Evans G. Machine learning applications in macromolecular X-ray crystallography. CRYSTALLOGR REV 2021. [DOI: 10.1080/0889311x.2021.1982914] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Ghadermarzi S, Krawczyk B, Song J, Kurgan L. XRRpred: Accurate Predictor of Crystal Structure Quality from Protein Sequence. Bioinformatics 2021;37:4366-4374. [PMID: 34247234 DOI: 10.1093/bioinformatics/btab509] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 06/10/2021] [Accepted: 07/06/2021] [Indexed: 11/12/2022] Open

Sequence-Based Prediction of Transmembrane Protein Crystallization Propensity. Interdiscip Sci 2021;13:693-702. [PMID: 34143353 DOI: 10.1007/s12539-021-00448-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 05/31/2021] [Accepted: 06/04/2021] [Indexed: 10/21/2022]

Wang Y, Li Z, Zhang Y, Ma Y, Huang Q, Chen X, Dai Z, Zou X. Performance improvement for a 2D convolutional neural network by using SSC encoding on protein-protein interaction tasks. BMC Bioinformatics 2021;22:184. [PMID: 33845759 PMCID: PMC8042949 DOI: 10.1186/s12859-021-04111-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Accepted: 03/30/2021] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

The interactions of proteins are determined by their sequences and affect the regulation of the cell cycle, signal transduction and metabolism, which is of extraordinary significance to modern proteomics research. Despite advances in experimental technology, it is still expensive, laborious, and time-consuming to determine protein-protein interactions (PPIs), and there is a strong demand for effective bioinformatics approaches to identify potential PPIs. Considering the large amount of PPI data, a high-performance processor can be utilized to enhance the capability of the deep learning method and directly predict protein sequences.

RESULTS

We propose the Sequence-Statistics-Content protein sequence encoding format (SSC) based on information extraction from the original sequence for further performance improvement of the convolutional neural network. The original protein sequences are encoded in the three-channel format by introducing statistical information (the second channel) and bigram encoding information (the third channel), which can increase the unique sequence features to enhance the performance of the deep learning model. On predicting protein-protein interaction tasks, the results using the 2D convolutional neural network (2D CNN) with the SSC encoding method are better than those of the 1D CNN with one hot encoding. The independent validation of new interactions from the HIPPIE database (version 2.1 published on July 18, 2017) and the validation of directly predicted results by applying a molecular docking tool indicate the effectiveness of the proposed protein encoding improvement in the CNN model.

CONCLUSION

The proposed protein sequence encoding method is efficient at improving the capability of the CNN model on protein sequence-related tasks and may also be effective at enhancing the capability of other machine learning or deep learning methods. Prediction accuracy and molecular docking validation showed considerable improvement compared to the existing hot encoding method, indicating that the SSC encoding method may be useful for analyzing protein sequence-related tasks. The source code of the proposed methods is freely available for academic research at https://github.com/wangy496/SSC-format/ .

Collapse

Xuan W, Liu N, Huang N, Li Y, Wang J. CLPred: a sequence-based protein crystallization predictor using BLSTM neural network. Bioinformatics 2021;36:i709-i717. [PMID: 33381840 DOI: 10.1093/bioinformatics/btaa791] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/07/2020] [Indexed: 01/05/2023] Open

Abstract

MOTIVATION

Determining the structures of proteins is a critical step to understand their biological functions. Crystallography-based X-ray diffraction technique is the main method for experimental protein structure determination. However, the underlying crystallization process, which needs multiple time-consuming and costly experimental steps, has a high attrition rate. To overcome this issue, a series of in silico methods have been developed with the primary aim of selecting the protein sequences that are promising to be crystallized. However, the predictive performance of the current methods is modest.

RESULTS

We propose a deep learning model, so-called CLPred, which uses a bidirectional recurrent neural network with long short-term memory (BLSTM) to capture the long-range interaction patterns between k-mers amino acids to predict protein crystallizability. Using sequence only information, CLPred outperforms the existing deep-learning predictors and a vast majority of sequence-based diffraction-quality crystals predictors on three independent test sets. The results highlight the effectiveness of BLSTM in capturing non-local, long-range inter-peptide interaction patterns to distinguish proteins that can result in diffraction-quality crystals from those that cannot. CLPred has been steadily improved over the previous window-based neural networks, which is able to predict crystallization propensity with high accuracy. CLPred can also be improved significantly if it incorporates additional features from pre-extracted evolutional, structural and physicochemical characteristics. The correctness of CLPred predictions is further validated by the case studies of Sox transcription factor family member proteins and Zika virus non-structural proteins.

AVAILABILITY AND IMPLEMENTATION

https://github.com/xuanwenjing/CLPred.

Collapse

Mall R, Elbasir A, Almeer H, Islam Z, Kolatkar PR, Chawla S, Ullah E. A Modelling Framework for Embedding-based Predictions for Compound-Viral Protein Activity. Bioinformatics 2021;37:2544-2555. [PMID: 33638345 PMCID: PMC8163000 DOI: 10.1093/bioinformatics/btab130] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 02/16/2021] [Accepted: 02/24/2021] [Indexed: 11/14/2022] Open

Yu L, Liu F, Li Y, Luo J, Jing R. DeepT3_4: A Hybrid Deep Neural Network Model for the Distinction Between Bacterial Type III and IV Secreted Effectors. Front Microbiol 2021;12:605782. [PMID: 33552038 PMCID: PMC7858263 DOI: 10.3389/fmicb.2021.605782] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Accepted: 01/04/2021] [Indexed: 01/17/2023] Open

Elhadd T, Mall R, Bashir M, Palotti J, Fernandez-Luque L, Farooq F, Mohanadi DA, Dabbous Z, Malik RA, Abou-Samra AB. Artificial Intelligence (AI) based machine learning models predict glucose variability and hypoglycaemia risk in patients with type 2 diabetes on a multiple drug regimen who fast during ramadan (The PROFAST - IT Ramadan study). Diabetes Res Clin Pract 2020;169:108388. [PMID: 32858096 DOI: 10.1016/j.diabres.2020.108388] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 08/19/2020] [Indexed: 10/23/2022]

Abstract

OBJECTIVE

To develop a machine-based algorithm from clinical and demographic data, physical activity and glucose variability to predict hyperglycaemic and hypoglycaemic excursions in patients with type 2 diabetes on multiple glucose lowering therapies who fast during Ramadan.

PATIENTS AND METHODS

Thirteen patients (10 males and three females) with type 2 diabetes on 3 or more anti-diabetic medications were studied with a Fitbit-2 pedometer device and Freestyle Libre (Abbott Diagnostics) 2 weeks before and 2 weeks during Ramadan. Several machine learning techniques were trained to predict blood glucose levels in a regression framework utilising physical activity and contemporaneous blood glucose levels, comparing Ramadan to non-Ramadan days.

RESULTS

The median age of participants was 51 years (IQR 49-52); median BMI was 33.2 kg/m² (IQR 33.0-35.9) and median HbA1c was 7.3% (IQR 6.7-7.8). The optimal model using physical activity achieved an R² of 0.548 and a mean absolute error (MAE) of 30.30. The addition of electronic health record (ehr) information increased R² to 0.636 and reduced MAE to 26.89 and the time of the day feature further increased R² to 0.768 and reduced MAE to 20.55. Combining all the features together resulted in an optimal XGBoost model with an R² of 0.836 and MAE of 17.47. This model accurately estimated normal glucose levels in 2584/2715 (95.2%) readings and hyperglycaemic events in 852/1031 (82.6%) readings, but fewer hypoglycaemic events (48/172 (27.9%)). The optimal XGBoost model prioritized age, gender, BMI and HbA1c followed by glucose levels and physical activity. Interestingly, the blood glucose level prediction by our model was influenced by use of SGLT2i.

CONCLUSION

XGBoost, a machine learning AI algorithm achieves high predictive performance for normal and hyperglycaemic excursions, but has limited predictive value for hypoglycaemia in patients on multiple therapies who fast during Ramadan.

Collapse

Elbasir A, Mall R, Kunji K, Rawi R, Islam Z, Chuang GY, Kolatkar PR, Bensmail H. BCrystal: an interpretable sequence-based protein crystallization predictor. Bioinformatics 2020;36:1429-1438. [PMID: 31603511 DOI: 10.1093/bioinformatics/btz762] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 09/19/2019] [Accepted: 10/08/2019] [Indexed: 02/01/2023] Open

Zhu YH, Hu J, Ge F, Li F, Song J, Zhang Y, Yu DJ. Accurate multistage prediction of protein crystallization propensity using deep-cascade forest with sequence-based features. Brief Bioinform 2020;22:5839971. [PMID: 32436937 DOI: 10.1093/bib/bbaa076] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Revised: 04/09/2020] [Accepted: 04/13/2020] [Indexed: 11/13/2022] Open

Abstract

X-ray crystallography is the major approach for determining atomic-level protein structures. Because not all proteins can be easily crystallized, accurate prediction of protein crystallization propensity provides critical help in guiding experimental design and improving the success rate of X-ray crystallography experiments. This study has developed a new machine-learning-based pipeline that uses a newly developed deep-cascade forest (DCF) model with multiple types of sequence-based features to predict protein crystallization propensity. Based on the developed pipeline, two new protein crystallization propensity predictors, denoted as DCFCrystal and MDCFCrystal, have been implemented. DCFCrystal is a multistage predictor that can estimate the success propensities of the three individual steps (production of protein material, purification and production of crystals) in the protein crystallization process. MDCFCrystal is a single-stage predictor that aims to estimate the probability that a protein will pass through the entire crystallization process. Moreover, DCFCrystal is designed for general proteins, whereas MDCFCrystal is specially designed for membrane proteins, which are notoriously difficult to crystalize. DCFCrystal and MDCFCrystal were separately tested on two benchmark datasets consisting of 12 289 and 950 proteins, respectively, with known crystallization results from various experimental records. The experimental results demonstrated that DCFCrystal and MDCFCrystal increased the value of Matthew's correlation coefficient by 199.7% and 77.8%, respectively, compared to the best of other state-of-the-art protein crystallization propensity predictors. Detailed analyses show that the major advantages of DCFCrystal and MDCFCrystal lie in the efficiency of the DCF model and the sensitivity of the sequence-based features used, especially the newly designed pseudo-predicted hybrid solvent accessibility (PsePHSA) feature, which improves crystallization recognition by incorporating sequence-order information with solvent accessibility of residues. Meanwhile, the new crystal-dataset constructions help to train the models with more comprehensive crystallization knowledge.

Collapse

Basith S, Manavalan B, Hwan Shin T, Lee G. Machine intelligence in peptide therapeutics: A next‐generation tool for rapid disease screening. Med Res Rev 2020;40:1276-1314. [DOI: 10.1002/med.21658] [Citation(s) in RCA: 139] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 11/26/2019] [Accepted: 12/16/2019] [Indexed: 12/12/2022]

Li F, Chen J, Leier A, Marquez-Lago T, Liu Q, Wang Y, Revote J, Smith AI, Akutsu T, Webb GI, Kurgan L, Song J. DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites. Bioinformatics 2019;36:1057-1065. [PMID: 31566664 PMCID: PMC8215920 DOI: 10.1093/bioinformatics/btz721] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 08/13/2019] [Accepted: 09/25/2019] [Indexed: 01/31/2023] Open

Park H, Mall R, Alharbi FH, Sanvito S, Tabet N, Bensmail H, El-Mellouhi F. Learn-and-Match Molecular Cations for Perovskites. J Phys Chem A 2019;123:7323-7334. [DOI: 10.1021/acs.jpca.9b06208] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Palotti J, Mall R, Aupetit M, Rueschman M, Singh M, Sathyanarayana A, Taheri S, Fernandez-Luque L. Benchmark on a large cohort for sleep-wake classification with machine learning techniques. NPJ Digit Med 2019;2:50. [PMID: 31304396 PMCID: PMC6555808 DOI: 10.1038/s41746-019-0126-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 05/06/2019] [Indexed: 11/17/2022] Open