1
|
Genc AG, McGuffin LJ. Beyond AlphaFold2: The Impact of AI for the Further Improvement of Protein Structure Prediction. Methods Mol Biol 2025; 2867:121-139. [PMID: 39576578 DOI: 10.1007/978-1-0716-4196-5_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
Protein structure prediction is fundamental to molecular biology and has numerous applications in areas such as drug discovery and protein engineering. Machine learning techniques have greatly advanced protein 3D modeling in recent years, particularly with the development of AlphaFold2 (AF2), which can analyze sequences of amino acids and predict 3D structures with near experimental accuracy. Since the release of AF2, numerous studies have been conducted, either using AF2 directly for large-scale modeling or building upon the software for other use cases. Many reviews have been published discussing the impact of AF2 in the field of protein bioinformatics, particularly in relation to neural networks, which have highlighted what AF2 can and cannot do. It is evident that AF2 and similar approaches are open to further development and several new approaches have emerged, in addition to older refinement approaches, for improving the quality of predictions. Here we provide a brief overview, aimed at the general biologist, of how machine learning techniques have been used for improvement of 3D models of proteins following AF2, and we highlight the impacts of these approaches. In the most recent experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP15), the most successful groups all developed their own tools for protein structure modeling that were based at least in some part on AF2. This improvement involved employing techniques such as generative modeling, changing parameters such as dropout to generate more AF2 structures, and data-driven approaches including using alternative templates and MSAs.
Collapse
Affiliation(s)
| | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Reading, UK.
| |
Collapse
|
2
|
Khalid S, Guo J, Muhammad SA, Bai B. Designing, cloning and simulation studies of cancer/testis antigens based multi-epitope vaccine candidates against cutaneous melanoma: An immunoinformatics approach. Biochem Biophys Rep 2024; 37:101651. [PMID: 38371523 PMCID: PMC10873875 DOI: 10.1016/j.bbrep.2024.101651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 01/18/2024] [Accepted: 01/18/2024] [Indexed: 02/20/2024] Open
Abstract
Background Melanoma is the most fatal kind of skin cancer. Among its various types, cutaneous melanoma is the most prevalent one. Melanoma cells are thought to be highly immunogenic due to the presence of distinct tumor-associated antigens (TAAs), which includes carcinoembryonic antigen (CEA), cancer/testis antigens (CTAs) and neo-antigens. The CTA family is a group of antigens that are only expressed in malignancies and testicular germ cells. Methods We used integrative framework and systems-level analysis to predict potential vaccine candidates for cutaneous melanoma involving epitopes prediction, molecular modeling and molecular docking to cross-validate the binding affinity and interaction between potential vaccine agents and major histocompatibility molecules (MHCs) followed by molecular dynamics simulation, immune simulation and in silico cloning. Results In this study, three cancer/testis antigens were targeted for immunotherapy of cutaneous melanoma. Among many CTAs that were studied for their expression in primary and malignant melanoma, NY-ESO-1, MAGE1 and SSX2 antigens are most prevalent in cutaneous melanoma. Cytotoxic and Helper epitopes were predicted, and the finest epitopes were shortlisted based on binding score. The vaccine construct was composed of the four epitope-rich domains of antigenic proteins, an appropriate adjuvant, His tag and linkers. This potential multi-epitope vaccine was further evaluated in terms of antigenicity, allergencity, toxicity and other physicochemical properties. Molecular interaction estimated through protein-protein docking unveiled good interactions characterized by favorable binding energies. Molecular dynamics simulation ensured the stability of docked complex and the predicted immune response through immune simulation revealed elevated levels of antibodies titer, cytokines, interleukins and immune cells (NK, DC and MA) population. Conclusion The findings indicate that the potential vaccine candidates could be effective immunotherapeutic agents that modify the treatment strategies of cutaneous melanoma.
Collapse
Affiliation(s)
- Sana Khalid
- Institute of Molecular Biology and Biotechnology, Bahauddin Zakariya University Multan, Pakistan
| | - Jinlei Guo
- School of Intelligent Medical Engineering, Sanquan College of Xinxiang Medical University, Xinxiang, China
| | - Syed Aun Muhammad
- Institute of Molecular Biology and Biotechnology, Bahauddin Zakariya University Multan, Pakistan
| | - Baogang Bai
- School of Information and Technology, Wenzhou Business College, Wenzhou, China
- Zhejiang Province Engineering Research Center of Intelligent Medicine, Wenzhou, China
- The 1st School of Medical, School of Information and Engineering, The 1st Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
3
|
Rosignoli S, Lustrino E, Di Silverio I, Paiardini A. Making Use of Averaging Methods in MODELLER for Protein Structure Prediction. Int J Mol Sci 2024; 25:1731. [PMID: 38339009 PMCID: PMC10855553 DOI: 10.3390/ijms25031731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 01/23/2024] [Accepted: 01/29/2024] [Indexed: 02/12/2024] Open
Abstract
Recent advances in protein structure prediction, driven by AlphaFold 2 and machine learning, demonstrate proficiency in static structures but encounter challenges in capturing essential dynamic features crucial for understanding biological function. In this context, homology-based modeling emerges as a cost-effective and computationally efficient alternative. The MODELLER (version 10.5, accessed on 30 November 2023) algorithm can be harnessed for this purpose since it computes intermediate models during simulated annealing, enabling the exploration of attainable configurational states and energies while minimizing its objective function. There have been a few attempts to date to improve the models generated by its algorithm, and in particular, there is no literature regarding the implementation of an averaging procedure involving the intermediate models in the MODELLER algorithm. In this study, we examined MODELLER's output using 225 target-template pairs, extracting the best representatives of intermediate models. Applying an averaging procedure to the selected intermediate structures based on statistical potentials, we aimed to determine: (1) whether averaging improves the quality of structural models during the building phase; (2) if ranking by statistical potentials reliably selects the best models, leading to improved final model quality; (3) whether using a single template versus multiple templates affects the averaging approach; (4) whether the "ensemble" nature of the MODELLER building phase can be harnessed to capture low-energy conformations in holo structures modeling. Our findings indicate that while improvements typically fall short of a few decimal points in the model evaluation metric, a notable fraction of configurations exhibit slightly higher similarity to the native structure than MODELLER's proposed final model. The averaging-building procedure proves particularly beneficial in (1) regions of low sequence identity between the target and template(s), the most challenging aspect of homology modeling; (2) holo protein conformations generation, an area in which MODELLER and related tools usually fall short of the expected performance.
Collapse
Affiliation(s)
| | | | | | - Alessandro Paiardini
- Department of Biochemical Sciences, Sapienza University of Rome, 00185 Rome, Italy; (S.R.); (E.L.); (I.D.S.)
| |
Collapse
|
4
|
Ng TK, Ji J, Liu Q, Yao Y, Wang WY, Cao Y, Chen CB, Lin JW, Dong G, Cen LP, Huang C, Zhang M. Evaluation of Myocilin Variant Protein Structures Modeled by AlphaFold2. Biomolecules 2023; 14:14. [PMID: 38275755 PMCID: PMC10813463 DOI: 10.3390/biom14010014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 12/12/2023] [Accepted: 12/15/2023] [Indexed: 01/27/2024] Open
Abstract
Deep neural network-based programs can be applied to protein structure modeling by inputting amino acid sequences. Here, we aimed to evaluate the AlphaFold2-modeled myocilin wild-type and variant protein structures and compare to the experimentally determined protein structures. Molecular dynamic and ligand binding properties of the experimentally determined and AlphaFold2-modeled protein structures were also analyzed. AlphaFold2-modeled myocilin variant protein structures showed high similarities in overall structure to the experimentally determined mutant protein structures, but the orientations and geometries of amino acid side chains were slightly different. The olfactomedin-like domain of the modeled missense variant protein structures showed fewer folding changes than the nonsense variant when compared to the predicted wild-type protein structure. Differences were also observed in molecular dynamics and ligand binding sites between the AlphaFold2-modeled and experimentally determined structures as well as between the wild-type and variant structures. In summary, the folding of the AlphaFold2-modeled MYOC variant protein structures could be similar to that determined by the experiments but with differences in amino acid side chain orientations and geometries. Careful comparisons with experimentally determined structures are needed before the applications of the in silico modeled variant protein structures.
Collapse
Affiliation(s)
- Tsz Kin Ng
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Jie Ji
- Network & Information Centre, Shantou University, Shantou 515041, China
| | - Qingping Liu
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
- Key Laboratory of Carbohydrate and Lipid Metabolism Research, College of Life Science and Technology, Dalian University, Dalian 116622, China
| | - Yao Yao
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
- Shantou University Medical College, Shantou 515041, China
| | - Wen-Ying Wang
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
- Shantou University Medical College, Shantou 515041, China
| | - Yingjie Cao
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Chong-Bo Chen
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Jian-Wei Lin
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Geng Dong
- Shantou University Medical College, Shantou 515041, China
| | - Ling-Ping Cen
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Chukai Huang
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Mingzhi Zhang
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| |
Collapse
|
5
|
Simpkin AJ, Mesdaghi S, Sánchez Rodríguez F, Elliott L, Murphy DL, Kryshtafovych A, Keegan RM, Rigden DJ. Tertiary structure assessment at CASP15. Proteins 2023; 91:1616-1635. [PMID: 37746927 PMCID: PMC10792517 DOI: 10.1002/prot.26593] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 08/25/2023] [Accepted: 09/07/2023] [Indexed: 09/26/2023]
Abstract
The results of tertiary structure assessment at CASP15 are reported. For the first time, recognizing the outstanding performance of AlphaFold 2 (AF2) at CASP14, all single-chain predictions were assessed together, irrespective of whether a template was available. At CASP15, there was no single stand-out group, with most of the best-scoring groups-led by PEZYFoldings, UM-TBM, and Yang Server-employing AF2 in one way or another. Many top groups paid special attention to generating deep Multiple Sequence Alignments (MSAs) and testing variant MSAs, thereby allowing them to successfully address some of the hardest targets. Such difficult targets, as well as lacking templates, were typically proteins with few homologues. Local divergence between prediction and target correlated with localization at crystal lattice or chain interfaces, and with regions exhibiting high B-factor factors in crystal structure targets, and should not necessarily be considered as representing error in the prediction. However, analysis of exposed and buried side chain accuracy showed room for improvement even in the latter. Nevertheless, a majority of groups produced high-quality predictions for most targets, which are valuable for experimental structure determination, functional analysis, and many other tasks across biology. These include those applying methods similar to those used to generate major resources such as the AlphaFold Protein Structure Database and the ESM Metagenomic atlas: the confidence estimates of the former were also notably accurate.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | - Shahram Mesdaghi
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
- Computational Biology Facility, MerseyBio, University of LiverpoolLiverpoolUK
| | - Filomeno Sánchez Rodríguez
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
- Life Science, Diamond Light Source, Harwell Science and Innovation CampusOxfordshireUK
- Department of Chemistry, York Structural Biology LaboratoryUniversity of YorkYorkUK
| | - Luc Elliott
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | - David L. Murphy
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | | | - Ronan M. Keegan
- UKRI‐STFC, Rutherford Appleton Laboratory, Research Complex at HarwellDidcotUK
| | - Daniel J. Rigden
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| |
Collapse
|
6
|
Mahmud S, Morehead A, Cheng J. Accurate prediction of protein tertiary structural changes induced by single-site mutations with equivariant graph neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.03.560758. [PMID: 37873289 PMCID: PMC10592624 DOI: 10.1101/2023.10.03.560758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Predicting the change of protein tertiary structure caused by singlesite mutations is important for studying protein structure, function, and interaction. Even though computational protein structure prediction methods such as AlphaFold can predict the overall tertiary structures of most proteins rather accurately, they are not sensitive enough to accurately predict the structural changes induced by single-site amino acid mutations on proteins. Specialized mutation prediction methods mostly focus on predicting the overall stability or function changes caused by mutations without attempting to predict the exact mutation-induced structural changes, limiting their use in protein mutation study. In this work, we develop the first deep learning method based on equivariant graph neural networks (EGNN) to directly predict the tertiary structural changes caused by single-site mutations and the tertiary structure of any protein mutant from the structure of its wild-type counterpart. The results show that it performs substantially better in predicting the tertiary structures of protein mutants than the widely used protein structure prediction method AlphaFold.
Collapse
|
7
|
Tam C, Iwasaki W. AlphaCutter: Efficient removal of non-globular regions from predicted protein structures. Proteomics 2023; 23:e2300176. [PMID: 37309722 DOI: 10.1002/pmic.202300176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 05/24/2023] [Accepted: 05/26/2023] [Indexed: 06/14/2023]
Abstract
A huge number of high-quality predicted protein structures are now publicly available. However, many of these structures contain non-globular regions, which diminish the performance of downstream structural bioinformatic applications. In this study, we develop AlphaCutter for the removal of non-globular regions from predicted protein structures. A large-scale cleaning of 542,380 predicted SwissProt structures highlights that AlphaCutter is able to (1) remove non-globular regions that are undetectable using pLDDT scores and (2) preserve high integrity of the cleaned domain regions. As useful applications, AlphaCutter improved the folding energy scores and sequence recovery rates in the re-design of domain regions. On average, AlphaCutter takes less than 3 s to clean a protein structure, enabling efficient cleaning of the exploding number of predicted protein structures. AlphaCutter is available at https://github.com/johnnytam100/AlphaCutter. AlphaCutter-cleaned SwissProt structures are available for download at https://doi.org/10.5281/zenodo.7944483.
Collapse
Affiliation(s)
- Chunlai Tam
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Chiba, Japan
| | - Wataru Iwasaki
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Chiba, Japan
| |
Collapse
|