1
|
Ille AM, Markosian C, Burley SK, Mathews MB, Pasqualini R, Arap W. Generative artificial intelligence performs rudimentary structural biology modeling. Sci Rep 2024; 14:19372. [PMID: 39169047 PMCID: PMC11339285 DOI: 10.1038/s41598-024-69021-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 07/30/2024] [Indexed: 08/23/2024] Open
Abstract
Natural language-based generative artificial intelligence (AI) has become increasingly prevalent in scientific research. Intriguingly, capabilities of generative pre-trained transformer (GPT) language models beyond the scope of natural language tasks have recently been identified. Here we explored how GPT-4 might be able to perform rudimentary structural biology modeling. We prompted GPT-4 to model 3D structures for the 20 standard amino acids and an α-helical polypeptide chain, with the latter incorporating Wolfram mathematical computation. We also used GPT-4 to perform structural interaction analysis between the anti-viral nirmatrelvir and its target, the SARS-CoV-2 main protease. Geometric parameters of the generated structures typically approximated close to experimental references. However, modeling was sporadically error-prone and molecular complexity was not well tolerated. Interaction analysis further revealed the ability of GPT-4 to identify specific amino acid residues involved in ligand binding along with corresponding bond distances. Despite current limitations, we show the current capacity of natural language generative AI to perform basic structural biology modeling and interaction analysis with atomic-scale accuracy.
Collapse
Affiliation(s)
- Alexander M Ille
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, NJ, USA
- Rutgers Cancer Institute, Newark, NJ, USA
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - Christopher Markosian
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, NJ, USA
- Rutgers Cancer Institute, Newark, NJ, USA
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Rutgers Cancer Institute, New Brunswick, NJ, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California-San Diego, La Jolla, San Diego, CA, USA
| | - Michael B Mathews
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, NJ, USA
- Division of Infectious Disease, Department of Medicine, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - Renata Pasqualini
- Rutgers Cancer Institute, Newark, NJ, USA.
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, NJ, USA.
| | - Wadih Arap
- Rutgers Cancer Institute, Newark, NJ, USA.
- Division of Hematology/Oncology, Department of Medicine, Rutgers New Jersey Medical School, Newark, NJ, USA.
| |
Collapse
|
2
|
Chen HO, Cui YC, Lin PC, Chiang JH. An Innovative Multi-Omics Model Integrating Latent Alignment and Attention Mechanism for Drug Response Prediction. J Pers Med 2024; 14:694. [PMID: 39063948 PMCID: PMC11277895 DOI: 10.3390/jpm14070694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 06/18/2024] [Accepted: 06/24/2024] [Indexed: 07/28/2024] Open
Abstract
By using omics, we can now examine all components of biological systems simultaneously. Deep learning-based drug prediction methods have shown promise by integrating cancer-related multi-omics data. However, the complex interaction between genes poses challenges in accurately projecting multi-omics data. In this research, we present a predictive model for drug response that incorporates diverse types of omics data, comprising genetic mutation, copy number variation, methylation, and gene expression data. This study proposes latent alignment for information mismatch in integration, which is achieved through an attention module capturing interactions among diverse types of omics data. The latent alignment and attention modules significantly improve predictions, outperforming the baseline model, with MSE = 1.1333, F1-score = 0.5342, and AUROC = 0.5776. High accuracy was achieved in predicting drug responses for piplartine and tenovin-6, while the accuracy was comparatively lower for mitomycin-C and obatoclax. The latent alignment module exclusively outperforms the baseline model, enhancing the MSE by 0.2375, the F1-score by 4.84%, and the AUROC by 6.1%. Similarly, the attention module only improves these metrics by 0.1899, 2.88%, and 2.84%, respectively. In the interpretability case study, panobinostat exhibited the most effective predicted response, with a value of -4.895. We provide reliable insights for drug selection in personalized medicine by identifying crucial genetic factors influencing drug response.
Collapse
Affiliation(s)
- Hui-O Chen
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan
- Institute of Medical Informatics, National Cheng Kung University, Tainan 701, Taiwan
| | - Yuan-Chi Cui
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan
- Institute of Medical Informatics, National Cheng Kung University, Tainan 701, Taiwan
| | - Peng-Chan Lin
- Department of Oncology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan
- Department of Genomic Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan
| | - Jung-Hsien Chiang
- Department of Oncology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan
- Department of Genomic Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 701, Taiwan
| |
Collapse
|
3
|
Ille AM, Markosian C, Burley SK, Mathews MB, Pasqualini R, Arap W. Generative artificial intelligence performs rudimentary structural biology modeling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.10.575113. [PMID: 38293060 PMCID: PMC10827103 DOI: 10.1101/2024.01.10.575113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Natural language-based generative artificial intelligence (AI) has become increasingly prevalent in scientific research. Intriguingly, capabilities of generative pre-trained transformer (GPT) language models beyond the scope of natural language tasks have recently been identified. Here we explored how GPT-4 might be able to perform rudimentary structural biology modeling. We prompted GPT-4 to model 3D structures for the 20 standard amino acids and an α-helical polypeptide chain, with the latter incorporating Wolfram mathematical computation. We also used GPT-4 to perform structural interaction analysis between nirmatrelvir and its target, the SARS-CoV-2 main protease. Geometric parameters of the generated structures typically approximated close to experimental references. However, modeling was sporadically error-prone and molecular complexity was not well tolerated. Interaction analysis further revealed the ability of GPT-4 to identify specific amino acid residues involved in ligand binding along with corresponding bond distances. Despite current limitations, we show the capacity of natural language generative AI to perform basic structural biology modeling and interaction analysis with atomic-scale accuracy.
Collapse
Affiliation(s)
- Alexander M. Ille
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, New Jersey, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey, USA
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Christopher Markosian
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, New Jersey, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey, USA
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, New Jersey, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, California, USA
| | - Michael B. Mathews
- School of Graduate Studies, Rutgers, The State University of New Jersey, Newark, New Jersey, USA
- Division of Infectious Disease, Department of Medicine, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Renata Pasqualini
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey, USA
- Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Wadih Arap
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey, USA
- Division of Hematology/Oncology, Department of Medicine, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| |
Collapse
|