Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Rehana H, Çam NB, Basmaci M, Zheng J, Jemiyo C, He Y, Özgür A, Hur J. Evaluating GPT and BERT models for protein-protein interaction identification in biomedical text. Bioinform Adv 2024;4:vbae133. [PMID: 39319026 PMCID: PMC11419952 DOI: 10.1093/bioadv/vbae133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 08/16/2024] [Accepted: 09/09/2024] [Indexed: 09/26/2024]

For:	Rehana H, Çam NB, Basmaci M, Zheng J, Jemiyo C, He Y, Özgür A, Hur J. Evaluating GPT and BERT models for protein-protein interaction identification in biomedical text. Bioinform Adv 2024;4:vbae133. [PMID: 39319026 PMCID: PMC11419952 DOI: 10.1093/bioadv/vbae133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 08/16/2024] [Accepted: 09/09/2024] [Indexed: 09/26/2024]

Number

Cited by Other Article(s)

Guo L, Lee HK, Oh S, Koirala GR, Kim TI. Smart Bioelectronics for Real-Time Diagnosis and Therapy of Body Organ Functions. ACS Sens 2025;10:3239-3273. [PMID: 40310273 DOI: 10.1021/acssensors.5c00024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2025]

Yoosefzadeh-Najafabadi M. From text to traits: exploring the role of large language models in plant breeding. FRONTIERS IN PLANT SCIENCE 2025;16:1583344. [PMID: 40438742 PMCID: PMC12116590 DOI: 10.3389/fpls.2025.1583344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Accepted: 04/18/2025] [Indexed: 06/01/2025]

Chang YC, Huang MS, Huang YH, Lin YH. The influence of prompt engineering on large language models for protein-protein interaction identification in biomedical literature. Sci Rep 2025;15:15493. [PMID: 40319086 PMCID: PMC12049485 DOI: 10.1038/s41598-025-99290-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Accepted: 04/18/2025] [Indexed: 05/07/2025] Open

Chen Q, Hu Y, Peng X, Xie Q, Jin Q, Gilson A, Singer MB, Ai X, Lai PT, Wang Z, Keloth VK, Raja K, Huang J, He H, Lin F, Du J, Zhang R, Zheng WJ, Adelman RA, Lu Z, Xu H. Benchmarking large language models for biomedical natural language processing applications and recommendations. Nat Commun 2025;16:3280. [PMID: 40188094 PMCID: PMC11972378 DOI: 10.1038/s41467-025-56989-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 02/07/2025] [Indexed: 04/07/2025] Open

Affiliation(s)

Qingyu Chen Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Yan Hu McWilliams School of Biomedical Informatics, University of Texas Health Science at Houston, Houston, TX, USA
Xueqing Peng Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Qianqian Xie Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Qiao Jin National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Aidan Gilson Department of Ophthalmology and Visual Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Maxwell B Singer Department of Ophthalmology and Visual Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Xuguang Ai Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Po-Ting Lai National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Zhizheng Wang National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Vipina K Keloth Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Kalpana Raja Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Jimin Huang Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Huan He Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Fongci Lin Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Jingcheng Du McWilliams School of Biomedical Informatics, University of Texas Health Science at Houston, Houston, TX, USA
Rui Zhang Division of Computational Health Sciences, Department of Surgery, Medical School, University of Minnesota, Minneapolis, MN, USA Center for Learning Health System Sciences, University of Minnesota, Minneapolis, MN, 55455, USA
W Jim Zheng McWilliams School of Biomedical Informatics, University of Texas Health Science at Houston, Houston, TX, USA
Ron A Adelman Department of Ophthalmology and Visual Science, Yale School of Medicine, Yale University, New Haven, CT, USA
Zhiyong Lu National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
Hua Xu Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.

Collapse

Rehana H, Zheng J, Yeh L, Bansal B, Çam NB, Jemiyo C, McGregor B, Özgür A, He Y, Hur J. Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature using Large Language Models. ARXIV 2025:arXiv:2502.09659v1. [PMID: 40196147 PMCID: PMC11975310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 04/09/2025]

Abstract

Motivation

An adjuvant is a chemical incorporated into vaccines that enhances their efficacy by improving the immune response. Identifying adjuvant names from cancer vaccine studies is essential for furthering research and enhancing immunotherapies. However, the manual curation from the constantly expanding biomedical literature poses significant challenges. This study explores the automated recognition of vaccine adjuvant names using state-of-the-art Large Language Models (LLMs), specifically Generative Pretrained Transformers (GPT) and Large Language Model Meta AI (Llama).

Methods

We utilized two datasets: 97 clinical trial records from AdjuvareDB and 290 PubMed abstracts annotated with the Vaccine Adjuvant Compendium (VAC). Two LLMs, GPT-4o and Llama 3.2 were employed in zero-shot and few-shot learning paradigms with up to four examples per prompt. Prompts explicitly targeted adjuvant names, testing the impact of contextual information such as substances or interventions. Outputs underwent automated and manual validation for accuracy and consistency.

Results

GPT-4o consistently attained 100% Precision across all situations, while also exhibiting notable enhancements in Recall and F1-scores, particularly with the incorporation of interventions. On the VAC dataset, GPT-4o achieved a maximum F1-score of 77.32% with interventions, surpassing Llama-3.2-3B by approximately 2%. On the AdjuvareDB dataset, GPT-4o reached an F1-score of 81.67% for three-shot prompting with interventions, surpassing Llama-3.2-3B's maximum F1-score of 65.62%. These results highlight the critical role of contextual information in enhancing model performance, with GPT-4o demonstrating a superior ability to leverage this enrichment.

Conclusion

Our findings demonstrate that LLMs excel at accurately identifying adjuvant names, including rare and novel variations of naming representation. This study emphasizes the capability of LLMs to enhance cancer vaccine development by efficiently extracting insights from clinical trial data. Future work aims to broaden the framework to encompass a wider array of biomedical literature and enhance model generalizability across various vaccines and adjuvants.

Availability

Source code is available at https://github.com/hurlab/Vaccine-Adjuvant-LLM.

Collapse

Wang J, Cheng Z, Yao Q, Liu L, Xu D, Hu G. Bioinformatics and biomedical informatics with ChatGPT: Year one review. QUANTITATIVE BIOLOGY 2024;12:345-359. [PMID: 39364207 PMCID: PMC11446534 DOI: 10.1002/qub2.67] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 06/12/2024] [Indexed: 10/05/2024]

Li X, Zheng Y, Hu J, Zheng J, Wang Z, He Y. VaxLLM: Leveraging Fine-tuned Large Language Model for automated annotation of Brucella Vaccines. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.25.625209. [PMID: 39651132 PMCID: PMC11623542 DOI: 10.1101/2024.11.25.625209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]

Abstract

Background

Vaccines play a vital role in enhancing immune defense and preventing the hosts against a wide range of diseases. However, research relating to vaccine annotation remains a labor-intensive task due to the ever-increasing volume of scientific literature. This study explores the application of Large Language Models (LLMs) to automate the classification and annotation of scientific literature on vaccines as exemplified on Brucella vaccines.

Results

We developed an automatic pipeline to automatically perform the classification and annotation of Brucella vaccine-related articles, using abstract and title. The pipeline includes VaxLLM (Vaccine Large Language Model), which is a fine-tuned Llama 3 model. VaxLLM systematically classifies articles by identifying the presence of vaccine formulations and extracts the key information about vaccines, including vaccine antigen, vaccine formulation, vaccine platform, host species used as animal models, and experiments used to investigate the vaccine. The model demonstrated high performance in classification (Precision: 0.90, Recall: 1.0, F1-Score: 0.95) and annotation accuracy (97.9%), significantly outperforming a corresponding non-fine-tuned Llama 3 model. The outputs from VaxLLM are presented in a structured format to facilitate the integration into databases such as the VIOLIN vaccine knowledgebase. To further enhance the accuracy and depth of the Brucella vaccine data annotations, the pipeline also incorporates PubTator, enabling cross comparison with VaxLLM annotations and supporting downstream analyses like gene enrichment.

Conclusion

VaxLLM rapidly and accurately extracted detailed itemized vaccine information from publications, significantly outperforming traditional annotation methods in both speed and precision. VaxLLM also shows great potential in automating knowledge extraction in the domain of vaccine research.

Availability

All data is available at https://github.com/xingxianli/VaxLLM , and the model was also uploaded to HuggingFace ( https://huggingface.co/Xingxian123/VaxLLM ).

Collapse

Ivanisenko TV, Demenkov PS, Ivanisenko VA. An Accurate and Efficient Approach to Knowledge Extraction from Scientific Publications Using Structured Ontology Models, Graph Neural Networks, and Large Language Models. Int J Mol Sci 2024;25:11811. [PMID: 39519363 PMCID: PMC11546091 DOI: 10.3390/ijms252111811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Revised: 10/23/2024] [Accepted: 10/28/2024] [Indexed: 11/16/2024] Open