1
|
Guo L, Lee HK, Oh S, Koirala GR, Kim TI. Smart Bioelectronics for Real-Time Diagnosis and Therapy of Body Organ Functions. ACS Sens 2025; 10:3239-3273. [PMID: 40310273 DOI: 10.1021/acssensors.5c00024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2025]
Abstract
Noncommunicable diseases (NCDs) associated with cardiovascular, neurological, and gastrointestinal disorders remain a leading cause of global mortality, sounding the alarm for the urgent need for better diagnostic and therapeutic solutions. Wearable and implantable biointegrated electronics offer a groundbreaking solution, combining real-time, high-resolution monitoring with innovative treatment capabilities tailored to specific organ functions. In this comprehensive review, we focus on the diseases affecting the brain, heart, gastrointestinal organs, bladder, and adrenal gland, along with their associated physiological parameters. Additionally, we provide an overview of the characteristics of these parameters and explore the potential of bioelectronic devices for in situ sensing and therapeutic applications and highlight the recent advancements in their deployment across specific organs. Finally, we analyze the current challenges and prospects of implementing closed-loop feedback control systems in integrated sensor-therapy applications. By emphasizing organ-specific applications and advocating for closed-loop systems, this review highlights the potential of future bioelectronics to address physiological needs and serves as a guide for researchers navigating the interdisciplinary fields of diagnostics, therapeutics, and personalized medicine.
Collapse
Affiliation(s)
- Lili Guo
- School of Chemical Engineering, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Hin Kiu Lee
- School of Chemical Engineering, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Suyoun Oh
- School of Chemical Engineering, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Gyan Raj Koirala
- School of Chemical Engineering, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
- Biomedical Institute for Convergence at SKKU (BICS), Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Tae-Il Kim
- School of Chemical Engineering, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
- Biomedical Institute for Convergence at SKKU (BICS), Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| |
Collapse
|
2
|
Yoosefzadeh-Najafabadi M. From text to traits: exploring the role of large language models in plant breeding. FRONTIERS IN PLANT SCIENCE 2025; 16:1583344. [PMID: 40438742 PMCID: PMC12116590 DOI: 10.3389/fpls.2025.1583344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Accepted: 04/18/2025] [Indexed: 06/01/2025]
Abstract
Modern plant breeders regularly deal with the intricate patterns within biological data in order to better understand the biological background behind a trait of interest and speed up the breeding process. Recently, Large Language Models (LLMs) have gained widespread adoption in everyday contexts, showcasing remarkable capabilities in understanding and generating human-like text. By harnessing the capabilities of LLMs, foundational models can be repurposed to uncover intricate patterns within biological data, leading to the development of robust and flexible predictive tools that provide valuable insights into complex plant breeding systems. Despite the significant progress made in utilizing LLMs in various scientific domains, their adoption within plant breeding remains unexplored, presenting a significant opportunity for innovation. This review paper explores how LLMs, initially designed for natural language tasks, can be adapted to address specific challenges in plant breeding, such as identifying novel genetic interactions, predicting performance of a trait of interest, and well-integrating diverse datasets such as multi-omics, phenotypic, and environmental sources. Compared to conventional breeding methods, LLMs offer the potential to enhance the discovery of genetic relationships, improve trait prediction accuracy, and facilitate informed decision-making. This review aims to bridge this gap by highlighting current advancements, challenges, and future directions for integrating LLMs into plant breeding, ultimately contributing to sustainable agriculture and improved global food security.
Collapse
|
3
|
Chang YC, Huang MS, Huang YH, Lin YH. The influence of prompt engineering on large language models for protein-protein interaction identification in biomedical literature. Sci Rep 2025; 15:15493. [PMID: 40319086 PMCID: PMC12049485 DOI: 10.1038/s41598-025-99290-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Accepted: 04/18/2025] [Indexed: 05/07/2025] Open
Abstract
Identifying protein-protein interactions (PPIs) is a foundational task in biomedical natural language processing. While specialized models have been developed, the potential of general-domain large language models (LLMs) in PPI extraction, particularly for researchers without computational expertise, remains unexplored. This study evaluates the effectiveness of proprietary LLMs (GPT-3.5, GPT-4, and Google Gemini) in PPI prediction through systematic prompt engineering. We designed six prompting scenarios of increasing complexity, from basic interaction queries to sophisticated entity-tagged formats, and assessed model performance across multiple benchmark datasets (LLL, IEPA, HPRD50, AIMed, BioInfer, and PEDD). Carefully designed prompts effectively guided LLMs in PPI prediction. Gemini 1.5 Pro achieved the highest performance across most datasets, with notable F1-scores in LLL (90.3%), IEPA (68.2%), HPRD50 (67.5%), and PEDD (70.2%). GPT-4 showed competitive performance, particularly in the LLL dataset (87.3%). We identified and addressed a positive prediction bias, demonstrating improved performance after evaluation refinement. While not surpassing specialized models, general-purpose LLMs with appropriate prompting strategies can effectively perform PPI prediction tasks, offering valuable tools for biomedical researchers without extensive computational expertise.
Collapse
Affiliation(s)
- Yung-Chun Chang
- Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan.
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan.
| | - Ming-Siang Huang
- Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan
| | - Yi-Hsuan Huang
- Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan
| | - Yi-Hsuan Lin
- Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
4
|
Chen Q, Hu Y, Peng X, Xie Q, Jin Q, Gilson A, Singer MB, Ai X, Lai PT, Wang Z, Keloth VK, Raja K, Huang J, He H, Lin F, Du J, Zhang R, Zheng WJ, Adelman RA, Lu Z, Xu H. Benchmarking large language models for biomedical natural language processing applications and recommendations. Nat Commun 2025; 16:3280. [PMID: 40188094 PMCID: PMC11972378 DOI: 10.1038/s41467-025-56989-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 02/07/2025] [Indexed: 04/07/2025] Open
Abstract
The rapid growth of biomedical literature poses challenges for manual knowledge curation and synthesis. Biomedical Natural Language Processing (BioNLP) automates the process. While Large Language Models (LLMs) have shown promise in general domains, their effectiveness in BioNLP tasks remains unclear due to limited benchmarks and practical guidelines. We perform a systematic evaluation of four LLMs-GPT and LLaMA representatives-on 12 BioNLP benchmarks across six applications. We compare their zero-shot, few-shot, and fine-tuning performance with the traditional fine-tuning of BERT or BART models. We examine inconsistencies, missing information, hallucinations, and perform cost analysis. Here, we show that traditional fine-tuning outperforms zero- or few-shot LLMs in most tasks. However, closed-source LLMs like GPT-4 excel in reasoning-related tasks such as medical question answering. Open-source LLMs still require fine-tuning to close performance gaps. We find issues like missing information and hallucinations in LLM outputs. These results offer practical insights for applying LLMs in BioNLP.
Collapse
Affiliation(s)
- Qingyu Chen
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Yan Hu
- McWilliams School of Biomedical Informatics, University of Texas Health Science at Houston, Houston, TX, USA
| | - Xueqing Peng
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Qianqian Xie
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Qiao Jin
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Aidan Gilson
- Department of Ophthalmology and Visual Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Maxwell B Singer
- Department of Ophthalmology and Visual Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Xuguang Ai
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Po-Ting Lai
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Zhizheng Wang
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Vipina K Keloth
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Kalpana Raja
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Jimin Huang
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Huan He
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Fongci Lin
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Jingcheng Du
- McWilliams School of Biomedical Informatics, University of Texas Health Science at Houston, Houston, TX, USA
| | - Rui Zhang
- Division of Computational Health Sciences, Department of Surgery, Medical School, University of Minnesota, Minneapolis, MN, USA
- Center for Learning Health System Sciences, University of Minnesota, Minneapolis, MN, 55455, USA
| | - W Jim Zheng
- McWilliams School of Biomedical Informatics, University of Texas Health Science at Houston, Houston, TX, USA
| | - Ron A Adelman
- Department of Ophthalmology and Visual Science, Yale School of Medicine, Yale University, New Haven, CT, USA
| | - Zhiyong Lu
- National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | - Hua Xu
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.
| |
Collapse
|
5
|
Rehana H, Zheng J, Yeh L, Bansal B, Çam NB, Jemiyo C, McGregor B, Özgür A, He Y, Hur J. Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature using Large Language Models. ARXIV 2025:arXiv:2502.09659v1. [PMID: 40196147 PMCID: PMC11975310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 04/09/2025]
Abstract
Motivation An adjuvant is a chemical incorporated into vaccines that enhances their efficacy by improving the immune response. Identifying adjuvant names from cancer vaccine studies is essential for furthering research and enhancing immunotherapies. However, the manual curation from the constantly expanding biomedical literature poses significant challenges. This study explores the automated recognition of vaccine adjuvant names using state-of-the-art Large Language Models (LLMs), specifically Generative Pretrained Transformers (GPT) and Large Language Model Meta AI (Llama). Methods We utilized two datasets: 97 clinical trial records from AdjuvareDB and 290 PubMed abstracts annotated with the Vaccine Adjuvant Compendium (VAC). Two LLMs, GPT-4o and Llama 3.2 were employed in zero-shot and few-shot learning paradigms with up to four examples per prompt. Prompts explicitly targeted adjuvant names, testing the impact of contextual information such as substances or interventions. Outputs underwent automated and manual validation for accuracy and consistency. Results GPT-4o consistently attained 100% Precision across all situations, while also exhibiting notable enhancements in Recall and F1-scores, particularly with the incorporation of interventions. On the VAC dataset, GPT-4o achieved a maximum F1-score of 77.32% with interventions, surpassing Llama-3.2-3B by approximately 2%. On the AdjuvareDB dataset, GPT-4o reached an F1-score of 81.67% for three-shot prompting with interventions, surpassing Llama-3.2-3B's maximum F1-score of 65.62%. These results highlight the critical role of contextual information in enhancing model performance, with GPT-4o demonstrating a superior ability to leverage this enrichment. Conclusion Our findings demonstrate that LLMs excel at accurately identifying adjuvant names, including rare and novel variations of naming representation. This study emphasizes the capability of LLMs to enhance cancer vaccine development by efficiently extracting insights from clinical trial data. Future work aims to broaden the framework to encompass a wider array of biomedical literature and enhance model generalizability across various vaccines and adjuvants. Availability Source code is available at https://github.com/hurlab/Vaccine-Adjuvant-LLM.
Collapse
Affiliation(s)
- Hasin Rehana
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, North Dakota, 58202, USA
- School of Electrical Engineering & Computer Science, University of North Dakota, Grand Forks, North Dakota, 58202, USA
| | - Jie Zheng
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, 48109, USA
| | - Leo Yeh
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, 48109, USA
| | - Benu Bansal
- School of Electrical Engineering & Computer Science, University of North Dakota, Grand Forks, North Dakota, 58202, USA
- Department of Biomedical Engineering, University of North Dakota, Grand Forks, North Dakota, 58202, USA
| | - Nur Bengisu Çam
- Department of Computer Engineering, Bogazici University, 34342 Istanbul, Turkey
| | - Christianah Jemiyo
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, North Dakota, 58202, USA
| | - Brett McGregor
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, North Dakota, 58202, USA
| | - Arzucan Özgür
- Department of Computer Engineering, Bogazici University, 34342 Istanbul, Turkey
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, 48109, USA
| | - Junguk Hur
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, North Dakota, 58202, USA
| |
Collapse
|
6
|
Wang J, Cheng Z, Yao Q, Liu L, Xu D, Hu G. Bioinformatics and biomedical informatics with ChatGPT: Year one review. QUANTITATIVE BIOLOGY 2024; 12:345-359. [PMID: 39364207 PMCID: PMC11446534 DOI: 10.1002/qub2.67] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 06/12/2024] [Indexed: 10/05/2024]
Abstract
The year 2023 marked a significant surge in the exploration of applying large language model chatbots, notably Chat Generative Pre-trained Transformer (ChatGPT), across various disciplines. We surveyed the application of ChatGPT in bioinformatics and biomedical informatics throughout the year, covering omics, genetics, biomedical text mining, drug discovery, biomedical image understanding, bioinformatics programming, and bioinformatics education. Our survey delineates the current strengths and limitations of this chatbot in bioinformatics and offers insights into potential avenues for future developments.
Collapse
Affiliation(s)
- Jinge Wang
- Department of Microbiology, Immunology & Cell Biology, West Virginia University, Morgantown, West Virginia, USA
| | - Zien Cheng
- Department of Microbiology, Immunology & Cell Biology, West Virginia University, Morgantown, West Virginia, USA
| | - Qiuming Yao
- School of Computing, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
| | - Li Liu
- College of Health Solutions, Arizona State University, Phoenix, Arizona, USA
- Biodesign Institute, Arizona State University, Tempe, Arizona, USA
| | - Dong Xu
- Department of Electrical Engineer and Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri, USA
| | - Gangqing Hu
- Department of Microbiology, Immunology & Cell Biology, West Virginia University, Morgantown, West Virginia, USA
| |
Collapse
|
7
|
Li X, Zheng Y, Hu J, Zheng J, Wang Z, He Y. VaxLLM: Leveraging Fine-tuned Large Language Model for automated annotation of Brucella Vaccines. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.25.625209. [PMID: 39651132 PMCID: PMC11623542 DOI: 10.1101/2024.11.25.625209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Background Vaccines play a vital role in enhancing immune defense and preventing the hosts against a wide range of diseases. However, research relating to vaccine annotation remains a labor-intensive task due to the ever-increasing volume of scientific literature. This study explores the application of Large Language Models (LLMs) to automate the classification and annotation of scientific literature on vaccines as exemplified on Brucella vaccines. Results We developed an automatic pipeline to automatically perform the classification and annotation of Brucella vaccine-related articles, using abstract and title. The pipeline includes VaxLLM (Vaccine Large Language Model), which is a fine-tuned Llama 3 model. VaxLLM systematically classifies articles by identifying the presence of vaccine formulations and extracts the key information about vaccines, including vaccine antigen, vaccine formulation, vaccine platform, host species used as animal models, and experiments used to investigate the vaccine. The model demonstrated high performance in classification (Precision: 0.90, Recall: 1.0, F1-Score: 0.95) and annotation accuracy (97.9%), significantly outperforming a corresponding non-fine-tuned Llama 3 model. The outputs from VaxLLM are presented in a structured format to facilitate the integration into databases such as the VIOLIN vaccine knowledgebase. To further enhance the accuracy and depth of the Brucella vaccine data annotations, the pipeline also incorporates PubTator, enabling cross comparison with VaxLLM annotations and supporting downstream analyses like gene enrichment. Conclusion VaxLLM rapidly and accurately extracted detailed itemized vaccine information from publications, significantly outperforming traditional annotation methods in both speed and precision. VaxLLM also shows great potential in automating knowledge extraction in the domain of vaccine research. Availability All data is available at https://github.com/xingxianli/VaxLLM , and the model was also uploaded to HuggingFace ( https://huggingface.co/Xingxian123/VaxLLM ).
Collapse
|
8
|
Ivanisenko TV, Demenkov PS, Ivanisenko VA. An Accurate and Efficient Approach to Knowledge Extraction from Scientific Publications Using Structured Ontology Models, Graph Neural Networks, and Large Language Models. Int J Mol Sci 2024; 25:11811. [PMID: 39519363 PMCID: PMC11546091 DOI: 10.3390/ijms252111811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Revised: 10/23/2024] [Accepted: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
The rapid growth of biomedical literature makes it challenging for researchers to stay current. Integrating knowledge from various sources is crucial for studying complex biological systems. Traditional text-mining methods often have limited accuracy because they don't capture semantic and contextual nuances. Deep-learning models can be computationally expensive and typically have low interpretability, though efforts in explainable AI aim to mitigate this. Furthermore, transformer-based models have a tendency to produce false or made-up information-a problem known as hallucination-which is especially prevalent in large language models (LLMs). This study proposes a hybrid approach combining text-mining techniques with graph neural networks (GNNs) and fine-tuned large language models (LLMs) to extend biomedical knowledge graphs and interpret predicted edges based on published literature. An LLM is used to validate predictions and provide explanations. Evaluated on a corpus of experimentally confirmed protein interactions, the approach achieved a Matthews correlation coefficient (MCC) of 0.772. Applied to insomnia, the approach identified 25 interactions between 32 human proteins absent in known knowledge bases, including regulatory interactions between MAOA and 5-HT2C, binding between ADAM22 and 14-3-3 proteins, which is implicated in neurological diseases, and a circadian regulatory loop involving RORB and NR1D1. The hybrid GNN-LLM method analyzes biomedical literature efficiency to uncover potential molecular interactions for complex disorders. It can accelerate therapeutic target discovery by focusing expert verification on the most relevant automatically extracted information.
Collapse
Affiliation(s)
- Timofey V. Ivanisenko
- The Artificial Intelligence Research Center of Novosibirsk State University, Pirogova Street 1, Novosibirsk 630090, Russia; (P.S.D.); (V.A.I.)
- Institute of Cytology & Genetics, Siberian Branch, Russian Academy of Sciences, Prospekt Lavrentyeva 10, Novosibirsk 630090, Russia
| | - Pavel S. Demenkov
- The Artificial Intelligence Research Center of Novosibirsk State University, Pirogova Street 1, Novosibirsk 630090, Russia; (P.S.D.); (V.A.I.)
- Institute of Cytology & Genetics, Siberian Branch, Russian Academy of Sciences, Prospekt Lavrentyeva 10, Novosibirsk 630090, Russia
| | - Vladimir A. Ivanisenko
- The Artificial Intelligence Research Center of Novosibirsk State University, Pirogova Street 1, Novosibirsk 630090, Russia; (P.S.D.); (V.A.I.)
- Institute of Cytology & Genetics, Siberian Branch, Russian Academy of Sciences, Prospekt Lavrentyeva 10, Novosibirsk 630090, Russia
| |
Collapse
|