Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

1166
(from Reference Citation Analysis)

Article PDFs (115)

Cited by > 0 (778)

Searched Name

Zhiyong Lu

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Statistics

Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Category

Show more Refine

Number	Citation Analysis
1	Enhanced humification of full-scale apple wood and cow manure by promoting lignocellulose degradation via biomass pretreatments. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024;929:172646. [PMID: 38653417 DOI: 10.1016/j.scitotenv.2024.172646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 02/26/2024] [Accepted: 04/19/2024] [Indexed: 04/25/2024] Abstract Agroforestry waste and cow manure pollute the environment, of which, agroforestry waste is difficult to degrade. Compost is an effective way to dispose agroforestry waste; however, the low degradation efficiency of lignocellulose in agroforestry waste affects the process of composting humification. This study investigated lignocellulose degradation and composting humification in full-size apple wood and cow manure composting processes by applying different pretreatments (acidic, alkaline, and high-temperature) to apple wood. Simultaneously, physicochemical characterization and metagenome sequencing were combined to analyze the function of carbohydrate-active enzymes database (CAZy). Therefore, microbial communities and functions were linked during the composting process and the lignocellulose degradation mechanism was elaborated. The results showed that the addition of apple wood increased the compost humus (HS) yield, and pretreatment of apple wood enhanced the lignocellulose degradation during composting processes. In addition, pretreatment improved the physicochemical properties, such as temperature, pH, electric conductivity (EC), ammonium nitrogen (NH4+), and nitrate nitrogen (NO3-) in the compost, of which, acid treated apple wood compost (AcAWC) achieved the highest temperature of 58.4 °C, effectively promoting nitrification with NO3- ultimately reaching 0.127 g/kg. In all composts, microbial networks constructed a high proportion of positively correlated connections, and microorganisms promoted the composting process through cooperation. The proportions of glycosyltransferase (GT) and glycoside hydrolase (GH) promoted the separation and degradation of lignocellulose during composting to form HS. Notably, the adverse effects of the alkali-treated apple wood compost on bacteria were greater. AcAWC showed significant correlations between bacterial and fungal communities and both lignin and hemicellulose, and had more biomarkers associated with lignocellulose degradation and humification. The lignin degradation rate was 24.57 % and the HS yield increased by 27.49 %. Therefore, AcAWC has been confirmed to enhance lignocellulose degradation and promote compost humification by altering the properties of the apple wood and establishing a richer microbial community. Collapse Key Words CAZy Composting Humification Lignocellulose Metagenome Collapse MESH Headings Lignin/metabolism Manure Malus Animals Wood Composting Cattle Biomass Humic Substances Biodegradation, Environmental Collapse Grants Collapse
2	Detection of abdominopelvic lymph nodes in multi-parametric MRI. Comput Med Imaging Graph 2024;114:102363. [PMID: 38447381 PMCID: PMC10981570 DOI: 10.1016/j.compmedimag.2024.102363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 02/16/2024] [Accepted: 02/17/2024] [Indexed: 03/08/2024] Abstract Reliable localization of lymph nodes (LNs) in multi-parametric MRI (mpMRI) studies plays a major role in the assessment of lymphadenopathy and staging of metastatic disease. Radiologists routinely measure the nodal size in order to distinguish benign from malignant nodes, which require subsequent cancer staging. However, identification of lymph nodes is a cumbersome task due to their myriad appearances in mpMRI studies. Multiple sequences are acquired in mpMRI studies, including T2 fat suppressed (T2FS) and diffusion weighted imaging (DWI) sequences among others; consequently, the sizing of LNs is rendered challenging due to the variety of signal intensities in these sequences. Furthermore, radiologists can miss potentially metastatic LNs during a busy clinical day. To lighten these imaging and workflow challenges, we propose a computer-aided detection (CAD) pipeline to detect both benign and malignant LNs in the body for their subsequent measurement. We employed the recently proposed Dynamic Head (DyHead) neural network to detect LNs in mpMRI studies that were acquired using a variety of scanners and exam protocols. The T2FS and DWI series were co-registered, and a selective augmentation technique called Intra-Label LISA (ILL) was used to blend the two volumes with the interpolation factor drawn from a Beta distribution. In this way, ILL diversified the samples that the model encountered during the training phase, while the requirement for both sequences to be present at test time was nullified. Our results showed a mean average precision (mAP) of 53.5% and a sensitivity of ∼78% with ILL at 4 FP/vol. This corresponded to an improvement of ≥10% in mAP and ≥12% in sensitivity at 4FP (p ¡ 0.05) respectively over current LN detection approaches evaluated on the same dataset. We also established the out-of-distribution robustness of the DyHead model by training it on data acquired by a Siemens Aera scanner and testing it on data from the Siemens Verio, Siemens Biograph mMR, and Philips Achieva scanners. Our pilot work represents an important first step towards automated detection, segmentation, and classification of lymph nodes in mpMRI. Collapse Key Words DWI Deep learning Detection Lymph nodes MRI Multi-parametric MRI T2 Collapse MESH Headings Humans Multiparametric Magnetic Resonance Imaging Lymphatic Metastasis/diagnostic imaging Lymphatic Metastasis/pathology Diffusion Magnetic Resonance Imaging/methods Lymph Nodes/diagnostic imaging Neoplasm Staging Collapse Grants Z01 CL040004 Intramural NIH HHS Collapse
3	Three-dimensional pseudo-continuous arterial spin-labelled perfusion imaging for diagnosing upper cervical lymph node metastasis in patients with nasopharyngeal carcinoma: a whole-node histogram analysis. Clin Radiol 2024;79:e736-e743. [PMID: 38341343 DOI: 10.1016/j.crad.2024.01.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 01/11/2024] [Accepted: 01/16/2024] [Indexed: 02/12/2024] Abstract AIM To evaluate whole-node histogram parameters of blood flow (BF) maps derived from three-dimensional pseudo-continuous arterial spin-labelled (3D pCASL) imaging in discriminating metastatic from benign upper cervical lymph nodes (UCLNs) for nasopharyngeal carcinoma (NPC) patients. MATERIALS AND METHODS Eighty NPC patients with a total of 170 histologically confirmed UCLNs (67 benign and 103 metastatic) were included retrospectively. Pre-treatment 3D pCASL imaging was performed and whole-node histogram analysis was then applied. Histogram parameters and morphological features, such as minimum axis diameter (MinAD), maximum axis diameter (MaxAD), and location of UCLNs, were assessed and compared between benign and metastatic lesions. Predictors were identified and further applied to establish a combined model by multivariate logistic regression in predicting the probability of metastatic UCLNs. Receiver operating characteristic (ROC) curves were used to analyse the diagnostic performance. RESULTS Metastatic UCLNs had larger MinAD and MinAD/MaxAD ratio, greater energy and entropy values, and higher incidence of level II (upper jugular group), but lower BF10th value than benign nodes (all p<0.05). MinAD, BF10th, energy, and entropy were validated as independent predictors in diagnosing metastatic UCLNs. The combined model yielded an area under the curve (AUC) of 0.932, accuracy of 84.42 %, sensitivity of 80.6 %, and specificity of 90.29 %. CONCLUSIONS Whole-node histogram analysis on BF maps is a feasible tool to differentiate metastatic from benign UCLNs in NPC patients, and the combined model can further improve the diagnostic efficacy. Collapse Key Words Collapse MESH Headings Humans Nasopharyngeal Carcinoma/diagnostic imaging Nasopharyngeal Carcinoma/pathology Lymphatic Metastasis/diagnostic imaging Lymphatic Metastasis/pathology Retrospective Studies Nasopharyngeal Neoplasms/pathology Perfusion Imaging Lymph Nodes/diagnostic imaging Lymph Nodes/pathology Collapse Grants Collapse
4	Leveraging generative AI for clinical evidence synthesis needs to ensure trustworthiness. J Biomed Inform 2024;153:104640. [PMID: 38608915 DOI: 10.1016/j.jbi.2024.104640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 04/08/2024] [Accepted: 04/09/2024] [Indexed: 04/14/2024] Abstract Evidence-based medicine promises to improve the quality of healthcare by empowering medical decisions and practices with the best available evidence. The rapid growth of medical evidence, which can be obtained from various sources, poses a challenge in collecting, appraising, and synthesizing the evidential information. Recent advancements in generative AI, exemplified by large language models, hold promise in facilitating the arduous task. However, developing accountable, fair, and inclusive models remains a complicated undertaking. In this perspective, we discuss the trustworthiness of generative AI in the context of automated summarization of medical evidence. Collapse Key Words Evidence-based medicine Large language models Medical evidence summarization Trustworthy generative AI Collapse MESH Headings Evidence-Based Medicine Humans Artificial Intelligence Trust Natural Language Processing Collapse Grants Collapse
5	Matching Patients to Clinical Trials with Large Language Models. ARXIV 2024:arXiv:2307.15051v4. [PMID: 37576126 PMCID: PMC10418514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 08/15/2023] Abstract Clinical trials are often hindered by the challenge of patient recruitment. In this work, we introduce TrialGPT, a first-of-its-kind large language model (LLM) framework to assist patient-to-trial matching. Given a patient note, TrialGPT predicts the patient's eligibility on a criterion-by-criterion basis and then consolidates these predictions to assess the patient's eligibility for the target trial. We evaluate the trial-level prediction performance of TrialGPT on three publicly available cohorts of 184 patients with over 18,000 trial annotations. We also engaged three physicians to label over 1,000 patient-criterion pairs to assess its criterion-level prediction accuracy. Experimental results show that TrialGPT achieves a criterion-level accuracy of 87.3% with faithful explanations, close to the expert performance (88.7%-90.0%). The aggregated TrialGPT scores are highly correlated with human eligibility judgments, and they outperform the best-competing models by 32.6% to 57.2% in ranking and excluding clinical trials. Furthermore, our user study reveals that TrialGPT can significantly reduce the screening time (by 42.6%) in a real-life clinical trial matching task. These results and analyses have demonstrated promising opportunities for clinical trial matching with LLMs such as TrialGPT. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
6	A survey of recent methods for addressing AI fairness and bias in biomedicine. J Biomed Inform 2024;154:104646. [PMID: 38677633 DOI: 10.1016/j.jbi.2024.104646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 04/17/2024] [Indexed: 04/29/2024] Abstract OBJECTIVES Artificial intelligence (AI) systems have the potential to revolutionize clinical practices, including improving diagnostic accuracy and surgical decision-making, while also reducing costs and manpower. However, it is important to recognize that these systems may perpetuate social inequities or demonstrate biases, such as those based on race or gender. Such biases can occur before, during, or after the development of AI models, making it critical to understand and address potential biases to enable the accurate and reliable application of AI models in clinical settings. To mitigate bias concerns during model development, we surveyed recent publications on different debiasing methods in the fields of biomedical natural language processing (NLP) or computer vision (CV). Then we discussed the methods, such as data perturbation and adversarial learning, that have been applied in the biomedical domain to address bias. METHODS We performed our literature search on PubMed, ACM digital library, and IEEE Xplore of relevant articles published between January 2018 and December 2023 using multiple combinations of keywords. We then filtered the result of 10,041 articles automatically with loose constraints, and manually inspected the abstracts of the remaining 890 articles to identify the 55 articles included in this review. Additional articles in the references are also included in this review. We discuss each method and compare its strengths and weaknesses. Finally, we review other potential methods from the general domain that could be applied to biomedicine to address bias and improve fairness. RESULTS The bias of AIs in biomedicine can originate from multiple sources such as insufficient data, sampling bias and the use of health-irrelevant features or race-adjusted algorithms. Existing debiasing methods that focus on algorithms can be categorized into distributional or algorithmic. Distributional methods include data augmentation, data perturbation, data reweighting methods, and federated learning. Algorithmic approaches include unsupervised representation learning, adversarial learning, disentangled representation learning, loss-based methods and causality-based methods. Collapse Key Words AI Bias Biomedicine Fairness Collapse MESH Headings Collapse Grants Collapse
7	An Updated Simplified Severity Scale for Age-Related Macular Degeneration, Incorporating Reticular Pseudodrusen: Age-Related Eye Disease Study Report No. 42. Ophthalmology 2024:S0161-6420(24)00263-X. [PMID: 38657840 DOI: 10.1016/j.ophtha.2024.04.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 03/25/2024] [Accepted: 04/15/2024] [Indexed: 04/26/2024] Open Abstract PURPOSE To update the Age-Related Eye Disease Study (AREDS) Simplified Severity Scale for risk of late age-related macular degeneration (AMD), including incorporation of reticular pseudodrusen (RPD), and to perform external validation on the AREDS2. DESIGN Post hoc analysis of two clinical trial cohorts: AREDS and AREDS2. PARTICIPANTS Participants with no late AMD in either eye at baseline in AREDS (n=2719) and AREDS2 (n=1472). METHODS Five-year rates of progression to late AMD were calculated according to levels 0-4 on the Simplified Severity Scale, following two updates: (i) non-central GA considered part of the outcome rather than a risk feature, and (ii) scale separation according to RPD status (determined by validated deep learning grading of color fundus photographs). MAIN OUTCOME MEASURES Five-year rate of progression to late AMD (defined as neovascular AMD or any GA). RESULTS In the AREDS, following the first scale update, the five-year rates of progression to late AMD for levels 0-4 were 0.3%, 4.5%, 12.9%, 32.2%, and 55.6%, respectively. Following both updates, the proportion progressing to late AMD by five years was 8.4% in participants without RPD and 40.6% in those with RPD. As the final Simplified Severity Scale, the five-year progression rates for levels 0-4, respectively, were 0.3%, 4.3%, 11.6%, 26.7%, and 50.0%, for participants without RPD at baseline, and 2.8%, 8.0%, 29.0%, 58.7%, and 72.2%, for participants with RPD at baseline. In external validation on the AREDS2, for levels 2-4, the progression rates were similar, at 15.0%, 27.7%, and 45.7% (RPD absent) and 26.2%, 46.0%, and 73.0% (RPD present), respectively. CONCLUSIONS The AREDS AMD Simplified Severity Scale has been modernized with two important updates. The new scale for individuals without RPD has five-year progression rates of ∼0.5%, 4%, 12%, ∼25%, and 50%, such that the rates on the original scale remain accurate. The new scale for individuals with RPD has five-year progression rates of 3%, 8%, ∼30%, ∼60%, and ∼70%, i.e., approximately double for most levels. This scale fits updated definitions of late AMD, has increased prognostic accuracy, appears generalizable to similar populations, but remains simple for broad risk categorization. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
8	Hidden Flaws Behind Expert-Level Accuracy of Multimodal GPT-4 Vision in Medicine. ARXIV 2024:arXiv:2401.08396v3. [PMID: 38410646 PMCID: PMC10896362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/28/2024] Abstract Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by-step multimodal reasoning when solving New England Journal of Medicine (NEJM) Image Challenges - an imaging quiz designed to test the knowledge and diagnostic capabilities of medical professionals. Evaluation results confirmed that GPT-4V performs comparatively to human physicians regarding multi-choice accuracy (81.6% vs. 77.8%). GPT-4V also performs well in cases where physicians incorrectly answer, with over 78% accuracy. However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4V's high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows. Collapse Key Words Collapse MESH Headings Collapse Grants R01 LM014344 NLM NIH HHS Collapse
9	Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study. J Med Internet Res 2024;26:e56655. [PMID: 38630520 PMCID: PMC11063893 DOI: 10.2196/56655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 02/17/2024] [Accepted: 03/06/2024] [Indexed: 04/19/2024] Open Abstract BACKGROUND Although patients have easy access to their electronic health records and laboratory test result data through patient portals, laboratory test results are often confusing and hard to understand. Many patients turn to web-based forums or question-and-answer (Q&A) sites to seek advice from their peers. The quality of answers from social Q&A sites on health-related questions varies significantly, and not all responses are accurate or reliable. Large language models (LLMs) such as ChatGPT have opened a promising avenue for patients to have their questions answered. OBJECTIVE We aimed to assess the feasibility of using LLMs to generate relevant, accurate, helpful, and unharmful responses to laboratory test-related questions asked by patients and identify potential issues that can be mitigated using augmentation approaches. METHODS We collected laboratory test result-related Q&A data from Yahoo! Answers and selected 53 Q&A pairs for this study. Using the LangChain framework and ChatGPT web portal, we generated responses to the 53 questions from 5 LLMs: GPT-4, GPT-3.5, LLaMA 2, MedAlpaca, and ORCA_mini. We assessed the similarity of their answers using standard Q&A similarity-based evaluation metrics, including Recall-Oriented Understudy for Gisting Evaluation, Bilingual Evaluation Understudy, Metric for Evaluation of Translation With Explicit Ordering, and Bidirectional Encoder Representations from Transformers Score. We used an LLM-based evaluator to judge whether a target model had higher quality in terms of relevance, correctness, helpfulness, and safety than the baseline model. We performed a manual evaluation with medical experts for all the responses to 7 selected questions on the same 4 aspects. RESULTS Regarding the similarity of the responses from 4 LLMs; the GPT-4 output was used as the reference answer, the responses from GPT-3.5 were the most similar, followed by those from LLaMA 2, ORCA_mini, and MedAlpaca. Human answers from Yahoo data were scored the lowest and, thus, as the least similar to GPT-4-generated answers. The results of the win rate and medical expert evaluation both showed that GPT-4's responses achieved better scores than all the other LLM responses and human responses on all 4 aspects (relevance, correctness, helpfulness, and safety). LLM responses occasionally also suffered from lack of interpretation in one's medical context, incorrect statements, and lack of references. CONCLUSIONS By evaluating LLMs in generating responses to patients' laboratory test result-related questions, we found that, compared to other 4 LLMs and human answers from a Q&A website, GPT-4's responses were more accurate, helpful, relevant, and safer. There were cases in which GPT-4 responses were inaccurate and not individualized. We identified a number of ways to improve the quality of LLM responses, including prompt engineering, prompt augmentation, retrieval-augmented generation, and response evaluation. Collapse Key Words ChatGPT generative AI generative artificial intelligence laboratory test results large language models natural language processing patient education Collapse MESH Headings Humans Animals Camelids, New World Benchmarking Electronic Health Records Engineering Language Collapse Grants R21 HS029969 AHRQ HHS UL1 TR001427 NCATS NIH HHS Collapse
10	PubTator 3.0: an AI-powered literature resource for unlocking biomedical knowledge. Nucleic Acids Res 2024:gkae235. [PMID: 38572754 DOI: 10.1093/nar/gkae235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 03/02/2024] [Accepted: 03/21/2024] [Indexed: 04/05/2024] Open Abstract PubTator 3.0 (https://www.ncbi.nlm.nih.gov/research/pubtator3/) is a biomedical literature resource using state-of-the-art AI techniques to offer semantic and relation searches for key concepts like proteins, genetic variants, diseases and chemicals. It currently provides over one billion entity and relation annotations across approximately 36 million PubMed abstracts and 6 million full-text articles from the PMC open access subset, updated weekly. PubTator 3.0's online interface and API utilize these precomputed entity relations and synonyms to provide advanced search capabilities and enable large-scale analyses, streamlining many complex information needs. We showcase the retrieval quality of PubTator 3.0 using a series of entity pair queries, demonstrating that PubTator 3.0 retrieves a greater number of articles than either PubMed or Google Scholar, with higher precision in the top 20 results. We further show that integrating ChatGPT (GPT-4) with PubTator APIs dramatically improves the factuality and verifiability of its responses. In summary, PubTator 3.0 offers a comprehensive set of features and tools that allow researchers to navigate the ever-expanding wealth of biomedical literature, expediting research and unlocking valuable insights for scientific discovery. Collapse Key Words Collapse MESH Headings Collapse Grants NIH HHS Collapse
11	Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge. ARXIV 2024:arXiv:2310.16112v2. [PMID: 37986726 PMCID: PMC10659524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/22/2023] Abstract Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification. Collapse Key Words Chest X-ray Computer-aided diagnosis Long-tailed learning Collapse MESH Headings Collapse Grants R01 LM014306 NLM NIH HHS Collapse
12	Advancing entity recognition in biomedicine via instruction tuning of large language models. Bioinformatics 2024;40:btae163. [PMID: 38514400 PMCID: PMC11001490 DOI: 10.1093/bioinformatics/btae163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 02/18/2024] [Accepted: 03/19/2024] [Indexed: 03/23/2024] Open Abstract MOTIVATION Large Language Models (LLMs) have the potential to revolutionize the field of Natural Language Processing, excelling not only in text generation and reasoning tasks but also in their ability for zero/few-shot learning, swiftly adapting to new tasks with minimal fine-tuning. LLMs have also demonstrated great promise in biomedical and healthcare applications. However, when it comes to Named Entity Recognition (NER), particularly within the biomedical domain, LLMs fall short of the effectiveness exhibited by fine-tuned domain-specific models. One key reason is that NER is typically conceptualized as a sequence labeling task, whereas LLMs are optimized for text generation and reasoning tasks. RESULTS We developed an instruction-based learning paradigm that transforms biomedical NER from a sequence labeling task into a generation task. This paradigm is end-to-end and streamlines the training and evaluation process by automatically repurposing pre-existing biomedical NER datasets. We further developed BioNER-LLaMA using the proposed paradigm with LLaMA-7B as the foundational LLM. We conducted extensive testing on BioNER-LLaMA across three widely recognized biomedical NER datasets, consisting of entities related to diseases, chemicals, and genes. The results revealed that BioNER-LLaMA consistently achieved higher F1-scores ranging from 5% to 30% compared to the few-shot learning capabilities of GPT-4 on datasets with different biomedical entities. We show that a general-domain LLM can match the performance of rigorously fine-tuned PubMedBERT models and PMC-LLaMA, biomedical-specific language model. Our findings underscore the potential of our proposed paradigm in developing general-domain LLMs that can rival SOTA performances in multi-task, multi-domain scenarios in biomedical and health applications. AVAILABILITY AND IMPLEMENTATION Datasets and other resources are available at https://github.com/BIDS-Xu-Lab/BioNER-LLaMA. Collapse Key Words Collapse MESH Headings Animals Camelids, New World Deep Learning Language Natural Language Processing Collapse Grants R01AG078154 NIH HHS National Institutes of Health Intramural Research Program of the National Library of Medicine Collapse
13	Atomically Precise Single-Site Catalysts via Exsolution in a Polyoxometalate-Metal-Organic-Framework Architecture. J Am Chem Soc 2024;146:7950-7955. [PMID: 38483267 DOI: 10.1021/jacs.4c00523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2024] Abstract Single-site catalysts (SSCs) achieve a high catalytic performance through atomically dispersed active sites. A challenge facing the development of SSCs is aggregation of active catalytic species. Reducing the loading of these sites to very low levels is a common strategy to mitigate aggregation and sintering; however, this limits the tools that can be used to characterize the SSCs. Here we report a sintering-resistant SSC with high loading that is achieved by incorporating Anderson-Evans polyoxometalate clusters (POMs, MMo6O24, M = Rh/Pt) within NU-1000, a Zr-based metal-organic framework (MOF). The dual confinement provided by isolating the active site within the POM, then isolating the POMs within the MOF, facilitates the formation of isolated noble metal sites with low coordination numbers via exsolution from the POM during activation. The high loading (up to 3.2 wt %) that can be achieved without sintering allowed the local structure transformation in the POM cluster and the surrounding MOF to be evaluated using in situ X-ray scattering with pair distribution function (PDF) analysis. Notably, the Rh/Pt···Mo distance in the active catalyst is shorter than the M···M bond lengths in the respective bulk metals. Models of the active cluster structure were identified based on the PDF data with complementary computation and X-ray absorption spectroscopy analysis. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
14	Response to Letter to Editor 'Timely need for navigating the potential and downsides of LLMs in healthcare and biomedicine'. Brief Bioinform 2024;25:bbae211. [PMID: 38706322 PMCID: PMC11070722 DOI: 10.1093/bib/bbae211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 04/22/2024] [Indexed: 05/07/2024] Open Abstract Collapse Key Words Collapse MESH Headings Humans Delivery of Health Care Biomedical Research Collapse Grants NLM NIH HHS NIH Intramural Research Program National Library of Medicine Collapse
15	A survey of recent methods for addressing AI fairness and bias in biomedicine. ARXIV 2024:arXiv:2402.08250v1. [PMID: 38529077 PMCID: PMC10962742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 03/27/2024] Abstract Objectives Artificial intelligence (AI) systems have the potential to revolutionize clinical practices, including improving diagnostic accuracy and surgical decision-making, while also reducing costs and manpower. However, it is important to recognize that these systems may perpetuate social inequities or demonstrate biases, such as those based on race or gender. Such biases can occur before, during, or after the development of AI models, making it critical to understand and address potential biases to enable the accurate and reliable application of AI models in clinical settings. To mitigate bias concerns during model development, we surveyed recent publications on different debiasing methods in the fields of biomedical natural language processing (NLP) or computer vision (CV). Then we discussed the methods, such as data perturbation and adversarial learning, that have been applied in the biomedical domain to address bias. Methods We performed our literature search on PubMed, ACM digital library, and IEEE Xplore of relevant articles published between January 2018 and December 2023 using multiple combinations of keywords. We then filtered the result of 10,041 articles automatically with loose constraints, and manually inspected the abstracts of the remaining 890 articles to identify the 55 articles included in this review. Additional articles in the references are also included in this review. We discuss each method and compare its strengths and weaknesses. Finally, we review other potential methods from the general domain that could be applied to biomedicine to address bias and improve fairness. Results The bias of AIs in biomedicine can originate from multiple sources such as insufficient data, sampling bias and the use of health-irrelevant features or race-adjusted algorithms. Existing debiasing methods that focus on algorithms can be categorized into distributional or algorithmic. Distributional methods include data augmentation, data perturbation, data reweighting methods, and federated learning. Algorithmic approaches include unsupervised representation learning, adversarial learning, disentangled representation learning, loss-based methods and causality-based methods. Collapse Key Words AI bias biomedicine fairness Collapse MESH Headings Collapse Grants Collapse
16	GeneGPT: augmenting large language models with domain tools for improved access to biomedical information. Bioinformatics 2024;40:btae075. [PMID: 38341654 PMCID: PMC10904143 DOI: 10.1093/bioinformatics/btae075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/08/2024] [Accepted: 02/08/2024] [Indexed: 02/12/2024] Open Abstract MOTIVATION While large language models (LLMs) have been successfully applied to various tasks, they still face challenges with hallucinations. Augmenting LLMs with domain-specific tools such as database utilities can facilitate easier and more precise access to specialized knowledge. In this article, we present GeneGPT, a novel method for teaching LLMs to use the Web APIs of the National Center for Biotechnology Information (NCBI) for answering genomics questions. Specifically, we prompt Codex to solve the GeneTuring tests with NCBI Web APIs by in-context learning and an augmented decoding algorithm that can detect and execute API calls. RESULTS Experimental results show that GeneGPT achieves state-of-the-art performance on eight tasks in the GeneTuring benchmark with an average score of 0.83, largely surpassing retrieval-augmented LLMs such as the new Bing (0.44), biomedical LLMs such as BioMedLM (0.08) and BioGPT (0.04), as well as GPT-3 (0.16) and ChatGPT (0.12). Our further analyses suggest that: First, API demonstrations have good cross-task generalizability and are more useful than documentations for in-context learning; second, GeneGPT can generalize to longer chains of API calls and answer multi-hop questions in GeneHop, a novel dataset introduced in this work; finally, different types of errors are enriched in different tasks, providing valuable insights for future improvements. AVAILABILITY AND IMPLEMENTATION The GeneGPT code and data are publicly available at https://github.com/ncbi/GeneGPT. Collapse Key Words Collapse MESH Headings Algorithms Benchmarking Databases, Factual Documentation Language Collapse Grants NLM NIH HHS NIH National Library of Medicine Collapse
17	PubMed and beyond: biomedical literature search in the age of artificial intelligence. EBioMedicine 2024;100:104988. [PMID: 38306900 PMCID: PMC10850402 DOI: 10.1016/j.ebiom.2024.104988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 01/14/2024] [Accepted: 01/15/2024] [Indexed: 02/04/2024] Open Abstract Biomedical research yields vast information, much of which is only accessible through the literature. Consequently, literature search is crucial for healthcare and biomedicine. Recent improvements in artificial intelligence (AI) have expanded functionality beyond keywords, but they might be unfamiliar to clinicians and researchers. In response, we present an overview of over 30 literature search tools tailored to common biomedical use cases, aiming at helping readers efficiently fulfill their information needs. We first discuss recent improvements and continued challenges of the widely used PubMed. Then, we describe AI-based literature search tools catering to five specific information needs: 1. Evidence-based medicine. 2. Precision medicine and genomics. 3. Searching by meaning, including questions. 4. Finding related articles with literature recommendation. 5. Discovering hidden associations through literature mining. Finally, we discuss the impacts of recent developments of large language models such as ChatGPT on biomedical information seeking. Collapse Key Words Artificial intelligence Biomedical literature search Collapse MESH Headings Humans Artificial Intelligence Data Mining PubMed Delivery of Health Care Biomedical Research Collapse Grants Collapse
18	Improving large language models for clinical named entity recognition via prompt engineering. J Am Med Inform Assoc 2024:ocad259. [PMID: 38281112 DOI: 10.1093/jamia/ocad259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 12/15/2023] [Accepted: 12/26/2023] [Indexed: 01/29/2024] Open Abstract IMPORTANCE The study highlights the potential of large language models, specifically GPT-3.5 and GPT-4, in processing complex clinical data and extracting meaningful information with minimal training data. By developing and refining prompt-based strategies, we can significantly enhance the models' performance, making them viable tools for clinical NER tasks and possibly reducing the reliance on extensive annotated datasets. OBJECTIVES This study quantifies the capabilities of GPT-3.5 and GPT-4 for clinical named entity recognition (NER) tasks and proposes task-specific prompts to improve their performance. MATERIALS AND METHODS We evaluated these models on 2 clinical NER tasks: (1) to extract medical problems, treatments, and tests from clinical notes in the MTSamples corpus, following the 2010 i2b2 concept extraction shared task, and (2) to identify nervous system disorder-related adverse events from safety reports in the vaccine adverse event reporting system (VAERS). To improve the GPT models' performance, we developed a clinical task-specific prompt framework that includes (1) baseline prompts with task description and format specification, (2) annotation guideline-based prompts, (3) error analysis-based instructions, and (4) annotated samples for few-shot learning. We assessed each prompt's effectiveness and compared the models to BioClinicalBERT. RESULTS Using baseline prompts, GPT-3.5 and GPT-4 achieved relaxed F1 scores of 0.634, 0.804 for MTSamples and 0.301, 0.593 for VAERS. Additional prompt components consistently improved model performance. When all 4 components were used, GPT-3.5 and GPT-4 achieved relaxed F1 socres of 0.794, 0.861 for MTSamples and 0.676, 0.736 for VAERS, demonstrating the effectiveness of our prompt framework. Although these results trail BioClinicalBERT (F1 of 0.901 for the MTSamples dataset and 0.802 for the VAERS), it is very promising considering few training samples are needed. DISCUSSION The study's findings suggest a promising direction in leveraging LLMs for clinical NER tasks. However, while the performance of GPT models improved with task-specific prompts, there's a need for further development and refinement. LLMs like GPT-4 show potential in achieving close performance to state-of-the-art models like BioClinicalBERT, but they still require careful prompt engineering and understanding of task-specific knowledge. The study also underscores the importance of evaluation schemas that accurately reflect the capabilities and performance of LLMs in clinical settings. CONCLUSION While direct application of GPT models to clinical NER tasks falls short of optimal performance, our task-specific prompt framework, incorporating medical knowledge and training samples, significantly enhances GPT models' feasibility for potential clinical applications. Collapse Key Words GPT-3.5 GPT-4 clinical named entity recognition large language models prompt engineering Collapse MESH Headings Collapse Grants R21EB029575 NIH HHS NIA NIH HHS Collapse
19	Unmasking and Quantifying Racial Bias of Large Language Models in Medical Report Generation. ARXIV 2024:arXiv:2401.13867v1. [PMID: 38410650 PMCID: PMC10896353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 02/28/2024] Abstract Large language models like GPT-3.5-turbo and GPT-4 hold promise for healthcare professionals, but they may inadvertently inherit biases during their training, potentially affecting their utility in medical applications. Despite few attempts in the past, the precise impact and extent of these biases remain uncertain. Through both qualitative and quantitative analyses, we find that these models tend to project higher costs and longer hospitalizations for White populations and exhibit optimistic views in challenging medical scenarios with much higher survival rates. These biases, which mirror real-world healthcare disparities, are evident in the generation of patient backgrounds, the association of specific diseases with certain races, and disparities in treatment recommendations, etc. Our findings underscore the critical need for future research to address and mitigate biases in language models, especially in critical healthcare applications, to ensure fair and accurate outcomes for all patients. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
20	Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study. ARXIV 2024:arXiv:2402.01693v1. [PMID: 38529075 PMCID: PMC10962749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/27/2024] Abstract Lab results are often confusing and hard to understand. Large language models (LLMs) such as ChatGPT have opened a promising avenue for patients to get their questions answered. We aim to assess the feasibility of using LLMs to generate relevant, accurate, helpful, and unharmful responses to lab test-related questions asked by patients and to identify potential issues that can be mitigated with augmentation approaches. We first collected lab test results related question and answer data from Yahoo! Answers and selected 53 QA pairs for this study. Using the LangChain framework and ChatGPT web portal, we generated responses to the 53 questions from four LLMs including GPT-4, Meta LLaMA 2, MedAlpaca, and ORCA_mini. We first assessed the similarity of their answers using standard QA similarity-based evaluation metrics including ROUGE, BLEU, METEOR, BERTScore. We also utilized an LLM-based evaluator to judge whether a target model has higher quality in terms of relevance, correctness, helpfulness, and safety than the baseline model. Finally, we performed a manual evaluation with medical experts for all the responses to seven selected questions on the same four aspects. The results of Win Rate and medical expert evaluation both showed that GPT-4's responses achieved better scores than all the other LLM responses and human responses on all four aspects (relevance, correctness, helpfulness, and safety). However, LLM responses occasionally also suffer from a lack of interpretation in one's medical context, incorrect statements, and lack of references. We find that compared to other three LLMs and human answer from the Q&A website, GPT-4's responses are more accurate, helpful, relevant, and safer. However, there are cases which GPT-4 responses are inaccurate and not individualized. We identified a number of ways to improve the quality of LLM responses. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
21	PubTator 3.0: an AI-powered Literature Resource for Unlocking Biomedical Knowledge. ARXIV 2024:arXiv:2401.11048v1. [PMID: 38410657 PMCID: PMC10896359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 02/28/2024] Abstract PubTator 3.0 (https://www.ncbi.nlm.nih.gov/research/pubtator3/) is a biomedical literature resource using state-of-the-art AI techniques to offer semantic and relation searches for key concepts like proteins, genetic variants, diseases, and chemicals. It currently provides over one billion entity and relation annotations across approximately 36 million PubMed abstracts and 6 million full-text articles from the PMC open access subset, updated weekly. PubTator 3.0's online interface and API utilize these precomputed entity relations and synonyms to provide advanced search capabilities and enable large-scale analyses, streamlining many complex information needs. We showcase the retrieval quality of PubTator 3.0 using a series of entity pair queries, demonstrating that PubTator 3.0 retrieves a greater number of articles than either PubMed or Google Scholar, with higher precision in the top 20 results. We further show that integrating ChatGPT (GPT-4) with PubTator APIs dramatically improves the factuality and verifiability of its responses. In summary, PubTator 3.0 offers a comprehensive set of features and tools that allow researchers to navigate the ever-expanding wealth of biomedical literature, expediting research and unlocking valuable insights for scientific discovery. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
22	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2024;52:D33-D43. [PMID: 37994677 PMCID: PMC10767890 DOI: 10.1093/nar/gkad1044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/24/2023] Open Abstract The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, SciENcv, the NIH Comparative Genomics Resource (CGR), NCBI Virus, SRA, RefSeq, foreign contamination screening tools, Taxonomy, iCn3D, ClinVar, GTR, MedGen, dbSNP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Biotechnology/instrumentation Databases, Genetic Databases, Nucleic Acid Internet National Library of Medicine (U.S.) United States Collapse Grants NIH HHS NIH HHS National Institutes of Health Collapse
23	Universal detection and segmentation of lymph nodes in multi-parametric MRI. Int J Comput Assist Radiol Surg 2024;19:163-170. [PMID: 37326816 PMCID: PMC11072433 DOI: 10.1007/s11548-023-02954-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 05/05/2023] [Indexed: 06/17/2023] Abstract PURPOSE Reliable measurement of lymph nodes (LNs) in multi-parametric MRI (mpMRI) studies of the body plays a major role in the assessment of lymphadenopathy and staging of metastatic disease. Previous approaches do not adequately exploit the complementary sequences in mpMRI to universally detect and segment lymph nodes, and they have shown fairly limited performance. METHODS We propose a computer-aided detection and segmentation pipeline to leverage the T2 fat-suppressed (T2FS) and diffusion-weighted imaging (DWI) series from a mpMRI study. The T2FS and DWI series in 38 studies (38 patients) were co-registered and blended together using a selective data augmentation technique, such that traits of both series were visible in the same volume. A mask RCNN model was subsequently trained for universal detection and segmentation of 3D LNs. RESULTS Experiments on 18 test mpMRI studies revealed that the proposed pipeline achieved a precision of [Formula: see text]%, sensitivity of [Formula: see text]% at 4 false positives (FP) per volume, and dice score of [Formula: see text]%. This represented an improvement of [Formula: see text]% in precision, [Formula: see text]% in sensitivity at 4 FP/volume, and [Formula: see text]% in dice score, respectively, over current approaches evaluated on the same dataset. CONCLUSION Our pipeline universally detected and segmented both metastatic and non-metastatic nodes in mpMRI studies. At test time, the input data used by the trained model could either be the T2FS series alone or a blend of co-registered T2FS and DWI series. Contrary to prior work, this eliminated the reliance on both the T2FS and DWI series in a mpMRI study. Collapse Key Words Detection Lymph node MRI Segmentation deep learning T2 Collapse MESH Headings Humans Multiparametric Magnetic Resonance Imaging Diffusion Magnetic Resonance Imaging/methods Lung Mediastinum Lymph Nodes/diagnostic imaging Lymph Nodes/pathology Collapse Grants Z01 CL040004 Intramural NIH HHS Z99 CL999999 Intramural NIH HHS Collapse
24	One-Pot Synthesis of MOF@MOF: Structural Incompatibility Leads to Core-Shell Structure and Adaptability Control Makes the Sequence. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2024;20:e2305881. [PMID: 37670528 DOI: 10.1002/smll.202305881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Indexed: 09/07/2023] Abstract Core-shell metal-organic frameworks (MOF@MOF) are promising materials with sophisticated structures that cannot only enhance the properties of MOFs but also endow them with new functions. The growth of isotopic lcore-shell MOFs is mostly limited to inconvenient stepwise seeding strategies with strict requirements, and by far one-pot synthesis is still of great challenge due to the interference of different components. Through two pairs of isoreticular MOFs, it reveals that the structural incompatibility is a prerequisite for the formation of MOFs@MOFs by one-pot synthesis, as illustrated by PMOF-3@HHU-9. It further unveils that the adaptability of the shell-MOF is a more key factor for nucleation kinetic control. MOFs with flexible linkers has comparably slower nucleation than MOFs with rigid linkers (forming PMOF-3@NJU-Bai21), and structural-flexible MOFs built by flexible linkers show the lowest nucleation and the most adaptability (affording NJU-Bai21@HHU-9). This degree of adaptability variation controls the sequence and further facilitates the synthesis of a first triple-layered core-shell MOF (PMOF-3@NJU-Bai21@HHU-9) by one-pot synthesis. The insight gained from this study will aid in the rational design and synthesis of other multi-shelled structures by one-pot synthesis and the further expansion of their applications. Collapse Key Words core-shell structures linker flexibility metal-organic frameworks nucleation kinetics structural adaptability Collapse MESH Headings Collapse Grants BK20221498 Natural Science Fund of Jiangsu Province 21601047 National Natural Science Foundation of China 2023A1515030234 Youth Enhancement Program of Guangdong Basic and Applied Research Collapse
25	LARGE LANGUAGE MODELS (LLMS) AND CHATGPT FOR BIOMEDICINE. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2024;29:641-644. [PMID: 38160312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/03/2024] Abstract Large Language Models (LLMs) are a type of artificial intelligence that has been revolutionizing various fields, including biomedicine. They have the capability to process and analyze large amounts of data, understand natural language, and generate new content, making them highly desirable in many biomedical applications and beyond. In this workshop, we aim to introduce the attendees to an in-depth understanding of the rise of LLMs in biomedicine, and how they are being used to drive innovation and improve outcomes in the field, along with associated challenges and pitfalls. Collapse Key Words Collapse MESH Headings Humans Artificial Intelligence Computational Biology Language Collapse Grants Collapse
26	A deep network DeepOpacityNet for detection of cataracts from color fundus photographs. COMMUNICATIONS MEDICINE 2023;3:184. [PMID: 38104223 PMCID: PMC10725427 DOI: 10.1038/s43856-023-00410-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 11/21/2023] [Indexed: 12/19/2023] Open Abstract BACKGROUND Cataract diagnosis typically requires in-person evaluation by an ophthalmologist. However, color fundus photography (CFP) is widely performed outside ophthalmology clinics, which could be exploited to increase the accessibility of cataract screening by automated detection. METHODS DeepOpacityNet was developed to detect cataracts from CFP and highlight the most relevant CFP features associated with cataracts. We used 17,514 CFPs from 2573 AREDS2 participants curated from the Age-Related Eye Diseases Study 2 (AREDS2) dataset, of which 8681 CFPs were labeled with cataracts. The ground truth labels were transferred from slit-lamp examination of nuclear cataracts and reading center grading of anterior segment photographs for cortical and posterior subcapsular cataracts. DeepOpacityNet was internally validated on an independent test set (20%), compared to three ophthalmologists on a subset of the test set (100 CFPs), externally validated on three datasets obtained from the Singapore Epidemiology of Eye Diseases study (SEED), and visualized to highlight important features. RESULTS Internally, DeepOpacityNet achieved a superior accuracy of 0.66 (95% confidence interval (CI): 0.64-0.68) and an area under the curve (AUC) of 0.72 (95% CI: 0.70-0.74), compared to that of other state-of-the-art methods. DeepOpacityNet achieved an accuracy of 0.75, compared to an accuracy of 0.67 for the ophthalmologist with the highest performance. Externally, DeepOpacityNet achieved AUC scores of 0.86, 0.88, and 0.89 on SEED datasets, demonstrating the generalizability of our proposed method. Visualizations show that the visibility of blood vessels could be characteristic of cataract absence while blurred regions could be characteristic of cataract presence. CONCLUSIONS DeepOpacityNet could detect cataracts from CFPs in AREDS2 with performance superior to that of ophthalmologists and generate interpretable results. The code and models are available at https://github.com/ncbi/DeepOpacityNet ( https://doi.org/10.5281/zenodo.10127002 ). Collapse Key Words lens diseases pathology Collapse MESH Headings Collapse Grants K99 LM014024 NLM NIH HHS Collapse
27	Term-BLAST-like alignment tool for concept recognition in noisy clinical texts. Bioinformatics 2023;39:btad716. [PMID: 38001031 PMCID: PMC10710372 DOI: 10.1093/bioinformatics/btad716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 10/20/2023] [Accepted: 11/23/2023] [Indexed: 11/26/2023] Open Abstract MOTIVATION Methods for concept recognition (CR) in clinical texts have largely been tested on abstracts or articles from the medical literature. However, texts from electronic health records (EHRs) frequently contain spelling errors, abbreviations, and other nonstandard ways of representing clinical concepts. RESULTS Here, we present a method inspired by the BLAST algorithm for biosequence alignment that screens texts for potential matches on the basis of matching k-mer counts and scores candidates based on conformance to typical patterns of spelling errors derived from 2.9 million clinical notes. Our method, the Term-BLAST-like alignment tool (TBLAT) leverages a gold standard corpus for typographical errors to implement a sequence alignment-inspired method for efficient entity linkage. We present a comprehensive experimental comparison of TBLAT with five widely used tools. Experimental results show an increase of 10% in recall on scientific publications and 20% increase in recall on EHR records (when compared against the next best method), hence supporting a significant enhancement of the entity linking task. The method can be used stand-alone or as a complement to existing approaches. AVAILABILITY AND IMPLEMENTATION Fenominal is a Java library that implements TBLAT for named CR of Human Phenotype Ontology terms and is available at https://github.com/monarch-initiative/fenominal under the GNU General Public License v3.0. Collapse Key Words Collapse MESH Headings Humans Algorithms Sequence Alignment Language Electronic Health Records Publications Collapse Grants R24 OD011883 NIH HHS U24 HG011449 NHGRI NIH HHS Shriners Children’s NIH NHGRI Collapse
28	Deep-GA-Net for Accurate and Explainable Detection of Geographic Atrophy on OCT Scans. OPHTHALMOLOGY SCIENCE 2023;3:100311. [PMID: 37304045 PMCID: PMC10251072 DOI: 10.1016/j.xops.2023.100311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 04/06/2023] [Accepted: 04/07/2023] [Indexed: 06/13/2023] Abstract Objective To propose Deep-GA-Net, a 3-dimensional (3D) deep learning network with 3D attention layer, for the detection of geographic atrophy (GA) on spectral domain OCT (SD-OCT) scans, explain its decision making, and compare it with existing methods. Design Deep learning model development. Participants Three hundred eleven participants from the Age-Related Eye Disease Study 2 Ancillary SD-OCT Study. Methods A dataset of 1284 SD-OCT scans from 311 participants was used to develop Deep-GA-Net. Cross-validation was used to evaluate Deep-GA-Net, where each testing set contained no participant from the corresponding training set. En face heatmaps and important regions at the B-scan level were used to visualize the outputs of Deep-GA-Net, and 3 ophthalmologists graded the presence or absence of GA in them to assess the explainability (i.e., understandability and interpretability) of its detections. Main Outcome Measures Accuracy, area under receiver operating characteristic curve (AUC), area under precision-recall curve (APR). Results Compared with other networks, Deep-GA-Net achieved the best metrics, with accuracy of 0.93, AUC of 0.94, and APR of 0.91, and received the best gradings of 0.98 and 0.68 on the en face heatmap and B-scan grading tasks, respectively. Conclusions Deep-GA-Net was able to detect GA accurately from SD-OCT scans. The visualizations of Deep-GA-Net were more explainable, as suggested by 3 ophthalmologists. The code and pretrained models are publicly available at https://github.com/ncbi/Deep-GA-Net. Financial Disclosures The author(s) have no proprietary or commercial interest in any materials discussed in this article. Collapse Key Words Deep Learning Detection Explainability GA Geographic atrophy OCT Collapse MESH Headings Collapse Grants Collapse
29	Maternal iodine intake and adherence to iodine supplement recommendations in a group of Chinese women: the results from the WIN cohort study - CORRIGENDUM. Proc Nutr Soc 2023;82:492. [PMID: 37078399 DOI: 10.1017/s0029665123002768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/21/2023] Abstract Collapse Key Words Collapse MESH Headings Humans Female Cohort Studies Nutritional Status Dietary Supplements Iodine China Collapse Grants Collapse
30	Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Brief Bioinform 2023;25:bbad493. [PMID: 38168838 PMCID: PMC10762511 DOI: 10.1093/bib/bbad493] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/15/2023] [Accepted: 12/06/2023] [Indexed: 01/05/2024] Open Abstract ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically, we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction and medical education and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health. Collapse Key Words ChatGPT biomedicine and health generative AI large language model opportunities and challenges Collapse MESH Headings Humans Information Storage and Retrieval Language Privacy Research Personnel Collapse Grants K99 LM014024 NLM NIH HHS 1K99LM014024 NIH HHS NIH Intramural Research Program National Library of Medicine National Institutes of Health Collapse
31	A bibliometric analysis of intra-articular injection therapy for knee osteoarthritis from 2012 to 2022. Medicine (Baltimore) 2023;102:e36105. [PMID: 37986287 PMCID: PMC10659632 DOI: 10.1097/md.0000000000036105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/22/2023] Open Abstract Knee osteoarthritis (KOA) is the most common joint disease worldwide and, with the progression of an aging population, is one of the most important causes of disability worldwide. Its main symptoms include articular cartilage damage, periarticular pain, swelling, and stiffness. Intra-articular (IA) injections offer many advantages over systemic administration and surgical treatment, including direct action on the target joint to improve local bioavailability, reduce systemic toxicity, and lower costs. This study analyzed KOA intra-articular injection treatment and its hot literature and research horizons using bibliometric methodologies and graphical tools to aid future research. We performed a bibliometric analysis of 2360 publications in the Web of Science core collection using CiteSpace software. The United States (28.26% of publications) and China (18%) had the biggest publications. Rush University was the most active institution, but Boston University had the greatest citation/publication rate (65.77), suggesting a high literature standard. The majority of publications were in Osteoarthritis and cartilage. Bannuru RR was the most referenced author, while Filardo, Giuseppe was the most productive author. Studies in platelet-rich plasma (PRP), mesenchymal stem cells (MSCs), and microsphere formulation are likely to be future research hotspots. The current scientometric study provides an overview of KOA intra-articular injection therapy studies from 2012 to 2022. This study outlines the current research hotspots and potential future research hotspots in the field of intra-articular injection treatment for KOA and may serve as a resource for researchers interested in this topic. Collapse Key Words bibliometric analysis citespace intra articular injection knee osteoarthritis Collapse MESH Headings Humans Aged Osteoarthritis, Knee/drug therapy Osteoarthritis, Knee/surgery Injections, Intra-Articular Cartilage, Articular Bibliometrics Platelet-Rich Plasma Treatment Outcome Collapse Grants Collapse
32	[Effect of recombinant human thrombin for hemostasis in liver resection: a randomized controlled phase Ⅲ clinical trial]. ZHONGHUA YI XUE ZA ZHI 2023;103:3416-3423. [PMID: 37963740 DOI: 10.3760/cma.j.cn112137-20230911-00438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/16/2023] Abstract Objective: To evaluate the hemostatic efficacy, safety and immunogenicity of recombinant human thrombin in the treatment of liver wounds that still ooze after conventional surgical hemostasis. Methods: A multicenter, stratified randomized, double-blind, placebo-controlled phase Ⅲ trial with a planned enrollment of 510 subjects at 33 centers, with a 2∶1 randomization to the thrombin group versus the placebo group. An interim analysis will be conducted after approximately 70% of the subjects have completed the observation period. The primary efficacy endpoint was the rate of hemostasis within 6 minutes at the point of bleeding that could be evaluated. Safety analysis was performed one month after surgery, and the positive rates of anti-drug antibody (ADA) and neutralizing antibody were evaluated. Results: At the interim analysis, a total of 348 subjects had been randomized and received the study drug (215 were male and 133 were female). They were aged 19-69 (52.9±10.9)years. Among them, 232 were in the thrombin group and 116 were in the placebo group, with balanced and comparable demographics and baseline characteristics between the two groups. The hemostasis rate at 6 minutes was 71.6% (95%CI:65.75%-77.36%) in the thrombin group and 44.0% (95%CI: 34.93%-53.00%) in the placebo group, respectively (P<0.001). No grade≥3 drug-related adverse events and no drug-related deaths were reported from the study.No recombinant human thrombin-induced immunologically-enhanced ADA or immunologically-induced ADA was detected after topical use in subjects. Conclusion: Recombinant human thrombin has shown significant hemostatic efficacy and good safety in controlling bleeding during liver resection surgery, while also demonstrating low immunogenicity characteristics. Collapse Key Words Collapse MESH Headings Humans Male Female Thrombin/adverse effects Hemostatics/therapeutic use Hemostatics/adverse effects Liver Hemostasis Treatment Outcome Collapse Grants Collapse
33	MedCPT: Contrastive Pre-trained Transformers with large-scale PubMed search logs for zero-shot biomedical information retrieval. Bioinformatics 2023;39:btad651. [PMID: 37930897 PMCID: PMC10627406 DOI: 10.1093/bioinformatics/btad651] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 09/29/2023] [Indexed: 11/08/2023] Open Abstract MOTIVATION Information retrieval (IR) is essential in biomedical knowledge acquisition and clinical decision support. While recent progress has shown that language model encoders perform better semantic retrieval, training such models requires abundant query-article annotations that are difficult to obtain in biomedicine. As a result, most biomedical IR systems only conduct lexical matching. In response, we introduce MedCPT, a first-of-its-kind Contrastively Pre-trained Transformer model for zero-shot semantic IR in biomedicine. RESULTS To train MedCPT, we collected an unprecedented scale of 255 million user click logs from PubMed. With such data, we use contrastive learning to train a pair of closely integrated retriever and re-ranker. Experimental results show that MedCPT sets new state-of-the-art performance on six biomedical IR tasks, outperforming various baselines including much larger models, such as GPT-3-sized cpt-text-XL. In addition, MedCPT also generates better biomedical article and sentence representations for semantic evaluations. As such, MedCPT can be readily applied to various real-world biomedical IR tasks. AVAILABILITY AND IMPLEMENTATION The MedCPT code and model are available at https://github.com/ncbi/MedCPT. Collapse Key Words Collapse MESH Headings Information Storage and Retrieval Language Natural Language Processing PubMed Semantics Review Literature as Topic Collapse Grants NIH Intramural Research Program, National Library of Medicine Collapse
34	Measurement of the Positive Muon Anomalous Magnetic Moment to 0.20 ppm. PHYSICAL REVIEW LETTERS 2023;131:161802. [PMID: 37925710 DOI: 10.1103/physrevlett.131.161802] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 09/05/2023] [Indexed: 11/07/2023] Abstract We present a new measurement of the positive muon magnetic anomaly, a_{μ}≡(g_{μ}-2)/2, from the Fermilab Muon g-2 Experiment using data collected in 2019 and 2020. We have analyzed more than 4 times the number of positrons from muon decay than in our previous result from 2018 data. The systematic error is reduced by more than a factor of 2 due to better running conditions, a more stable beam, and improved knowledge of the magnetic field weighted by the muon distribution, ω[over ˜]_{p}^{'}, and of the anomalous precession frequency corrected for beam dynamics effects, ω_{a}. From the ratio ω_{a}/ω[over ˜]_{p}^{'}, together with precisely determined external parameters, we determine a_{μ}=116 592 057(25)×10^{-11} (0.21 ppm). Combining this result with our previous result from the 2018 data, we obtain a_{μ}(FNAL)=116 592 055(24)×10^{-11} (0.20 ppm). The new experimental world average is a_{μ}(exp)=116 592 059(22)×10^{-11} (0.19 ppm), which represents a factor of 2 improvement in precision. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
35	A scoping review on multimodal deep learning in biomedical images and texts. ARXIV 2023:arXiv:2307.07362v3. [PMID: 37576120 PMCID: PMC10418520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 08/15/2023] Abstract Computer-assisted diagnostic and prognostic systems of the future should be capable of simultaneously processing multimodal data. Multimodal deep learning (MDL), which involves the integration of multiple sources of data, such as images and text, has the potential to revolutionize the analysis and interpretation of biomedical data. However, it only caught researchers' attention recently. To this end, there is a critical need to conduct a systematic review on this topic, identify the limitations of current work, and explore future directions. In this scoping review, we aim to provide a comprehensive overview of the current state of the field and identify key concepts, types of studies, and research gaps with a focus on biomedical images and texts joint learning, mainly because these two were the most commonly available data types in MDL research. This study reviewed the current uses of multimodal deep learning on five tasks: (1) Report generation, (2) Visual question answering, (3) Cross-modal retrieval, (4) Computer-aided diagnosis, and (5) Semantic segmentation. Our results highlight the diverse applications and potential of MDL and suggest directions for future research in the field. We hope our review will facilitate the collaboration of natural language processing (NLP) and medical imaging communities and support the next generation of decision-making and computer-assisted diagnostic system development. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
36	Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health. ARXIV 2023:arXiv:2306.10070v2. [PMID: 37904734 PMCID: PMC10614979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/01/2023] Abstract ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction, and medical education, and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health. Collapse Key Words ChatGPT biomedicine and health generative AI large language model opportunities and challenges Collapse MESH Headings Collapse Grants K99 LM014024 NLM NIH HHS Collapse
37	Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports. ARXIV 2023:arXiv:2306.08749v2. [PMID: 37502627 PMCID: PMC10370215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/29/2023] Abstract Despite the reduction in turn-around times in radiology reporting with the use of speech recognition software, persistent communication errors can significantly impact the interpretation of radiology reports. Pre-filling a radiology report holds promise in mitigating reporting errors, and despite multiple efforts in literature to generate comprehensive medical reports, there lacks approaches that exploit the longitudinal nature of patient visit records in the MIMIC-CXR dataset. To address this gap, we propose to use longitudinal multi-modal data, i.e., previous patient visit CXR, current visit CXR, and the previous visit report, to pre-fill the "findings" section of the patient's current visit. We first gathered the longitudinal visit information for 26,625 patients from the MIMIC-CXR dataset, and created a new dataset called Longitudinal-MIMIC. With this new dataset, a transformer-based model was trained to capture the multi-modal longitudinal information from patient visit records (CXR images + reports) via a cross-attention-based multi-modal fusion module and a hierarchical memory-driven decoder. In contrast to previous works that only uses current visit data as input to train a model, our work exploits the longitudinal information available to pre-fill the "findings" section of radiology reports. Experiments show that our approach outperforms several recent approaches. Code will be published at https://github.com/CelestialShine/Longitudinal-Chest-X-Ray. Collapse Key Words Chest X-Rays Longitudinal data Radiology reports Report Generation Report Pre-Filling Collapse MESH Headings Collapse Grants R00 LM013001 NLM NIH HHS Collapse
38	Improving model fairness in image-based computer-aided diagnosis. Nat Commun 2023;14:6261. [PMID: 37803009 PMCID: PMC10558498 DOI: 10.1038/s41467-023-41974-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 09/25/2023] [Indexed: 10/08/2023] Open Abstract Deep learning has become a popular tool for computer-aided diagnosis using medical images, sometimes matching or exceeding the performance of clinicians. However, these models can also reflect and amplify human bias, potentially resulting inaccurate missed diagnoses. Despite this concern, the problem of improving model fairness in medical image classification by deep learning has yet to be fully studied. To address this issue, we propose an algorithm that leverages the marginal pairwise equal opportunity to reduce bias in medical image classification. Our evaluations across four tasks using four independent large-scale cohorts demonstrate that our proposed algorithm not only improves fairness in individual and intersectional subgroups but also maintains overall performance. Specifically, the relative change in pairwise fairness difference between our proposed model and the baseline model was reduced by over 35%, while the relative change in AUC value was typically within 1%. By reducing the bias generated by deep learning models, our proposed approach can potentially alleviate concerns about the fairness and reliability of image-based computer-aided diagnosis. Collapse Key Words diagnosis optic nerve diseases macular degeneration glaucoma radiography Collapse MESH Headings Humans Reproducibility of Results Diagnosis, Computer-Assisted/methods Algorithms Computers Collapse Grants R00 LM013001 NLM NIH HHS R21 EY035296 NEI NIH HHS Collapse
39	GNorm2: an improved gene name recognition and normalization system. Bioinformatics 2023;39:btad599. [PMID: 37878810 PMCID: PMC10612401 DOI: 10.1093/bioinformatics/btad599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 09/06/2023] [Accepted: 10/23/2023] [Indexed: 10/27/2023] Open Abstract MOTIVATION Gene name normalization is an important yet highly complex task in biomedical text mining research, as gene names can be highly ambiguous and may refer to different genes in different species or share similar names with other bioconcepts. This poses a challenge for accurately identifying and linking gene mentions to their corresponding entries in databases such as NCBI Gene or UniProt. While there has been a body of literature on the gene normalization task, few have addressed all of these challenges or make their solutions publicly available to the scientific community. RESULTS Building on the success of GNormPlus, we have created GNorm2: a more advanced tool with optimized functions and improved performance. GNorm2 integrates a range of advanced deep learning-based methods, resulting in the highest levels of accuracy and efficiency for gene recognition and normalization to date. Our tool is freely available for download. AVAILABILITY AND IMPLEMENTATION https://github.com/ncbi/GNorm2. Collapse Key Words Collapse MESH Headings Data Mining/methods Databases, Factual Collapse Grants NLM NIH HHS NIH HHS National Library of Medicine National Institutes of Health Fundamental Research Funds for the Central Universities Collapse
40	The Role of Radiation Therapy for Metastatic Cervical Cancer. Int J Radiat Oncol Biol Phys 2023;117:e555. [PMID: 37785704 DOI: 10.1016/j.ijrobp.2023.06.1865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Abstract PURPOSE/OBJECTIVE(S) Survival rates for women with metastatic cervical cancer (CC) are low, with limited management options. Radiation therapy (RT) for metastatic disease has led to prolonged survival in other malignancies, however, the data are scarce in CC. Herein, we evaluated the effect of RT for metastatic CC. MATERIALS/METHODS A total of 58 patients with metastatic CC between September 2019 and January 2023 were retrospectively analyzed. All the patients were treated with platinum-based chemotherapy combined with targeted therapy or immunotherapy followed with or without RT (NRT). The recent efficacy, survival status and prognostic factors were analyzed statistically. RESULTS Objective response rate (ORR) was 63.6% with one complete and twenty partial responses in RT group (n = 33) and 40.0% with two complete and eight partial responses in NRT group (n = 25), respectively (p = 0.074). Disease control rate (DCR) of the RT and NRT groups were 79.4% vs 80.0%, respectively (p = 0.861). Median follow-up time was 17 months (3-39months). In RT group, 11(33.3%) patients experienced local regional or distant failure and 9 (27.3%) patients were dead. In NRT group, 15(60%) patients had progression and 8 (32%) patients dead. There was no significant difference between the two groups in overall survival (OS); however, RT group displayed superior progression-free survival (PFS) (1-year OS: 72.7% vs. 68.0%, p = 0.460; 1-year PFS: 66.7% vs. 40.0%, p = 0.039). The multivariate analysis showed that RT, immunotherapy, lymph node metastasis only relevant predictor of superior PFS but not OS. In subgroup analysis, patients treated with RT appeared to have a better PFS in some specific cohorts, such as age>45 years (72.0% vs 36.4% P = 0.015), squamous carcinoma histology (71.0% vs 40.9% P = 0.017), metastatic at diagnosis (75.0% vs 47.6% P = 0.012), non-targeted therapy (72.4% vs 43.8% P = 0.040). No significant increase in treatment-related toxicity was observed in the RT group compared with the NRT group. CONCLUSION RT provided superior PFS in metastatic CC patients compared to NRT, and well tolerated. Moreover, RT, immunotherapy, lymph node metastasis only were independent significant prognostic factors for PFS. Subgroup analysis showed that combination of RT and chemotherapy obtained favorable PFS in metastatic CC patients with age>45 years, squamous carcinoma histology, metastatic at diagnosis, non-targeted therapy. Studies with a larger sample size and longer follow-up are warranted. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
41	BioREx: Improving biomedical relation extraction by leveraging heterogeneous datasets. J Biomed Inform 2023;146:104487. [PMID: 37673376 DOI: 10.1016/j.jbi.2023.104487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 08/18/2023] [Accepted: 09/02/2023] [Indexed: 09/08/2023] Abstract Biomedical relation extraction (RE) is the task of automatically identifying and characterizing relations between biomedical concepts from free text. RE is a central task in biomedical natural language processing (NLP) research and plays a critical role in many downstream applications, such as literature-based discovery and knowledge graph construction. State-of-the-art methods were used primarily to train machine learning models on individual RE datasets, such as protein-protein interaction and chemical-induced disease relation. Manual dataset annotation, however, is highly expensive and time-consuming, as it requires domain knowledge. Existing RE datasets are usually domain-specific or small, which limits the development of generalized and high-performing RE models. In this work, we present a novel framework for systematically addressing the data heterogeneity of individual datasets and combining them into a large dataset. Based on the framework and dataset, we report on BioREx, a data-centric approach for extracting relations. Our evaluation shows that BioREx achieves significantly higher performance than the benchmark system trained on the individual dataset, setting a new SOTA from 74.4% to 79.6% in F-1 measure on the recently released BioRED corpus. We further demonstrate that the combined dataset can improve performance for five different RE tasks. In addition, we show that on average BioREx compares favorably to current best-performing methods such as transfer learning and multi-task learning. Finally, we demonstrate BioREx's robustness and generalizability in two independent RE tasks not previously seen in training data: drug-drug N-ary combination and document-level gene-disease RE. The integrated dataset and optimized method have been packaged as a stand-alone tool available at https://github.com/ncbi/BioREx. Collapse Key Words Biomedical dataset Biomedical natural language processing Multi-task learning Transfer learning Transformers Collapse MESH Headings Collapse Grants Collapse
42	A scoping review on multimodal deep learning in biomedical images and texts. J Biomed Inform 2023;146:104482. [PMID: 37652343 PMCID: PMC10591890 DOI: 10.1016/j.jbi.2023.104482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 07/18/2023] [Accepted: 08/28/2023] [Indexed: 09/02/2023] Abstract OBJECTIVE Computer-assisted diagnostic and prognostic systems of the future should be capable of simultaneously processing multimodal data. Multimodal deep learning (MDL), which involves the integration of multiple sources of data, such as images and text, has the potential to revolutionize the analysis and interpretation of biomedical data. However, it only caught researchers' attention recently. To this end, there is a critical need to conduct a systematic review on this topic, identify the limitations of current work, and explore future directions. METHODS In this scoping review, we aim to provide a comprehensive overview of the current state of the field and identify key concepts, types of studies, and research gaps with a focus on biomedical images and texts joint learning, mainly because these two were the most commonly available data types in MDL research. RESULT This study reviewed the current uses of multimodal deep learning on five tasks: (1) Report generation, (2) Visual question answering, (3) Cross-modal retrieval, (4) Computer-aided diagnosis, and (5) Semantic segmentation. CONCLUSION Our results highlight the diverse applications and potential of MDL and suggest directions for future research in the field. We hope our review will facilitate the collaboration of natural language processing (NLP) and medical imaging communities and support the next generation of decision-making and computer-assisted diagnostic system development. Collapse Key Words Clinical notes Medical images Multimodal learning Scoping review Collapse MESH Headings Deep Learning Diagnostic Imaging Semantics Natural Language Processing Diagnosis, Computer-Assisted Collapse Grants R00 LM013001 NLM NIH HHS Collapse
43	Radiation Induced Lung Injury in Rats after Pre-Oxygenation Radiotherapy. Int J Radiat Oncol Biol Phys 2023;117:e279-e280. [PMID: 37785046 DOI: 10.1016/j.ijrobp.2023.06.1260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Abstract PURPOSE/OBJECTIVE(S) Deep inspiratory breath holding (DIBH) has been widely used during the radiotherapy of thoracic tumors. The main disadvantage of voluntary DIBH is the short duration of each breath hold. The hypocapnia induced by hyperoxia (oxygen concentration > 50%) pre-oxygenation (PreO2) combined with mechanical hyperventilation has been reported to prolong the duration of single breath hold, but its safety remains controversial, especially the sensitivity of lung tissue to radiation damage under hyperoxia exposure has not been elucidated. In this study, we aim to investigate the changes of radiation induced lung injury in rats after PreO2 radiation. MATERIALS/METHODS We evaluated the lung tissue of rats at different time points (48h, 2w, 4w, 8w, 12w) after thoracic radiation (15Gy single fraction to the right lung), and sequenced the transcriptome of lung tissue at 48 hours after irradiation. Rat cohorts (n = 7/group): 1. Control (Con); 2. Radiation group (RT); 3. Pre-oxygenation (oxygen concentration > 90%) for 8 hours before thoracic radiation (PreO2). RESULTS The inflammatory exudation emerged in the pulmonary interstitium at 48 hours, and reached the most serious alveolitis after four weeks of irradiation (the comparison of alveolitis scores in RT4w vs Con4w and PreO2(4w) vs Con4w, P<0.001) on hematoxylin-eosin staining. While the alveolitis scores in RT group and PreO2 group were not statistically different at each time point. Masson staining showed that the pulmonary fibrosis in the RT group and the PreO2 group reached an obvious pathological change at 12 weeks after irradiation, but the difference between the two groups was not significant. Transcriptome sequencing showed that the number of differential genes in PreO2 vs Con was 559 (302 up-regulated genes and 257 down-regulated genes). The GO enrichment analysis indicated that chromosome segregation was the most significant functional item with P value in the comparative analysis, and the KEGG enrichment analysis suggested that cell division was the most significant enrichment pathway of these differential genes. While there was a small quantity of differential genes in PreO2 vs RT group (3 up-regulated genes and 12 down-regulated genes). Pentose and glucuronate conversions were the most significant enrichment pathway of these differential genes. CONCLUSION This study demonstrated that PreO2 radiotherapy did not increase the severity of radiation induced lung injury in rats compared to conventional radiotherapy. Further study should be conducted to confirm these results and to investigate the regulatory mechanism of pneumonia caused by PreO2 radiotherapy. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
44	Deep Learning-Based Multi-Modality Segmentation of Primary Gross Tumor Volume in CT and MRI for Nasopharyngeal Carcinoma. Int J Radiat Oncol Biol Phys 2023;117:e498. [PMID: 37785566 DOI: 10.1016/j.ijrobp.2023.06.1739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023] Abstract PURPOSE/OBJECTIVE(S) The delineation of primary gross tumor volume (GTV) of nasopharyngeal carcinoma (NPC) is an essential step for radiotherapy planning. In clinical practice, radiation oncologists manually delineate the GTV in planning CT with the help of diagnostic MRI. This is because NPC tumors are closely adjacent to many important anatomic structures, and CT and MRI provide complementary strength to accurately determine the tumor extension boundary. Manual delineation is time-consuming with the potential registration errors between MRI and CT decreasing the delineation accuracy. In this study, we propose a fully automated GTV segmentation method based on CT and MRI by first aligning MRI to CT, and then, segmenting the GTV using a multi-modality deep learning model. MATERIALS/METHODS We collected 104 nasopharyngeal carcinoma patients with both planning CT and diagnostic MRI scans (T1 & T2 phases). An experienced radiation oncologists manually delineated the GTV, which was further examined by another senior radiation oncologist. Then, a coarse to fine cross-modality registration from MRI to CT was conducted as follows: (1) A rigid transformation was performed on MRI to roughly align MRI to CT with similar anatomic position. (2) Then, the region of interest (RoI) on both CT and rigid-transformed MRI were cropped. (3) A leading cross-modality deformable registration algorithm, named DEEDS, was applied on the cropped MRI and CT RoIs to find an accurate local alignment. Next, using CT and registered MRI as the combined input, a multi-modality deep segmentation network based on nnUNet was trained to generate the GTV prediction. 20% patients were randomly selected as the unseen testing set to quantitatively evaluate the performance. RESULTS The quantitative NPC GTV segmentation performance is summarized in Table 1. The deep segmentation model using CT alone achieved reasonable high performance with 76.6% Dice score and 1.34mm average surface distance (ASD). When both CT and registered MRI were used, the segmentation model further improved the performance by 0.9% Dice score increase and 11% relative ASD error reduction, demonstrating the complementary strength of CT and MRI in determining NPC GTV. Notably, the achieved 77.5% Dice score and 1.19mm ASD by the multimodality model is among the top performing results reported in recent automatic NPC GTV segmentation using either CT or MRI modality. CONCLUSION We developed a fully automated multi-modal deep-learning model for NPC GTV segmentation. The developed model can segment the NPC GTV in high accuracy. With further optimization and validation, this automated model has potential to standardize the NPC GTV segmentation and significantly decrease the workload of radiation oncologists in clinical practice. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
45	Utilizing Longitudinal Chest X-Rays and Reports to Pre-fill Radiology Reports. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2023;14224:189-198. [PMID: 38501075 PMCID: PMC10947431 DOI: 10.1007/978-3-031-43904-9_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2024] Abstract Despite the reduction in turn-around times in radiology reporting with the use of speech recognition software, persistent communication errors can significantly impact the interpretation of radiology reports. Pre-filling a radiology report holds promise in mitigating reporting errors, and despite multiple efforts in literature to generate comprehensive medical reports, there lacks approaches that exploit the longitudinal nature of patient visit records in the MIMIC-CXR dataset. To address this gap, we propose to use longitudinal multi-modal data, i.e., previous patient visit CXR, current visit CXR, and the previous visit report, to pre-fill the "findings" section of the patient's current visit. We first gathered the longitudinal visit information for 26,625 patients from the MIMIC-CXR dataset, and created a new dataset called Longitudinal-MIMIC. With this new dataset, a transformer-based model was trained to capture the multi-modal longitudinal information from patient visit records (CXR images + reports) via a cross-attention-based multi-modal fusion module and a hierarchical memory-driven decoder. In contrast to previous works that only uses current visit data as input to train a model, our work exploits the longitudinal information available to pre-fill the "findings" section of radiology reports. Experiments show that our approach outperforms several recent approaches by ≥3% on F1 score, and ≥2% for BLEU-4, METEOR and ROUGE-L respectively. Code will be published at https://github.com/CelestialShine/Longitudinal-Chest-X-Ray. Collapse Key Words Chest X-Rays Longitudinal data Radiology reports Report Generation Report Pre-Filling Collapse MESH Headings Collapse Grants R00 LM013001 NLM NIH HHS R01 LM014306 NLM NIH HHS Collapse
46	Retrieve, Summarize, and Verify: How Will ChatGPT Affect Information Seeking from the Medical Literature? J Am Soc Nephrol 2023;34:1302-1304. [PMID: 37254254 PMCID: PMC10400098 DOI: 10.1681/asn.0000000000000166] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 05/24/2023] [Indexed: 06/01/2023] Open Abstract Collapse Key Words aki covid-19 kidney ethics medical education Collapse MESH Headings Information Seeking Behavior Artificial Intelligence Collapse Grants Collapse
47	Realizing Nanolime Aqueous Dispersion via Ionic Liquid Surface Modification to Consolidate Stone Relics. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2023. [PMID: 37422798 DOI: 10.1021/acs.langmuir.3c01147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023] Abstract After decades of research in the conservation of cultural heritage, nanolime (NL) has emerged as a potential alternative inorganic material to the frequently used organic materials. However, its poor kinetic stability in water has been a major challenge that restricted its penetration depth through cultural relics and resulted in unsatisfactory conservation outcomes. Here, for the first time, we realize NL water dispersion by modification of ionic liquid (1-butyl-3-methylimidazolium tetrafluoroborate) via a sample aqueous solution deposit method. Our findings indicate that the cation of the ionic liquid (IL) binds strongly to the surface of NL particles (IL-NL) by forming hydrogen bonds with Ca(OH)₂ facets. The absorption of IL causes an unexpected significant alteration in the morphology of NL particles and results in a drastic reduction in NL's size. More importantly, this absorption endows NL excellent kinetic stability dispersed into water and implements NL water dispersion, which makes a breakthrough in terms of extreme poor kinetic stability of as-synthesized NL and commercial NL in water. The mechanism driving IL-NL water dispersion is explained by Stern theory. In the context of consolidating weathered stone, the presence of IL may delay carbonation of NL but the penetration depth of IL-NL through stone samples is three times deeper than that of as-synthesized and commercial NLs. Additionally, the consolidation strength of IL-NL is similar to that of as-synthesized NL and commercial NL. Moreover, IL-NL has no significant impact on the permeability, pore size, and microstructure of consolidated stone relics. Our research contributes to the field of NL-related materials and will enhance the dissemination and utilization of NL-based materials in the preservation of water-insensitive cultural heritage. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
48	Context-aware deep network for coronary artery stenosis classification in coronary CT angiography. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023;2023:1-4. [PMID: 38083399 DOI: 10.1109/embc40787.2023.10340650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023] Abstract Automatic coronary artery stenosis grading plays an important role in the diagnosis of coronary artery disease. Due to the difficulty of learning the informative features from varying grades of stenosis, it is still a challenging task to identify coronary artery stenosis from coronary CT angiography (CCTA). In this paper, we propose a context-aware deep network (CADN) for coronary artery stenosis classification. The proposed method integrates 3D CNN with Transformer to improve the feature representation of coronary artery stenosis in CCTA. We evaluate the proposed method on a multicenter dataset (APOLLO study with NCT05509010). Experimental results show that our proposed method can achieve the accuracy of 0.84, 0.83, and 0.86 for stenosis diagnosis on the lesion, artery, and patient levels, respectively. Collapse Key Words Collapse MESH Headings Humans Computed Tomography Angiography/methods Constriction, Pathologic Coronary Angiography/methods Tomography, X-Ray Computed Coronary Stenosis/diagnostic imaging Collapse Grants Collapse
49	BioREx: Improving Biomedical Relation Extraction by Leveraging Heterogeneous Datasets. ARXIV 2023:arXiv:2306.11189v1. [PMID: 37502629 PMCID: PMC10370213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/29/2023] Abstract Biomedical relation extraction (RE) is the task of automatically identifying and characterizing relations between biomedical concepts from free text. RE is a central task in biomedical natural language processing (NLP) research and plays a critical role in many downstream applications, such as literature-based discovery and knowledge graph construction. State-of-the-art methods were used primarily to train machine learning models on individual RE datasets, such as protein-protein interaction and chemical-induced disease relation. Manual dataset annotation, however, is highly expensive and time-consuming, as it requires domain knowledge. Existing RE datasets are usually domain-specific or small, which limits the development of generalized and high-performing RE models. In this work, we present a novel framework for systematically addressing the data heterogeneity of individual datasets and combining them into a large dataset. Based on the framework and dataset, we report on BioREx, a data-centric approach for extracting relations. Our evaluation shows that BioREx achieves significantly higher performance than the benchmark system trained on the individual dataset, setting a new SOTA from 74.4% to 79.6% in F-1 measure on the recently released BioRED corpus. We further demonstrate that the combined dataset can improve performance for five different RE tasks. In addition, we show that on average BioREx compares favorably to current best-performing methods such as transfer learning and multi-task learning. Finally, we demonstrate BioREx's robustness and generalizability in two independent RE tasks not previously seen in training data: drug-drug N-ary combination and document-level gene-disease RE. The integrated dataset and optimized method have been packaged as a stand-alone tool available at https://github.com/ncbi/BioREx. Collapse Key Words biomedical natural language processing biomedical dataset transformers transfer learning multi-task learning Collapse MESH Headings Collapse Grants K99 LM014024 NLM NIH HHS Collapse
50	From function to translation: Decoding genetic susceptibility to human diseases via artificial intelligence. CELL GENOMICS 2023;3:100320. [PMID: 37388909 PMCID: PMC10300605 DOI: 10.1016/j.xgen.2023.100320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Abstract While genome-wide association studies (GWAS) have discovered thousands of disease-associated loci, molecular mechanisms for a considerable fraction of the loci remain to be explored. The logical next steps for post-GWAS are interpreting these genetic associations to understand disease etiology (GWAS functional studies) and translating this knowledge into clinical benefits for the patients (GWAS translational studies). Although various datasets and approaches using functional genomics have been developed to facilitate these studies, significant challenges remain due to data heterogeneity, multiplicity, and high dimensionality. To address these challenges, artificial intelligence (AI) technology has demonstrated considerable promise in decoding complex functional datasets and providing novel biological insights into GWAS findings. This perspective first describes the landmark progress driven by AI in interpreting and translating GWAS findings and then outlines specific challenges followed by actionable recommendations related to data availability, model optimization, and interpretation, as well as ethical concerns. Collapse Key Words artificial intelligence functional genomics genome-wide association studies translational genomics Collapse MESH Headings Collapse Grants Collapse