1
|
Musella L, Afonso Castro A, Lai X, Widmann M, Vera J. ENQUIRE automatically reconstructs, expands, and drives enrichment analysis of gene and Mesh co-occurrence networks from context-specific biomedical literature. PLoS Comput Biol 2025; 21:e1012745. [PMID: 39932993 PMCID: PMC11844901 DOI: 10.1371/journal.pcbi.1012745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 02/21/2025] [Accepted: 12/20/2024] [Indexed: 02/13/2025] Open
Abstract
The accelerating growth of scientific literature overwhelms our capacity to manually distil complex phenomena like molecular networks linked to diseases. Moreover, biases in biomedical research and database annotation limit our interpretation of facts and generation of hypotheses. ENQUIRE (Expanding Networks by Querying Unexpectedly Inter-Related Entities) offers a time- and resource-efficient alternative to manual literature curation and database mining. ENQUIRE reconstructs and expands co-occurrence networks of genes and biomedical ontologies from user-selected input corpora and network-inferred PubMed queries. Its modest resource usage and the integration of text mining, automatic querying, and network-based statistics mitigating literature biases makes ENQUIRE unique in its broad-scope applications. For example, ENQUIRE can generate co-occurrence gene networks that reflect high-confidence, functional networks. When tested on case studies spanning cancer, cell differentiation, and immunity, ENQUIRE identified interlinked genes and enriched pathways unique to each topic, thereby preserving their underlying context specificity. ENQUIRE supports biomedical researchers by easing literature annotation, boosting hypothesis formulation, and facilitating the identification of molecular targets for subsequent experimentation.
Collapse
Affiliation(s)
- Luca Musella
- Laboratory of Systems Tumor Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Deutsches Zentrum Immuntherapie, BZKF, and Uniklinikum Erlangen, Erlangen, Germany
| | - Alejandro Afonso Castro
- Laboratory of Systems Tumor Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Deutsches Zentrum Immuntherapie, BZKF, and Uniklinikum Erlangen, Erlangen, Germany
| | - Xin Lai
- Laboratory of Systems Tumor Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Deutsches Zentrum Immuntherapie, BZKF, and Uniklinikum Erlangen, Erlangen, Germany
- Faculty of Medicine and Health Technology, Systems and Network Medicine Lab, Biomedicine Unit, Tampere University, Tampere, Finland
| | - Max Widmann
- Laboratory of Systems Tumor Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Deutsches Zentrum Immuntherapie, BZKF, and Uniklinikum Erlangen, Erlangen, Germany
- University of Konstanz, Konstanz, Germany
| | - Julio Vera
- Laboratory of Systems Tumor Immunology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Deutsches Zentrum Immuntherapie, BZKF, and Uniklinikum Erlangen, Erlangen, Germany
| |
Collapse
|
2
|
Page J, Moore N, Broderick G. A Computational Protocol for the Knowledge-Based Assessment and Capture of Pathologies. Methods Mol Biol 2025; 2868:265-284. [PMID: 39546235 DOI: 10.1007/978-1-0716-4200-9_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
We propose that one of the main hurdles in delivering comprehensively informed care results from the challenges surrounding the extraction, representation, and retention of prior clinical experience and basic medical knowledge, as well as its translation into time- and context-informed actionable interventions. While emerging applications in artificial intelligence-based techniques, for example, large language models, offer impressive pattern association capabilities, they often fall short in producing human-readable explanations crucial to their integration into clinical care. Moreover, they require large well-defined and well-integrated data sets that typically conflict with the availability of such data in all but a few areas of medicine, for example, medical imaging and neuroimaging, noninvasive monitoring of bio-electrical activity, etc. In this chapter, we argue that approximate reasoning rooted in the knowledge that is explainable to the human clinician may offer attractive avenues for the introduction of such knowledge in a systematic way that supports formal retention, sharing, and reuse of new clinical and basic medical experience. We outline a conceptual protocol that targets the use of sparse and disparate data of different types and from different sources, seamlessly drawing on our collective experience and that of others. We illustrate the utility of such an integrative approach by applying the latter to the assessment and reconciliation of data from different experimental models, human and animal, in the example use case of a complex health condition.
Collapse
Affiliation(s)
- Jeffrey Page
- Center for Clinical Systems Biology, Rochester General Hospital, Rochester, NY, USA
| | - Nadia Moore
- Center for Clinical Systems Biology, Rochester General Hospital, Rochester, NY, USA
| | - Gordon Broderick
- Center for Clinical Systems Biology, Rochester General Hospital, Rochester, NY, USA.
- Vaccine and Infectious Disease Organization (VIDO-InterVac), University of Saskatchewan, Saskatoon, SK, Canada.
| |
Collapse
|
3
|
Degnan DJ, Strauch CW, Obiri MY, VonKaenel ED, Kim GS, Kershaw JD, Novelli DL, Pazdernik KT, Bramer LM. Protein-Protein Interaction Networks Derived from Classical and Machine Learning-Based Natural Language Processing Tools. J Proteome Res 2024; 23:5395-5404. [PMID: 39526844 DOI: 10.1021/acs.jproteome.4c00535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
The study of protein-protein interactions (PPIs) provides insight into various biological mechanisms, including the binding of antibodies to antigens, enzymes to inhibitors or promoters, and receptors to ligands. Recent studies of PPIs have led to significant biological breakthroughs. For example, the study of PPIs involved in the human:SARS-CoV-2 viral infection mechanism aided in the development of SARS-CoV-2 vaccines. Though several databases exist for the manual curation of PPI networks, text mining methods have been routinely demonstrated as useful alternatives for newly studied or understudied species, where databases are incomplete. Here, the relationship extraction performance of several open-source classical text processing, machine learning (ML)-based natural language processing (NLP), and large language model (LLM)-based NLP tools was compared. Overall, our results indicated that networks derived from classical methods tend to have high true positive rates at the expense of having overconnected networks, ML-based NLP methods have lower true positive rates but networks with the closest structures to the target network, and LLM-based NLP methods tend to exist between the two other approaches, with variable performances. The selection of a specific NLP approach should be tied to the needs of a study and text availability, as models varied in performance due to the amount of text provided.
Collapse
Affiliation(s)
- David J Degnan
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
| | - Clayton W Strauch
- AI & Data Analytics Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
| | - Moses Y Obiri
- Earth Systems Science Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
| | - Erik D VonKaenel
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
| | - Grace S Kim
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
| | - James D Kershaw
- Earth Systems Science Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
| | - David L Novelli
- AI & Data Analytics Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
| | - Karl Tl Pazdernik
- AI & Data Analytics Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
- Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Lisa M Bramer
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington 99354, United States
| |
Collapse
|
4
|
Savosina P, Druzhilovskiy D, Filimonov D, Poroikov V. WWAD: the most comprehensive small molecule World Wide Approved Drug database of therapeutics. Front Pharmacol 2024; 15:1473279. [PMID: 39359251 PMCID: PMC11444997 DOI: 10.3389/fphar.2024.1473279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 08/28/2024] [Indexed: 10/04/2024] Open
Affiliation(s)
- Polina Savosina
- Laboratory of Structure-Function Based Drug Design, Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia
| | | | | | | |
Collapse
|
5
|
Mohammad-Taheri S, Navada PP, Hoyt CT, Zucker J, Sachs K, Gyori BM, Vitek O. Eliater: a Python package for estimating outcomes of perturbations in biomolecular networks. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae527. [PMID: 39187941 PMCID: PMC11410922 DOI: 10.1093/bioinformatics/btae527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 05/16/2024] [Accepted: 08/23/2024] [Indexed: 08/28/2024]
Abstract
SUMMARY We introduce Eliater, a Python package for estimating the effect of perturbation of an upstream molecule on a downstream molecule in a biomolecular network. The estimation takes as input a biomolecular network, observational biomolecular data, and a perturbation of interest, and outputs an estimated quantitative effect of the perturbation. We showcase the functionalities of Eliater in a case study of Escherichia coli transcriptional regulatory network. AVAILABILITY AND IMPLEMENTATION The code, the documentation, and several case studies are available open source at https://github.com/y0-causal-inference/eliater.
Collapse
Affiliation(s)
- Sara Mohammad-Taheri
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, 02115, United States
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, United States
| | - Pruthvi Prakash Navada
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, United States
| | - Charles Tapley Hoyt
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, United States
| | - Jeremy Zucker
- Pacific Northwest National Laboratory, Richland, WA, 99354, United States
| | - Karen Sachs
- Next Generation Analytics, Palo Alto, CA, United States
- Modulo Bio, Inc., Los Altos, CA, 92121, United States
| | - Benjamin M Gyori
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, 02115, United States
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, United States
- Department of Bioengineering, Northeastern University, Boston, MA, 02115, United States
| | - Olga Vitek
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, 02115, United States
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, United States
| |
Collapse
|
6
|
Gyori BM, Vitek O. Beyond protein lists: AI-assisted interpretation of proteomic investigations in the context of evolving scientific knowledge. Nat Methods 2024; 21:1387-1389. [PMID: 39122950 DOI: 10.1038/s41592-024-02324-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2024]
Affiliation(s)
- Benjamin M Gyori
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA.
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
- Department of Bioengineering, College of Engineering, Northeastern University, Boston, MA, USA.
| | - Olga Vitek
- Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA, USA.
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
| |
Collapse
|
7
|
Jain A, Gyori BM, Hakim S, Jain A, Sun L, Petrova V, Bhuiyan SA, Zhen S, Wang Q, Kawaguchi R, Bunga S, Taub DG, Ruiz-Cantero MC, Tong-Li C, Andrews N, Kotoda M, Renthal W, Sorger PK, Woolf CJ. Nociceptor-immune interactomes reveal insult-specific immune signatures of pain. Nat Immunol 2024; 25:1296-1305. [PMID: 38806708 PMCID: PMC11224023 DOI: 10.1038/s41590-024-01857-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 04/25/2024] [Indexed: 05/30/2024]
Abstract
Inflammatory pain results from the heightened sensitivity and reduced threshold of nociceptor sensory neurons due to exposure to inflammatory mediators. However, the cellular and transcriptional diversity of immune cell and sensory neuron types makes it challenging to decipher the immune mechanisms underlying pain. Here we used single-cell transcriptomics to determine the immune gene signatures associated with pain development in three skin inflammatory pain models in mice: zymosan injection, skin incision and ultraviolet burn. We found that macrophage and neutrophil recruitment closely mirrored the kinetics of pain development and identified cell-type-specific transcriptional programs associated with pain and its resolution. Using a comprehensive list of potential interactions mediated by receptors, ligands, ion channels and metabolites to generate injury-specific neuroimmune interactomes, we also uncovered that thrombospondin-1 upregulated by immune cells upon injury inhibited nociceptor sensitization. This study lays the groundwork for identifying the neuroimmune axes that modulate pain in diverse disease contexts.
Collapse
Affiliation(s)
- Aakanksha Jain
- F. M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
| | - Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
- Department of Bioengineering, College of Engineering, Northeastern University, Boston, MA, USA
| | - Sara Hakim
- F. M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA
| | - Ashish Jain
- Research Computing, Department of Information Technology, Boston Children's Hospital, Boston, MA, USA
| | - Liang Sun
- Research Computing, Department of Information Technology, Boston Children's Hospital, Boston, MA, USA
| | - Veselina Petrova
- F. M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
| | - Shamsuddin A Bhuiyan
- Department of Neurology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Shannon Zhen
- F. M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
| | - Qing Wang
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Riki Kawaguchi
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA
| | - Samuel Bunga
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
| | - Daniel G Taub
- F. M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
| | - M Carmen Ruiz-Cantero
- Department of Pharmacology and Neurosciences Institute (Biomedical Research Center) and Biosanitary Research Institute ibs.GRANADA, University of Granada, Granada, Spain
| | - Candace Tong-Li
- F. M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA
| | | | - Masakazu Kotoda
- Department of Anesthesiology, Faculty of Medicine, University of Yamanashi, Chuo, Yamanashi, Japan
| | - William Renthal
- Department of Neurology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Peter K Sorger
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Clifford J Woolf
- F. M. Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA, USA.
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
8
|
Inoue Y, Lee H, Fu T, Luna A. drGAT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network. ARXIV 2024:arXiv:2405.08979v1. [PMID: 38800657 PMCID: PMC11118660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Drug development is a lengthy process with a high failure rate. Increasingly, machine learning is utilized to facilitate the drug development processes. These models aim to enhance our understanding of drug characteristics, including their activity in biological contexts. However, a major challenge in drug response (DR) prediction is model interpretability as it aids in the validation of findings. This is important in biomedicine, where models need to be understandable in comparison with established knowledge of drug interactions with proteins. drGAT, a graph deep learning model, leverages a heterogeneous graph composed of relationships between proteins, cell lines, and drugs. drGAT is designed with two objectives: DR prediction as a binary sensitivity prediction and elucidation of drug mechanism from attention coefficients. drGAT has demonstrated superior performance over existing models, achieving 78% accuracy (and precision), and 76% F1 score for 269 DNA-damaging compounds of the NCI60 drug response dataset. To assess the model's interpretability, we conducted a review of drug-gene co-occurrences in Pubmed abstracts in comparison to the top 5 genes with the highest attention coefficients for each drug. We also examined whether known relationships were retained in the model by inspecting the neighborhoods of topoisomerase-related drugs. For example, our model retained TOP1 as a highly weighted predictive feature for irinotecan and topotecan, in addition to other genes that could potentially be regulators of the drugs. Our method can be used to accurately predict sensitivity to drugs and may be useful in the identification of biomarkers relating to the treatment of cancer patients.
Collapse
Affiliation(s)
- Yoshitaka Inoue
- Department of Computer Science and Engineering, University of Minnesota
- Computational Biology Branch, National Library of Medicine
| | - Hunmin Lee
- Department of Computer Science and Engineering, University of Minnesota
| | - Tianfan Fu
- Computer Science Department, Rensselaer Polytechnic Institute
| | - Augustin Luna
- Computational Biology Branch, National Library of Medicine
- Developmental Therapeutics Branch, National Cancer Institute
| |
Collapse
|
9
|
Arakane K, Imoto H, Ormersbach F, Okada M. Extending BioMASS to construct mathematical models from external knowledge. BIOINFORMATICS ADVANCES 2024; 4:vbae042. [PMID: 38606187 PMCID: PMC11007111 DOI: 10.1093/bioadv/vbae042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 02/13/2024] [Accepted: 04/03/2024] [Indexed: 04/13/2024]
Abstract
Motivation Mechanistic modeling based on ordinary differential equations has led to numerous findings in systems biology by integrating prior knowledge and experimental data. However, the manual curation of knowledge necessary when constructing models poses a bottleneck. As the speed of knowledge accumulation continues to grow, there is a demand for a scalable means of constructing executable models. Results We previously introduced BioMASS-an open-source, Python-based framework-to construct, simulate, and analyze mechanistic models of signaling networks. With one of its features, Text2Model, BioMASS allows users to define models in a natural language-like format, thereby facilitating the construction of large-scale models. We demonstrate that Text2Model can serve as a tool for integrating external knowledge for mathematical modeling by generating Text2Model files from a pathway database or through the use of a large language model, and simulating its dynamics through BioMASS. Our findings reveal the tool's capabilities to encourage exploration from prior knowledge and pave the way for a fully data-driven approach to constructing mathematical models. Availability and implementation The code and documentation for BioMASS are available at https://github.com/biomass-dev/biomass and https://biomass-core.readthedocs.io, respectively. The code used in this article are available at https://github.com/okadalabipr/text2model-from-knowledge.
Collapse
Affiliation(s)
- Kiwamu Arakane
- Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Hiroaki Imoto
- Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | | | - Mariko Okada
- Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
- Premium Research Institute for Human Metaverse Medicine (WPI-PRIMe), Osaka 565-0871, Japan
| |
Collapse
|
10
|
Vega C, Ostaszewski M, Grouès V, Schneider R, Satagopam V. BioKC: a collaborative platform for curation and annotation of molecular interactions. Database (Oxford) 2024; 2024:baae013. [PMID: 38537198 PMCID: PMC10972550 DOI: 10.1093/database/baae013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 01/30/2024] [Accepted: 02/19/2024] [Indexed: 03/23/2025]
Abstract
Curation of biomedical knowledge into systems biology diagrammatic or computational models is essential for studying complex biological processes. However, systems-level curation is a laborious manual process, especially when facing ever-increasing growth of domain literature. New findings demonstrating elaborate relationships between multiple molecules, pathways and cells have to be represented in a format suitable for systems biology applications. Importantly, curation should capture the complexity of molecular interactions in such a format together with annotations of the involved elements and support stable identifiers and versioning. This challenge calls for novel collaborative tools and platforms allowing to improve the quality and the output of the curation process. In particular, community-based curation, an important source of curated knowledge, requires support in role management, reviewing features and versioning. Here, we present Biological Knowledge Curation (BioKC), a web-based collaborative platform for the curation and annotation of biomedical knowledge following the standard data model from Systems Biology Markup Language (SBML). BioKC offers a graphical user interface for curation of complex molecular interactions and their annotation with stable identifiers and supporting sentences. With the support of collaborative curation and review, it allows to construct building blocks for systems biology diagrams and computational models. These building blocks can be published under stable identifiers and versioned and used as annotations, supporting knowledge building for modelling activities.
Collapse
Affiliation(s)
- Carlos Vega
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, 7 Avenue des Hauts Fourneaux, Esch-sur-Alzette 4362, Luxembourg
| | - Marek Ostaszewski
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, 7 Avenue des Hauts Fourneaux, Esch-sur-Alzette 4362, Luxembourg
| | - Valentin Grouès
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, 7 Avenue des Hauts Fourneaux, Esch-sur-Alzette 4362, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, 7 Avenue des Hauts Fourneaux, Esch-sur-Alzette 4362, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, 7 Avenue des Hauts Fourneaux, Esch-sur-Alzette 4362, Luxembourg
| |
Collapse
|
11
|
Gomez SM, Axtman AD, Willson TM, Major MB, Townsend RR, Sorger PK, Johnson GL. Illuminating function of the understudied druggable kinome. Drug Discov Today 2024; 29:103881. [PMID: 38218213 PMCID: PMC11262466 DOI: 10.1016/j.drudis.2024.103881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 12/21/2023] [Accepted: 01/09/2024] [Indexed: 01/15/2024]
Abstract
The human kinome, with more than 500 proteins, is crucial for cell signaling and disease. Yet, about one-third of kinases lack in-depth study. The Data and Resource Generating Center for Understudied Kinases has developed multiple resources to address this challenge including creation of a heavy amino acid peptide library for parallel reaction monitoring and quantitation of protein kinase expression, use of understudied kinases tagged with a miniTurbo-biotin ligase to determine interaction networks by proximity-dependent protein biotinylation, NanoBRET probe development for screening chemical tool target specificity in live cells, characterization of small molecule chemical tools inhibiting understudied kinases, and computational tools for defining kinome architecture. These resources are available through the Dark Kinase Knowledgebase, supporting further research into these understudied protein kinases.
Collapse
Affiliation(s)
- Shawn M Gomez
- University of North Carolina School of Medicine, Chapel Hill, NC, USA.
| | - Alison D Axtman
- University of North Carolina School of Medicine, Chapel Hill, NC, USA
| | - Timothy M Willson
- University of North Carolina School of Medicine, Chapel Hill, NC, USA
| | - Michael B Major
- Washington University School of Medicine in St. Louis, MO, USA
| | - Reid R Townsend
- Washington University School of Medicine in St. Louis, MO, USA
| | | | - Gary L Johnson
- University of North Carolina School of Medicine, Chapel Hill, NC, USA.
| |
Collapse
|
12
|
Venugopal V, Olivetti E. MatKG: An autonomously generated knowledge graph in Material Science. Sci Data 2024; 11:217. [PMID: 38368452 PMCID: PMC10874416 DOI: 10.1038/s41597-024-03039-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 02/01/2024] [Indexed: 02/19/2024] Open
Abstract
In this paper, we present MatKG, a knowledge graph in materials science that offers a repository of entities and relationships extracted from scientific literature. Using advanced natural language processing techniques, MatKG includes an array of entities, including materials, properties, applications, characterization and synthesis methods, descriptors, and symmetry phase labels. The graph is formulated based on statistical metrics, encompassing over 70,000 entities and 5.4 million unique triples. To enhance accessibility and utility, we have serialized MatKG in both CSV and RDF formats and made these, along with the code base, available to the research community. As the largest knowledge graph in materials science to date, MatKG provides structured organization of domain-specific data. Its deployment holds promise for various applications, including material discovery, recommendation systems, and advanced analytics.
Collapse
Affiliation(s)
- Vineeth Venugopal
- Massachusetts Institute of Technology (MIT), Department of Material Science and Engineering, Boston, 02139, USA.
| | - Elsa Olivetti
- Massachusetts Institute of Technology (MIT), Department of Material Science and Engineering, Boston, 02139, USA.
| |
Collapse
|
13
|
Wu X, Zeng Y, Das A, Jo S, Zhang T, Patel P, Zhang J, Gao SJ, Pratt D, Chiu YC, Huang Y. reguloGPT: Harnessing GPT for Knowledge Graph Construction of Molecular Regulatory Pathways. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.27.577521. [PMID: 38313267 PMCID: PMC10836076 DOI: 10.1101/2024.01.27.577521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2024]
Abstract
Motivation Molecular Regulatory Pathways (MRPs) are crucial for understanding biological functions. Knowledge Graphs (KGs) have become vital in organizing and analyzing MRPs, providing structured representations of complex biological interactions. Current tools for mining KGs from biomedical literature are inadequate in capturing complex, hierarchical relationships and contextual information about MRPs. Large Language Models (LLMs) like GPT-4 offer a promising solution, with advanced capabilities to decipher the intricate nuances of language. However, their potential for end-to-end KG construction, particularly for MRPs, remains largely unexplored. Results We present reguloGPT, a novel GPT-4 based in-context learning prompt, designed for the end-to-end joint name entity recognition, N-ary relationship extraction, and context predictions from a sentence that describes regulatory interactions with MRPs. Our reguloGPT approach introduces a context-aware relational graph that effectively embodies the hierarchical structure of MRPs and resolves semantic inconsistencies by embedding context directly within relational edges. We created a benchmark dataset including 400 annotated PubMed titles on N6-methyladenosine (m6A) regulations. Rigorous evaluation of reguloGPT on the benchmark dataset demonstrated marked improvement over existing algorithms. We further developed a novel G-Eval scheme, leveraging GPT-4 for annotation-free performance evaluation and demonstrated its agreement with traditional annotation-based evaluations. Utilizing reguloGPT predictions on m6A-related titles, we constructed the m6A-KG and demonstrated its utility in elucidating m6A's regulatory mechanisms in cancer phenotypes across various cancers. These results underscore reguloGPT's transformative potential for extracting biological knowledge from the literature. Availability and implementation The source code of reguloGPT, the m6A title and benchmark datasets, and m6A-KG are available at: https://github.com/Huang-AI4Medicine-Lab/reguloGPT.
Collapse
Affiliation(s)
- Xidong Wu
- Electrical and Computer Engineering, University of Pittsburgh
| | - Yiming Zeng
- Hillman Cancer Center, University of Pittsburgh Medical Center
| | - Arun Das
- Hillman Cancer Center, University of Pittsburgh Medical Center
- Division of Hematology/Oncology, Department of Medicine, University of Pittsburgh
| | - Sumin Jo
- Electrical and Computer Engineering, University of Pittsburgh
- Hillman Cancer Center, University of Pittsburgh Medical Center
| | - Tinghe Zhang
- Hillman Cancer Center, University of Pittsburgh Medical Center
- Division of Hematology/Oncology, Department of Medicine, University of Pittsburgh
| | - Parth Patel
- Department of Electrical and Computer Engineering, The University of Texas at San Antonio
| | - Jianqiu Zhang
- Department of Electrical and Computer Engineering, The University of Texas at San Antonio
| | - Shou-Jiang Gao
- Hillman Cancer Center, University of Pittsburgh Medical Center
- Department of Microbiology and Molecular Genetics, University of Pittsburgh
| | | | - Yu-Chiao Chiu
- Hillman Cancer Center, University of Pittsburgh Medical Center
- Division of Hematology/Oncology, Department of Medicine, University of Pittsburgh
| | - Yufei Huang
- Hillman Cancer Center, University of Pittsburgh Medical Center
- Division of Hematology/Oncology, Department of Medicine, University of Pittsburgh
| |
Collapse
|
14
|
Abstract
Knowledge graphs represent information in the form of entities and relationships between those entities. Such a representation has multiple potential applications in drug discovery, including democratizing access to biomedical data, contextualizing or visualizing that data, and generating novel insights through the application of machine learning approaches. Knowledge graphs put data into context and therefore offer the opportunity to generate explainable predictions, which is a key topic in contemporary artificial intelligence. In this chapter, we outline some of the factors that need to be considered when constructing biomedical knowledge graphs, examine recent advances in mining such systems to gain insights for drug discovery, and identify potential future areas for further development.
Collapse
Affiliation(s)
- Tim James
- Evotec (UK) Ltd., Abingdon, Oxfordshire, UK.
| | | |
Collapse
|
15
|
Georgouli K, Yeom JS, Blake RC, Navid A. Multi-scale models of whole cells: progress and challenges. Front Cell Dev Biol 2023; 11:1260507. [PMID: 38020904 PMCID: PMC10661945 DOI: 10.3389/fcell.2023.1260507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 10/19/2023] [Indexed: 12/01/2023] Open
Abstract
Whole-cell modeling is "the ultimate goal" of computational systems biology and "a grand challenge for 21st century" (Tomita, Trends in Biotechnology, 2001, 19(6), 205-10). These complex, highly detailed models account for the activity of every molecule in a cell and serve as comprehensive knowledgebases for the modeled system. Their scope and utility far surpass those of other systems models. In fact, whole-cell models (WCMs) are an amalgam of several types of "system" models. The models are simulated using a hybrid modeling method where the appropriate mathematical methods for each biological process are used to simulate their behavior. Given the complexity of the models, the process of developing and curating these models is labor-intensive and to date only a handful of these models have been developed. While whole-cell models provide valuable and novel biological insights, and to date have identified some novel biological phenomena, their most important contribution has been to highlight the discrepancy between available data and observations that are used for the parametrization and validation of complex biological models. Another realization has been that current whole-cell modeling simulators are slow and to run models that mimic more complex (e.g., multi-cellular) biosystems, those need to be executed in an accelerated fashion on high-performance computing platforms. In this manuscript, we review the progress of whole-cell modeling to date and discuss some of the ways that they can be improved.
Collapse
Affiliation(s)
- Konstantia Georgouli
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Jae-Seung Yeom
- Center for Applied Scientific Computing, Computing Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Robert C. Blake
- Center for Applied Scientific Computing, Computing Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| | - Ali Navid
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, United States
| |
Collapse
|
16
|
Huang Q, Zhang H, Zhang L, Xu B. Bacterial microbiota in different types of processed meat products: diversity, adaptation, and co-occurrence. Crit Rev Food Sci Nutr 2023; 65:287-302. [PMID: 37905560 DOI: 10.1080/10408398.2023.2272770] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
As a double-edged sword, some bacterial microbes can improve the quality and shelf life of meat products, but others mainly responsible for deterioration of the safety and quality of meat products. This review aims to present a landscape of the bacterial microbiota in different types of processed meat products. After demonstrating a panoramic view of the bacterial genera in meat products, the diversity of bacterial microbiota was evaluated in two dimensions, namely different types of processed meat products and different meats. Then, the influence of environmental factors on bacterial communities was evaluated according to the storage temperature, packaging conditions, and sterilization methods. Furthermore, microbes are not independent. To explore interactions among those genera, co-occurrence patterns were examined. In these respects, this review highlighted the recent advances in fundamental principles that underlie the environmental adaption tricks and why some species tend to occur together frequently, such as metabolic cross-feeding, co-aggregate at microscale, and the intercellular signaling system. Further investigations are required to unveil the underlying molecular mechanisms that govern microbial community systems, ultimately contributing to developing new strategies to harness beneficial microorganisms and control harmful microorganisms.
Collapse
Affiliation(s)
- Qianli Huang
- Engineering Research Center of Bio-process, Ministry of Education, Hefei University of Technology, Hefei, China
- School of Food and Biological Engineering, Hefei University of Technology, Hefei, China
| | - Huijuan Zhang
- Engineering Research Center of Bio-process, Ministry of Education, Hefei University of Technology, Hefei, China
- School of Food and Biological Engineering, Hefei University of Technology, Hefei, China
| | - Li Zhang
- Engineering Research Center of Bio-process, Ministry of Education, Hefei University of Technology, Hefei, China
- School of Food and Biological Engineering, Hefei University of Technology, Hefei, China
| | - Baocai Xu
- Engineering Research Center of Bio-process, Ministry of Education, Hefei University of Technology, Hefei, China
- School of Food and Biological Engineering, Hefei University of Technology, Hefei, China
| |
Collapse
|
17
|
Chacko TP, Toole JT, Morris MC, Page J, Forsten RD, Barrett JP, Reinhard MJ, Brewster RC, Costanzo ME, Broderick G. A regulatory pathway model of neuropsychological disruption in Havana syndrome. Front Psychiatry 2023; 14:1180929. [PMID: 37965360 PMCID: PMC10642174 DOI: 10.3389/fpsyt.2023.1180929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 09/29/2023] [Indexed: 11/16/2023] Open
Abstract
Introduction In 2016 diplomatic personnel serving in Havana, Cuba, began reporting audible sensory phenomena paired with onset of complex and persistent neurological symptoms consistent with brain injury. The etiology of these Anomalous Health Incidents (AHI) and subsequent symptoms remains unknown. This report investigates putative exposure-symptom pathology by assembling a network model of published bio-behavioral pathways and assessing how dysregulation of such pathways might explain loss of function in these subjects using data available in the published literature. Given similarities in presentation with mild traumatic brain injury (mTBI), we used the latter as a clinically relevant means of evaluating if the neuropsychological profiles observed in Havana Syndrome Havana Syndrome might be explained at least in part by a dysregulation of neurotransmission, neuro-inflammation, or both. Method Automated text-mining of >9,000 publications produced a network consisting of 273 documented regulatory interactions linking 29 neuro-chemical markers with 9 neuropsychological constructs from the Brief Mood Survey, PTSD Checklist, and the Frontal Systems Behavior Scale. Analysis of information flow through this network produced a set of regulatory rules reconciling to within a 6% departure known mechanistic pathways with neuropsychological profiles in N = 6 subjects. Results Predicted expression of neuro-chemical markers that jointly satisfy documented pathways and observed symptom profiles display characteristically elevated IL-1B, IL-10, NGF, and norepinephrine levels in the context of depressed BDNF, GDNF, IGF1, and glutamate expression (FDR < 5%). Elevations in CRH and IL-6 were also predicted unanimously across all subjects. Furthermore, simulations of neurological regulatory dynamics reveal subjects do not appear to be "locked in" persistent illness but rather appear to be engaged in a slow recovery trajectory. Discussion This computational analysis of measured neuropsychological symptoms in Havana-based diplomats proposes that these AHI symptoms may be supported in part by disruption of known neuroimmune and neurotransmission regulatory mechanisms also associated with mTBI.
Collapse
Affiliation(s)
- Thomas P. Chacko
- Center for Clinical Systems Biology, Rochester General Hospital, Rochester, NY, United States
| | - J. Tory Toole
- Center for Clinical Systems Biology, Rochester General Hospital, Rochester, NY, United States
| | - Matthew C. Morris
- Center for Clinical Systems Biology, Rochester General Hospital, Rochester, NY, United States
| | - Jeffrey Page
- Center for Clinical Systems Biology, Rochester General Hospital, Rochester, NY, United States
| | - Robert D. Forsten
- War Related Illness and Injury Study Center (WRIISC), Department of Veterans Affairs, Washington, DC, United States
| | - John P. Barrett
- War Related Illness and Injury Study Center (WRIISC), Department of Veterans Affairs, Washington, DC, United States
- Department of Preventive Medicine and Biostatistics, Uniformed Services University, Bethesda, MD, United States
| | - Matthew J. Reinhard
- War Related Illness and Injury Study Center (WRIISC), Department of Veterans Affairs, Washington, DC, United States
- Complex Exposures Threats Center, Department of Veterans Affairs, Washington, DC, United States
| | - Ryan C. Brewster
- War Related Illness and Injury Study Center (WRIISC), Department of Veterans Affairs, Washington, DC, United States
| | - Michelle E. Costanzo
- War Related Illness and Injury Study Center (WRIISC), Department of Veterans Affairs, Washington, DC, United States
- Complex Exposures Threats Center, Department of Veterans Affairs, Washington, DC, United States
- Department of Medicine, Uniformed Services University, Bethesda, MD, United States
| | - Gordon Broderick
- Center for Clinical Systems Biology, Rochester General Hospital, Rochester, NY, United States
- Complex Exposures Threats Center, Department of Veterans Affairs, Washington, DC, United States
| |
Collapse
|