1
|
Vinken M, Grimm D, Baatout S, Baselet B, Beheshti A, Braun M, Carstens AC, Casaletto JA, Cools B, Costes SV, De Meulemeester P, Doruk B, Eyal S, Ferreira MJS, Miranda S, Hahn C, Helvacıoğlu Akyüz S, Herbert S, Krepkiy D, Lichterfeld Y, Liemersdorf C, Krüger M, Marchal S, Ritz J, Schmakeit T, Stenuit H, Tabury K, Trittel T, Wehland M, Zhang YS, Putt KS, Zhang ZY, Tagle DA. Taking the 3Rs to a higher level: replacement and reduction of animal testing in life sciences in space research. Biotechnol Adv 2025; 81:108574. [PMID: 40180136 PMCID: PMC12048243 DOI: 10.1016/j.biotechadv.2025.108574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2025] [Revised: 03/28/2025] [Accepted: 03/29/2025] [Indexed: 04/05/2025]
Abstract
Human settlements on the Moon, crewed missions to Mars and space tourism will become a reality in the next few decades. Human presence in space, especially for extended periods of time, will therefore steeply increase. However, despite more than 60 years of spaceflight, the mechanisms underlying the effects of the space environment on human physiology are still not fully understood. Animals, ranging in complexity from flies to monkeys, have played a pioneering role in understanding the (patho)physiological outcome of critical environmental factors in space, in particular altered gravity and cosmic radiation. The use of animals in biomedical research is increasingly being criticized because of ethical reasons and limited human relevance. Driven by the 3Rs concept, calling for replacement, reduction and refinement of animal experimentation, major efforts have been focused in the past decades on the development of alternative methods that fully bypass animal testing or so-called new approach methodologies. These new approach methodologies range from simple monolayer cultures of individual primary or stem cells all up to bioprinted 3D organoids and microfluidic chips that recapitulate the complex cellular architecture of organs. Other approaches applied in life sciences in space research contribute to the reduction of animal experimentation. These include methods to mimic space conditions on Earth, such as microgravity and radiation simulators, as well as tools to support the processing, analysis or application of testing results obtained in life sciences in space research, including systems biology, live-cell, high-content and real-time analysis, high-throughput analysis, artificial intelligence and digital twins. The present paper provides an in-depth overview of such methods to replace or reduce animal testing in life sciences in space research.
Collapse
Affiliation(s)
- Mathieu Vinken
- Department of Pharmaceutical and Pharmacological Sciences, Vrije Universiteit Brussel, Brussels, Belgium.
| | - Daniela Grimm
- Department of Microgravity and Translational Regenerative Medicine, Otto-von-Guericke-University, Magdeburg, Germany; Department of Biomedicine, Aarhus University, Aarhus, Denmark
| | - Sarah Baatout
- Nuclear Medical Applications Institute, Belgian Nuclear Research Centre, Mol, Belgium; Department of Molecular Biotechnology, Gent University, Gent, Belgium
| | - Bjorn Baselet
- Nuclear Medical Applications Institute, Belgian Nuclear Research Centre, Mol, Belgium
| | - Afshin Beheshti
- Center of Space Biomedicine, McGowan Institute for Regenerative Medicine, and Department of Surgery, University of Pittsburgh, Pittsburgh, PA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Markus Braun
- German Space Agency, German Aerospace Center, Bonn, Germany
| | | | - James A Casaletto
- Blue Marble Space Institute of Science, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Ben Cools
- Department of Pharmaceutical and Pharmacological Sciences, Vrije Universiteit Brussel, Brussels, Belgium; Nuclear Medical Applications Institute, Belgian Nuclear Research Centre, Mol, Belgium
| | - Sylvain V Costes
- Blue Marble Space Institute of Science, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA; Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Phoebe De Meulemeester
- Department of Pharmaceutical and Pharmacological Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| | - Bartu Doruk
- Space Applications Services NV/SA, Sint-Stevens-Woluwe, Belgium; Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Sara Eyal
- Institute for Drug Research, School of Pharmacy, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | | | - Silvana Miranda
- Nuclear Medical Applications Institute, Belgian Nuclear Research Centre, Mol, Belgium; Department of Molecular Biotechnology, Gent University, Gent, Belgium
| | - Christiane Hahn
- European Space Agency, Human and Robotic Exploration Programmes, Human Exploration Science team, Noordwijk, the Netherlands
| | - Sinem Helvacıoğlu Akyüz
- Department of Pharmaceutical and Pharmacological Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| | - Stefan Herbert
- Space Systems, Airbus Defence and Space, Immenstaad am Bodensee, Germany
| | - Dmitriy Krepkiy
- Office of Special Initiatives, National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD, USA
| | - Yannick Lichterfeld
- Department of Applied Aerospace Biology, Institute of Aerospace Medicine, German Aerospace Center, Cologne, Germany
| | - Christian Liemersdorf
- Department of Applied Aerospace Biology, Institute of Aerospace Medicine, German Aerospace Center, Cologne, Germany
| | - Marcus Krüger
- Department of Microgravity and Translational Regenerative Medicine, Otto-von-Guericke-University, Magdeburg, Germany
| | - Shannon Marchal
- Department of Microgravity and Translational Regenerative Medicine, Otto-von-Guericke-University, Magdeburg, Germany
| | - Jette Ritz
- Department of Pharmaceutical and Pharmacological Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| | - Theresa Schmakeit
- Department of Applied Aerospace Biology, Institute of Aerospace Medicine, German Aerospace Center, Cologne, Germany
| | - Hilde Stenuit
- Space Applications Services NV/SA, Sint-Stevens-Woluwe, Belgium
| | - Kevin Tabury
- Nuclear Medical Applications Institute, Belgian Nuclear Research Centre, Mol, Belgium
| | - Torsten Trittel
- Department of Microgravity and Translational Regenerative Medicine, Otto-von-Guericke-University, Magdeburg, Germany; Department of Engineering, Brandenburg University of Applied Sciences, Brandenburg an der Havel, Germany
| | - Markus Wehland
- Department of Microgravity and Translational Regenerative Medicine, Otto-von-Guericke-University, Magdeburg, Germany
| | - Yu Shrike Zhang
- Division of Engineering, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Harvard Stem Cell Institute, Harvard University, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Karson S Putt
- Institute for Drug Discovery, Purdue University, West Lafayette, IN, USA
| | - Zhong-Yin Zhang
- Institute for Drug Discovery, Purdue University, West Lafayette, IN, USA; Borch Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN, USA
| | - Danilo A Tagle
- Office of Special Initiatives, National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
2
|
Liu S, McCoy AB, Wright A. Improving large language model applications in biomedicine with retrieval-augmented generation: a systematic review, meta-analysis, and clinical development guidelines. J Am Med Inform Assoc 2025; 32:605-615. [PMID: 39812777 PMCID: PMC12005634 DOI: 10.1093/jamia/ocaf008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 12/17/2024] [Accepted: 01/03/2025] [Indexed: 01/16/2025] Open
Abstract
OBJECTIVE The objectives of this study are to synthesize findings from recent research of retrieval-augmented generation (RAG) and large language models (LLMs) in biomedicine and provide clinical development guidelines to improve effectiveness. MATERIALS AND METHODS We conducted a systematic literature review and a meta-analysis. The report was created in adherence to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 analysis. Searches were performed in 3 databases (PubMed, Embase, PsycINFO) using terms related to "retrieval augmented generation" and "large language model," for articles published in 2023 and 2024. We selected studies that compared baseline LLM performance with RAG performance. We developed a random-effect meta-analysis model, using odds ratio as the effect size. RESULTS Among 335 studies, 20 were included in this literature review. The pooled effect size was 1.35, with a 95% confidence interval of 1.19-1.53, indicating a statistically significant effect (P = .001). We reported clinical tasks, baseline LLMs, retrieval sources and strategies, as well as evaluation methods. DISCUSSION Building on our literature review, we developed Guidelines for Unified Implementation and Development of Enhanced LLM Applications with RAG in Clinical Settings to inform clinical applications using RAG. CONCLUSION Overall, RAG implementation showed a 1.35 odds ratio increase in performance compared to baseline LLMs. Future research should focus on (1) system-level enhancement: the combination of RAG and agent, (2) knowledge-level enhancement: deep integration of knowledge into LLM, and (3) integration-level enhancement: integrating RAG systems within electronic health records.
Collapse
Affiliation(s)
- Siru Liu
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
- Department of Computer Science, Vanderbilt University, Nashville, TN 37212, United States
| | - Allison B McCoy
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Adam Wright
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| |
Collapse
|
3
|
Baranzini SE. The Barancik award lecture: Multi-disciplinary research will be the key to stop, restore, and end MS. Mult Scler 2025; 31:384-391. [PMID: 39871711 PMCID: PMC11956383 DOI: 10.1177/13524585251314756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2025] [Accepted: 01/02/2025] [Indexed: 01/29/2025]
Abstract
The past 25 years have brought extraordinary advances in our understanding of MS pathogenesis and the subsequent development of effective therapies. Collaborative genetics efforts have uncovered the association of 236 common DNA variants with disease susceptibility and the first association with disease severity, paving the way to more effective therapies, particularly for progressive forms of the disease. In parallel, and in addition to established environmental disease triggers or modifiers, new collaborative work has revealed new associations with components of the gut microbiome. This research opened a new and exciting prospect for exploring the gut-brain axis, with the potential to also provide new pharmacologic targets and diet-based therapies. Finally, with the availability of massive amounts of information and unprecedented computer power, a new wave of artificial intelligence (AI)-based research is sprawling. These investigations will result in statistically powerful predictive models to identify individuals at risk even years before the disease is clinically apparent. Furthermore, using approaches like semantic representation and causal inference, some of these approaches will be explainable in biomedical terms, thus making them trusted and facilitating their implementation in the clinical setting. The common thread that characterizes all of these advances is multi-disciplinary collaboration among scientists in the form of formal consortia, working groups, or ad hoc partnerships. This may be the "secret sauce" of modern science and the best strategy to stop, restore, and end MS.
Collapse
Affiliation(s)
- Sergio E Baranzini
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|
4
|
Zagare A, Balaur I, Rougny A, Saraiva C, Gobin M, Monzel AS, Ghosh S, Satagopam VP, Schwamborn JC. Deciphering shared molecular dysregulation across Parkinson's disease variants using a multi-modal network-based data integration and analysis. NPJ Parkinsons Dis 2025; 11:63. [PMID: 40164620 PMCID: PMC11958823 DOI: 10.1038/s41531-025-00914-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Accepted: 03/13/2025] [Indexed: 04/02/2025] Open
Abstract
Parkinson's disease (PD) is a progressive neurodegenerative disorder with no effective treatment. Advances in neuroscience and systems biomedicine now enable the use of complex patient-specific in vitro disease models and cutting-edge computational tools for data integration, enhancing our understanding of complex PD mechanisms. To explore common biomedical features across monogenic PD forms, we developed a knowledge graph (KG) by integrating previously published high-content imaging and RNA sequencing data of PD patient-specific midbrain organoids harbouring LRRK2-G2019S, SNCA triplication, GBA-N370S or MIRO1-R272Q mutations with publicly available biological data. Furthermore, we generated a single-cell RNA sequencing dataset of midbrain organoids derived from idiopathic PD patients (IPD) to stratify IPD patients within the spectrum of monogenic forms of PD. Despite the high degree of PD heterogeneity, we found that common transcriptomic dysregulation in monogenic PD forms is reflected in glial cells of IPD patient midbrain organoids. In addition, dysregulation in ROBO signalling might be involved in shared pathophysiology between monogenic PD and IPD cases.
Collapse
Affiliation(s)
- Alise Zagare
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg.
| | - Irina Balaur
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Adrien Rougny
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Claudia Saraiva
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Matthieu Gobin
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Anna S Monzel
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Soumyabrata Ghosh
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Venkata P Satagopam
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg.
| | - Jens C Schwamborn
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
5
|
Jiang W, Ye W, Tan X, Bao YJ. Network-based multi-omics integrative analysis methods in drug discovery: a systematic review. BioData Min 2025; 18:27. [PMID: 40155979 PMCID: PMC11954193 DOI: 10.1186/s13040-025-00442-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Accepted: 03/17/2025] [Indexed: 04/01/2025] Open
Abstract
The integration of multi-omics data from diverse high-throughput technologies has revolutionized drug discovery. While various network-based methods have been developed to integrate multi-omics data, systematic evaluation and comparison of these methods remain challenging. This review aims to analyze network-based approaches for multi-omics integration and evaluate their applications in drug discovery. We conducted a comprehensive review of literature (2015-2024) on network-based multi-omics integration methods in drug discovery, and categorized methods into four primary types: network propagation/diffusion, similarity-based approaches, graph neural networks, and network inference models. We also discussed the applications of the methods in three scenario of drug discovery, including drug target identification, drug response prediction, and drug repurposing, and finally evaluated the performance of the methods by highlighting their advantages and limitations in specific applications. While network-based multi-omics integration has shown promise in drug discovery, challenges remain in computational scalability, data integration, and biological interpretation. Future developments should focus on incorporating temporal and spatial dynamics, improving model interpretability, and establishing standardized evaluation frameworks.
Collapse
Affiliation(s)
- Wei Jiang
- School of Life Sciences, Hubei University, Wuhan, China
| | - Weicai Ye
- School of Computer Science and Engineering, Guangdong Province Key Laboratory of Computational Science, National Engineering Laboratory for Big Data Analysis and Application, Sun Yat-sen University, Guangzhou, China
| | - Xiaoming Tan
- School of Life Sciences, Hubei University, Wuhan, China
| | - Yun-Juan Bao
- School of Life Sciences, Hubei University, Wuhan, China.
- , No.368 Youyi Avenue, Wuhan, 430062, China.
| |
Collapse
|
6
|
Bueckle A, Herr BW, Hardi J, Quardokus EM, Musen MA, Börner K. Construction, Deployment, and Usage of the Human Reference Atlas Knowledge Graph for Linked Open Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.12.22.630006. [PMID: 39764040 PMCID: PMC11703146 DOI: 10.1101/2024.12.22.630006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2025]
Abstract
The Human Reference Atlas (HRA) for the healthy, adult body is being developed by a team of international, interdisciplinary experts across 20+ consortia. It provides standard terminologies and data structures for describing specimens, biological structures, and spatial positions of experimental datasets and ontology-linked reference anatomical structures (AS), cell types (CT), and biomarkers (B). We introduce the HRA Knowledge Graph (KG) as central data resource for HRA v2.2, supporting cross-scale, biological queries to Resource Description Framework graphs using SPARQL. In February 2025, the HRA KG covered 71 organs with 5,800 AS, 2,268 CT, 2,531 B; it had 10,064,033 nodes, 171,250,177 edges, and a size of 125.84 GB. The HRA KG comprises 13 types of Digital Objects (DOs) using the Common Coordinate Framework Ontology to standardize core concepts and relationships across DOs. We (1) provide data and code for HRA KG construction; (2) detail HRA KG deployment by Linked Open Data principles; and (3) illustrate HRA KG usage via application programming interfaces, user interfaces, data products. A companion website is at https://cns-iu.github.io/hra-kg-supporting-information.
Collapse
Affiliation(s)
- Andreas Bueckle
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
| | - Bruce W Herr
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
| | - Josef Hardi
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Ellen M Quardokus
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
| | - Mark A Musen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Katy Börner
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
| |
Collapse
|
7
|
Casaletto JA, Scott RT, Myrick M, Mackintosh G, Chok H, Saravia-Butler A, Hoarfrost A, Galazka JM, Sanders LM, Costes SV. Analyzing the relationship between gene expression and phenotype in space-flown mice using a causal inference machine learning ensemble. Sci Rep 2025; 15:2363. [PMID: 39824847 PMCID: PMC11748630 DOI: 10.1038/s41598-024-81394-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 11/26/2024] [Indexed: 01/20/2025] Open
Abstract
Spaceflight has several detrimental effects on human and rodent health. For example, liver dysfunction is a common phenotype observed in space-flown rodents, and this dysfunction is partially reflected in transcriptomic changes. Studies linking transcriptomics with liver dysfunction rely on tools which exploit correlation, but these tools make no attempt to disambiguate true correlations from spurious ones. In this work, we use a machine learning ensemble of causal inference methods called the Causal Research and Inference Search Platform (CRISP) which was developed to predict causal features of a binary response variable from high-dimensional input. We used CRISP to identify genes robustly correlated with a lipid density phenotype using transcriptomic and histological data from the NASA Open Science Data Repository (OSDR). Our approach identified genes and molecular targets not predicted by previous traditional differential gene expression analyses. These genes are likely to play a pivotal role in the liver dysfunction observed in space-flown rodents, and this work opens the door to identifying novel countermeasures for space travel.
Collapse
Affiliation(s)
- James A Casaletto
- Blue Marble Space Institute of Science, NASA Ames, Mountain View, USA.
| | | | - Makenna Myrick
- Department of Chemistry, University of Florida, Gainesville, USA
| | | | - Hamed Chok
- Blue Marble Space Institute of Science, NASA Ames, Mountain View, USA
| | | | | | | | | | | |
Collapse
|
8
|
Wu Y, Xie X, Zhu J, Guan L, Li M. Overview and Prospects of DNA Sequence Visualization. Int J Mol Sci 2025; 26:477. [PMID: 39859192 PMCID: PMC11764684 DOI: 10.3390/ijms26020477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 12/30/2024] [Accepted: 01/04/2025] [Indexed: 01/27/2025] Open
Abstract
Due to advances in big data technology, deep learning, and knowledge engineering, biological sequence visualization has been extensively explored. In the post-genome era, biological sequence visualization enables the visual representation of both structured and unstructured biological sequence data. However, a universal visualization method for all types of sequences has not been reported. Biological sequence data are rapidly expanding exponentially and the acquisition, extraction, fusion, and inference of knowledge from biological sequences are critical supporting technologies for visualization research. These areas are important and require in-depth exploration. This paper elaborates on a comprehensive overview of visualization methods for DNA sequences from four different perspectives-two-dimensional, three-dimensional, four-dimensional, and dynamic visualization approaches-and discusses the strengths and limitations of each method in detail. Furthermore, this paper proposes two potential future research directions for biological sequence visualization in response to the challenges of inefficient graphical feature extraction and knowledge association network generation in existing methods. The first direction is the construction of knowledge graphs for biological sequence big data, and the second direction is the cross-modal visualization of biological sequences using machine learning methods. This review is anticipated to provide valuable insights and contributions to computational biology, bioinformatics, genomic computing, genetic breeding, evolutionary analysis, and other related disciplines in the fields of biology, medicine, chemistry, statistics, and computing. It has an important reference value in biological sequence recommendation systems and knowledge question answering systems.
Collapse
Affiliation(s)
| | | | | | | | - Mengshan Li
- School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China; (Y.W.); (X.X.); (J.Z.); (L.G.)
| |
Collapse
|
9
|
Gebre SG, Scott RT, Saravia-Butler AM, Lopez DK, Sanders LM, Costes SV. NASA open science data repository: open science for life in space. Nucleic Acids Res 2025; 53:D1697-D1710. [PMID: 39558178 PMCID: PMC11701653 DOI: 10.1093/nar/gkae1116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 10/11/2024] [Accepted: 10/28/2024] [Indexed: 11/20/2024] Open
Abstract
Space biology and health data are critical for the success of deep space missions and sustainable human presence off-world. At the core of effectively managing biomedical risks is the commitment to open science principles, which ensure that data are findable, accessible, interoperable, reusable, reproducible and maximally open. The 2021 integration of the Ames Life Sciences Data Archive with GeneLab to establish the NASA Open Science Data Repository significantly enhanced access to a wide range of life sciences, biomedical-clinical and mission telemetry data alongside existing 'omics data from GeneLab. This paper describes the new database, its architecture and new data streams supporting diverse data types and enhancing data submission, retrieval and analysis. Features include the biological data management environment for improved data submission, a new user interface, controlled data access, an enhanced API and comprehensive public visualization tools for environmental telemetry, radiation dosimetry data and 'omics analyses. By fostering global collaboration through its analysis working groups and training programs, the open science data repository promotes widespread engagement in space biology, ensuring transparency and inclusivity in research. It supports the global scientific community in advancing our understanding of spaceflight's impact on biological systems, ensuring humans will thrive in future deep space missions.
Collapse
Affiliation(s)
- Samrawit G Gebre
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Ryan T Scott
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | | | - Danielle K Lopez
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Lauren M Sanders
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Sylvain V Costes
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| |
Collapse
|
10
|
Johnson R, Gottlieb U, Shaham G, Eisen L, Waxman J, Devons-Sberro S, Ginder CR, Hong P, Sayeed R, Reis BY, Balicer RD, Dagan N, Zitnik M. Unified Clinical Vocabulary Embeddings for Advancing Precision Medicine. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.12.03.24318322. [PMID: 39677476 PMCID: PMC11643188 DOI: 10.1101/2024.12.03.24318322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Integrating clinical knowledge into AI remains challenging despite numerous medical guidelines and vocabularies. Medical codes, central to healthcare systems, often reflect operational patterns shaped by geographic factors, national policies, insurance frameworks, and physician practices rather than the precise representation of clinical knowledge. This disconnect hampers AI in representing clinical relationships, raising concerns about bias, transparency, and generalizability. Here, we developed a resource of 67,124 clinical vocabulary embeddings derived from a clinical knowledge graph tailored to electronic health record vocabularies, spanning over 1.3 million edges. Using graph transformer neural networks, we generated clinical vocabulary embeddings that provide a new representation of clinical knowledge by unifying seven medical vocabularies. These embeddings were validated through a phenotype risk score analysis involving 4.57 million patients from Clalit Healthcare Services, effectively stratifying individuals based on survival outcomes. Inter-institutional panels of clinicians evaluated the embeddings for alignment with clinical knowledge across 90 diseases and 3,000 clinical codes, confirming their robustness and transferability. This resource addresses gaps in integrating clinical vocabularies into AI models and training datasets, paving the way for knowledge-grounded population and patient-level models.
Collapse
Affiliation(s)
- Ruth Johnson
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Uri Gottlieb
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Galit Shaham
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Lihi Eisen
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Jacob Waxman
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Stav Devons-Sberro
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Curtis R. Ginder
- Cardiovascular Division, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Peter Hong
- Division of General Pediatrics, Department of Pediatrics, Boston Children’s Hospital, Boston, MA, USA
- Information Technology, Enterprise Data Analytics and Reporting, Boston Children’s Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Raheel Sayeed
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ben Y. Reis
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Predictive Medicine Group, Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA, USA
- Harvard Data Science Initiative, Cambridge, MA, USA
| | - Ran D. Balicer
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
- Faculty of Health Sciences, School of Public Health, Ben Gurion University of the Negev, Be’er Sheva, Israel
| | - Noa Dagan
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
- Software and Information Systems Engineering, Ben Gurion University, Be’er Sheva, Israel
| | - Marinka Zitnik
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Harvard Data Science Initiative, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA, USA
| |
Collapse
|
11
|
Bailey RL, MacFarlane AJ, Field MS, Tagkopoulos I, Baranzini SE, Edwards KM, Rose CJ, Schork NJ, Singhal A, Wallace BC, Fisher KP, Markakis K, Stover PJ. Artificial intelligence in food and nutrition evidence: The challenges and opportunities. PNAS NEXUS 2024; 3:pgae461. [PMID: 39677367 PMCID: PMC11638775 DOI: 10.1093/pnasnexus/pgae461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 10/02/2024] [Indexed: 12/17/2024]
Abstract
Science-informed decisions are best guided by the objective synthesis of the totality of evidence around a particular question and assessing its trustworthiness through systematic processes. However, there are major barriers and challenges that limit science-informed food and nutrition policy, practice, and guidance. First, insufficient evidence, primarily due to acquisition cost of generating high-quality data, and the complexity of the diet-disease relationship. Furthermore, the sheer number of systematic reviews needed across the entire agriculture and food value chain, and the cost and time required to conduct them, can delay the translation of science to policy. Artificial intelligence offers the opportunity to (i) better understand the complex etiology of diet-related chronic diseases, (ii) bring more precision to our understanding of the variation among individuals in the diet-chronic disease relationship, (iii) provide new types of computed data related to the efficacy and effectiveness of nutrition/food interventions in health promotion, and (iv) automate the generation of systematic reviews that support timely decisions. These advances include the acquisition and synthesis of heterogeneous and multimodal datasets. This perspective summarizes a meeting convened at the National Academy of Sciences, Engineering, and Medicine. The purpose of the meeting was to examine the current state and future potential of artificial intelligence in generating new types of computed data as well as automating the generation of systematic reviews to support evidence-based food and nutrition policy, practice, and guidance.
Collapse
Affiliation(s)
- Regan L Bailey
- Department of Nutrition, Texas A&M University, Cater-Mattil Hall, 373 Olsen Blvd Room 130, College Station, TX 77843, USA
- Institute for Advancing Health Through Agriculture, Texas A&M University, Borlaug Building, College Station, TX 77843, USA
| | - Amanda J MacFarlane
- Department of Nutrition, Texas A&M University, Cater-Mattil Hall, 373 Olsen Blvd Room 130, College Station, TX 77843, USA
- Texas A&M Agriculture, Food, and Nutrition Evidence Center, 801 Cherry Street, Fort Worth, TX 76102, USA
| | - Martha S Field
- Division of Nutritional Sciences, Cornell University, Savage Hall, Ithaca, NY 14850, USA
| | - Ilias Tagkopoulos
- Department of Computer Science and Genome Center, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
- USDA/NSF AI Institute for Next Generation Food Systems, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Sergio E Baranzini
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, 1651 4th St, San Francisco, CA 94158, USA
| | - Kristen M Edwards
- Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Christopher J Rose
- Cluster for Reviews and Health Technology Assessments, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway
- Centre for Epidemic Interventions Research, Norwegian Institute of Public Health, Lovisenberggata 8 0456, 0213 Oslo, Norway
| | - Nicholas J Schork
- Translational Genomics Research Institute, City of Hope National Medical Center, 445 N. Fifth Street, Phoenix, AZ 85004, USA
| | - Akshat Singhal
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, San Diego, CA 92093, USA
| | - Byron C Wallace
- Khoury College of Computer Sciences, Northeastern University, #202, West Village Residence Complex H, 440 Huntington Ave, Boston, MA 02115, USA
| | - Kelly P Fisher
- Institute for Advancing Health Through Agriculture, Texas A&M University, Borlaug Building, College Station, TX 77843, USA
| | - Konstantinos Markakis
- Department of Computer Science and Genome Center, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Patrick J Stover
- Department of Nutrition, Texas A&M University, Cater-Mattil Hall, 373 Olsen Blvd Room 130, College Station, TX 77843, USA
| |
Collapse
|
12
|
Yang Y, Zheng Z, Xu Y, Wei H, Yan W. BioGSF: a graph-driven semantic feature integration framework for biomedical relation extraction. Brief Bioinform 2024; 26:bbaf025. [PMID: 39853110 PMCID: PMC11759886 DOI: 10.1093/bib/bbaf025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 12/24/2024] [Accepted: 01/09/2025] [Indexed: 01/26/2025] Open
Abstract
The automatic and accurate extraction of diverse biomedical relations from literature constitutes the core elements of medical knowledge graphs, which are indispensable for healthcare artificial intelligence. Currently, fine-tuning through stacking various neural networks on pre-trained language models (PLMs) represents a common framework for end-to-end resolution of the biomedical relation extraction (RE) problem. Nevertheless, sequence-based PLMs, to a certain extent, fail to fully exploit the connections between semantics and the topological features formed by these connections. In this study, we presented a graph-driven framework named BioGSF for RE from the literature by integrating shortest dependency paths (SDP) with entity-pair graph through the employment of the graph neural network model. Initially, we leveraged dependency relationships to obtain the SDP between entities and incorporated this information into the entity-pair graph. Subsequently, the graph attention network was utilized to acquire the topological information of the entity-pair graph. Ultimately, the obtained topological information was combined with the semantic features of the contextual information for relation classification. Our method was evaluated on two distinct datasets, namely S4 and BioRED. The outcomes reveal that BioGSF not only attains the superior performance among previous models with a micro-F1 score of 96.68% (S4) and 96.03% (BioRED), but also demands the shortest running times. BioGSF emerges as an efficient framework for biomedical RE.
Collapse
Affiliation(s)
- Yang Yang
- Computing Science and Artificial Intelligence College, Suzhou City University, No. 1188 Wuzhong Avenue, Wuzhong District Suzhou, Suzhou 215004, China
- Suzhou Key Lab of Multi-modal Data Fusion and Intelligent Healthcare, No. 1188 Wuzhong Avenue, Wuzhong District Suzhou, Suzhou 215004, China
- School of Computer Science & Technology, Soochow University, No. 1 Shizi Street, Suzhou 215000, China
| | - Zixuan Zheng
- School of Computer Science & Technology, Soochow University, No. 1 Shizi Street, Suzhou 215000, China
| | - Yuyang Xu
- School of Computer Science & Technology, Soochow University, No. 1 Shizi Street, Suzhou 215000, China
| | - Huifang Wei
- School of Basic Medical Sciences, Suzhou Medical College of Soochow University, No. 199 Renai Road, SIP, Suzhou 215123, China
| | - Wenying Yan
- Suzhou Key Lab of Multi-modal Data Fusion and Intelligent Healthcare, No. 1188 Wuzhong Avenue, Wuzhong District Suzhou, Suzhou 215004, China
- School of Basic Medical Sciences, Suzhou Medical College of Soochow University, No. 199 Renai Road, SIP, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Soochow University, No. 199 Renai Road, SIP, Suzhou 215123, China
| |
Collapse
|
13
|
Mazein I, Rougny A, Mazein A, Henkel R, Gütebier L, Michaelis L, Ostaszewski M, Schneider R, Satagopam V, Jensen LJ, Waltemath D, Wodke JAH, Balaur I. Graph databases in systems biology: a systematic review. Brief Bioinform 2024; 25:bbae561. [PMID: 39565895 PMCID: PMC11578065 DOI: 10.1093/bib/bbae561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/28/2024] [Accepted: 10/21/2024] [Indexed: 11/22/2024] Open
Abstract
Graph databases are becoming increasingly popular across scientific disciplines, being highly suitable for storing and connecting complex heterogeneous data. In systems biology, they are used as a backend solution for biological data repositories, ontologies, networks, pathways, and knowledge graph databases. In this review, we analyse all publications using or mentioning graph databases retrieved from PubMed and PubMed Central full-text search, focusing on the top 16 available graph databases, Publications are categorized according to their domain and application, focusing on pathway and network biology and relevant ontologies and tools. We detail different approaches and highlight the advantages of outstanding resources, such as UniProtKB, Disease Ontology, and Reactome, which provide graph-based solutions. We discuss ongoing efforts of the systems biology community to standardize and harmonize knowledge graph creation and the maintenance of integrated resources. Outlining prospects, including the use of graph databases as a way of communication between biological data repositories, we conclude that efficient design, querying, and maintenance of graph databases will be key for knowledge generation in systems biology and other research fields with heterogeneous data.
Collapse
Affiliation(s)
- Ilya Mazein
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Adrien Rougny
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Alexander Mazein
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Ron Henkel
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Lea Gütebier
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Lea Michaelis
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Marek Ostaszewski
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Lars Juhl Jensen
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 15, 1870 Frederiksberg C, Denmark
| | - Dagmar Waltemath
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Judith A H Wodke
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Irina Balaur
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| |
Collapse
|
14
|
Qin G, Zhang Y, Tyner JW, Kemp CJ, Shmulevich I. Knowledge graphs facilitate prediction of drug response for acute myeloid leukemia. iScience 2024; 27:110755. [PMID: 39280607 PMCID: PMC11401200 DOI: 10.1016/j.isci.2024.110755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 05/04/2024] [Accepted: 08/14/2024] [Indexed: 09/18/2024] Open
Abstract
Acute myeloid leukemia (AML) is a highly aggressive and heterogeneous disease, underscoring the need for improved therapeutic options and methods to optimally predict responses. With the wealth of available data resources, including clinical features, multiomics analysis, and ex vivo drug screening from AML patients, development of drug response prediction models has become feasible. Knowledge graphs (KGs) embed the relationships between different entities or features, allowing for explanation of a wide breadth of drug sensitivity and resistance mechanisms. We designed AML drug response prediction models guided by KGs. Our models included engineered features, relative gene expression between marker genes for each drug and regulators (e.g., transcription factors). We identified relative gene expression of FGD4-MIR4519, NPC2-GATA2, and BCL2-NFKB2 as predictive features for venetoclax ex vivo drug response. The KG-guided models provided high accuracy in independent test sets, overcame potential platform batch effects, and provided candidate drug sensitivity biomarkers for further validation.
Collapse
Affiliation(s)
- Guangrong Qin
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Yue Zhang
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Jeffrey W. Tyner
- Knight Cancer Institute, Oregon Health & Science University, Portland, OR 97239, USA
| | | | | |
Collapse
|
15
|
Soman K, Rose PW, Morris JH, Akbas RE, Smith B, Peetoom B, Villouta-Reyes C, Cerono G, Shi Y, Rizk-Jackson A, Israni S, Nelson CA, Huang S, Baranzini SE. Biomedical knowledge graph-optimized prompt generation for large language models. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae560. [PMID: 39288310 PMCID: PMC11441322 DOI: 10.1093/bioinformatics/btae560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 08/29/2024] [Accepted: 09/15/2024] [Indexed: 09/19/2024]
Abstract
MOTIVATION Large language models (LLMs) are being adopted at an unprecedented rate, yet still face challenges in knowledge-intensive domains such as biomedicine. Solutions such as pretraining and domain-specific fine-tuning add substantial computational overhead, requiring further domain-expertise. Here, we introduce a token-optimized and robust Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging a massive biomedical KG (SPOKE) with LLMs such as Llama-2-13b, GPT-3.5-Turbo, and GPT-4, to generate meaningful biomedical text rooted in established knowledge. RESULTS Compared to the existing RAG technique for Knowledge Graphs, the proposed method utilizes minimal graph schema for context extraction and uses embedding methods for context pruning. This optimization in context extraction results in more than 50% reduction in token consumption without compromising the accuracy, making a cost-effective and robust RAG implementation on proprietary LLMs. KG-RAG consistently enhanced the performance of LLMs across diverse biomedical prompts by generating responses rooted in established knowledge, accompanied by accurate provenance and statistical evidence (if available) to substantiate the claims. Further benchmarking on human curated datasets, such as biomedical true/false and multiple-choice questions (MCQ), showed a remarkable 71% boost in the performance of the Llama-2 model on the challenging MCQ dataset, demonstrating the framework's capacity to empower open-source models with fewer parameters for domain-specific questions. Furthermore, KG-RAG enhanced the performance of proprietary GPT models, such as GPT-3.5 and GPT-4. In summary, the proposed framework combines explicit and implicit knowledge of KG and LLM in a token optimized fashion, thus enhancing the adaptability of general-purpose LLMs to tackle domain-specific questions in a cost-effective fashion. AVAILABILITY AND IMPLEMENTATION SPOKE KG can be accessed at https://spoke.rbvi.ucsf.edu/neighborhood.html. It can also be accessed using REST-API (https://spoke.rbvi.ucsf.edu/swagger/). KG-RAG code is made available at https://github.com/BaranziniLab/KG_RAG. Biomedical benchmark datasets used in this study are made available to the research community in the same GitHub repository.
Collapse
Affiliation(s)
- Karthik Soman
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Peter W Rose
- San Diego Supercomputer Center, University of California, San Diego, CA 92093, United States
| | - John H Morris
- Department of Pharmaceutical Chemistry, School of Pharmacy, University of California, San Francisco, CA 94158, United States
| | - Rabia E Akbas
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Brett Smith
- Institute for Systems Biology, Seattle, WA 98109, United States
| | - Braian Peetoom
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Catalina Villouta-Reyes
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Gabriel Cerono
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Yongmei Shi
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158, United States
| | - Angela Rizk-Jackson
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158, United States
| | - Sharat Israni
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158, United States
| | - Charlotte A Nelson
- Mate Bioservices, Inc. Swallowtail Ct., Brisbane, CA 94005, United States
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA 98109, United States
| | - Sergio E Baranzini
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| |
Collapse
|
16
|
Noecker C, Turnbaugh PJ. Emerging tools and best practices for studying gut microbial community metabolism. Nat Metab 2024; 6:1225-1236. [PMID: 38961185 DOI: 10.1038/s42255-024-01074-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 05/30/2024] [Indexed: 07/05/2024]
Abstract
The human gut microbiome vastly extends the set of metabolic reactions catalysed by our own cells, with far-reaching consequences for host health and disease. However, our knowledge of gut microbial metabolism relies on a handful of model organisms, limiting our ability to interpret and predict the metabolism of complex microbial communities. In this Perspective, we discuss emerging tools for analysing and modelling the metabolism of gut microorganisms and for linking microorganisms, pathways and metabolites at the ecosystem level, highlighting promising best practices for researchers. Continued progress in this area will also require infrastructure development to facilitate cross-disciplinary synthesis of scientific findings. Collectively, these efforts can enable a broader and deeper understanding of the workings of the gut ecosystem and open new possibilities for microbiome manipulation and therapy.
Collapse
Affiliation(s)
- Cecilia Noecker
- Department of Biological Sciences, Minnesota State University, Mankato, Mankato, MN, USA
- Department of Microbiology & Immunology, University of California, San Francisco, San Francisco, CA, USA
| | - Peter J Turnbaugh
- Department of Microbiology & Immunology, University of California, San Francisco, San Francisco, CA, USA.
- Chan Zuckerberg Biohub-San Francisco, San Francisco, CA, USA.
| |
Collapse
|
17
|
Siew K, Nestler KA, Nelson C, D'Ambrosio V, Zhong C, Li Z, Grillo A, Wan ER, Patel V, Overbey E, Kim J, Yun S, Vaughan MB, Cheshire C, Cubitt L, Broni-Tabi J, Al-Jaber MY, Boyko V, Meydan C, Barker P, Arif S, Afsari F, Allen N, Al-Maadheed M, Altinok S, Bah N, Border S, Brown AL, Burling K, Cheng-Campbell M, Colón LM, Degoricija L, Figg N, Finch R, Foox J, Faridi P, French A, Gebre S, Gordon P, Houerbi N, Valipour Kahrood H, Kiffer FC, Klosinska AS, Kubik A, Lee HC, Li Y, Lucarelli N, Marullo AL, Matei I, McCann CM, Mimar S, Naglah A, Nicod J, O'Shaughnessy KM, Oliveira LCD, Oswalt L, Patras LI, Lai Polo SH, Rodríguez-Lopez M, Roufosse C, Sadeghi-Alavijeh O, Sanchez-Hodge R, Paul AS, Schittenhelm RB, Schweickart A, Scott RT, Choy Lim Kam Sian TC, da Silveira WA, Slawinski H, Snell D, Sosa J, Saravia-Butler AM, Tabetah M, Tanuwidjaya E, Walker-Samuel S, Yang X, Yasmin, Zhang H, Godovac-Zimmermann J, Sarder P, Sanders LM, Costes SV, Campbell RAA, Karouia F, Mohamed-Alis V, Rodriques S, Lynham S, Steele JR, Baranzini S, Fazelinia H, Dai Z, Uruno A, Shiba D, Yamamoto M, A C Almeida E, Blaber E, Schisler JC, Eisch AJ, Muratani M, Zwart SR, et alSiew K, Nestler KA, Nelson C, D'Ambrosio V, Zhong C, Li Z, Grillo A, Wan ER, Patel V, Overbey E, Kim J, Yun S, Vaughan MB, Cheshire C, Cubitt L, Broni-Tabi J, Al-Jaber MY, Boyko V, Meydan C, Barker P, Arif S, Afsari F, Allen N, Al-Maadheed M, Altinok S, Bah N, Border S, Brown AL, Burling K, Cheng-Campbell M, Colón LM, Degoricija L, Figg N, Finch R, Foox J, Faridi P, French A, Gebre S, Gordon P, Houerbi N, Valipour Kahrood H, Kiffer FC, Klosinska AS, Kubik A, Lee HC, Li Y, Lucarelli N, Marullo AL, Matei I, McCann CM, Mimar S, Naglah A, Nicod J, O'Shaughnessy KM, Oliveira LCD, Oswalt L, Patras LI, Lai Polo SH, Rodríguez-Lopez M, Roufosse C, Sadeghi-Alavijeh O, Sanchez-Hodge R, Paul AS, Schittenhelm RB, Schweickart A, Scott RT, Choy Lim Kam Sian TC, da Silveira WA, Slawinski H, Snell D, Sosa J, Saravia-Butler AM, Tabetah M, Tanuwidjaya E, Walker-Samuel S, Yang X, Yasmin, Zhang H, Godovac-Zimmermann J, Sarder P, Sanders LM, Costes SV, Campbell RAA, Karouia F, Mohamed-Alis V, Rodriques S, Lynham S, Steele JR, Baranzini S, Fazelinia H, Dai Z, Uruno A, Shiba D, Yamamoto M, A C Almeida E, Blaber E, Schisler JC, Eisch AJ, Muratani M, Zwart SR, Smith SM, Galazka JM, Mason CE, Beheshti A, Walsh SB. Cosmic kidney disease: an integrated pan-omic, physiological and morphological study into spaceflight-induced renal dysfunction. Nat Commun 2024; 15:4923. [PMID: 38862484 PMCID: PMC11167060 DOI: 10.1038/s41467-024-49212-1] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 05/28/2024] [Indexed: 06/13/2024] Open
Abstract
Missions into Deep Space are planned this decade. Yet the health consequences of exposure to microgravity and galactic cosmic radiation (GCR) over years-long missions on indispensable visceral organs such as the kidney are largely unexplored. We performed biomolecular (epigenomic, transcriptomic, proteomic, epiproteomic, metabolomic, metagenomic), clinical chemistry (electrolytes, endocrinology, biochemistry) and morphometry (histology, 3D imaging, miRNA-ISH, tissue weights) analyses using samples and datasets available from 11 spaceflight-exposed mouse and 5 human, 1 simulated microgravity rat and 4 simulated GCR-exposed mouse missions. We found that spaceflight induces: 1) renal transporter dephosphorylation which may indicate astronauts' increased risk of nephrolithiasis is in part a primary renal phenomenon rather than solely a secondary consequence of bone loss; 2) remodelling of the nephron that results in expansion of distal convoluted tubule size but loss of overall tubule density; 3) renal damage and dysfunction when exposed to a Mars roundtrip dose-equivalent of simulated GCR.
Collapse
Affiliation(s)
- Keith Siew
- London Tubular Centre, Department of Renal Medicine, University College London, London, UK.
| | - Kevin A Nestler
- The Institute for Biomedical Sciences (IBS), The George Washington University, Washington, DC, USA
| | - Charlotte Nelson
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Viola D'Ambrosio
- London Tubular Centre, Department of Renal Medicine, University College London, London, UK
- Department of Experimental and Translational Medicine, Università Cattolica del Sacro Cuore di Roma, Rome, Italy
| | - Chutong Zhong
- London Tubular Centre, Department of Renal Medicine, University College London, London, UK
| | - Zhongwang Li
- London Tubular Centre, Department of Renal Medicine, University College London, London, UK
- Centre for Advanced Biomedical Imaging, University College London, London, UK
- Centre for Computational Medicine, University College London, London, UK
| | - Alessandra Grillo
- London Tubular Centre, Department of Renal Medicine, University College London, London, UK
| | - Elizabeth R Wan
- London Tubular Centre, Department of Renal Medicine, University College London, London, UK
| | - Vaksha Patel
- Department of Renal Medicine, University College London, London, UK
| | - Eliah Overbey
- Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY, USA
| | - JangKeun Kim
- Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY, USA
| | - Sanghee Yun
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Michael B Vaughan
- School of Medicine, College of Medicine and Health, University College Cork, Cork, Ireland
- Tissue Engineering and Biomaterials Group, Ghent University, Ghent, Belgium
- Center for Medical Genetics, Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Chris Cheshire
- Bioinformatics and Computational Biology Laboratory, The Francis Crick Institute, London, UK
| | - Laura Cubitt
- Applied Biotechnology Laboratory, The Francis Crick Institute, London, UK
| | - Jessica Broni-Tabi
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, UK
| | | | - Valery Boyko
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Cem Meydan
- Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY, USA
| | - Peter Barker
- MRC MDU Mouse Biochemistry Laboratory, University of Cambridge, Cambridge, UK
| | - Shehbeel Arif
- Center for Data Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Fatemeh Afsari
- Department of Medicine-Nephrology & Intelligent Critical Care Center, University of Florida, Gainesville, FL, USA
| | - Noah Allen
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Mohammed Al-Maadheed
- Anti-Doping Laboratory Qatar, Doha, Qatar
- Centre of Metabolism and Inflammation, University College London, London, UK
| | - Selin Altinok
- School of Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Nourdine Bah
- Applied Biotechnology Laboratory, The Francis Crick Institute, London, UK
| | - Samuel Border
- Department of Medicine-Nephrology & Intelligent Critical Care Center, University of Florida, Gainesville, FL, USA
| | - Amanda L Brown
- Pharmacology, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Keith Burling
- MRC MDU Mouse Biochemistry Laboratory, University of Cambridge, Cambridge, UK
| | - Margareth Cheng-Campbell
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
- Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Lorianna M Colón
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia Research Institute, Philadelphia, PA, USA
| | - Lovorka Degoricija
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Nichola Figg
- Department of Medicine, University of Cambridge, Cambridge, UK
| | - Rebecca Finch
- School of Health, Science and Wellbeing, Staffordshire University, Stoke-on-Trent, UK
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY, USA
| | - Pouya Faridi
- Monash Proteomics and Metabolomics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
| | - Alison French
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Samrawit Gebre
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Peter Gordon
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, UK
| | - Nadia Houerbi
- Physiology, Biophysics & Systems Biology, Weill Cornell Medical College, New York, NY, USA
| | - Hossein Valipour Kahrood
- Monash Proteomics and Metabolomics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
- Monash Bioinformatics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
| | - Frederico C Kiffer
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Aleksandra S Klosinska
- Division of Experimental Medicine & Immunotherapeutics (EMIT), Department of Medicine, University of Cambridge, Cambridge, UK
| | - Angela Kubik
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Han-Chung Lee
- Monash Proteomics and Metabolomics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
| | - Yinghui Li
- State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, Beijing, China
| | - Nicholas Lucarelli
- Department of Medicine-Nephrology & Intelligent Critical Care Center, University of Florida, Gainesville, FL, USA
| | - Anthony L Marullo
- School of Medicine, College of Medicine and Health, University College Cork, Cork, Ireland
| | - Irina Matei
- Cornell Center for Immunology, Cornell University, Ithaca, NY, USA
- Children's Cancer and Blood Foundation Laboratories, Departments of Pediatrics and Cell and Developmental Biology, Drukier Institute for Children's Health, Meyer Cancer Center, Weill Cornell Medical College, New York, NY, USA
| | - Colleen M McCann
- Pharmacology, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Sayat Mimar
- Department of Medicine-Nephrology & Intelligent Critical Care Center, University of Florida, Gainesville, FL, USA
| | - Ahmed Naglah
- Department of Medicine-Nephrology & Intelligent Critical Care Center, University of Florida, Gainesville, FL, USA
| | - Jérôme Nicod
- Advanced Sequencing Facility, The Francis Crick Institute, London, UK
| | - Kevin M O'Shaughnessy
- Division of Experimental Medicine & Immunotherapeutics (EMIT), Department of Medicine, University of Cambridge, Cambridge, UK
| | | | - Leah Oswalt
- Pharmacology, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - San-Huei Lai Polo
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | | | - Candice Roufosse
- Department of Immunology and Inflammation, Imperial College London, London, UK
| | | | | | - Anindya S Paul
- Department of Medicine-Nephrology & Intelligent Critical Care Center, University of Florida, Gainesville, FL, USA
| | - Ralf Bernd Schittenhelm
- Monash Proteomics and Metabolomics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
| | - Annalise Schweickart
- Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medical College, New York, NY, USA
| | - Ryan T Scott
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Terry Chin Choy Lim Kam Sian
- Monash Proteomics and Metabolomics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
| | - Willian A da Silveira
- School of Health, Science and Wellbeing, Staffordshire University, Stoke-on-Trent, UK
- International Space University, 67400, Illkirch-Graffenstaden, France
| | - Hubert Slawinski
- Advanced Sequencing Facility, The Francis Crick Institute, London, UK
| | - Daniel Snell
- Advanced Sequencing Facility, The Francis Crick Institute, London, UK
| | - Julio Sosa
- University Health Network, Toronto, ON, Canada
| | | | - Marshall Tabetah
- Department of Agricultural and Biological Engineering, Purdue University, West Lafayette, IN, USA
| | - Erwin Tanuwidjaya
- Monash Proteomics and Metabolomics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
| | - Simon Walker-Samuel
- Centre for Advanced Biomedical Imaging, University College London, London, UK
- Centre for Computational Medicine, University College London, London, UK
| | | | - Yasmin
- Division of Experimental Medicine & Immunotherapeutics (EMIT), Department of Medicine, University of Cambridge, Cambridge, UK
| | - Haijian Zhang
- Monash Proteomics and Metabolomics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
| | | | - Pinaki Sarder
- Department of Medicine-Quantitative Health Section, University of Florida, Gainesville, FL, USA
- Departments of Biomedical Engineering and Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA
| | - Lauren M Sanders
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
- Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Sylvain V Costes
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Robert A A Campbell
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, UK
| | - Fathi Karouia
- Blue Marble Space Institute of Science, Seattle, WA, USA
- Space Research Within Reach, San Francisco, CA, USA
- Center for Space Medicine, Baylor College of Medicine, Houston, TX, USA
| | - Vidya Mohamed-Alis
- Anti-Doping Laboratory Qatar, Doha, Qatar
- Centre of Metabolism and Inflammation, University College London, London, UK
| | - Samuel Rodriques
- Applied Biotechnology Laboratory, The Francis Crick Institute, London, UK
| | | | - Joel Ricky Steele
- Monash Proteomics and Metabolomics Platform, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
| | - Sergio Baranzini
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Hossein Fazelinia
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia Research Institute, Philadelphia, PA, USA
| | - Zhongquan Dai
- State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, Beijing, China
| | - Akira Uruno
- Department of Integrative Genomics, Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi, Japan
| | - Dai Shiba
- Mouse Epigenetics Project, ISS/Kibo experiment, Japan Aerospace Exploration Agency (JAXA), Tsukuba, Ibaraki, Japan
- JEM Utilization Center, Human Spaceflight Technology Directorate, Japan Aerospace Exploration Agency (JAXA), Tsukuba, Ibaraki, Japan
| | - Masayuki Yamamoto
- Department of Integrative Genomics, Tohoku Medical Megabank Organization, Tohoku University, Sendai, Miyagi, Japan
- Department of Medical Biochemistry, Graduate School of Medicine, Tohoku University, Sendai, Miyagi, Japan
| | - Eduardo A C Almeida
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Elizabeth Blaber
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
- Center for Biotechnology & Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY, USA
- Stanley Center for Psychiatric Research, Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA
| | - Jonathan C Schisler
- Pharmacology, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Amelia J Eisch
- Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Neuroscience, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Masafumi Muratani
- Institute of Medicine, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Sara R Zwart
- Department of Preventative Medicine and Community Health, University of Texas Medical Branch, Galveston, TX, USA
| | | | - Jonathan M Galazka
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medical College, New York, NY, USA
- The Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, New York, NY, USA
| | - Afshin Beheshti
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
- Broad Institute, Cambridge, MA, USA
- Space Biosciences Division, Universities Space Research Association (USRA), Washington, DC, USA
| | - Stephen B Walsh
- London Tubular Centre, Department of Renal Medicine, University College London, London, UK.
| |
Collapse
|
18
|
Sanders LM, Grigorev KA, Scott RT, Saravia-Butler AM, Polo SHL, Gilbert R, Overbey EG, Kim J, Mason CE, Costes SV. Inspiration4 data access through the NASA Open Science Data Repository. NPJ Microgravity 2024; 10:56. [PMID: 38744887 PMCID: PMC11094041 DOI: 10.1038/s41526-024-00393-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 04/03/2024] [Indexed: 05/16/2024] Open
Abstract
The increasing accessibility of commercial and private space travel necessitates a profound understanding of its impact on human health. The NASA Open Science Data Repository (OSDR) provides transparent and FAIR access to biological studies, notably the SpaceX Inspiration4 (I4) mission, which amassed extensive data from civilian astronauts. This dataset encompasses omics and clinical assays, facilitating comprehensive research on space-induced biological responses. These data allow for multi-modal, longitudinal assessments, bridging the gap between human and model organism studies. Crucially, community-driven data standards established by NASA's OSDR Analysis Working Groups empower artificial intelligence and machine learning to glean invaluable insights, guiding future mission planning and health risk mitigation. This article presents a concise guide to access and analyze I4 data in OSDR, including programmatic access through GLOpenAPI. This pioneering effort establishes a precedent for post-mission health monitoring programs within space agencies, propelling research in the burgeoning field of commercial space travel's impact on human physiology.
Collapse
Affiliation(s)
- Lauren M Sanders
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- Blue Marble Space, Seattle, WA, USA
| | - Kirill A Grigorev
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- Blue Marble Space, Seattle, WA, USA
| | - Ryan T Scott
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- KBR, Houston, TX, USA
| | - Amanda M Saravia-Butler
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- KBR, Houston, TX, USA
| | - San-Huei Lai Polo
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- KBR, Houston, TX, USA
| | - Rachel Gilbert
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- KBR, Houston, TX, USA
| | - Eliah G Overbey
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Center for STEM, University of Austin, Austin, TX, USA
| | - JangKeun Kim
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Sylvain V Costes
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA.
| |
Collapse
|
19
|
Di Maria A, Bellomo L, Billeci F, Cardillo A, Alaimo S, Ferragina P, Ferro A, Pulvirenti A. NetMe 2.0: a web-based platform for extracting and modeling knowledge from biomedical literature as a labeled graph. Bioinformatics 2024; 40:btae194. [PMID: 38597890 PMCID: PMC11074003 DOI: 10.1093/bioinformatics/btae194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/29/2024] [Accepted: 04/08/2024] [Indexed: 04/11/2024] Open
Abstract
MOTIVATION The rapid increase of bio-medical literature makes it harder and harder for scientists to keep pace with the discoveries on which they build their studies. Therefore, computational tools have become more widespread, among which network analysis plays a crucial role in several life-science contexts. Nevertheless, building correct and complete networks about some user-defined biomedical topics on top of the available literature is still challenging. RESULTS We introduce NetMe 2.0, a web-based platform that automatically extracts relevant biomedical entities and their relations from a set of input texts-i.e. in the form of full-text or abstract of PubMed Central's papers, free texts, or PDFs uploaded by users-and models them as a BioMedical Knowledge Graph (BKG). NetMe 2.0 also implements an innovative Retrieval Augmented Generation module (Graph-RAG) that works on top of the relationships modeled by the BKG and allows the distilling of well-formed sentences that explain their content. The experimental results show that NetMe 2.0 can infer comprehensive and reliable biological networks with significant Precision-Recall metrics when compared to state-of-the-art approaches. AVAILABILITY AND IMPLEMENTATION https://netme.click/.
Collapse
Affiliation(s)
- Antonio Di Maria
- Department of Clinical and Experimental Medicine, University of Catania, Catania, 95125, Italy
| | | | - Fabrizio Billeci
- Department of Computer Science, University of Catania, Catania, 95125, Italy
| | - Alfio Cardillo
- Department of Computer Science, University of Catania, Catania, 95125, Italy
| | - Salvatore Alaimo
- Department of Clinical and Experimental Medicine, University of Catania, Catania, 95125, Italy
| | - Paolo Ferragina
- Department of Computer Science, University of Pisa, Pisa, 56126 , Italy
| | - Alfredo Ferro
- Department of Clinical and Experimental Medicine, University of Catania, Catania, 95125, Italy
| | - Alfredo Pulvirenti
- Department of Clinical and Experimental Medicine, University of Catania, Catania, 95125, Italy
| |
Collapse
|
20
|
Oskotsky TT, Bhoja A, Bunis D, Le BL, Tang AS, Kosti I, Li C, Houshdaran S, Sen S, Vallvé-Juanico J, Wang W, Arthurs E, Govil A, Mahoney L, Lang L, Gaudilliere B, Stevenson DK, Irwin JC, Giudice LC, McAllister SL, Sirota M. Identifying therapeutic candidates for endometriosis through a transcriptomics-based drug repositioning approach. iScience 2024; 27:109388. [PMID: 38510116 PMCID: PMC10952035 DOI: 10.1016/j.isci.2024.109388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 12/29/2023] [Accepted: 02/28/2024] [Indexed: 03/22/2024] Open
Abstract
Existing medical treatments for endometriosis-related pain are often ineffective, underscoring the need for new therapeutic strategies. In this study, we applied a computational drug repurposing pipeline to stratified and unstratified disease signatures based on endometrial gene expression data to identify potential therapeutics from existing drugs, based on expression reversal. Of 3,131 unique genes differentially expressed by at least one of six endometriosis signatures, only 308 (9.8%) were in common; however, 221 out of 299 drugs identified, (73.9%) were shared. We selected fenoprofen, an uncommonly prescribed NSAID that was the top therapeutic candidate for further investigation. When testing fenoprofen in an established rat model of endometriosis, fenoprofen successfully alleviated endometriosis-associated vaginal hyperalgesia, a surrogate marker for endometriosis-related pain. These findings validate fenoprofen as a therapeutic that could be utilized more frequently for endometriosis and suggest the utility of the aforementioned computational drug repurposing approach for endometriosis.
Collapse
Affiliation(s)
- Tomiko T. Oskotsky
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| | - Arohee Bhoja
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Carnegie Mellon University, Pittsburgh, PA, USA
| | - Daniel Bunis
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| | - Brian L. Le
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| | - Alice S. Tang
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| | - Idit Kosti
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| | - Christine Li
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
| | - Sahar Houshdaran
- Department of Obstetrics, Gynecology and Reproductive Sciences, UCSF, San Francisco, CA, USA
| | - Sushmita Sen
- Department of Obstetrics, Gynecology and Reproductive Sciences, UCSF, San Francisco, CA, USA
| | - Júlia Vallvé-Juanico
- Department of Obstetrics, Gynecology and Reproductive Sciences, UCSF, San Francisco, CA, USA
| | - Wanxin Wang
- Department of Obstetrics, Gynecology and Reproductive Sciences, UCSF, San Francisco, CA, USA
| | - Erin Arthurs
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA, USA
| | - Arpita Govil
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA, USA
| | - Lauren Mahoney
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA, USA
| | - Lindsey Lang
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA, USA
| | - Brice Gaudilliere
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, CA, USA
| | | | - Juan C. Irwin
- Department of Obstetrics, Gynecology and Reproductive Sciences, UCSF, San Francisco, CA, USA
| | - Linda C. Giudice
- Department of Obstetrics, Gynecology and Reproductive Sciences, UCSF, San Francisco, CA, USA
| | | | - Marina Sirota
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| |
Collapse
|
21
|
Scrivner O, Nguyen T, Ginda M, Simon K, Börner K. Interactive network visualization of opioid crisis research: a tool for reinforcing data linkage skills for public health policy researchers. Front Artif Intell 2024; 7:1208874. [PMID: 38646414 PMCID: PMC11026550 DOI: 10.3389/frai.2024.1208874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 01/29/2024] [Indexed: 04/23/2024] Open
Abstract
Background Public health policy researchers face a persistent challenge in identifying and integrating relevant data, particularly in the context of the U.S. opioid crisis, where a comprehensive approach is crucial. Purpose To meet this new workforce demand health policy and health economics programs are increasingly introducing data analysis and data visualization skills. Such skills facilitate data integration and discovery by linking multiple resources. Common linking strategies include individual or aggregate level linking (e.g., patient identifiers) in primary clinical data and conceptual linking (e.g., healthcare workforce, state funding, burnout rates) in secondary data. Often, the combination of primary and secondary datasets is sought, requiring additional skills, for example, understanding metadata and constructing interlinkages. Methods To help improve those skills, we developed a 2-step process using a scoping method to discover data and network visualization to interlink metadata. Results: We show how these new skills enable the discovery of relationships among data sources pertinent to public policy research related to the opioid overdose crisis and facilitate inquiry across heterogeneous data resources. In addition, our interactive network visualization introduces (1) a conceptual approach, drawing from recent systematic review studies and linked by the publications, and (2) an aggregate approach, constructed using publicly available datasets and linked through crosswalks. Conclusions These novel metadata visualization techniques can be used as a teaching tool or a discovery method and can also be extended to other public policy domains.
Collapse
Affiliation(s)
- Olga Scrivner
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, United States
- Rose-Hulman Institute of Technology, Terre Haute, IN, United States
| | - Thuy Nguyen
- School of Public Health, University of Michigan, Ann Arbor, MI, United States
| | - Michael Ginda
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, United States
| | - Kosali Simon
- O'Neill School of Public and Environmental Affairs, Indiana University, Bloomington, IN, United States
- National Bureau of Economic Research, Cambridge, MA, United States
| | - Katy Börner
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, United States
| |
Collapse
|
22
|
Ma C, Liu S, Koslicki D. MetagenomicKG: a knowledge graph for metagenomic applications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.14.585056. [PMID: 38559251 PMCID: PMC10980061 DOI: 10.1101/2024.03.14.585056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Motivation The sheer volume and variety of genomic content within microbial communities makes metagenomics a field rich in biomedical knowledge. To traverse these complex communities and their vast unknowns, metagenomic studies often depend on distinct reference databases, such as the Genome Taxonomy Database (GTDB), the Kyoto Encyclopedia of Genes and Genomes (KEGG), and the Bacterial and Viral Bioinformatics Resource Center (BV-BRC), for various analytical purposes. These databases are crucial for genetic and functional annotation of microbial communities. Nevertheless, the inconsistent nomenclature or identifiers of these databases present challenges for effective integration, representation, and utilization. Knowledge graphs (KGs) offer an appropriate solution by organizing biological entities and their interrelations into a cohesive network. The graph structure not only facilitates the unveiling of hidden patterns but also enriches our biological understanding with deeper insights. Despite KGs having shown potential in various biomedical fields, their application in metagenomics remains underexplored. Results We present MetagenomicKG, a novel knowledge graph specifically tailored for metagenomic analysis. MetagenomicKG integrates taxonomic, functional, and pathogenesis-related information from widely used databases, and further links these with established biomedical knowledge graphs to expand biological connections. Through several use cases, we demonstrate its utility in enabling hypothesis generation regarding the relationships between microbes and diseases, generating sample-specific graph embeddings, and providing robust pathogen prediction. Availability and Implementation The source code and technical details for constructing the MetagenomicKG and reproducing all analyses are available at Github: https://github.com/KoslickiLab/MetagenomicKG. We also host a Neo4j instance: http://mkg.cse.psu.edu:7474 for accessing and querying this graph.
Collapse
Affiliation(s)
- Chunyu Ma
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, Pennsylvania, USA
| | - Shaopeng Liu
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, Pennsylvania, USA
| | - David Koslicki
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, Pennsylvania, USA
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania, USA
- Department of Biology, Pennsylvania State University, State College, Pennsylvania, USA
- The One Health Microbiome Center, Huck Institutes of the Life Sciences, Pennsylvania State University, State College, Pennsylvania, USA
| |
Collapse
|
23
|
Tang AS, Rankin KP, Cerono G, Miramontes S, Mills H, Roger J, Zeng B, Nelson C, Soman K, Woldemariam S, Li Y, Lee A, Bove R, Glymour M, Aghaeepour N, Oskotsky TT, Miller Z, Allen IE, Sanders SJ, Baranzini S, Sirota M. Leveraging electronic health records and knowledge networks for Alzheimer's disease prediction and sex-specific biological insights. NATURE AGING 2024; 4:379-395. [PMID: 38383858 PMCID: PMC10950787 DOI: 10.1038/s43587-024-00573-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 01/19/2024] [Indexed: 02/23/2024]
Abstract
Identification of Alzheimer's disease (AD) onset risk can facilitate interventions before irreversible disease progression. We demonstrate that electronic health records from the University of California, San Francisco, followed by knowledge networks (for example, SPOKE) allow for (1) prediction of AD onset and (2) prioritization of biological hypotheses, and (3) contextualization of sex dimorphism. We trained random forest models and predicted AD onset on a cohort of 749 individuals with AD and 250,545 controls with a mean area under the receiver operating characteristic of 0.72 (7 years prior) to 0.81 (1 day prior). We further harnessed matched cohort models to identify conditions with predictive power before AD onset. Knowledge networks highlight shared genes between multiple top predictors and AD (for example, APOE, ACTB, IL6 and INS). Genetic colocalization analysis supports AD association with hyperlipidemia at the APOE locus, as well as a stronger female AD association with osteoporosis at a locus near MS4A6A. We therefore show how clinical data can be utilized for early AD prediction and identification of personalized biological hypotheses.
Collapse
Affiliation(s)
- Alice S Tang
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Graduate Program in Bioengineering, University of California, San Francisco and University of California, Berkeley, San Francisco and Berkeley, CA, USA.
| | - Katherine P Rankin
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Memory and Aging Center, Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Gabriel Cerono
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Silvia Miramontes
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Hunter Mills
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Jacquelyn Roger
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Billy Zeng
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Charlotte Nelson
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Karthik Soman
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Sarah Woldemariam
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Yaqiao Li
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Albert Lee
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Riley Bove
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Maria Glymour
- Department of Anesthesiology, Pain, and Perioperative Medicine, Stanford University, Palo Alto, CA, USA
| | - Nima Aghaeepour
- Department of Anesthesiology, Pain, and Perioperative Medicine, Stanford University, Palo Alto, CA, USA
- Department of Pediatrics, Stanford University, Palo Alto, CA, USA
- Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA
| | - Tomiko T Oskotsky
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Zachary Miller
- Memory and Aging Center, Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Isabel E Allen
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
| | - Stephan J Sanders
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Institute of Developmental and Regenerative Medicine, Department of Paediatrics, University of Oxford, Oxford, UK
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Sergio Baranzini
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Department of Pediatrics, University of California, San Francisco, CA, USA.
| |
Collapse
|
24
|
Disease insights from medical data using interpretable risk prediction models. NATURE AGING 2024; 4:293-294. [PMID: 38383859 DOI: 10.1038/s43587-024-00585-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
|
25
|
Zhang Y, Qin G, Aguilar B, Rappaport N, Yurkovich JT, Pflieger L, Huang S, Hood L, Shmulevich I. A framework towards digital twins for type 2 diabetes. Front Digit Health 2024; 6:1336050. [PMID: 38343907 PMCID: PMC10853398 DOI: 10.3389/fdgth.2024.1336050] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 01/15/2024] [Indexed: 11/16/2024] Open
Abstract
Introduction A digital twin is a virtual representation of a patient's disease, facilitating real-time monitoring, analysis, and simulation. This enables the prediction of disease progression, optimization of care delivery, and improvement of outcomes. Methods Here, we introduce a digital twin framework for type 2 diabetes (T2D) that integrates machine learning with multiomic data, knowledge graphs, and mechanistic models. By analyzing a substantial multiomic and clinical dataset, we constructed predictive machine learning models to forecast disease progression. Furthermore, knowledge graphs were employed to elucidate and contextualize multiomic-disease relationships. Results and discussion Our findings not only reaffirm known targetable disease components but also spotlight novel ones, unveiled through this integrated approach. The versatile components presented in this study can be incorporated into a digital twin system, enhancing our grasp of diseases and propelling the advancement of precision medicine.
Collapse
Affiliation(s)
- Yue Zhang
- Institute for Systems Biology, Seattle, WA, United States
| | - Guangrong Qin
- Institute for Systems Biology, Seattle, WA, United States
| | - Boris Aguilar
- Institute for Systems Biology, Seattle, WA, United States
| | - Noa Rappaport
- Institute for Systems Biology, Seattle, WA, United States
- Center for Phenomic Health, Buck Institute for Research on Aging, Novato, CA, United States
| | - James T. Yurkovich
- Center for Phenomic Health, Buck Institute for Research on Aging, Novato, CA, United States
- Phenome Health, Seattle, WA, United States
| | - Lance Pflieger
- Center for Phenomic Health, Buck Institute for Research on Aging, Novato, CA, United States
- Phenome Health, Seattle, WA, United States
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA, United States
| | - Leroy Hood
- Institute for Systems Biology, Seattle, WA, United States
- Center for Phenomic Health, Buck Institute for Research on Aging, Novato, CA, United States
- Phenome Health, Seattle, WA, United States
| | | |
Collapse
|
26
|
Woodman RJ, Koczwara B, Mangoni AA. Applying precision medicine principles to the management of multimorbidity: the utility of comorbidity networks, graph machine learning, and knowledge graphs. Front Med (Lausanne) 2024; 10:1302844. [PMID: 38404463 PMCID: PMC10885565 DOI: 10.3389/fmed.2023.1302844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 12/22/2023] [Indexed: 02/27/2024] Open
Abstract
The current management of patients with multimorbidity is suboptimal, with either a single-disease approach to care or treatment guideline adaptations that result in poor adherence due to their complexity. Although this has resulted in calls for more holistic and personalized approaches to prescribing, progress toward these goals has remained slow. With the rapid advancement of machine learning (ML) methods, promising approaches now also exist to accelerate the advance of precision medicine in multimorbidity. These include analyzing disease comorbidity networks, using knowledge graphs that integrate knowledge from different medical domains, and applying network analysis and graph ML. Multimorbidity disease networks have been used to improve disease diagnosis, treatment recommendations, and patient prognosis. Knowledge graphs that combine different medical entities connected by multiple relationship types integrate data from different sources, allowing for complex interactions and creating a continuous flow of information. Network analysis and graph ML can then extract the topology and structure of networks and reveal hidden properties, including disease phenotypes, network hubs, and pathways; predict drugs for repurposing; and determine safe and more holistic treatments. In this article, we describe the basic concepts of creating bipartite and unipartite disease and patient networks and review the use of knowledge graphs, graph algorithms, graph embedding methods, and graph ML within the context of multimorbidity. Specifically, we provide an overview of the application of graph theory for studying multimorbidity, the methods employed to extract knowledge from graphs, and examples of the application of disease networks for determining the structure and pathways of multimorbidity, identifying disease phenotypes, predicting health outcomes, and selecting safe and effective treatments. In today's modern data-hungry, ML-focused world, such network-based techniques are likely to be at the forefront of developing robust clinical decision support tools for safer and more holistic approaches to treating older patients with multimorbidity.
Collapse
Affiliation(s)
- Richard John Woodman
- Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia
| | - Bogda Koczwara
- Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia
- Department of Medical Oncology, Flinders Medical Centre, Southern Adelaide Local Health Network, Adelaide, SA, Australia
| | - Arduino Aleksander Mangoni
- Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia
- Department of Clinical Pharmacology, Flinders Medical Centre, Southern Adelaide Local Health Network, Adelaide, SA, Australia
| |
Collapse
|
27
|
Altenhoff A, Bairoch A, Bansal P, Baratin D, Bastian F, Bolleman* J, Bridge A, Burdet F, Crameri K, Dauvillier J, Dessimoz C, Gehant S, Glover N, Gnodtke K, Hayes C, Ibberson M, Kriventseva E, Kuznetsov D, Frédérique L, Mehl F, Mendes de Farias* T, Michel PA, Moretti S, Morgat A, Österle S, Pagni M, Redaschi N, Robinson-Rechavi M, Samarasinghe K, Sima AC, Szklarczyk D, Topalov O, Touré V, Unni D, von Mering C, Wollbrett J, Zahn-Zabal* M, Zdobnov E. The SIB Swiss Institute of Bioinformatics Semantic Web of data. Nucleic Acids Res 2024; 52:D44-D51. [PMID: 37878411 PMCID: PMC10767860 DOI: 10.1093/nar/gkad902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/02/2023] [Accepted: 10/05/2023] [Indexed: 10/27/2023] Open
Abstract
The SIB Swiss Institute of Bioinformatics (https://www.sib.swiss/) is a federation of bioinformatics research and service groups. The international life science community in academia and industry has been accessing the freely available databases provided by SIB since its inception in 1998. In this paper we present the 11 databases which currently offer semantically enriched data in accordance with the FAIR principles (Findable, Accessible, Interoperable, Reusable), as well as the Swiss Personalized Health Network initiative (SPHN) which also employs this enrichment. The semantic enrichment facilitates the manipulation of large data sets from public databases and private data sets. Examples are provided to illustrate that the data from the SIB databases can not only be queried using precise criteria individually, but also across multiple databases, including a variety of non-SIB databases. Data manipulation, be it exploration, extraction, annotation, combination, and publication, is possible using the SPARQL query language. Providing documentation, tutorials and sample queries makes it easier to navigate this web of semantic data. Through this paper, the reader will discover how the existing SIB knowledge graphs can be leveraged to tackle the complex biological or clinical questions that are being addressed today.
Collapse
|
28
|
Yang X, Huang K, Yang D, Zhao W, Zhou X. Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review. GLOBAL CHALLENGES (HOBOKEN, NJ) 2024; 8:2300163. [PMID: 38223896 PMCID: PMC10784210 DOI: 10.1002/gch2.202300163] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/20/2023] [Indexed: 01/16/2024]
Abstract
The explosive growth of biomedical Big Data presents both significant opportunities and challenges in the realm of knowledge discovery and translational applications within precision medicine. Efficient management, analysis, and interpretation of big data can pave the way for groundbreaking advancements in precision medicine. However, the unprecedented strides in the automated collection of large-scale molecular and clinical data have also introduced formidable challenges in terms of data analysis and interpretation, necessitating the development of novel computational approaches. Some potential challenges include the curse of dimensionality, data heterogeneity, missing data, class imbalance, and scalability issues. This overview article focuses on the recent progress and breakthroughs in the application of big data within precision medicine. Key aspects are summarized, including content, data sources, technologies, tools, challenges, and existing gaps. Nine fields-Datawarehouse and data management, electronic medical record, biomedical imaging informatics, Artificial intelligence-aided surgical design and surgery optimization, omics data, health monitoring data, knowledge graph, public health informatics, and security and privacy-are discussed.
Collapse
Affiliation(s)
- Xue Yang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Kexin Huang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Dewei Yang
- College of Advanced Manufacturing EngineeringChongqing University of Posts and TelecommunicationsChongqingChongqing400000China
| | - Weiling Zhao
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| | - Xiaobo Zhou
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| |
Collapse
|
29
|
Woodman RJ, Mangoni AA. A comprehensive review of machine learning algorithms and their application in geriatric medicine: present and future. Aging Clin Exp Res 2023; 35:2363-2397. [PMID: 37682491 PMCID: PMC10627901 DOI: 10.1007/s40520-023-02552-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 08/24/2023] [Indexed: 09/09/2023]
Abstract
The increasing access to health data worldwide is driving a resurgence in machine learning research, including data-hungry deep learning algorithms. More computationally efficient algorithms now offer unique opportunities to enhance diagnosis, risk stratification, and individualised approaches to patient management. Such opportunities are particularly relevant for the management of older patients, a group that is characterised by complex multimorbidity patterns and significant interindividual variability in homeostatic capacity, organ function, and response to treatment. Clinical tools that utilise machine learning algorithms to determine the optimal choice of treatment are slowly gaining the necessary approval from governing bodies and being implemented into healthcare, with significant implications for virtually all medical disciplines during the next phase of digital medicine. Beyond obtaining regulatory approval, a crucial element in implementing these tools is the trust and support of the people that use them. In this context, an increased understanding by clinicians of artificial intelligence and machine learning algorithms provides an appreciation of the possible benefits, risks, and uncertainties, and improves the chances for successful adoption. This review provides a broad taxonomy of machine learning algorithms, followed by a more detailed description of each algorithm class, their purpose and capabilities, and examples of their applications, particularly in geriatric medicine. Additional focus is given on the clinical implications and challenges involved in relying on devices with reduced interpretability and the progress made in counteracting the latter via the development of explainable machine learning.
Collapse
Affiliation(s)
- Richard J Woodman
- Centre of Epidemiology and Biostatistics, College of Medicine and Public Health, Flinders University, GPO Box 2100, Adelaide, SA, 5001, Australia.
| | - Arduino A Mangoni
- Discipline of Clinical Pharmacology, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia
- Department of Clinical Pharmacology, Flinders Medical Centre, Southern Adelaide Local Health Network, Adelaide, SA, Australia
| |
Collapse
|
30
|
Callaghan J, Xu CH, Xin J, Cano MA, Riutta A, Zhou E, Juneja R, Yao Y, Narayan M, Hanspers K, Agrawal A, Pico AR, Wu C, Su AI. BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs. Bioinformatics 2023; 39:7273783. [PMID: 37707514 PMCID: PMC11015316 DOI: 10.1093/bioinformatics/btad570] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 08/18/2023] [Accepted: 09/12/2023] [Indexed: 09/15/2023] Open
Abstract
SUMMARY Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThings Explorer is distributed as a lightweight application that dynamically retrieves information at query time. AVAILABILITY AND IMPLEMENTATION More information can be found at https://explorer.biothings.io and code is available at https://github.com/biothings/biothings_explorer.
Collapse
Affiliation(s)
- Jackson Callaghan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Colleen H Xu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Jiwen Xin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Anders Riutta
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Eric Zhou
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Rohan Juneja
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Yao Yao
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Madhumita Narayan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Kristina Hanspers
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Ayushi Agrawal
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Alexander R Pico
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| |
Collapse
|
31
|
Evangelista JE, Xie Z, Marino GB, Nguyen N, Clarke DB, Ma’ayan A. Enrichr-KG: bridging enrichment analysis across multiple libraries. Nucleic Acids Res 2023; 51:W168-W179. [PMID: 37166973 PMCID: PMC10320098 DOI: 10.1093/nar/gkad393] [Citation(s) in RCA: 74] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 04/23/2023] [Accepted: 05/02/2023] [Indexed: 05/12/2023] Open
Abstract
Gene and protein set enrichment analysis is a critical step in the analysis of data collected from omics experiments. Enrichr is a popular gene set enrichment analysis web-server search engine that contains hundreds of thousands of annotated gene sets. While Enrichr has been useful in providing enrichment analysis with many gene set libraries from different categories, integrating enrichment results across libraries and domains of knowledge can further hypothesis generation. To this end, Enrichr-KG is a knowledge graph database and a web-server application that combines selected gene set libraries from Enrichr for integrative enrichment analysis and visualization. The enrichment results are presented as subgraphs made of nodes and links that connect genes to their enriched terms. In addition, users of Enrichr-KG can add gene-gene links, as well as predicted genes to the subgraphs. This graphical representation of cross-library results with enriched and predicted genes can illuminate hidden associations between genes and annotated enriched terms from across datasets and resources. Enrichr-KG currently serves 26 gene set libraries from different categories that include transcription, pathways, ontologies, diseases/drugs, and cell types. To demonstrate the utility of Enrichr-KG we provide several case studies. Enrichr-KG is freely available at: https://maayanlab.cloud/enrichr-kg.
Collapse
Affiliation(s)
- John Erol Evangelista
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, NY, NY, USA
| | - Zhuorui Xie
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, NY, NY, USA
| | - Giacomo B Marino
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, NY, NY, USA
| | - Nhi Nguyen
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, NY, NY, USA
| | - Daniel J B Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, NY, NY, USA
| | - Avi Ma’ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, NY, NY, USA
| |
Collapse
|
32
|
Callaghan J, Xu CH, Xin J, Cano MA, Riutta A, Zhou E, Juneja R, Yao Y, Narayan M, Hanspers K, Agrawal A, Pico AR, Wu C, Su AI. BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs. ARXIV 2023:arXiv:2304.09344v1. [PMID: 37131885 PMCID: PMC10153288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThing Explorer is distributed as a lightweight application that dynamically retrieves information at query time. More information can be found at https://explorer.biothings.io, and code is available at https://github.com/biothings/biothings_explorer.
Collapse
Affiliation(s)
- Jackson Callaghan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Colleen H Xu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Jiwen Xin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Anders Riutta
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Eric Zhou
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Rohan Juneja
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Yao Yao
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Madhumita Narayan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Kristina Hanspers
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Ayushi Agrawal
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Alexander R Pico
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute
| |
Collapse
|
33
|
Mendes de Farias T, Wollbrett J, Robinson-Rechavi M, Bastian F. Lessons learned to boost a bioinformatics knowledge base reusability, the Bgee experience. Gigascience 2022; 12:giad058. [PMID: 37589308 PMCID: PMC10433096 DOI: 10.1093/gigascience/giad058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/30/2023] [Accepted: 07/07/2023] [Indexed: 08/18/2023] Open
Abstract
BACKGROUND Enhancing interoperability of bioinformatics knowledge bases is a high-priority requirement to maximize data reusability and thus increase their utility such as the return on investment for biomedical research. A knowledge base may provide useful information for life scientists and other knowledge bases, but it only acquires exchange value once the knowledge base is (re)used, and without interoperability, the utility lies dormant. RESULTS In this article, we discuss several approaches to boost interoperability depending on the interoperable parts. The findings are driven by several real-world scenario examples that were mostly implemented by Bgee, a well-established gene expression knowledge base. To better justify the findings are transferable, for each Bgee interoperability experience, we also highlight similar implementations by major bioinformatics knowledge bases. Moreover, we discuss ten general main lessons learned. These lessons can be applied in the context of any bioinformatics knowledge base to foster data reusability. CONCLUSIONS This work provides pragmatic methods and transferable skills to promote reusability of bioinformatics knowledge bases by focusing on interoperability.
Collapse
Affiliation(s)
- Tarcisio Mendes de Farias
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
| | - Julien Wollbrett
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
| | - Marc Robinson-Rechavi
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
| | - Frederic Bastian
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|