1
|
Matentzoglu N, Bello SM, Stefancsik R, Alghamdi SM, Anagnostopoulos AV, Balhoff JP, Balk MA, Bradford YM, Bridges Y, Callahan TJ, Caufield H, Cuzick A, Carmody LC, Caron AR, de Souza V, Engel SR, Fey P, Fisher M, Gehrke S, Grove C, Hansen P, Harris NL, Harris MA, Harris L, Ibrahim A, Jacobsen JOB, Köhler S, McMurry JA, Munoz-Fuentes V, Munoz-Torres MC, Parkinson H, Pendlington ZM, Pilgrim C, Robb SMC, Robinson PN, Seager J, Segerdell E, Smedley D, Sollis E, Toro S, Vasilevsky N, Wood V, Haendel MA, Mungall CJ, McLaughlin JA, Osumi-Sutherland D. The Unified Phenotype Ontology : a framework for cross-species integrative phenomics. Genetics 2025; 229:iyaf027. [PMID: 40048704 PMCID: PMC11912833 DOI: 10.1093/genetics/iyaf027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Accepted: 01/30/2025] [Indexed: 03/12/2025] Open
Abstract
Phenotypic data are critical for understanding biological mechanisms and consequences of genomic variation, and are pivotal for clinical use cases such as disease diagnostics and treatment development. For over a century, vast quantities of phenotype data have been collected in many different contexts covering a variety of organisms. The emerging field of phenomics focuses on integrating and interpreting these data to inform biological hypotheses. A major impediment in phenomics is the wide range of distinct and disconnected approaches to recording the observable characteristics of an organism. Phenotype data are collected and curated using free text, single terms or combinations of terms, using multiple vocabularies, terminologies, or ontologies. Integrating these heterogeneous and often siloed data enables the application of biological knowledge both within and across species. Existing integration efforts are typically limited to mappings between pairs of terminologies; a generic knowledge representation that captures the full range of cross-species phenomics data is much needed. We have developed the Unified Phenotype Ontology (uPheno) framework, a community effort to provide an integration layer over domain-specific phenotype ontologies, as a single, unified, logical representation. uPheno comprises (1) a system for consistent computational definition of phenotype terms using ontology design patterns, maintained as a community library; (2) a hierarchical vocabulary of species-neutral phenotype terms under which their species-specific counterparts are grouped; and (3) mapping tables between species-specific ontologies. This harmonized representation supports use cases such as cross-species integration of genotype-phenotype associations from different organisms and cross-species informed variant prioritization.
Collapse
Affiliation(s)
| | | | - Ray Stefancsik
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Sarah M Alghamdi
- King Abdullah University of Science and Technology, Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, Thuwal, 23955-6900, Saudi Arabia
| | | | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27517, USA
| | - Meghan A Balk
- Natural History Museum, University of Oslo, Oslo 0562, Norway
| | - Yvonne M Bradford
- The Institute of Neuroscience, University of Oregon, 5291 University of Oregon, Eugene, OR 97403-5291, USA
| | - Yasemin Bridges
- William Harvey Research Institute, Queen Mary University of London, London, E14 NS, UK
| | - Tiffany J Callahan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, Columbia University, New York, NY 10032, USA
| | - Harry Caufield
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Alayne Cuzick
- Department of Biointeractions and Crop Protection, Rothamsted Research, West Common, Harpenden, AL52 JQ, UK
| | | | - Anita R Caron
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Vinicius de Souza
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Stacia R Engel
- Department of Genetics, Stanford University, Palo Alto, CA 94304, USA
| | - Petra Fey
- Center for Genetic Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Malcolm Fisher
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Sarah Gehrke
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Christian Grove
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Peter Hansen
- Universitätsmedizin Berlin, Berlin Institute of Health at Charité, Anna-Louisa-Karsch-Straße 2, Berlin 10178, Germany
| | - Nomi L Harris
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Midori A Harris
- Department of Biochemistry, University of Cambridge, Cambridge, CB21 TN, UK
| | - Laura Harris
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Arwa Ibrahim
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Julius O B Jacobsen
- William Harvey Research Institute, Queen Mary University of London, London, E14 NS, UK
| | | | - Julie A McMurry
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | | | - Monica C Munoz-Torres
- Department of Biomedical Informatics, University of Colorado, Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Zoë M Pendlington
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Clare Pilgrim
- Department of Biochemistry, University of Cambridge, Cambridge, CB21 TN, UK
| | - Sofia M C Robb
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Peter N Robinson
- Universitätsmedizin Berlin, Berlin Institute of Health at Charité, Anna-Louisa-Karsch-Straße 2, Berlin 10178, Germany
| | - James Seager
- Department of Biointeractions and Crop Protection, Rothamsted Research, West Common, Harpenden, AL52 JQ, UK
| | - Erik Segerdell
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, E14 NS, UK
| | - Elliot Sollis
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Sabrina Toro
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | | | - Valerie Wood
- Department of Biochemistry, University of Cambridge, Cambridge, CB21 TN, UK
| | - Melissa A Haendel
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - James A McLaughlin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | |
Collapse
|
2
|
Smith JR, Tutaj MA, Thota J, Lamers L, Gibson AC, Kundurthi A, Gollapally VR, Brodie KC, Zacher S, Laulederkind SJF, Hayman GT, Wang SJ, Tutaj M, Kaldunski ML, Vedi M, Demos WM, De Pons JL, Dwinell MR, Kwitek AE. Standardized pipelines support and facilitate integration of diverse datasets at the Rat Genome Database. Database (Oxford) 2025; 2025:baae132. [PMID: 39841812 PMCID: PMC11753291 DOI: 10.1093/database/baae132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 11/01/2024] [Accepted: 12/30/2024] [Indexed: 01/24/2025]
Abstract
The Rat Genome Database (RGD) is a multispecies knowledgebase which integrates genetic, multiomic, phenotypic, and disease data across 10 mammalian species. To support cross-species, multiomics studies and to enhance and expand on data manually extracted from the biomedical literature by the RGD team of expert curators, RGD imports and integrates data from multiple sources. These include major databases and a substantial number of domain-specific resources, as well as direct submissions by individual researchers. The incorporation of these diverse datatypes is handled by a growing list of automated import, export, data processing, and quality control pipelines. This article outlines the development over time of a standardized infrastructure for automated RGD pipelines with a summary of key design decisions and a focus on lessons learned.
Collapse
Affiliation(s)
- Jennifer R Smith
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Marek A Tutaj
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Jyothi Thota
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Logan Lamers
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Adam C Gibson
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Akhilanand Kundurthi
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Varun Reddy Gollapally
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Kent C Brodie
- Clinical and Translational Science Institute, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Stacy Zacher
- Finance and Administration, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Stanley J F Laulederkind
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - G Thomas Hayman
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Shur-Jen Wang
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Monika Tutaj
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Mary L Kaldunski
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Mahima Vedi
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Wendy M Demos
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Jeffrey L De Pons
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Melinda R Dwinell
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| | - Anne E Kwitek
- Rat Genome Database, Department of Physiology, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226, United States
| |
Collapse
|
3
|
Abdulla S, Aevermann B, Assis P, Badajoz S, Bell SM, Bezzi E, Cakir B, Chaffer J, Chambers S, Cherry J, Chi T, Chien J, Dorman L, Garcia-Nieto P, Gloria N, Hastie M, Hegeman D, Hilton J, Huang T, Infeld A, Istrate AM, Jelic I, Katsuya K, Kim YJ, Liang K, Lin M, Lombardo M, Marshall B, Martin B, McDade F, Megill C, Patel N, Predeus A, Raymor B, Robatmili B, Rogers D, Rutherford E, Sadgat D, Shin A, Small C, Smith T, Sridharan P, Tarashansky A, Tavares N, Thomas H, Tolopko A, Urisko M, Yan J, Yeretssian G, Zamanian J, Mani A, Cool J, Carr A. CZ CELLxGENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data. Nucleic Acids Res 2025; 53:D886-D900. [PMID: 39607691 PMCID: PMC11701654 DOI: 10.1093/nar/gkae1142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Revised: 10/28/2024] [Accepted: 11/01/2024] [Indexed: 11/29/2024] Open
Abstract
Hundreds of millions of single cells have been analyzed using high-throughput transcriptomic methods. The cumulative knowledge within these datasets provides an exciting opportunity for unlocking insights into health and disease at the level of single cells. Meta-analyses that span diverse datasets building on recent advances in large language models and other machine-learning approaches pose exciting new directions to model and extract insight from single-cell data. Despite the promise of these and emerging analytical tools for analyzing large amounts of data, the sheer number of datasets, data models and accessibility remains a challenge. Here, we present CZ CELLxGENE Discover (cellxgene.cziscience.com), a data platform that provides curated and interoperable single-cell data. Available via a free-to-use online data portal, CZ CELLxGENE hosts a growing corpus of community-contributed data of over 93 million unique cells. Curated, standardized and associated with consistent cell-level metadata, this collection of single-cell transcriptomic data is the largest of its kind and growing rapidly via community contributions. A suite of tools and features enables accessibility and reusability of the data via both computational and visual interfaces to allow researchers to explore individual datasets, perform cross-corpus analysis, and run meta-analyses of tens of millions of cells across studies and tissues at the resolution of single cells.
Collapse
Affiliation(s)
| | - Shibla Abdulla
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Brian Aevermann
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Pedro Assis
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Seve Badajoz
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Sidney M Bell
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Emanuele Bezzi
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Batuhan Cakir
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Jim Chaffer
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Signe Chambers
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - J Michael Cherry
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Tiffany Chi
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Jennifer Chien
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Leah Dorman
- Chan Zuckerberg, Biohub, SF, 499 Illinois St, San Francisco, CA 94158, USA
| | - Pablo Garcia-Nieto
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Nayib Gloria
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Mim Hastie
- Clever Canary, 850 Front St. #1491, Santa Cruz, CA, USA
| | - Daniel Hegeman
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Jason Hilton
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Timmy Huang
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Amanda Infeld
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Ana-Maria Istrate
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Ivana Jelic
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Kuni Katsuya
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Yang Joon Kim
- Chan Zuckerberg, Biohub, SF, 499 Illinois St, San Francisco, CA 94158, USA
| | - Karen Liang
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Mike Lin
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | | | - Bailey Marshall
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Bruce Martin
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Fran McDade
- Clever Canary, 850 Front St. #1491, Santa Cruz, CA, USA
| | - Colin Megill
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Nikhil Patel
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Alexander Predeus
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Brian Raymor
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Behnam Robatmili
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Dave Rogers
- Clever Canary, 850 Front St. #1491, Santa Cruz, CA, USA
| | - Erica Rutherford
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Dana Sadgat
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Andrew Shin
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Corinn Small
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Trent Smith
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Prathap Sridharan
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | | | - Norbert Tavares
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Harley Thomas
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Andrew Tolopko
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Meghan Urisko
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Joyce Yan
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Garabet Yeretssian
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Jennifer Zamanian
- Department of Genetics, Stanford University School of Medicine, 291 Campus Drive, Li Ka Shing Building, Stanford, CA 94305, USA
| | - Arathi Mani
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Jonah Cool
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| | - Ambrose Carr
- Chan Zuckerberg Initiative, 1180 Main Street, Redwood City, CA 94063, USA
| |
Collapse
|
4
|
Bastian F, Cammarata A, Carsanaro S, Detering H, Huang WT, Joye S, Niknejad A, Nyamari M, Mendes de Farias T, Moretti S, Tzivanopoulou M, Wollbrett J, Robinson-Rechavi M. Bgee in 2024: focus on curated single-cell RNA-seq datasets, and query tools. Nucleic Acids Res 2025; 53:D878-D885. [PMID: 39656924 PMCID: PMC11701651 DOI: 10.1093/nar/gkae1118] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 10/25/2024] [Accepted: 10/28/2024] [Indexed: 12/17/2024] Open
Abstract
Bgee (https://www.bgee.org/) is a database to retrieve and compare gene expression patterns in multiple animal species. Expression data are integrated and made comparable between species thanks to consistent data annotation and processing. In the past years, we have integrated single-cell RNA-sequencing expression data into Bgee through careful curation of public datasets in multiple species. We have fully integrated this new technology along with the wealth of other data existing in Bgee. As a result, Bgee can now provide one definitive answer all the way to the cell resolution about a gene's expression pattern, comparable between species. We have updated our programmatic access tools to adapt to these changes accordingly. We have introduced a new web interface, providing detailed access to our annotations and expression data. It enables users to retrieve data, e.g. for specific organs, cell types or developmental stages, and leverages ontology reasoning to build powerful queries. Finally, we have expanded our species count from 29 to 52, emphasizing fish species critical for vertebrate genome studies, species of agronomic and veterinary importance and nonhuman primates.
Collapse
Affiliation(s)
- Frederic B Bastian
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Alessandro Brandulas Cammarata
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Sara Carsanaro
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Harald Detering
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Wan-Ting Huang
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Sagane Joye
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Anne Niknejad
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Marion Nyamari
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Tarcisio Mendes de Farias
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Sébastien Moretti
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Marianna Tzivanopoulou
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Julien Wollbrett
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| | - Marc Robinson-Rechavi
- Evolutionary Bioinformatics, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, Lausanne, 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge, Bâtiment Biophore, Lausanne, 1015, Switzerland
| |
Collapse
|
5
|
Caron AR, Puig-Barbe A, Quardokus EM, Balhoff JP, Belfiore J, Chipampe NJ, Hardi J, Herr BW, Kir H, Roncaglia P, Musen MA, McLaughlin JA, Börner K, Osumi-Sutherland D. A general strategy for generating expert-guided, simplified views of ontologies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.13.628309. [PMID: 39763856 PMCID: PMC11702530 DOI: 10.1101/2024.12.13.628309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]
Abstract
Annotation with widely used, well-structured ontologies, combined with the use of ontology-aware software tools, ensures data and analyses are Findable, Accessible, Interoperable and Reusable (FAIR). Standardized terms with synonyms support lexical search. Ontology structure supports biologically meaningful grouping of annotations (typically by location and type). However, there are significant barriers to the adoption and use of ontologies by researchers and resource developers. One barrier is complexity. Ontologies serving diverse communities are often more complex than needed for individual applications. It is common for atlases to attempt their own simplifications by manually constructing hierarchies of terms linked to ontologies, but these typically include relationship types that are not suitable for grouping annotations. Here, we present a suite of tools for validating user hierarchies against ontology structure, using them to generate graphical reports for discussion and ontology views tailored to the needs of the HuBMAP Human Reference Atlas, and the Human Developmental Cell Atlas. In both cases, validation is a source of corrections and content for both ontologies and user hierarchies.
Collapse
Affiliation(s)
- Anita R Caron
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aleix Puig-Barbe
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ellen M Quardokus
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA
| | - James P Balhoff
- RENCI, University of North Carolina, Chapel Hill, NC, North Carolina 27517, USA
| | - Jasmine Belfiore
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Nana-Jane Chipampe
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Josef Hardi
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, 94305 USA
| | - Bruce W Herr
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA
| | - Huseyin Kir
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Paola Roncaglia
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mark A Musen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, 94305 USA
| | - James A McLaughlin
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Katy Börner
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA
| | - David Osumi-Sutherland
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| |
Collapse
|
6
|
Matentzoglu N, Bello SM, Stefancsik R, Alghamdi SM, Anagnostopoulos AV, Balhoff JP, Balk MA, Bradford YM, Bridges Y, Callahan TJ, Caufield H, Cuzick A, Carmody LC, Caron AR, de Souza V, Engel SR, Fey P, Fisher M, Gehrke S, Grove C, Hansen P, Harris NL, Harris MA, Harris L, Ibrahim A, Jacobsen JO, Köhler S, McMurry JA, Munoz-Fuentes V, Munoz-Torres MC, Parkinson H, Pendlington ZM, Pilgrim C, Robb SMC, Robinson PN, Seager J, Segerdell E, Smedley D, Sollis E, Toro S, Vasilevsky N, Wood V, Haendel MA, Mungall CJ, McLaughlin JA, Osumi-Sutherland D. The Unified Phenotype Ontology (uPheno): A framework for cross-species integrative phenomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.18.613276. [PMID: 39345458 PMCID: PMC11429889 DOI: 10.1101/2024.09.18.613276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Phenotypic data are critical for understanding biological mechanisms and consequences of genomic variation, and are pivotal for clinical use cases such as disease diagnostics and treatment development. For over a century, vast quantities of phenotype data have been collected in many different contexts covering a variety of organisms. The emerging field of phenomics focuses on integrating and interpreting these data to inform biological hypotheses. A major impediment in phenomics is the wide range of distinct and disconnected approaches to recording the observable characteristics of an organism. Phenotype data are collected and curated using free text, single terms or combinations of terms, using multiple vocabularies, terminologies, or ontologies. Integrating these heterogeneous and often siloed data enables the application of biological knowledge both within and across species. Existing integration efforts are typically limited to mappings between pairs of terminologies; a generic knowledge representation that captures the full range of cross-species phenomics data is much needed. We have developed the Unified Phenotype Ontology (uPheno) framework, a community effort to provide an integration layer over domain-specific phenotype ontologies, as a single, unified, logical representation. uPheno comprises (1) a system for consistent computational definition of phenotype terms using ontology design patterns, maintained as a community library; (2) a hierarchical vocabulary of species-neutral phenotype terms under which their species-specific counterparts are grouped; and (3) mapping tables between species-specific ontologies. This harmonized representation supports use cases such as cross-species integration of genotype-phenotype associations from different organisms and cross-species informed variant prioritization.
Collapse
Affiliation(s)
| | | | | | | | | | - James P. Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC USA
| | - Meghan A. Balk
- Natural History Museum, University of Oslo, Oslo, Norway
| | | | | | - Tiffany J. Callahan
- Department of Biomedical Informatics, Columbia University Irving Medical Center
| | - Harry Caufield
- Lawrence Berkeley National. Laboratory, Berkeley, CA, USA
| | | | | | | | | | | | | | - Malcolm Fisher
- Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, US
| | | | | | | | - Nomi L. Harris
- Lawrence Berkeley National. Laboratory, Berkeley, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Erik Segerdell
- Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, US
| | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Boudinot BE, van de Kamp T, Peters P, Knöllinger K. Male genitalia, hierarchical homology, and the anatomy of the bullet ant (Paraponera clavata; Hymenoptera, Formicidae). J Morphol 2024; 285:e21757. [PMID: 39192511 DOI: 10.1002/jmor.21757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/19/2024] [Accepted: 07/23/2024] [Indexed: 08/29/2024]
Abstract
The male genitalia of insects are among the most variable, complex, and informative character systems for evolutionary analysis and taxonomic purposes. Because of these general properties, many generations of systematists have struggled to develop a theory of homology and alignment of parts. This struggle continues to the present day, where fundamentally different models and nomenclatures for the male genitalia of Hymenoptera, for example, are applied. Here, we take a multimodal approach to digitalize and comprehensively document the genital skeletomuscular anatomy of the bullet ant (Paraponera clavata; Hymenoptera: Formicidae), including hand dissection, synchrotron radiation microcomputed tomography, microphotography, scanning electron microscopy, confocal laser scanning microscopy, and 3D-printing. Through this work, we generate several new concepts for the structure and form of the male genitalia of Hymenoptera, such as for the endophallic sclerite (=fibula ducti), which we were able to evaluate in detail for the first time for any species. Based on this phenomic anatomical study and comparison with other Holometabola and Hexapoda, we reconsider the homologies of insect genitalia more broadly, and propose a series of clarifications in support of the penis-gonopod theory of male genital identity. Specifically, we use the male genitalia of Paraponera and insects more broadly as an empirical case for hierarchical homology by applying and refining the 5-category classification of serial homologs from DiFrisco et al. (2023) (DLW23) to all of our formalized concepts. Through this, we find that: (1) geometry is a critical attribute to account for in ontology, especially as all individually identifiable attributes are positionally indexed hence can be recognized as homomorphic; (2) the definition of "structure" proposed by DLW23 is difficult to apply, and likely heterogeneous; and (3) formative elements, or spatially defined foldings or in- or evaginations of the epidermis and cuticle, are an important yet overlooked class of homomorphs. We propose a morphogenetic model for male and female insect genitalia, and a model analogous to gene-tree species-tree mappings for the hierarchical homology of male genitalia specifically. For all of the structures evaluated in the present study, we provide 3D-printable models - with and without musculature, and in various states of digital dissection - to facilitate the development of a tactile understanding. Our treatment of the male genitalia of P. clavata serves as a basic template for future phenomic studies of male insect genitalia, which will be substantially improved with the development of automation and collections-based data processing pipelines, that is, collectomics. The Hymenoptera Anatomy Ontology will be a critical resource to include in this effort, and in best practice concepts should be linked.
Collapse
Affiliation(s)
- Brendon E Boudinot
- Department of Terrestrial Zoology, Entomology II, Senckenberg Research Institute and Natural History Museum, Frankfurt am Main, Germany
| | - Thomas van de Kamp
- Institute for Photon Science and Synchrotron Radiation (IPS), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
- Laboratory for Applications of Synchrotron Radiation (LAS), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Patricia Peters
- Department of Terrestrial Zoology, Entomology II, Senckenberg Research Institute and Natural History Museum, Frankfurt am Main, Germany
| | - Katja Knöllinger
- Department of Terrestrial Zoology, Entomology II, Senckenberg Research Institute and Natural History Museum, Frankfurt am Main, Germany
- Zurich University of the Arts, Zurich, Switzerland
| |
Collapse
|
8
|
Han S, Lee JE, Kang S, So M, Jin H, Lee JH, Baek S, Jun H, Kim TY, Lee YS. Standigm ASK™: knowledge graph and artificial intelligence platform applied to target discovery in idiopathic pulmonary fibrosis. Brief Bioinform 2024; 25:bbae035. [PMID: 38349059 PMCID: PMC10862655 DOI: 10.1093/bib/bbae035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 12/28/2023] [Indexed: 02/15/2024] Open
Abstract
Standigm ASK™ revolutionizes healthcare by addressing the critical challenge of identifying pivotal target genes in disease mechanisms-a fundamental aspect of drug development success. Standigm ASK™ integrates a unique combination of a heterogeneous knowledge graph (KG) database and an attention-based neural network model, providing interpretable subgraph evidence. Empowering users through an interactive interface, Standigm ASK™ facilitates the exploration of predicted results. Applying Standigm ASK™ to idiopathic pulmonary fibrosis (IPF), a complex lung disease, we focused on genes (AMFR, MDFIC and NR5A2) identified through KG evidence. In vitro experiments demonstrated their relevance, as TGFβ treatment induced gene expression changes associated with epithelial-mesenchymal transition characteristics. Gene knockdown reversed these changes, identifying AMFR, MDFIC and NR5A2 as potential therapeutic targets for IPF. In summary, Standigm ASK™ emerges as an innovative KG and artificial intelligence platform driving insights in drug target discovery, exemplified by the identification and validation of therapeutic targets for IPF.
Collapse
Affiliation(s)
- Seokjin Han
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Ji Eun Lee
- College of Pharmacy, Ewha Womans University, Ewhayeodae-gil, 03760, Seoul, Republic of Korea
| | - Seolhee Kang
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Minyoung So
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Hee Jin
- College of Pharmacy, Ewha Womans University, Ewhayeodae-gil, 03760, Seoul, Republic of Korea
| | - Jang Ho Lee
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Sunghyeob Baek
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Hyungjin Jun
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Tae Yong Kim
- Standigm Inc., Nonhyeon-ro 85-gil, 06234, Seoul, Republic of Korea
| | - Yun-Sil Lee
- College of Pharmacy, Ewha Womans University, Ewhayeodae-gil, 03760, Seoul, Republic of Korea
| |
Collapse
|
9
|
Milacic M, Beavers D, Conley P, Gong C, Gillespie M, Griss J, Haw R, Jassal B, Matthews L, May B, Petryszak R, Ragueneau E, Rothfels K, Sevilla C, Shamovsky V, Stephan R, Tiwari K, Varusai T, Weiser J, Wright A, Wu G, Stein L, Hermjakob H, D’Eustachio P. The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res 2024; 52:D672-D678. [PMID: 37941124 PMCID: PMC10767911 DOI: 10.1093/nar/gkad1025] [Citation(s) in RCA: 280] [Impact Index Per Article: 280.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 10/14/2023] [Accepted: 10/20/2023] [Indexed: 11/10/2023] Open
Abstract
The Reactome Knowledgebase (https://reactome.org), an Elixir and GCBR core biological data resource, provides manually curated molecular details of a broad range of normal and disease-related biological processes. Processes are annotated as an ordered network of molecular transformations in a single consistent data model. Reactome thus functions both as a digital archive of manually curated human biological processes and as a tool for discovering functional relationships in data such as gene expression profiles or somatic mutation catalogs from tumor cells. Here we review progress towards annotation of the entire human proteome, targeted annotation of disease-causing genetic variants of proteins and of small-molecule drugs in a pathway context, and towards supporting explicit annotation of cell- and tissue-specific pathways. Finally, we briefly discuss issues involved in making Reactome more fully interoperable with other related resources such as the Gene Ontology and maintaining the resulting community resource network.
Collapse
Affiliation(s)
- Marija Milacic
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Deidre Beavers
- Oregon Health and Science University, Portland, OR 97239, USA
| | - Patrick Conley
- Oregon Health and Science University, Portland, OR 97239, USA
| | - Chuqiao Gong
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Marc Gillespie
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
- College of Pharmacy and Health Sciences, St. John's University, Queens, NY 11439, USA
| | - Johannes Griss
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- Department of Dermatology, Medical University of Vienna, 1090 Vienna, Austria
| | - Robin Haw
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Bijay Jassal
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Lisa Matthews
- NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Bruce May
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | | | - Eliot Ragueneau
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Karen Rothfels
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Cristoffer Sevilla
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | | | - Ralf Stephan
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
- Institute for Globally Distributed Open Research and Education (IGDORE)
| | - Krishna Tiwari
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Thawfeek Varusai
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Joel Weiser
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Adam Wright
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Guanming Wu
- Oregon Health and Science University, Portland, OR 97239, USA
| | - Lincoln Stein
- Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A1, Canada
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | | |
Collapse
|
10
|
Carmody LC, Gargano MA, Toro S, Vasilevsky NA, Adam MP, Blau H, Chan LE, Gomez-Andres D, Horvath R, Kraus ML, Ladewig MS, Lewis-Smith D, Lochmüller H, Matentzoglu NA, Munoz-Torres MC, Schuetz C, Seitz B, Similuk MN, Sparks TN, Strauss T, Swietlik EM, Thompson R, Zhang XA, Mungall CJ, Haendel MA, Robinson PN. The Medical Action Ontology: A tool for annotating and analyzing treatments and clinical management of human disease. MED 2023; 4:913-927.e3. [PMID: 37963467 PMCID: PMC10842845 DOI: 10.1016/j.medj.2023.10.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 08/31/2023] [Accepted: 10/14/2023] [Indexed: 11/16/2023]
Abstract
BACKGROUND Navigating the clinical literature to determine the optimal clinical management for rare diseases presents significant challenges. We introduce the Medical Action Ontology (MAxO), an ontology specifically designed to organize medical procedures, therapies, and interventions. METHODS MAxO incorporates logical structures that link MAxO terms to numerous other ontologies within the OBO Foundry. Term development involves a blend of manual and semi-automated processes. Additionally, we have generated annotations detailing diagnostic modalities for specific phenotypic abnormalities defined by the Human Phenotype Ontology (HPO). We introduce a web application, POET, that facilitates MAxO annotations for specific medical actions for diseases using the Mondo Disease Ontology. FINDINGS MAxO encompasses 1,757 terms spanning a wide range of biomedical domains, from human anatomy and investigations to the chemical and protein entities involved in biological processes. These terms annotate phenotypic features associated with specific disease (using HPO and Mondo). Presently, there are over 16,000 MAxO diagnostic annotations that target HPO terms. Through POET, we have created 413 MAxO annotations specifying treatments for 189 rare diseases. CONCLUSIONS MAxO offers a computational representation of treatments and other actions taken for the clinical management of patients. Its development is closely coupled to Mondo and HPO, broadening the scope of our computational modeling of diseases and phenotypic features. We invite the community to contribute disease annotations using POET (https://poet.jax.org/). MAxO is available under the open-source CC-BY 4.0 license (https://github.com/monarch-initiative/MAxO). FUNDING NHGRI 1U24HG011449-01A1 and NHGRI 5RM1HG010860-04.
Collapse
Affiliation(s)
- Leigh C Carmody
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Sabrina Toro
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Margaret P Adam
- University of Washington School of Medicine, Seattle, WA, USA
| | - Hannah Blau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - David Gomez-Andres
- Pediatric Neurology, Vall d'Hebron Institut de Recerca (VHIR), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus, Passeig Vall d'Hebron 119-129, 08035 Barcelona, Spain
| | - Rita Horvath
- Department of Clinical Neurosciences, University of Cambridge, Robinson Way, Cambridge CB2 0PY, UK
| | - Megan L Kraus
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Markus S Ladewig
- Department of Ophthalmology, Klinikum Saarbrücken, Saarbrücken, Germany
| | - David Lewis-Smith
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Hanns Lochmüller
- Children's Hospital of Eastern Ontario Research Institute, Ottowa, Canada; Division of Neurology, Department of Medicine, The Ottawa Hospital, Ottawa, Canada; Brain and Mind Research Institute, University of Ottawa, Ottawa, Canada; Department of Neuropediatrics and Muscle Disorders, Medical Center - University of Freiburg, Faculty of Medicine, Freiburg, Germany; Centro Nacional de Análisis Genómico, Barcelona, Spain
| | | | | | - Catharina Schuetz
- Department of Pediatrics, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Berthold Seitz
- Department of Ophthalmology, Saarland University Medical Center UKS, Homburg, Saar, Germany
| | - Morgan N Similuk
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Teresa N Sparks
- Department of Obstetrics, Gynecology, & Reproductive Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Timmy Strauss
- Department of Pediatrics, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Emilia M Swietlik
- Department of Medicine, University of Cambridge, Heart and Lung Research Institute, Cambridge CB2 0BB, UK
| | - Rachel Thompson
- Children's Hospital of Eastern Ontario Research Institute, Ottowa, Canada
| | | | | | | | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
| |
Collapse
|
11
|
Vilela J, Martiniano H, Marques AR, Santos JX, Asif M, Rasga C, Oliveira G, Vicente AM. Identification of Neurotransmission and Synaptic Biological Processes Disrupted in Autism Spectrum Disorder Using Interaction Networks and Community Detection Analysis. Biomedicines 2023; 11:2971. [PMID: 38001974 PMCID: PMC10668950 DOI: 10.3390/biomedicines11112971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 10/26/2023] [Accepted: 11/01/2023] [Indexed: 11/26/2023] Open
Abstract
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by communication deficits and repetitive behavioral patterns. Hundreds of candidate genes have been implicated in ASD, including neurotransmission and synaptic (NS) genes; however, the genetic architecture of this disease is far from clear. In this study, we seek to clarify the biological processes affected by NS gene variants identified in individuals with ASD and the global networks that link those processes together. For a curated list of 1216 NS candidate genes, identified in multiple databases and the literature, we searched for ultra-rare (UR) loss-of-function (LoF) variants in the whole-exome sequencing dataset from the Autism Sequencing Consortium (N = 3938 cases). Filtering for population frequency was carried out using gnomAD (N = 60,146 controls). NS genes with UR LoF variants were used to construct a network of protein-protein interactions, and the network's biological communities were identified by applying the Leiden algorithm. We further explored the expression enrichment of network genes in specific brain regions. We identified 356 variants in 208 genes, with a preponderance of UR LoF variants in the PDE11A and SYTL3 genes. Expression enrichment analysis highlighted several subcortical structures, particularly the basal ganglia. The interaction network defined seven network communities, clustering synaptic and neurotransmitter pathways with several ubiquitous processes that occur in multiple organs and systems. This approach also uncovered biological pathways that are not usually associated with ASD, such as brain cytochromes P450 and brain mitochondrial metabolism. Overall, the community analysis suggests that ASD involves the disruption of synaptic and neurotransmitter pathways but also ubiquitous, but less frequently implicated, biological processes.
Collapse
Affiliation(s)
- Joana Vilela
- Departamento de Promoção da Saúde e Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Avenida Padre Cruz, 1649-016 Lisboa, Portugal; (J.V.); (H.M.); (A.R.M.); (J.X.S.); (M.A.); (C.R.)
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8, 1749-016 Lisboa, Portugal
| | - Hugo Martiniano
- Departamento de Promoção da Saúde e Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Avenida Padre Cruz, 1649-016 Lisboa, Portugal; (J.V.); (H.M.); (A.R.M.); (J.X.S.); (M.A.); (C.R.)
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8, 1749-016 Lisboa, Portugal
| | - Ana Rita Marques
- Departamento de Promoção da Saúde e Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Avenida Padre Cruz, 1649-016 Lisboa, Portugal; (J.V.); (H.M.); (A.R.M.); (J.X.S.); (M.A.); (C.R.)
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8, 1749-016 Lisboa, Portugal
| | - João Xavier Santos
- Departamento de Promoção da Saúde e Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Avenida Padre Cruz, 1649-016 Lisboa, Portugal; (J.V.); (H.M.); (A.R.M.); (J.X.S.); (M.A.); (C.R.)
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8, 1749-016 Lisboa, Portugal
| | - Muhammad Asif
- Departamento de Promoção da Saúde e Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Avenida Padre Cruz, 1649-016 Lisboa, Portugal; (J.V.); (H.M.); (A.R.M.); (J.X.S.); (M.A.); (C.R.)
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8, 1749-016 Lisboa, Portugal
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Faisalabad 38000, Pakistan
| | - Célia Rasga
- Departamento de Promoção da Saúde e Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Avenida Padre Cruz, 1649-016 Lisboa, Portugal; (J.V.); (H.M.); (A.R.M.); (J.X.S.); (M.A.); (C.R.)
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8, 1749-016 Lisboa, Portugal
| | - Guiomar Oliveira
- Unidade de Neurodesenvolvimento e Autismo, Serviço do Centro de Desenvolvimento da Criança, Centro de Investigação e Formação Clínica, Hospital Pediátrico, Centro Hospitalar e Universitário de Coimbra (CHUC), 3000-602 Coimbra, Portugal;
- Coimbra Institute for Biomedical Imaging and Translational Research, University Clinic of Pediatrics, Faculty of Medicine, University of Coimbra, 3000-602 Coimbra, Portugal
| | - Astrid Moura Vicente
- Departamento de Promoção da Saúde e Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Avenida Padre Cruz, 1649-016 Lisboa, Portugal; (J.V.); (H.M.); (A.R.M.); (J.X.S.); (M.A.); (C.R.)
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Campo Grande, C8, 1749-016 Lisboa, Portugal
| |
Collapse
|
12
|
Quan F, Liang X, Cheng M, Yang H, Liu K, He S, Sun S, Deng M, He Y, Liu W, Wang S, Zhao S, Deng L, Hou X, Zhang X, Xiao Y. Annotation of cell types (ACT): a convenient web server for cell type annotation. Genome Med 2023; 15:91. [PMID: 37924118 PMCID: PMC10623726 DOI: 10.1186/s13073-023-01249-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 10/18/2023] [Indexed: 11/06/2023] Open
Abstract
BACKGROUND The advancement of single-cell sequencing has progressed our ability to solve biological questions. Cell type annotation is of vital importance to this process, allowing for the analysis and interpretation of enormous single-cell datasets. At present, however, manual cell annotation which is the predominant approach remains limited by both speed and the requirement of expert knowledge. METHODS To address these challenges, we constructed a hierarchically organized marker map through manually curating over 26,000 cell marker entries from about 7000 publications. We then developed WISE, a weighted and integrated gene set enrichment method, to integrate the prevalence of canonical markers and ordered differentially expressed genes of specific cell types in the marker map. Benchmarking analysis suggested that our method outperformed state-of-the-art methods. RESULTS By integrating the marker map and WISE, we developed a user-friendly and convenient web server, ACT ( http://xteam.xbio.top/ACT/ or http://biocc.hrbmu.edu.cn/ACT/ ), which only takes a simple list of upregulated genes as input and provides interactive hierarchy maps, together with well-designed charts and statistical information, to accelerate the assignment of cell identities and made the results comparable to expert manual annotation. Besides, a pan-tissue marker map was constructed to assist in cell assignments in less-studied tissues. Applying ACT to three case studies showed that all cell clusters were quickly and accurately annotated, and multi-level and more refined cell types were identified. CONCLUSIONS We developed a knowledge-based resource and a corresponding method, together with an intuitive graphical web interface, for cell type annotation. We believe that ACT, emerging as a powerful tool for cell type annotation, would be widely used in single-cell research and considerably accelerate the process of cell type identification.
Collapse
Affiliation(s)
- Fei Quan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Xin Liang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Mingjiang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Huan Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Kun Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Shengyuan He
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Shangqin Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Menglan Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Yanzhen He
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Wei Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Shuai Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Shuxiang Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Lantian Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Xiaobo Hou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China
| | - Xinxin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China.
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China.
| |
Collapse
|
13
|
Triant DA, Walsh AT, Hartley GA, Petry B, Stegemiller MR, Nelson BM, McKendrick MM, Fuller EP, Cockett NE, Koltes JE, McKay SD, Green JA, Murdoch BM, Hagen DE, Elsik CG. AgAnimalGenomes: browsers for viewing and manually annotating farm animal genomes. Mamm Genome 2023; 34:418-436. [PMID: 37460664 PMCID: PMC10382368 DOI: 10.1007/s00335-023-10008-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023]
Abstract
Current genome sequencing technologies have made it possible to generate highly contiguous genome assemblies for non-model animal species. Despite advances in genome assembly methods, there is still room for improvement in the delineation of specific gene features in the genomes. Here we present genome visualization and annotation tools to support seven livestock species (bovine, chicken, goat, horse, pig, sheep, and water buffalo), available in a new resource called AgAnimalGenomes. In addition to supporting the manual refinement of gene models, these browsers provide visualization tracks for hundreds of RNAseq experiments, as well as data generated by the Functional Annotation of Animal Genomes (FAANG) Consortium. For species with predicted gene sets from both Ensembl and RefSeq, the browsers provide special tracks showing the thousands of protein-coding genes that disagree across the two gene sources, serving as a valuable resource to alert researchers to gene model issues that may affect data interpretation. We describe the data and search methods available in the new genome browsers and how to use the provided tools to edit and create new gene models.
Collapse
Affiliation(s)
- Deborah A Triant
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
| | - Amy T Walsh
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
| | - Gabrielle A Hartley
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, 06269, USA
| | - Bruna Petry
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA
| | - Morgan R Stegemiller
- Department of Animal, Veterinary and Food Sciences, University of Idaho, Moscow, ID, 83844, USA
| | - Benjamin M Nelson
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
| | - Makenna M McKendrick
- Department of Animal and Food Sciences, Oklahoma State University, Stillwater, OK, 74078, USA
| | - Emily P Fuller
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, 06269, USA
| | - Noelle E Cockett
- Department of Animal, Dairy, and Veterinary Sciences, Utah State University, Logan, UT, 84322, USA
| | - James E Koltes
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA
| | - Stephanie D McKay
- Department of Animal and Veterinary Sciences, University of Vermont, Burlington, VT, 05405, USA
| | - Jonathan A Green
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
| | - Brenda M Murdoch
- Department of Animal, Veterinary and Food Sciences, University of Idaho, Moscow, ID, 83844, USA
| | - Darren E Hagen
- Department of Animal and Food Sciences, Oklahoma State University, Stillwater, OK, 74078, USA
| | - Christine G Elsik
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.
- Division of Plant Science & Technology, University of Missouri, Columbia, MO, 65211, USA.
- Institute for Data Science & Informatics, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
14
|
Boppana A, Lee S, Malhotra R, Halushka M, Gustilo KS, Quardokus EM, Herr BW, Börner K, Weber GM. Anatomical structures, cell types, and biomarkers of the healthy human blood vasculature. Sci Data 2023; 10:452. [PMID: 37468503 PMCID: PMC10356915 DOI: 10.1038/s41597-023-02018-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 01/30/2023] [Indexed: 07/21/2023] Open
Abstract
More than 150 scientists from 17 consortia are collaborating on an international project to build a Human Reference Atlas, which maps all 37 trillion cells in the healthy adult human body. The initial release of this atlas provided hierarchical lists of the anatomical structures, cell types, and biomarkers in 11 organs. Here, we describe the methods we used as part of this initiative to build the first open, computer-readable, and comprehensive database of the adult human blood vasculature, called the Human Reference Atlas-Vasculature Common Coordinate Framework (HRA-VCCF). It includes 993 vessels and their branching connections, 10 cell types, and 10 biomarkers. With this paper we are releasing additional details on vessel types and subtypes, branching sequence, anastomoses, portal systems, microvasculature, functional tissue units, mappings to regions vessels supply or drain, geometric properties of vessels, and links to 3D reference objects. Future versions will add variants and connections to the lymph vasculature; and, it will iteratively expand and improve the database as additional experimental data become available through the participating consortia.
Collapse
Affiliation(s)
| | - Sujin Lee
- Department of Surgery, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Rajeev Malhotra
- Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Marc Halushka
- Department of Pathology, Johns Hopkins Medicine, Baltimore, Maryland, USA
| | - Katherine S Gustilo
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, USA
| | - Ellen M Quardokus
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, USA
| | - Bruce W Herr
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, USA
| | - Katy Börner
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, USA
| | - Griffin M Weber
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA.
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
| |
Collapse
|
15
|
Carmody LC, Gargano MA, Toro S, Vasilevsky NA, Adam MP, Blau H, Chan LE, Gomez-Andres D, Horvath R, Kraus ML, Ladewig MS, Lewis-Smith D, Lochmüller H, Matentzoglu NA, Munoz-Torres MC, Schuetz C, Seitz B, Similuk MN, Sparks TN, Strauss T, Swietlik EM, Thompson R, Zhang XA, Mungall CJ, Haendel MA, Robinson PN. The Medical Action Ontology: A Tool for Annotating and Analyzing Treatments and Clinical Management of Human Disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.13.23292612. [PMID: 37503136 PMCID: PMC10370244 DOI: 10.1101/2023.07.13.23292612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Navigating the vast landscape of clinical literature to find optimal treatments and management strategies can be a challenging task, especially for rare diseases. To address this task, we introduce the Medical Action Ontology (MAxO), the first ontology specifically designed to organize medical procedures, therapies, and interventions in a structured way. Currently, MAxO contains 1757 medical action terms added through a combination of manual and semi-automated processes. MAxO was developed with logical structures that make it compatible with several other ontologies within the Open Biological and Biomedical Ontologies (OBO) Foundry. These cover a wide range of biomedical domains, from human anatomy and investigations to the chemical and protein entities involved in biological processes. We have created a database of over 16000 annotations that describe diagnostic modalities for specific phenotypic abnormalities as defined by the Human Phenotype Ontology (HPO). Additionally, 413 annotations are provided for medical actions for 189 rare diseases. We have developed a web application called POET (https://poet.jax.org/) for the community to use to contribute MAxO annotations. MAxO provides a computational representation of treatments and other actions taken for the clinical management of patients. The development of MAxO is closely coupled to the Mondo Disease Ontology (Mondo) and the Human Phenotype Ontology (HPO) and expands the scope of our computational modeling of diseases and phenotypic features to include diagnostics and therapeutic actions. MAxO is available under the open-source CC-BY 4.0 license (https://github.com/monarch-initiative/MAxO).
Collapse
Affiliation(s)
- Leigh C Carmody
- The Jackson Laboratory for Genomic Medicine,Farmington,CT,United States
| | - Michael A Gargano
- The Jackson Laboratory for Genomic Medicine,Farmington,CT,United States
| | - Sabrina Toro
- University of Colorado Anschutz Medical Campus,Aurora,CO,United States
| | | | - Margaret P Adam
- University of Washington School of Medicine, Seattle, WA, United States
| | - Hannah Blau
- The Jackson Laboratory for Genomic Medicine,Farmington,CT,United States
| | | | - David Gomez-Andres
- Pediatric Neurology, Vall d'Hebron Institut de Recerca (VHIR), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus., Passeig Vall d'Hebron 119-129, 08035 Barcelona, Spain
| | - Rita Horvath
- Department of Clinical Neurosciences, University of Cambridge, Robinson Way CB2 0PY, Cambridge UK
| | - Megan L Kraus
- University of Colorado Anschutz Medical Campus,Aurora,CO,United States
| | - Markus S Ladewig
- Department of Ophthalmology,Klinikum Saarbrücken,Saarbrücken,Germany
| | - David Lewis-Smith
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, NE2 4HH, United Kingdom
| | | | | | | | - Catharina Schuetz
- Department of Pediatrics, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Berthold Seitz
- Department of Ophthalmology,Saarland University Hospital UKS,Homburg/Saar Germany
| | - Morgan N Similuk
- National Institute of Allergy and Infectious Diseases,National Institutes of Health,Bethesda,MD,United States
| | - Teresa N Sparks
- Department of Obstetrics, Gynecology, & Reproductive Sciences, University of California, San Francisco, San Francisco, CA 94143
| | - Timmy Strauss
- Department of Pediatrics, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Emilia M Swietlik
- Department of Medicine, University of Cambridge, Heart and Lung Research Institute, CB2 0BB, Cambridge, UK
| | | | | | | | - Melissa A Haendel
- University of Colorado Anschutz Medical Campus,Aurora,CO,United States
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine,Farmington,CT,United States
| |
Collapse
|
16
|
Ruberte J, Schofield PN, Sundberg JP, Rodriguez-Baeza A, Carretero A, McKerlie C. Bridging mouse and human anatomies; a knowledge-based approach to comparative anatomy for disease model phenotyping. Mamm Genome 2023:10.1007/s00335-023-10005-4. [PMID: 37421464 PMCID: PMC10382392 DOI: 10.1007/s00335-023-10005-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 06/13/2023] [Indexed: 07/10/2023]
Abstract
The laboratory mouse is the foremost mammalian model used for studying human diseases and is closely anatomically related to humans. Whilst knowledge about human anatomy has been collected throughout the history of mankind, the first comprehensive study of the mouse anatomy was published less than 60 years ago. This has been followed by the more recent publication of several books and resources on mouse anatomy. Nevertheless, to date, our understanding and knowledge of mouse anatomy is far from being at the same level as that of humans. In addition, the alignment between current mouse and human anatomy nomenclatures is far from being as developed as those existing between other species, such as domestic animals and humans. To close this gap, more in depth mouse anatomical research is needed and it will be necessary to extent and refine the current vocabulary of mouse anatomical terms.
Collapse
Affiliation(s)
- Jesús Ruberte
- Center for Animal Biotechnology and Gene Therapy, Universitat Autònoma de Barcelona, Barcelona, Spain.
- Department of Animal Health and Anatomy, Universitat Autònoma de Barcelona, Barcelona, Spain.
| | - Paul N Schofield
- The Jackson Laboratory, Bar Harbor, ME, USA
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
| | - John P Sundberg
- The Jackson Laboratory, Bar Harbor, ME, USA
- Department of Dermatology, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Ana Carretero
- Center for Animal Biotechnology and Gene Therapy, Universitat Autònoma de Barcelona, Barcelona, Spain
- Department of Animal Health and Anatomy, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Colin McKerlie
- The Hospital for Sick Children, Toronto, Canada
- Department of Lab Medicine and Pathobiology, Faculty of Medicine, University of Toronto, Toronto, Canada
| |
Collapse
|
17
|
Aleksander SA, Balhoff J, Carbon S, Cherry JM, Drabkin HJ, Ebert D, Feuermann M, Gaudet P, Harris NL, Hill DP, Lee R, Mi H, Moxon S, Mungall CJ, Muruganugan A, Mushayahama T, Sternberg PW, Thomas PD, Van Auken K, Ramsey J, Siegele DA, Chisholm RL, Fey P, Aspromonte MC, Nugnes MV, Quaglia F, Tosatto S, Giglio M, Nadendla S, Antonazzo G, Attrill H, Dos Santos G, Marygold S, Strelets V, Tabone CJ, Thurmond J, Zhou P, Ahmed SH, Asanitthong P, Luna Buitrago D, Erdol MN, Gage MC, Ali Kadhum M, Li KYC, Long M, Michalak A, Pesala A, Pritazahra A, Saverimuttu SCC, Su R, Thurlow KE, Lovering RC, Logie C, Oliferenko S, Blake J, Christie K, Corbani L, Dolan ME, Drabkin HJ, Hill DP, Ni L, Sitnikov D, Smith C, Cuzick A, Seager J, Cooper L, Elser J, Jaiswal P, Gupta P, Jaiswal P, Naithani S, Lera-Ramirez M, Rutherford K, Wood V, De Pons JL, Dwinell MR, Hayman GT, Kaldunski ML, Kwitek AE, Laulederkind SJF, Tutaj MA, Vedi M, Wang SJ, D'Eustachio P, Aimo L, Axelsen K, Bridge A, Hyka-Nouspikel N, Morgat A, Aleksander SA, Cherry JM, Engel SR, Karra K, Miyasato SR, Nash RS, Skrzypek MS, Weng S, Wong ED, Bakker E, Berardini TZ, et alAleksander SA, Balhoff J, Carbon S, Cherry JM, Drabkin HJ, Ebert D, Feuermann M, Gaudet P, Harris NL, Hill DP, Lee R, Mi H, Moxon S, Mungall CJ, Muruganugan A, Mushayahama T, Sternberg PW, Thomas PD, Van Auken K, Ramsey J, Siegele DA, Chisholm RL, Fey P, Aspromonte MC, Nugnes MV, Quaglia F, Tosatto S, Giglio M, Nadendla S, Antonazzo G, Attrill H, Dos Santos G, Marygold S, Strelets V, Tabone CJ, Thurmond J, Zhou P, Ahmed SH, Asanitthong P, Luna Buitrago D, Erdol MN, Gage MC, Ali Kadhum M, Li KYC, Long M, Michalak A, Pesala A, Pritazahra A, Saverimuttu SCC, Su R, Thurlow KE, Lovering RC, Logie C, Oliferenko S, Blake J, Christie K, Corbani L, Dolan ME, Drabkin HJ, Hill DP, Ni L, Sitnikov D, Smith C, Cuzick A, Seager J, Cooper L, Elser J, Jaiswal P, Gupta P, Jaiswal P, Naithani S, Lera-Ramirez M, Rutherford K, Wood V, De Pons JL, Dwinell MR, Hayman GT, Kaldunski ML, Kwitek AE, Laulederkind SJF, Tutaj MA, Vedi M, Wang SJ, D'Eustachio P, Aimo L, Axelsen K, Bridge A, Hyka-Nouspikel N, Morgat A, Aleksander SA, Cherry JM, Engel SR, Karra K, Miyasato SR, Nash RS, Skrzypek MS, Weng S, Wong ED, Bakker E, Berardini TZ, Reiser L, Auchincloss A, Axelsen K, Argoud-Puy G, Blatter MC, Boutet E, Breuza L, Bridge A, Casals-Casas C, Coudert E, Estreicher A, Livia Famiglietti M, Feuermann M, Gos A, Gruaz-Gumowski N, Hulo C, Hyka-Nouspikel N, Jungo F, Le Mercier P, Lieberherr D, Masson P, Morgat A, Pedruzzi I, Pourcel L, Poux S, Rivoire C, Sundaram S, Bateman A, Bowler-Barnett E, Bye-A-Jee H, Denny P, Ignatchenko A, Ishtiaq R, Lock A, Lussi Y, Magrane M, Martin MJ, Orchard S, Raposo P, Speretta E, Tyagi N, Warner K, Zaru R, Diehl AD, Lee R, Chan J, Diamantakis S, Raciti D, Zarowiecki M, Fisher M, James-Zorn C, Ponferrada V, Zorn A, Ramachandran S, Ruzicka L, Westerfield M. The Gene Ontology knowledgebase in 2023. Genetics 2023; 224:iyad031. [PMID: 36866529 PMCID: PMC10158837 DOI: 10.1093/genetics/iyad031] [Show More Authors] [Citation(s) in RCA: 808] [Impact Index Per Article: 404.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 02/10/2023] [Accepted: 02/11/2023] [Indexed: 03/04/2023] Open
Abstract
The Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms. Here, we provide an updated overview of the GO knowledgebase, as well as the efforts of the broad, international consortium of scientists that develops, maintains, and updates the GO knowledgebase. The GO knowledgebase consists of three components: (1) the GO-a computational knowledge structure describing the functional characteristics of genes; (2) GO annotations-evidence-supported statements asserting that a specific gene product has a particular functional characteristic; and (3) GO Causal Activity Models (GO-CAMs)-mechanistic models of molecular "pathways" (GO biological processes) created by linking multiple GO annotations using defined relations. Each of these components is continually expanded, revised, and updated in response to newly published discoveries and receives extensive QA checks, reviews, and user feedback. For each of these components, we provide a description of the current contents, recent developments to keep the knowledgebase up to date with new discoveries, and guidance on how users can best make use of the data that we provide. We conclude with future directions for the project.
Collapse
|
18
|
Herr BW, Hardi J, Quardokus EM, Bueckle A, Chen L, Wang F, Caron AR, Osumi-Sutherland D, Musen MA, Börner K. Specimen, biological structure, and spatial ontologies in support of a Human Reference Atlas. Sci Data 2023; 10:171. [PMID: 36973309 PMCID: PMC10043028 DOI: 10.1038/s41597-023-01993-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 01/30/2023] [Indexed: 03/29/2023] Open
Abstract
The Human Reference Atlas (HRA) is defined as a comprehensive, three-dimensional (3D) atlas of all the cells in the healthy human body. It is compiled by an international team of experts who develop standard terminologies that they link to 3D reference objects, describing anatomical structures. The third HRA release (v1.2) covers spatial reference data and ontology annotations for 26 organs. Experts access the HRA annotations via spreadsheets and view reference object models in 3D editing tools. This paper introduces the Common Coordinate Framework (CCF) Ontology v2.0.1 that interlinks specimen, biological structure, and spatial data, together with the CCF API that makes the HRA programmatically accessible and interoperable with Linked Open Data (LOD). We detail how real-world user needs and experimental data guide CCF Ontology design and implementation, present CCF Ontology classes and properties together with exemplary usage, and report on validation methods. The CCF Ontology graph database and API are used in the HuBMAP portal, HRA Organ Gallery, and other applications that support data queries across multiple, heterogeneous sources.
Collapse
Affiliation(s)
- Bruce W Herr
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, 47408, USA
| | - Josef Hardi
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, 94305, USA
| | - Ellen M Quardokus
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, 47408, USA
| | - Andreas Bueckle
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, 47408, USA.
| | - Lu Chen
- Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Fusheng Wang
- Department of Computer Science, Stony Brook University, Stony Brook, NY, 11794, USA
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Anita R Caron
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | | | - Mark A Musen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, 94305, USA
| | - Katy Börner
- Department of Intelligent Systems Engineering, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, 47408, USA.
| |
Collapse
|
19
|
Burger A, Baldock RA, Adams DJ, Din S, Papatheodorou I, Glinka M, Hill B, Houghton D, Sharghi M, Wicks M, Arends MJ. Towards a clinically-based common coordinate framework for the human gut cell atlas: the gut models. BMC Med Inform Decis Mak 2023; 23:36. [PMID: 36793076 PMCID: PMC9933383 DOI: 10.1186/s12911-023-02111-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 01/13/2023] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND The Human Cell Atlas resource will deliver single cell transcriptome data spatially organised in terms of gross anatomy, tissue location and with images of cellular histology. This will enable the application of bioinformatics analysis, machine learning and data mining revealing an atlas of cell types, sub-types, varying states and ultimately cellular changes related to disease conditions. To further develop the understanding of specific pathological and histopathological phenotypes with their spatial relationships and dependencies, a more sophisticated spatial descriptive framework is required to enable integration and analysis in spatial terms. METHODS We describe a conceptual coordinate model for the Gut Cell Atlas (small and large intestines). Here, we focus on a Gut Linear Model (1-dimensional representation based on the centreline of the gut) that represents the location semantics as typically used by clinicians and pathologists when describing location in the gut. This knowledge representation is based on a set of standardised gut anatomy ontology terms describing regions in situ, such as ileum or transverse colon, and landmarks, such as ileo-caecal valve or hepatic flexure, together with relative or absolute distance measures. We show how locations in the 1D model can be mapped to and from points and regions in both a 2D model and 3D models, such as a patient's CT scan where the gut has been segmented. RESULTS The outputs of this work include 1D, 2D and 3D models of the human gut, delivered through publicly accessible Json and image files. We also illustrate the mappings between models using a demonstrator tool that allows the user to explore the anatomical space of the gut. All data and software is fully open-source and available online. CONCLUSIONS Small and large intestines have a natural "gut coordinate" system best represented as a 1D centreline through the gut tube, reflecting functional differences. Such a 1D centreline model with landmarks, visualised using viewer software allows interoperable translation to both a 2D anatomogram model and multiple 3D models of the intestines. This permits users to accurately locate samples for data comparison.
Collapse
Affiliation(s)
- Albert Burger
- Department of Computer Science, School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK.
| | - Richard A Baldock
- Division of Pathology, Centre for Comparative Pathology, Edinburgh Cancer Research Centre, Institute of Cancer and Genetics, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XU, UK
| | - David J Adams
- Experimental Cancer Genetics, Wellcome Sanger Institute, Hinxton, Cambridge, UK
| | - Shahida Din
- Edinburgh IBD Unit Western General Hospital, NHS Lothian, Edinburgh, UK
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Hinxton, Cambridge, UK
| | - Michael Glinka
- Division of Pathology, Centre for Comparative Pathology, Edinburgh Cancer Research Centre, Institute of Cancer and Genetics, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XU, UK
| | - Bill Hill
- Department of Computer Science, School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
| | - Derek Houghton
- Department of Computer Science, School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
| | - Mehran Sharghi
- Department of Computer Science, School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
| | - Michael Wicks
- Division of Pathology, Centre for Comparative Pathology, Edinburgh Cancer Research Centre, Institute of Cancer and Genetics, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XU, UK
| | - Mark J Arends
- Division of Pathology, Centre for Comparative Pathology, Edinburgh Cancer Research Centre, Institute of Cancer and Genetics, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XU, UK.
| |
Collapse
|
20
|
Burger A, Baldock R, Adams DJ, Din S, Papatheodorou I, Glinka M, Hill B, Houghton D, Sharghi M, Wicks M, Arends MJ. Towards a Clinically-based Common Coordinate Framework for the Human Gut Cell Atlas - The Gut Models.. [DOI: 10.1101/2022.12.08.519665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2024]
Abstract
AbstractBackgroundThe Human Cell Atlas resource will deliver single cell transcriptome data spatially organised in terms of gross anatomy, tissue location and with images of cellular histology. This will enable the application of bioinformatics analysis, machine learning and data mining revealing an atlas of cell types, sub-types, varying states and ultimately cellular changes related to disease conditions. To further develop the understanding of specific pathological and histopathological phenotypes with their spatial relationships and dependencies, a more sophisticated spatial descriptive framework is required to enable integration and analysis in spatial terms.MethodsWe describe a conceptual coordinate model for the Gut Cell Atlas (small and large intestines). Here, we focus on a Gut Linear Model (1-dimensional representation based on the centreline of the gut) that represents the location semantics as typically used by clinicians and pathologists when describing location in the gut. This knowledge representation is based on a set of standardised gut anatomy ontology terms describing regionsin situ, such as ileum or transverse colon, and landmarks, such as ileo-caecal valve or hepatic flexure, together with relative or absolute distance measures. We show how locations in the 1D model can be mapped to and from points and regions in both a 2D model and 3D models, such as a patient’s CT scan where the gut has been segmented.ResultsThe outputs of this work include 1D, 2D and 3D models of the human gut, delivered through publicly accessible Json and image files. We also illustrate the mappings between models using a demonstrator tool that allows the user to explore the anatomical space of the gut. All data and software is fully open-source and available online.ConclusionsSmall and large intestines have a natural “gut coordinate” system best represented as a 1D centreline through the gut tube, reflecting functional differences. Such a 1D centreline model with landmarks, visualised using viewer software allows interoperable translation to both a 2D anatomogram model and multiple 3D models of the intestines. This permits users to accurately locate samples for data comparison.
Collapse
|
21
|
Lyman DF, Bell A, Black A, Dingerdissen H, Cauley E, Gogate N, Liu D, Joseph A, Kahsay R, Crichton DJ, Mehta A, Mazumder R. Modeling and integration of N-glycan biomarkers in a comprehensive biomarker data model. Glycobiology 2022; 32:855-870. [PMID: 35925813 PMCID: PMC9487899 DOI: 10.1093/glycob/cwac046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 06/30/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Molecular biomarkers measure discrete components of biological processes that can contribute to disorders when impaired. Great interest exists in discovering early cancer biomarkers to improve outcomes. Biomarkers represented in a standardized data model, integrated with multi-omics data, may improve understanding and use of novel biomarkers such as glycans and glycoconjugates. Among altered components in tumorigenesis, N-glycans exhibit substantial biomarker potential, when analyzed with their protein carriers. However, such data are distributed across publications and databases of diverse formats, which hampers their use in research and clinical application. Mass spectrometry measures of fifty N-glycans, on seven serum proteins in liver disease, were integrated (as a panel) into a cancer biomarker data model, providing a unique identifier, standard nomenclature, links to glycan resources, and accession and ontology annotations to standard protein, gene, disease, and biomarker information. Data provenance was documented with a standardized FDA-supported BioCompute Object. Using the biomarker data model allows capture of granular information, such as glycans with different levels of abundance in cirrhosis, hepatocellular carcinoma, and transplant groups. Such representation in a standardized data model harmonizes glycomics data in a unified framework, making glycan-protein biomarker data exploration more available to investigators and to other data resources. The biomarker data model we describe can be used by researchers to describe their novel glycan and glycoconjugate biomarkers, can integrate N-glycan biomarker data with multi-source biomedical data, and can foster discovery and insight within a unified data framework for glycan biomarker representation thereby making the data FAIR (Findable, Accessible, Interoperable, Reusable) (https://www.go-fair.org/fair-principles/).
Collapse
Affiliation(s)
- Daniel F Lyman
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC 20037, United States of America
| | - Amanda Bell
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC 20037, United States of America
| | - Alyson Black
- The Department of Cell & Molecular Pharmacology, The Medical University of South Carolina, Charleston, SC, 29403, United States of America
| | - Hayley Dingerdissen
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC 20037, United States of America
| | - Edmund Cauley
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC 20037, United States of America.,The McCormick Genomic and Proteomic Center, The George Washington University, Washington, DC 20037, United States of America
| | - Nikhita Gogate
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC 20037, United States of America
| | - David Liu
- NASA Jet Propulsion Laboratory, Pasadena, CA 91109, United States of America
| | - Ashia Joseph
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC 20037, United States of America
| | - Robel Kahsay
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC 20037, United States of America
| | - Daniel J Crichton
- NASA Jet Propulsion Laboratory, Pasadena, CA 91109, United States of America
| | - Anand Mehta
- The Department of Cell & Molecular Pharmacology, The Medical University of South Carolina, Charleston, SC, 29403, United States of America
| | - Raja Mazumder
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC 20037, United States of America.,The McCormick Genomic and Proteomic Center, The George Washington University, Washington, DC 20037, United States of America
| |
Collapse
|
22
|
de Bono B, Gillespie T, Surles-Zeigler MC, Kokash N, Grethe JS, Martone M. Representing Normal and Abnormal Physiology as Routes of Flow in ApiNATOMY. Front Physiol 2022; 13:795303. [PMID: 35547570 PMCID: PMC9083405 DOI: 10.3389/fphys.2022.795303] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 02/07/2022] [Indexed: 01/04/2023] Open
Abstract
We present (i) the ApiNATOMY workflow to build knowledge models of biological connectivity, as well as (ii) the ApiNATOMY TOO map, a topological scaffold to organize and visually inspect these connectivity models in the context of a canonical architecture of body compartments. In this work, we outline the implementation of ApiNATOMY's knowledge representation in the context of a large-scale effort, SPARC, to map the autonomic nervous system. Within SPARC, the ApiNATOMY modeling effort has generated the SCKAN knowledge graph that combines connectivity models and TOO map. This knowledge graph models flow routes for a number of normal and disease scenarios in physiology. Calculations over SCKAN to infer routes are being leveraged to classify, navigate and search for semantically-linked metadata of multimodal experimental datasets for a number of cross-scale, cross-disciplinary projects.
Collapse
Affiliation(s)
- Bernard de Bono
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Tom Gillespie
- Department of Neuroscience, University of California, San Diego, San Diego, CA, United States
| | | | - Natallia Kokash
- Faculty of Humanities, University of Amsterdam, Amsterdam, Netherlands
| | - Jeff S. Grethe
- Department of Neuroscience, University of California, San Diego, San Diego, CA, United States
| | - Maryann Martone
- Department of Neuroscience, University of California, San Diego, San Diego, CA, United States
| |
Collapse
|
23
|
Agapite J, Albou LP, Aleksander SA, Alexander M, Anagnostopoulos AV, Antonazzo G, Argasinska J, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blake JA, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Michael Cherry J, Cho J, Christie KR, Crosby MA, Davis P, da Veiga Beltrame E, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, dos Santos G, Douglass E, Dunn B, Eagle A, Ebert D, Engel SR, Fashena D, Foley S, Frazer K, Gao S, Gibson AC, Gondwe F, Goodman J, Sian Gramates L, Grove CA, Hale P, Harris T, Thomas Hayman G, Hill DP, Howe DG, Howe KL, Hu Y, Jha S, Kadin JA, Kaufman TC, Kalita P, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, MacPherson KA, Martin R, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nalabolu HS, Nash RS, Ng P, Nuin P, Paddock H, Paulini M, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schaper K, Schindelman G, Shimoyama M, Simison M, Shaw DR, Shrivatsav A, Singer A, Skrzypek M, Smith CM, Smith CL, et alAgapite J, Albou LP, Aleksander SA, Alexander M, Anagnostopoulos AV, Antonazzo G, Argasinska J, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blake JA, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Michael Cherry J, Cho J, Christie KR, Crosby MA, Davis P, da Veiga Beltrame E, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, dos Santos G, Douglass E, Dunn B, Eagle A, Ebert D, Engel SR, Fashena D, Foley S, Frazer K, Gao S, Gibson AC, Gondwe F, Goodman J, Sian Gramates L, Grove CA, Hale P, Harris T, Thomas Hayman G, Hill DP, Howe DG, Howe KL, Hu Y, Jha S, Kadin JA, Kaufman TC, Kalita P, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, MacPherson KA, Martin R, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nalabolu HS, Nash RS, Ng P, Nuin P, Paddock H, Paulini M, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schaper K, Schindelman G, Shimoyama M, Simison M, Shaw DR, Shrivatsav A, Singer A, Skrzypek M, Smith CM, Smith CL, Smith JR, Stein L, Sternberg PW, Tabone CJ, Thomas PD, Thorat K, Thota J, Toro S, Tomczuk M, Trovisco V, Tutaj MA, Tutaj M, Urbano JM, Van Auken K, Van Slyke CE, Wang Q, Wang SJ, Weng S, Westerfield M, Williams G, Wilming LG, Wong ED, Wright A, Yook K, Zarowiecki M, Zhou P, Zytkovicz M. Harmonizing model organism data in the Alliance of Genome Resources. Genetics 2022; 220:iyac022. [PMID: 35380658 PMCID: PMC8982023 DOI: 10.1093/genetics/iyac022] [Show More Authors] [Citation(s) in RCA: 59] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Accepted: 01/26/2022] [Indexed: 02/06/2023] Open
Abstract
The Alliance of Genome Resources (the Alliance) is a combined effort of 7 knowledgebase projects: Saccharomyces Genome Database, WormBase, FlyBase, Mouse Genome Database, the Zebrafish Information Network, Rat Genome Database, and the Gene Ontology Resource. The Alliance seeks to provide several benefits: better service to the various communities served by these projects; a harmonized view of data for all biomedical researchers, bioinformaticians, clinicians, and students; and a more sustainable infrastructure. The Alliance has harmonized cross-organism data to provide useful comparative views of gene function, gene expression, and human disease relevance. The basis of the comparative views is shared calls of orthology relationships and the use of common ontologies. The key types of data are alleles and variants, gene function based on gene ontology annotations, phenotypes, association to human disease, gene expression, protein-protein and genetic interactions, and participation in pathways. The information is presented on uniform gene pages that allow facile summarization of information about each gene in each of the 7 organisms covered (budding yeast, roundworm Caenorhabditis elegans, fruit fly, house mouse, zebrafish, brown rat, and human). The harmonized knowledge is freely available on the alliancegenome.org portal, as downloadable files, and by APIs. We expect other existing and emerging knowledge bases to join in the effort to provide the union of useful data and features that each knowledge base currently provides.
Collapse
|
24
|
Porto DS, Dahdul WM, Lapp H, Balhoff JP, Vision TJ, Mabee PM, Uyeda J. Assessing Bayesian Phylogenetic Information Content of Morphological Data Using Knowledge from Anatomy Ontologies. Syst Biol 2022; 71:1290-1306. [PMID: 35285502 PMCID: PMC9558846 DOI: 10.1093/sysbio/syac022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 02/09/2022] [Accepted: 03/05/2022] [Indexed: 11/18/2022] Open
Abstract
Morphology remains a primary source of phylogenetic information for many groups of organisms, and the only one for most fossil taxa. Organismal anatomy is not a collection of randomly assembled and independent “parts”, but instead a set of dependent and hierarchically nested entities resulting from ontogeny and phylogeny. How do we make sense of these dependent and at times redundant characters? One promising approach is using ontologies—structured controlled vocabularies that summarize knowledge about different properties of anatomical entities, including developmental and structural dependencies. Here, we assess whether evolutionary patterns can explain the proximity of ontology-annotated characters within an ontology. To do so, we measure phylogenetic information across characters and evaluate if it matches the hierarchical structure given by ontological knowledge—in much the same way as across-species diversity structure is given by phylogeny. We implement an approach to evaluate the Bayesian phylogenetic information (BPI) content and phylogenetic dissonance among ontology-annotated anatomical data subsets. We applied this to data sets representing two disparate animal groups: bees (Hexapoda: Hymenoptera: Apoidea, 209 chars) and characiform fishes (Actinopterygii: Ostariophysi: Characiformes, 463 chars). For bees, we find that BPI is not substantially explained by anatomy since dissonance is often high among morphologically related anatomical entities. For fishes, we find substantial information for two clusters of anatomical entities instantiating concepts from the jaws and branchial arch bones, but among-subset information decreases and dissonance increases substantially moving to higher-level subsets in the ontology. We further applied our approach to address particular evolutionary hypotheses with an example of morphological evolution in miniature fishes. While we show that phylogenetic information does match ontology structure for some anatomical entities, additional relationships and processes, such as convergence, likely play a substantial role in explaining BPI and dissonance, and merit future investigation. Our work demonstrates how complex morphological data sets can be interrogated with ontologies by allowing one to access how information is spread hierarchically across anatomical concepts, how congruent this information is, and what sorts of processes may play a role in explaining it: phylogeny, development, or convergence. [Apidae; Bayesian phylogenetic information; Ostariophysi; Phenoscape; phylogenetic dissonance; semantic similarity.]
Collapse
Affiliation(s)
- Diego S Porto
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, 926 West Campus Drive, Blacksburg, VA 24061, USA
| | - Wasila M Dahdul
- UCI Libraries,University of California, Irvine, Irvine, CA 92623, USA
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
| | - Hilmar Lapp
- Center for Genomic and Computational Biology, Duke University, 101 Science Drive, Durham, NC 27708, USA
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, 100 Europa Drive, Suite 540, Chapel Hill, NC 27517, USA
| | - Todd J Vision
- Department of Biology and School of Information and Library Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Paula M Mabee
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
- Battelle, National Ecological Observatory Network, Boulder, CO 80301, USA
| | - Josef Uyeda
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, 926 West Campus Drive, Blacksburg, VA 24061, USA
| |
Collapse
|
25
|
Chloe Li KY, Cook AC, Lovering RC. GOing Forward With the Cardiac Conduction System Using Gene Ontology. Front Genet 2022; 13:802393. [PMID: 35309148 PMCID: PMC8924464 DOI: 10.3389/fgene.2022.802393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 02/09/2022] [Indexed: 02/03/2023] Open
Abstract
The cardiac conduction system (CCS) comprises critical components responsible for the initiation, propagation, and coordination of the action potential. Aberrant CCS development can cause conduction abnormalities, including sick sinus syndrome, accessory pathways, and atrioventricular and bundle branch blocks. Gene Ontology (GO; http://geneontology.org/) is an invaluable global bioinformatics resource which provides structured, computable knowledge describing the functions of gene products. Many gene products are known to be involved in CCS development; however, this information is not comprehensively captured by GO. To address the needs of the heart development research community, this study aimed to describe the specific roles of proteins reported in the literature to be involved with CCS development and/or function. 14 proteins were prioritized for GO annotation which led to the curation of 15 peer-reviewed primary experimental articles using carefully selected GO terms. 152 descriptive GO annotations, including those describing sinoatrial node and atrioventricular node development were created and submitted to the GO Consortium database. A functional enrichment analysis of 35 key CCS development proteins confirmed that this work has improved the in-silico interpretation of this CCS dataset. This work may improve future investigations of the CCS with application of high-throughput methods such as genome-wide association studies analysis, proteomics, and transcriptomics.
Collapse
Affiliation(s)
- Kan Yan Chloe Li
- Department of Preclinical and Fundamental Science, Institute of Cardiovascular Science, Functional Gene Annotation, University College London, London, United Kingdom,Department of Children’s Cardiovascular Disease, Centre for Morphology and Structural Heart Disease, Institute of Cardiovascular Science, University College London, London, United Kingdom,*Correspondence: Kan Yan Chloe Li,
| | - Andrew C Cook
- Department of Children’s Cardiovascular Disease, Centre for Morphology and Structural Heart Disease, Institute of Cardiovascular Science, University College London, London, United Kingdom
| | - Ruth C Lovering
- Department of Preclinical and Fundamental Science, Institute of Cardiovascular Science, Functional Gene Annotation, University College London, London, United Kingdom
| |
Collapse
|
26
|
Newmaster KT, Kronman FA, Wu YT, Kim Y. Seeing the Forest and Its Trees Together: Implementing 3D Light Microscopy Pipelines for Cell Type Mapping in the Mouse Brain. Front Neuroanat 2022; 15:787601. [PMID: 35095432 PMCID: PMC8794814 DOI: 10.3389/fnana.2021.787601] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 12/02/2021] [Indexed: 12/14/2022] Open
Abstract
The brain is composed of diverse neuronal and non-neuronal cell types with complex regional connectivity patterns that create the anatomical infrastructure underlying cognition. Remarkable advances in neuroscience techniques enable labeling and imaging of these individual cell types and their interactions throughout intact mammalian brains at a cellular resolution allowing neuroscientists to examine microscopic details in macroscopic brain circuits. Nevertheless, implementing these tools is fraught with many technical and analytical challenges with a need for high-level data analysis. Here we review key technical considerations for implementing a brain mapping pipeline using the mouse brain as a primary model system. Specifically, we provide practical details for choosing methods including cell type specific labeling, sample preparation (e.g., tissue clearing), microscopy modalities, image processing, and data analysis (e.g., image registration to standard atlases). We also highlight the need to develop better 3D atlases with standardized anatomical labels and nomenclature across species and developmental time points to extend the mapping to other species including humans and to facilitate data sharing, confederation, and integrative analysis. In summary, this review provides key elements and currently available resources to consider while developing and implementing high-resolution mapping methods.
Collapse
Affiliation(s)
- Kyra T Newmaster
- Department of Neural and Behavioral Sciences, The Pennsylvania State University, Hershey, PA, United States
| | - Fae A Kronman
- Department of Neural and Behavioral Sciences, The Pennsylvania State University, Hershey, PA, United States
| | - Yuan-Ting Wu
- Department of Neural and Behavioral Sciences, The Pennsylvania State University, Hershey, PA, United States
| | - Yongsoo Kim
- Department of Neural and Behavioral Sciences, The Pennsylvania State University, Hershey, PA, United States
| |
Collapse
|
27
|
Zancolli G, Reijnders M, Waterhouse RM, Robinson-Rechavi M. Convergent evolution of venom gland transcriptomes across Metazoa. Proc Natl Acad Sci U S A 2022; 119:e2111392119. [PMID: 34983844 PMCID: PMC8740685 DOI: 10.1073/pnas.2111392119] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/10/2021] [Indexed: 12/13/2022] Open
Abstract
Animals have repeatedly evolved specialized organs and anatomical structures to produce and deliver a mixture of potent bioactive molecules to subdue prey or predators-venom. This makes it one of the most widespread, convergent functions in the animal kingdom. Whether animals have adopted the same genetic toolkit to evolved venom systems is a fascinating question that still eludes us. Here, we performed a comparative analysis of venom gland transcriptomes from 20 venomous species spanning the main Metazoan lineages to test whether different animals have independently adopted similar molecular mechanisms to perform the same function. We found a strong convergence in gene expression profiles, with venom glands being more similar to each other than to any other tissue from the same species, and their differences closely mirroring the species phylogeny. Although venom glands secrete some of the fastest evolving molecules (toxins), their gene expression does not evolve faster than evolutionarily older tissues. We found 15 venom gland-specific gene modules enriched in endoplasmic reticulum stress and unfolded protein response pathways, indicating that animals have independently adopted stress response mechanisms to cope with mass production of toxins. This, in turn, activates regulatory networks for epithelial development, cell turnover, and maintenance, which seem composed of both convergent and lineage-specific factors, possibly reflecting the different developmental origins of venom glands. This study represents a first step toward an understanding of the molecular mechanisms underlying the repeated evolution of one of the most successful adaptive traits in the animal kingdom.
Collapse
Affiliation(s)
- Giulia Zancolli
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland;
- Evolutionary Bioinformatics Group, Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Maarten Reijnders
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
- Evolutionary-Functional Genomics Group, Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Robert M Waterhouse
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
- Evolutionary-Functional Genomics Group, Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
- Evolutionary Bioinformatics Group, Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| |
Collapse
|
28
|
Schriml LM, Munro JB, Schor M, Olley D, McCracken C, Felix V, Baron JA, Jackson R, Bello SM, Bearer C, Lichenstein R, Bisordi K, Dialo NC, Giglio M, Greene C. The Human Disease Ontology 2022 update. Nucleic Acids Res 2021; 50:D1255-D1261. [PMID: 34755882 PMCID: PMC8728220 DOI: 10.1093/nar/gkab1063] [Citation(s) in RCA: 119] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 10/13/2021] [Accepted: 10/18/2021] [Indexed: 01/31/2023] Open
Abstract
The Human Disease Ontology (DO) (www.disease-ontology.org) database, has significantly expanded the disease content and enhanced our userbase and website since the DO’s 2018 Nucleic Acids Research DATABASE issue paper. Conservatively, based on available resource statistics, terms from the DO have been annotated to over 1.5 million biomedical data elements and citations, a 10× increase in the past 5 years. The DO, funded as a NHGRI Genomic Resource, plays a key role in disease knowledge organization, representation, and standardization, serving as a reference framework for multiscale biomedical data integration and analysis across thousands of clinical, biomedical and computational research projects and genomic resources around the world. This update reports on the addition of 1,793 new disease terms, a 14% increase of textual definitions and the integration of 22 137 new SubClassOf axioms defining disease to disease connections representing the DO’s complex disease classification. The DO’s updated website provides multifaceted etiology searching, enhanced documentation and educational resources.
Collapse
Affiliation(s)
- Lynn M Schriml
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - James B Munro
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Mike Schor
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Dustin Olley
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Carrie McCracken
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Victor Felix
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - J Allen Baron
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | | | - Susan M Bello
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME, USA
| | | | | | | | | | - Michelle Giglio
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Carol Greene
- University of Maryland School of Medicine, Baltimore, MD, USA
| |
Collapse
|
29
|
Csabai L, Fazekas D, Kadlecsik T, Szalay-Bekő M, Bohár B, Madgwick M, Módos D, Ölbei M, Gul L, Sudhakar P, Kubisch J, Oyeyemi OJ, Liska O, Ari E, Hotzi B, Billes VA, Molnár E, Földvári-Nagy L, Csályi K, Demeter A, Pápai N, Koltai M, Varga M, Lenti K, Farkas IJ, Türei D, Csermely P, Vellai T, Korcsmáros T. SignaLink3: a multi-layered resource to uncover tissue-specific signaling networks. Nucleic Acids Res 2021; 50:D701-D709. [PMID: 34634810 PMCID: PMC8728204 DOI: 10.1093/nar/gkab909] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 09/16/2021] [Accepted: 09/22/2021] [Indexed: 12/26/2022] Open
Abstract
Signaling networks represent the molecular mechanisms controlling a cell's response to various internal or external stimuli. Most currently available signaling databases contain only a part of the complex network of intertwining pathways, leaving out key interactions or processes. Hence, we have developed SignaLink3 (http://signalink.org/), a value-added knowledge-base that provides manually curated data on signaling pathways and integrated data from several types of databases (interaction, regulation, localisation, disease, etc.) for humans, and three major animal model organisms. SignaLink3 contains over 400 000 newly added human protein-protein interactions resulting in a total of 700 000 interactions for Homo sapiens, making it one of the largest integrated signaling network resources. Next to H. sapiens, SignaLink3 is the only current signaling network resource to provide regulatory information for the model species Caenorhabditis elegans and Danio rerio, and the largest resource for Drosophila melanogaster. Compared to previous versions, we have integrated gene expression data as well as subcellular localization of the interactors, therefore uniquely allowing tissue-, or compartment-specific pathway interaction analysis to create more accurate models. Data is freely available for download in widely used formats, including CSV, PSI-MI TAB or SQL.
Collapse
Affiliation(s)
- Luca Csabai
- Earlham Institute, Norwich NR4 7UZ, UK.,Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | - Dávid Fazekas
- Earlham Institute, Norwich NR4 7UZ, UK.,Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | - Tamás Kadlecsik
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | | | - Balázs Bohár
- Earlham Institute, Norwich NR4 7UZ, UK.,Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | - Matthew Madgwick
- Earlham Institute, Norwich NR4 7UZ, UK.,Gut Microbes and Health Programme, Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
| | - Dezső Módos
- Earlham Institute, Norwich NR4 7UZ, UK.,Gut Microbes and Health Programme, Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
| | - Márton Ölbei
- Earlham Institute, Norwich NR4 7UZ, UK.,Gut Microbes and Health Programme, Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
| | - Lejla Gul
- Earlham Institute, Norwich NR4 7UZ, UK
| | - Padhmanand Sudhakar
- Earlham Institute, Norwich NR4 7UZ, UK.,Translational Research in GastroIntestinal Disorders, Leuven BE-3000, Belgium
| | - János Kubisch
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | | | - Orsolya Liska
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary.,HCEMM-BRC Metabolic Systems Biology Lab, Szeged H-6726, Hungary.,Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH), Szeged H-6726, Hungary.,Doctoral School in Biology, University of Szeged, Szeged H-6720 Hungary
| | - Eszter Ari
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary.,HCEMM-BRC Metabolic Systems Biology Lab, Szeged H-6726, Hungary.,Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH), Szeged H-6726, Hungary
| | - Bernadette Hotzi
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | - Viktor A Billes
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary.,ELKH/MTA-ELTE Genetics Research Group, Budapest H-1117, Hungary
| | - Eszter Molnár
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | - László Földvári-Nagy
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary.,Department of Morphology and Physiology, Semmelweis University, Budapest H-1088, Hungary
| | - Kitti Csályi
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | - Amanda Demeter
- Earlham Institute, Norwich NR4 7UZ, UK.,Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | - Nóra Pápai
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary.,Institute of Molecular Biotechnology, Vienna A-1030, Austria
| | - Mihály Koltai
- Centre for the Mathematical Modelling of Infectious Diseases (CMMID), London School of Hygiene & Tropical Medicine, London WC1E 7HT, UK
| | - Máté Varga
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary
| | - Katalin Lenti
- Department of Morphology and Physiology, Semmelweis University, Budapest H-1088, Hungary
| | - Illés J Farkas
- Citibank Europe plc Hungarian Branch Office, Budapest H-1133, Hungary
| | - Dénes Türei
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Péter Csermely
- Department of Molecular Biology, Semmelweis University, Budapest H-1094, Hungary
| | - Tibor Vellai
- Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary.,ELKH/MTA-ELTE Genetics Research Group, Budapest H-1117, Hungary
| | - Tamás Korcsmáros
- Earlham Institute, Norwich NR4 7UZ, UK.,Department of Genetics, ELTE Eötvös Loránd University, Budapest H-1117, Hungary.,Gut Microbes and Health Programme, Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
| |
Collapse
|
30
|
Fischer DS, Dony L, König M, Moeed A, Zappia L, Heumos L, Tritschler S, Holmberg O, Aliee H, Theis FJ. Sfaira accelerates data and model reuse in single cell genomics. Genome Biol 2021; 22:248. [PMID: 34433466 PMCID: PMC8386039 DOI: 10.1186/s13059-021-02452-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 08/03/2021] [Indexed: 12/15/2022] Open
Abstract
Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells.
Collapse
Affiliation(s)
- David S Fischer
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany
| | - Leander Dony
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany
- Department of Translational Psychiatry, Max Planck Institute of Psychiatry, and International Max Planck Research School for Translational Psychiatry (IMPRS-TP), 80804, Munich, Germany
| | - Martin König
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
| | - Abdul Moeed
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
| | - Luke Zappia
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- Department of Mathematics, Technical University of Munich, 85748, Garching bei München, Germany
| | - Lukas Heumos
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany
- Institute of Lung Biology and Disease and Comprehensive Pneumology Center, Helmholtz Zentrum München, Member of the German Center for Lung Research (DZL), Munich, Germany
| | - Sophie Tritschler
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany
| | - Olle Holmberg
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany
| | - Hananeh Aliee
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany.
- Department of Mathematics, Technical University of Munich, 85748, Garching bei München, Germany.
| |
Collapse
|
31
|
Wilson SL, Way GP, Bittremieux W, Armache JP, Haendel MA, Hoffman MM. Sharing biological data: why, when, and how. FEBS Lett 2021; 595:847-863. [PMID: 33843054 PMCID: PMC10390076 DOI: 10.1002/1873-3468.14067] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Samantha L Wilson
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Gregory P Way
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.,Department of Computer Science, University of Antwerp, Antwerpen, Belgium
| | - Jean-Paul Armache
- Department of Biochemistry & Molecular Biology, The Huck Institutes of Life Sciences, Pennsylvania State University, University Park, PA, USA
| | | | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.,Department of Medical Biophysics, Department of Computer Science, University of Toronto, Toronto, ON, Canada.,Vector Institute, Toronto, ON, Canada
| |
Collapse
|
32
|
Babcock S, Beverley J, Cowell LG, Smith B. The Infectious Disease Ontology in the age of COVID-19. J Biomed Semantics 2021; 12:13. [PMID: 34275487 PMCID: PMC8286442 DOI: 10.1186/s13326-021-00245-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 06/21/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Effective response to public health emergencies, such as we are now experiencing with COVID-19, requires data sharing across multiple disciplines and data systems. Ontologies offer a powerful data sharing tool, and this holds especially for those ontologies built on the design principles of the Open Biomedical Ontologies Foundry. These principles are exemplified by the Infectious Disease Ontology (IDO), a suite of interoperable ontology modules aiming to provide coverage of all aspects of the infectious disease domain. At its center is IDO Core, a disease- and pathogen-neutral ontology covering just those types of entities and relations that are relevant to infectious diseases generally. IDO Core is extended by disease and pathogen-specific ontology modules. RESULTS To assist the integration and analysis of COVID-19 data, and viral infectious disease data more generally, we have recently developed three new IDO extensions: IDO Virus (VIDO); the Coronavirus Infectious Disease Ontology (CIDO); and an extension of CIDO focusing on COVID-19 (IDO-COVID-19). Reflecting the fact that viruses lack cellular parts, we have introduced into IDO Core the term acellular structure to cover viruses and other acellular entities studied by virologists. We now distinguish between infectious agents - organisms with an infectious disposition - and infectious structures - acellular structures with an infectious disposition. This in turn has led to various updates and refinements of IDO Core's content. We believe that our work on VIDO, CIDO, and IDO-COVID-19 can serve as a model for yielding greater conformance with ontology building best practices. CONCLUSIONS IDO provides a simple recipe for building new pathogen-specific ontologies in a way that allows data about novel diseases to be easily compared, along multiple dimensions, with data represented by existing disease ontologies. The IDO strategy, moreover, supports ontology coordination, providing a powerful method of data integration and sharing that allows physicians, researchers, and public health organizations to respond rapidly and efficiently to current and future public health crises.
Collapse
Affiliation(s)
- Shane Babcock
- Department of Philosophy, Niagara University, Lewiston, NY, USA.
- National Center for Ontological Research, University at Buffalo, Buffalo, NY, USA.
| | - John Beverley
- National Center for Ontological Research, University at Buffalo, Buffalo, NY, USA
- Department of Philosophy, Northwestern University, Evanston, IL, USA
| | - Lindsay G Cowell
- National Center for Ontological Research, University at Buffalo, Buffalo, NY, USA
- Cowell Lab, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Barry Smith
- National Center for Ontological Research, University at Buffalo, Buffalo, NY, USA
- Department of Philosophy, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
33
|
Wang Y, Guo B. The divergence of alternative splicing between ohnologs in teleost fishes. BMC Ecol Evol 2021; 21:98. [PMID: 34034651 PMCID: PMC8146666 DOI: 10.1186/s12862-021-01833-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 05/19/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene duplication and alternative splicing (AS) are two distinct mechanisms generating new materials for genetic innovations. The evolutionary link between gene duplication and AS is still controversial, due to utilizing duplicates from inconsistent ages of duplication events in earlier studies. With the aid of RNA-seq data, we explored evolutionary scenario of AS divergence between duplicates with ohnologs that resulted from the teleost genome duplication event in zebrafish, medaka, and stickleback. RESULTS Ohnologs in zebrafish have fewer AS forms compared to their singleton orthologs, supporting the function-sharing model of AS divergence between duplicates. Ohnologs in stickleback have more AS forms compared to their singleton orthologs, which supports the accelerated model of AS divergence between duplicates. The evolution of AS in ohnologs in medaka supports a combined scenario of the function-sharing and the accelerated model of AS divergence between duplicates. We also found a small number of ohnolog pairs in each of the three teleosts showed significantly asymmetric AS divergence. For example, the well-known ovary-factor gene cyp19a1a has no AS form but its ohnolog cyp19a1b has multiple AS forms in medaka, suggesting that functional divergence between duplicates might have result from AS divergence. CONCLUSIONS We found that a combined scenario of function-sharing and accelerated models for AS evolution in ohnologs in teleosts and rule out the independent model that assumes a lack of correlation between gene duplication and AS. Our study thus provided insights into the link between gene duplication and AS in general and ohnolog divergence in teleosts from AS perspective in particular.
Collapse
Affiliation(s)
- Yuwei Wang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Baocheng Guo
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China. .,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650201, China.
| |
Collapse
|
34
|
Ruiz C, Zitnik M, Leskovec J. Identification of disease treatment mechanisms through the multiscale interactome. Nat Commun 2021; 12:1796. [PMID: 33741907 PMCID: PMC7979814 DOI: 10.1038/s41467-021-21770-8] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 02/04/2021] [Indexed: 12/12/2022] Open
Abstract
Most diseases disrupt multiple proteins, and drugs treat such diseases by restoring the functions of the disrupted proteins. How drugs restore these functions, however, is often unknown as a drug's therapeutic effects are not limited to the proteins that the drug directly targets. Here, we develop the multiscale interactome, a powerful approach to explain disease treatment. We integrate disease-perturbed proteins, drug targets, and biological functions into a multiscale interactome network. We then develop a random walk-based method that captures how drug effects propagate through a hierarchy of biological functions and physical protein-protein interactions. On three key pharmacological tasks, the multiscale interactome predicts drug-disease treatment, identifies proteins and biological functions related to treatment, and predicts genes that alter a treatment's efficacy and adverse reactions. Our results indicate that physical interactions between proteins alone cannot explain treatment since many drugs treat diseases by affecting the biological functions disrupted by the disease rather than directly targeting disease proteins or their regulators. We provide a general framework for explaining treatment, even when drugs seem unrelated to the diseases they are recommended for.
Collapse
Affiliation(s)
- Camilo Ruiz
- Computer Science Department, Stanford University, Stanford, CA, USA
- Bioengineering Department, Stanford University, Stanford, CA, USA
| | - Marinka Zitnik
- Biomedical Informatics Department, Harvard University, Boston, MA, USA
| | - Jure Leskovec
- Computer Science Department, Stanford University, Stanford, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
35
|
Yamada I, Campbell MP, Edwards N, Castro LJ, Lisacek F, Mariethoz J, Ono T, Ranzinger R, Shinmachi D, Aoki-Kinoshita KF. The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application. Glycobiology 2021; 31:741-750. [PMID: 33677548 PMCID: PMC8351504 DOI: 10.1093/glycob/cwab013] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 12/31/2020] [Accepted: 01/01/2021] [Indexed: 01/19/2023] Open
Abstract
Recent years have seen great advances in the development of glycoproteomics protocols and methods resulting in a sustainable increase in the reporting proteins, their attached glycans and glycosylation sites. However, only very few of these reports find their way into databases or data repositories. One of the major reasons is the absence of digital standard to represent glycoproteins and the challenging annotations with glycans. Depending on the experimental method, such a standard must be able to represent glycans as complete structures or as compositions, store not just single glycans but also represent glycoforms on a specific glycosylation side, deal with partially missing site information if no site mapping was performed, and store abundances or ratios of glycans within a glycoform of a specific site. To support the above, we have developed the GlycoConjugate Ontology (GlycoCoO) as a standard semantic framework to describe and represent glycoproteomics data. GlycoCoO can be used to represent glycoproteomics data in triplestores and can serve as a basis for data exchange formats. The ontology, database providers and supporting documentation are available online (https://github.com/glycoinfo/GlycoCoO).
Collapse
Affiliation(s)
- Issaku Yamada
- Research Department, The Noguchi Institute, 1-9-7 Kaga, Itabashi, Tokyo 173-0003, Japan
| | - Matthew P Campbell
- Institute for Glycomics, Griffith University at Gold Coast, Southport, QLD 4215, Australia
| | - Nathan Edwards
- Department of Biochemistry, Molecular and Cellular Biology, Georgetown University Medical Center, Washington, D.C. 20007, USA
| | - Leyla Jael Castro
- ZB MED Information Centre for Life Sciences, Gleueler Str. 60, 50931 Cologne, Germany
| | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Computer Science Department, University of Geneva, route de Drize 7, CH - 1227 Geneva Switzerland, and also Section of Biology, University of Geneva, Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 7 Route de Drize, 1227 Geneva, Switzerland
| | - Tamiko Ono
- Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo 192-8577, Japan
| | - Rene Ranzinger
- Complex Carbohydrate Research Center, The University of Georgia, 315 Riverbend Rd, Athens, Georgia 30602, USA
| | - Daisuke Shinmachi
- R&D Department, SparqLite LLC., 1615-22 Ishikawamachi, Hachioji, Tokyo 192-0032, Japan
| | - Kiyoko F Aoki-Kinoshita
- Glycan & Life Science Integration Center (GaLSIC), Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo 192-8577, Japan
| |
Collapse
|
36
|
Lim N, Tesar S, Belmadani M, Poirier-Morency G, Mancarci BO, Sicherman J, Jacobson M, Leong J, Tan P, Pavlidis P. Curation of over 10 000 transcriptomic studies to enable data reuse. Database (Oxford) 2021; 2021:6143045. [PMID: 33599246 PMCID: PMC7904053 DOI: 10.1093/database/baab006] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 12/09/2020] [Accepted: 01/28/2021] [Indexed: 01/07/2023]
Abstract
Vast amounts of transcriptomic data reside in public repositories, but effective reuse remains challenging. Issues include unstructured dataset metadata, inconsistent data processing and quality control, and inconsistent probe-gene mappings across microarray technologies. Thus, extensive curation and data reprocessing are necessary prior to any reuse. The Gemma bioinformatics system was created to help address these issues. Gemma consists of a database of curated transcriptomic datasets, analytical software, a web interface and web services. Here we present an update on Gemma's holdings, data processing and analysis pipelines, our curation guidelines, and software features. As of June 2020, Gemma contains 10 811 manually curated datasets (primarily human, mouse and rat), over 395 000 samples and hundreds of curated transcriptomic platforms (both microarray and RNA sequencing). Dataset topics were represented with 10 215 distinct terms from 12 ontologies, for a total of 54 316 topic annotations (mean topics/dataset = 5.2). While Gemma has broad coverage of conditions and tissues, it captures a large majority of available brain-related datasets, accounting for 34% of its holdings. Users can access the curated data and differential expression analyses through the Gemma website, RESTful service and an R package. Database URL: https://gemma.msl.ubc.ca/home.html.
Collapse
Affiliation(s)
- Nathaniel Lim
- Genome Science and Technology Graduate Program, University of British Columbia, Vancouver, BC V6T1Z4, Canada,Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada
| | - Stepan Tesar
- Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada
| | - Manuel Belmadani
- Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada
| | - Guillaume Poirier-Morency
- Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada
| | - Burak Ogan Mancarci
- Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada,Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC V6T1Z4, Canada
| | - Jordan Sicherman
- Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada,Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC V6T1Z4, Canada
| | - Matthew Jacobson
- Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada
| | - Justin Leong
- Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada
| | - Patrick Tan
- Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC V6T1Z4, Canada
| | | |
Collapse
|
37
|
Diniz WJS, Crouse MS, Cushman RA, McLean KJ, Caton JS, Dahlen CR, Reynolds LP, Ward AK. Cerebrum, liver, and muscle regulatory networks uncover maternal nutrition effects in developmental programming of beef cattle during early pregnancy. Sci Rep 2021; 11:2771. [PMID: 33531552 PMCID: PMC7854659 DOI: 10.1038/s41598-021-82156-w] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 01/13/2021] [Indexed: 01/30/2023] Open
Abstract
The molecular basis underlying fetal programming in response to maternal nutrition remains unclear. Herein, we investigated the regulatory relationships between genes in fetal cerebrum, liver, and muscle tissues to shed light on the putative mechanisms that underlie the effects of early maternal nutrient restriction on bovine developmental programming. To this end, cerebrum, liver, and muscle gene expression were measured with RNA-Seq in 14 fetuses collected on day 50 of gestation from dams fed a diet initiated at breeding to either achieve 60% (RES, n = 7) or 100% (CON, n = 7) of energy requirements. To build a tissue-to-tissue gene network, we prioritized tissue-specific genes, transcription factors, and differentially expressed genes. Furthermore, we built condition-specific networks to identify differentially co-expressed or connected genes. Nutrient restriction led to differential tissue regulation between the treatments. Myogenic factors differentially regulated by ZBTB33 and ZNF131 may negatively affect myogenesis. Additionally, nutrient-sensing pathways, such as mTOR and PI3K/Akt, were affected by gene expression changes in response to nutrient restriction. By unveiling the network properties, we identified major regulators driving gene expression. However, further research is still needed to determine the impact of early maternal nutrition and strategic supplementation on pre- and post-natal performance.
Collapse
Affiliation(s)
- Wellison J. S. Diniz
- grid.261055.50000 0001 2293 4611Department of Animal Sciences, Center for Nutrition and Pregnancy, North Dakota State University, Fargo, ND USA
| | - Matthew S. Crouse
- grid.463419.d0000 0001 0946 3608USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE USA
| | - Robert A. Cushman
- grid.463419.d0000 0001 0946 3608USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE USA
| | - Kyle J. McLean
- grid.411461.70000 0001 2315 1184Department of Animal Science, University of Tennessee, Knoxville, TN USA
| | - Joel S. Caton
- grid.261055.50000 0001 2293 4611Department of Animal Sciences, Center for Nutrition and Pregnancy, North Dakota State University, Fargo, ND USA
| | - Carl R. Dahlen
- grid.261055.50000 0001 2293 4611Department of Animal Sciences, Center for Nutrition and Pregnancy, North Dakota State University, Fargo, ND USA
| | - Lawrence P. Reynolds
- grid.261055.50000 0001 2293 4611Department of Animal Sciences, Center for Nutrition and Pregnancy, North Dakota State University, Fargo, ND USA
| | - Alison K. Ward
- grid.261055.50000 0001 2293 4611Department of Animal Sciences, Center for Nutrition and Pregnancy, North Dakota State University, Fargo, ND USA
| |
Collapse
|
38
|
Bastian FB, Roux J, Niknejad A, Comte A, Fonseca Costa SS, de Farias TM, Moretti S, Parmentier G, de Laval VR, Rosikiewicz M, Wollbrett J, Echchiki A, Escoriza A, Gharib WH, Gonzales-Porta M, Jarosz Y, Laurenczy B, Moret P, Person E, Roelli P, Sanjeev K, Seppey M, Robinson-Rechavi M. The Bgee suite: integrated curated expression atlas and comparative transcriptomics in animals. Nucleic Acids Res 2021; 49:D831-D847. [PMID: 33037820 PMCID: PMC7778977 DOI: 10.1093/nar/gkaa793] [Citation(s) in RCA: 113] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 08/24/2020] [Accepted: 09/15/2020] [Indexed: 01/24/2023] Open
Abstract
Bgee is a database to retrieve and compare gene expression patterns in multiple animal species, produced by integrating multiple data types (RNA-Seq, Affymetrix, in situ hybridization, and EST data). It is based exclusively on curated healthy wild-type expression data (e.g., no gene knock-out, no treatment, no disease), to provide a comparable reference of normal gene expression. Curation includes very large datasets such as GTEx (re-annotation of samples as ‘healthy’ or not) as well as many small ones. Data are integrated and made comparable between species thanks to consistent data annotation and processing, and to calls of presence/absence of expression, along with expression scores. As a result, Bgee is capable of detecting the conditions of expression of any single gene, accommodating any data type and species. Bgee provides several tools for analyses, allowing, e.g., automated comparisons of gene expression patterns within and between species, retrieval of the prefered conditions of expression of any gene, or enrichment analyses of conditions with expression of sets of genes. Bgee release 14.1 includes 29 animal species, and is available at https://bgee.org/ and through its Bioconductor R package BgeeDB.
Collapse
Affiliation(s)
- Frederic B Bastian
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Julien Roux
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Anne Niknejad
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Aurélie Comte
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Sara S Fonseca Costa
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Tarcisio Mendes de Farias
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Sébastien Moretti
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Gilles Parmentier
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Valentine Rech de Laval
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Marta Rosikiewicz
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Julien Wollbrett
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Amina Echchiki
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Angélique Escoriza
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Walid H Gharib
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Mar Gonzales-Porta
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Yohan Jarosz
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Balazs Laurenczy
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Philippe Moret
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Emilie Person
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Patrick Roelli
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Komal Sanjeev
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Mathieu Seppey
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
39
|
Carbon S, Douglass E, Good BM, Unni DR, Harris NL, Mungall CJ, Basu S, Chisholm RL, Dodson RJ, Hartline E, Fey P, Thomas PD, Albou LP, Ebert D, Kesling MJ, Mi H, Muruganujan A, Huang X, Mushayahama T, LaBonte SA, Siegele DA, Antonazzo G, Attrill H, Brown NH, Garapati P, Marygold SJ, Trovisco V, dos Santos G, Falls K, Tabone C, Zhou P, Goodman JL, Strelets VB, Thurmond J, Garmiri P, Ishtiaq R, Rodríguez-López M, Acencio ML, Kuiper M, Lægreid A, Logie C, Lovering RC, Kramarz B, Saverimuttu SCC, Pinheiro SM, Gunn H, Su R, Thurlow KE, Chibucos M, Giglio M, Nadendla S, Munro J, Jackson R, Duesbury MJ, Del-Toro N, Meldal BHM, Paneerselvam K, Perfetto L, Porras P, Orchard S, Shrivastava A, Chang HY, Finn RD, Mitchell AL, Rawlings ND, Richardson L, Sangrador-Vegas A, Blake JA, Christie KR, Dolan ME, Drabkin HJ, Hill DP, Ni L, Sitnikov DM, Harris MA, Oliver SG, Rutherford K, Wood V, Hayles J, Bähler J, Bolton ER, De Pons JL, Dwinell MR, Hayman GT, Kaldunski ML, Kwitek AE, Laulederkind SJF, Plasterer C, Tutaj MA, Vedi M, Wang SJ, D’Eustachio P, Matthews L, Balhoff JP, Aleksander SA, Alexander MJ, Cherry JM, Engel SR, Gondwe F, Karra K, et alCarbon S, Douglass E, Good BM, Unni DR, Harris NL, Mungall CJ, Basu S, Chisholm RL, Dodson RJ, Hartline E, Fey P, Thomas PD, Albou LP, Ebert D, Kesling MJ, Mi H, Muruganujan A, Huang X, Mushayahama T, LaBonte SA, Siegele DA, Antonazzo G, Attrill H, Brown NH, Garapati P, Marygold SJ, Trovisco V, dos Santos G, Falls K, Tabone C, Zhou P, Goodman JL, Strelets VB, Thurmond J, Garmiri P, Ishtiaq R, Rodríguez-López M, Acencio ML, Kuiper M, Lægreid A, Logie C, Lovering RC, Kramarz B, Saverimuttu SCC, Pinheiro SM, Gunn H, Su R, Thurlow KE, Chibucos M, Giglio M, Nadendla S, Munro J, Jackson R, Duesbury MJ, Del-Toro N, Meldal BHM, Paneerselvam K, Perfetto L, Porras P, Orchard S, Shrivastava A, Chang HY, Finn RD, Mitchell AL, Rawlings ND, Richardson L, Sangrador-Vegas A, Blake JA, Christie KR, Dolan ME, Drabkin HJ, Hill DP, Ni L, Sitnikov DM, Harris MA, Oliver SG, Rutherford K, Wood V, Hayles J, Bähler J, Bolton ER, De Pons JL, Dwinell MR, Hayman GT, Kaldunski ML, Kwitek AE, Laulederkind SJF, Plasterer C, Tutaj MA, Vedi M, Wang SJ, D’Eustachio P, Matthews L, Balhoff JP, Aleksander SA, Alexander MJ, Cherry JM, Engel SR, Gondwe F, Karra K, Miyasato SR, Nash RS, Simison M, Skrzypek MS, Weng S, Wong ED, Feuermann M, Gaudet P, Morgat A, Bakker E, Berardini TZ, Reiser L, Subramaniam S, Huala E, Arighi CN, Auchincloss A, Axelsen K, Argoud-Puy G, Bateman A, Blatter MC, Boutet E, Bowler E, Breuza L, Bridge A, Britto R, Bye-A-Jee H, Casas CC, Coudert E, Denny P, Estreicher A, Famiglietti ML, Georghiou G, Gos A, Gruaz-Gumowski N, Hatton-Ellis E, Hulo C, Ignatchenko A, Jungo F, Laiho K, Le Mercier P, Lieberherr D, Lock A, Lussi Y, MacDougall A, Magrane M, Martin MJ, Masson P, Natale DA, Hyka-Nouspikel N, Orchard S, Pedruzzi I, Pourcel L, Poux S, Pundir S, Rivoire C, Speretta E, Sundaram S, Tyagi N, Warner K, Zaru R, Wu CH, Diehl AD, Chan JN, Grove C, Lee RYN, Muller HM, Raciti D, Van Auken K, Sternberg PW, Berriman M, Paulini M, Howe K, Gao S, Wright A, Stein L, Howe DG, Toro S, Westerfield M, Jaiswal P, Cooper L, Elser J. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res 2021; 49:D325-D334. [PMID: 33290552 PMCID: PMC7779012 DOI: 10.1093/nar/gkaa1113] [Show More Authors] [Citation(s) in RCA: 2155] [Impact Index Per Article: 538.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/22/2020] [Accepted: 12/02/2020] [Indexed: 12/28/2022] Open
Abstract
The Gene Ontology Consortium (GOC) provides the most comprehensive resource currently available for computable knowledge regarding the functions of genes and gene products. Here, we report the advances of the consortium over the past two years. The new GO-CAM annotation framework was notably improved, and we formalized the model with a computational schema to check and validate the rapidly increasing repository of 2838 GO-CAMs. In addition, we describe the impacts of several collaborations to refine GO and report a 10% increase in the number of GO annotations, a 25% increase in annotated gene products, and over 9,400 new scientific articles annotated. As the project matures, we continue our efforts to review older annotations in light of newer findings, and, to maintain consistency with other ontologies. As a result, 20 000 annotations derived from experimental data were reviewed, corresponding to 2.5% of experimental GO annotations. The website (http://geneontology.org) was redesigned for quick access to documentation, downloads and tools. To maintain an accurate resource and support traceability and reproducibility, we have made available a historical archive covering the past 15 years of GO data with a consistent format and file structure for both the ontology and annotations.
Collapse
|
40
|
Köhler S, Gargano M, Matentzoglu N, Carmody LC, Lewis-Smith D, Vasilevsky NA, Danis D, Balagura G, Baynam G, Brower AM, Callahan TJ, Chute CG, Est JL, Galer PD, Ganesan S, Griese M, Haimel M, Pazmandi J, Hanauer M, Harris NL, Hartnett M, Hastreiter M, Hauck F, He Y, Jeske T, Kearney H, Kindle G, Klein C, Knoflach K, Krause R, Lagorce D, McMurry JA, Miller JA, Munoz-Torres M, Peters RL, Rapp CK, Rath AM, Rind SA, Rosenberg A, Segal MM, Seidel MG, Smedley D, Talmy T, Thomas Y, Wiafe SA, Xian J, Yüksel Z, Helbig I, Mungall CJ, Haendel MA, Robinson PN. The Human Phenotype Ontology in 2021. Nucleic Acids Res 2021; 49:D1207-D1217. [PMID: 33264411 PMCID: PMC7778952 DOI: 10.1093/nar/gkaa1043] [Citation(s) in RCA: 662] [Impact Index Per Article: 165.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/11/2020] [Accepted: 11/16/2020] [Indexed: 12/21/2022] Open
Abstract
The Human Phenotype Ontology (HPO, https://hpo.jax.org) was launched in 2008 to provide a comprehensive logical standard to describe and computationally analyze phenotypic abnormalities found in human disease. The HPO is now a worldwide standard for phenotype exchange. The HPO has grown steadily since its inception due to considerable contributions from clinical experts and researchers from a diverse range of disciplines. Here, we present recent major extensions of the HPO for neurology, nephrology, immunology, pulmonology, newborn screening, and other areas. For example, the seizure subontology now reflects the International League Against Epilepsy (ILAE) guidelines and these enhancements have already shown clinical validity. We present new efforts to harmonize computational definitions of phenotypic abnormalities across the HPO and multiple phenotype ontologies used for animal models of disease. These efforts will benefit software such as Exomiser by improving the accuracy and scope of cross-species phenotype matching. The computational modeling strategy used by the HPO to define disease entities and phenotypic features and distinguish between them is explained in detail.We also report on recent efforts to translate the HPO into indigenous languages. Finally, we summarize recent advances in the use of HPO in electronic health record systems.
Collapse
Affiliation(s)
| | - Michael Gargano
- Monarch Initiative
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Nicolas Matentzoglu
- Monarch Initiative
- Semanticly Ltd, London, UK
- European Bioinformatics Institute (EMBL-EBI)
| | - Leigh C Carmody
- Monarch Initiative
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - David Lewis-Smith
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, UK
- Clinical Neurosciences, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Nicole A Vasilevsky
- Monarch Initiative
- Oregon Clinical & Translational Research Institute, Oregon Health & Science University
| | | | - Ganna Balagura
- Department of Neurosciences, Rehabilitation, Ophthalmology, Genetics, and Maternal and Child Health, University of Genoa, Genoa, Italy
- Pediatric Neurology and Muscular Diseases Unit, IRCCS ‘G. Gaslini’ Institute, Genoa, Italy
| | - Gareth Baynam
- Western Australian Register of Developmental Anomalies, King Edward memorial Hospital, Perth, Australia
- Telethon Kids Institute and the Division of Paediatrics, Faculty of Helath and Medical Sciences, University of Western Australia, Perth, Australia
| | - Amy M Brower
- American College of Medical Genetics and Genomics (ACMG), Bethesda, MD, USA
| | - Tiffany J Callahan
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Colorado, USA
| | | | - Johanna L Est
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Peter D Galer
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Shiva Ganesan
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Matthias Griese
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
- Ludwig-Maximilians University, German Center for Lung Research (DZL), Munich, Germany
| | - Matthias Haimel
- Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Vienna, Austria
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Julia Pazmandi
- Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Vienna, Austria
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
- Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA
| | - Marc Hanauer
- INSERM, US14––Orphanet, Plateforme Maladies Rares, Paris, France
| | - Nomi L Harris
- Monarch Initiative
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley CA, USA
| | - Michael J Hartnett
- American College of Medical Genetics and Genomics (ACMG), Bethesda, MD, USA
| | - Maximilian Hastreiter
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Fabian Hauck
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
- German Centre for Infection Research (DZIF), Munich, Germany
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Tim Jeske
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Hugh Kearney
- FutureNeuro, SFI Research Centre for Chronic and Rare Neurological Diseases, Ireland
| | - Gerhard Kindle
- Institute for Immunodeficiency, Center for Chronic Immunodeficiency (CCI). Faculty of Medicine, Medical Center - University of Freiburg, Freiburg, Germany
- Centre for Biobanking FREEZE, Faculty of Medicine, Medical Center - University of Freiburg, Freiburg, Germany
| | - Christoph Klein
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Katrin Knoflach
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
- Ludwig-Maximilians University, German Center for Lung Research (DZL), Munich, Germany
| | - Roland Krause
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4367 Belvaux, Luxembourg
| | - David Lagorce
- INSERM, US14––Orphanet, Plateforme Maladies Rares, Paris, France
| | - Julie A McMurry
- Monarch Initiative
- Translational and Integrative Sciences Center, Department of Environmental and Molecular Toxicology, Oregon State University, OR, USA
| | - Jillian A Miller
- American College of Medical Genetics and Genomics (ACMG), Bethesda, MD, USA
| | - Monica C Munoz-Torres
- Monarch Initiative
- Translational and Integrative Sciences Center, Department of Environmental and Molecular Toxicology, Oregon State University, OR, USA
| | - Rebecca L Peters
- American College of Medical Genetics and Genomics (ACMG), Bethesda, MD, USA
| | - Christina K Rapp
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
- Ludwig-Maximilians University, German Center for Lung Research (DZL), Munich, Germany
| | - Ana M Rath
- INSERM, US14––Orphanet, Plateforme Maladies Rares, Paris, France
| | - Shahmir A Rind
- WA Register of Developmental Anomalies
- Curtin University, Western Australia, Australia
| | - Avi Z Rosenberg
- Division of Kidney-Urologic Pathology, Johns Hopkins University, Baltimore, MD 21205, USA
| | | | - Markus G Seidel
- Research Unit for Pediatric Hematology and Immunology, Division of Pediatric Hemato-Oncology, Department of Pediatrics and Adolescent Medicine, Medical University of Graz, Graz, Austria
| | - Damian Smedley
- The William Harvey Research Institute, Charterhouse Square Barts and the London School of Medicine and Dentistry Queen Mary University of London, London EC1M 6BQ, UK
| | - Tomer Talmy
- Genomic Research Department, Emedgene Technologies, Tel Aviv, Israel
- Faculty of Medicine, Hebrew University Hadassah Medical School, Jerusalem, Israel
| | - Yarlalu Thomas
- West Australian Register of Developmental Anomalies, East Perth, WA, Australia
| | | | - Julie Xian
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, PA, USA
| | - Zafer Yüksel
- Human Genetics, Bioscientia GmbH, Ingelheim, Germany
| | - Ingo Helbig
- Department of Neurology, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA
- The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Christopher J Mungall
- Monarch Initiative
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley CA, USA
| | - Melissa A Haendel
- Monarch Initiative
- Oregon Clinical & Translational Research Institute, Oregon Health & Science University
- Translational and Integrative Sciences Center, Department of Environmental and Molecular Toxicology, Oregon State University, OR, USA
| | - Peter N Robinson
- Monarch Initiative
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA
| |
Collapse
|
41
|
Miller JA, Gouwens NW, Tasic B, Collman F, van Velthoven CTJ, Bakken TE, Hawrylycz MJ, Zeng H, Lein ES, Bernard A. Common cell type nomenclature for the mammalian brain. eLife 2020; 9:e59928. [PMID: 33372656 PMCID: PMC7790494 DOI: 10.7554/elife.59928] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 12/28/2020] [Indexed: 12/22/2022] Open
Abstract
The advancement of single-cell RNA-sequencing technologies has led to an explosion of cell type definitions across multiple organs and organisms. While standards for data and metadata intake are arising, organization of cell types has largely been left to individual investigators, resulting in widely varying nomenclature and limited alignment between taxonomies. To facilitate cross-dataset comparison, the Allen Institute created the common cell type nomenclature (CCN) for matching and tracking cell types across studies that is qualitatively similar to gene transcript management across different genome builds. The CCN can be readily applied to new or established taxonomies and was applied herein to diverse cell type datasets derived from multiple quantifiable modalities. The CCN facilitates assigning accurate yet flexible cell type names in the mammalian cortex as a step toward community-wide efforts to organize multi-source, data-driven information related to cell type taxonomies from any organism.
Collapse
|
42
|
Kamaraj US, Chen J, Katwadi K, Ouyang JF, Yang Sun YB, Lim YM, Liu X, Handoko L, Polo JM, Petretto E, Rackham OJ. EpiMogrify Models H3K4me3 Data to Identify Signaling Molecules that Improve Cell Fate Control and Maintenance. Cell Syst 2020; 11:509-522.e10. [DOI: 10.1016/j.cels.2020.09.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 04/30/2020] [Accepted: 09/14/2020] [Indexed: 12/14/2022]
|
43
|
Fernando PC, Mabee PM, Zeng E. Integration of anatomy ontology data with protein-protein interaction networks improves the candidate gene prediction accuracy for anatomical entities. BMC Bioinformatics 2020; 21:442. [PMID: 33028186 PMCID: PMC7542696 DOI: 10.1186/s12859-020-03773-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Accepted: 09/22/2020] [Indexed: 01/04/2023] Open
Abstract
Background Identification of genes responsible for anatomical entities is a major requirement in many fields including developmental biology, medicine, and agriculture. Current wet lab techniques used for this purpose, such as gene knockout, are high in resource and time consumption. Protein–protein interaction (PPI) networks are frequently used to predict disease genes for humans and gene candidates for molecular functions, but they are rarely used to predict genes for anatomical entities. Moreover, PPI networks suffer from network quality issues, which can be a limitation for their usage in predicting candidate genes. Therefore, we developed an integrative framework to improve the candidate gene prediction accuracy for anatomical entities by combining existing experimental knowledge about gene-anatomical entity relationships with PPI networks using anatomy ontology annotations. We hypothesized that this integration improves the quality of the PPI networks by reducing the number of false positive and false negative interactions and is better optimized to predict candidate genes for anatomical entities. We used existing Uberon anatomical entity annotations for zebrafish and mouse genes to construct gene networks by calculating semantic similarity between the genes. These anatomy-based gene networks were semantic networks, as they were constructed based on the anatomy ontology annotations that were obtained from the experimental data in the literature. We integrated these anatomy-based gene networks with mouse and zebrafish PPI networks retrieved from the STRING database and compared the performance of their network-based candidate gene predictions. Results According to evaluations of candidate gene prediction performance tested under four different semantic similarity calculation methods (Lin, Resnik, Schlicker, and Wang), the integrated networks, which were semantically improved PPI networks, showed better performances by having higher area under the curve values for receiver operating characteristic and precision-recall curves than PPI networks for both zebrafish and mouse. Conclusion Integration of existing experimental knowledge about gene-anatomical entity relationships with PPI networks via anatomy ontology improved the candidate gene prediction accuracy and optimized them for predicting candidate genes for anatomical entities.
Collapse
Affiliation(s)
- Pasan C Fernando
- Department of Biology, University of South Dakota, Vermillion, SD, USA.
| | - Paula M Mabee
- Department of Biology, University of South Dakota, Vermillion, SD, USA.,National Ecological Observatory Network, Battelle Memorial Institute, 1685 38th St., Suite 100, Boulder, CO, 80301, USA
| | - Erliang Zeng
- Division of Biostatistics and Computational Biology, College of Dentistry, University of Iowa, Iowa City, IA, USA. .,Department of Preventive and Community Dentistry, College of Dentistry, University of Iowa, Iowa City, IA, USA. .,Department of Biostatistics, College of Public Health, University of Iowa, Iowa City, IA, USA. .,Department of Biomedical Engineering, College of Engineering, University of Iowa, Iowa City, IA, USA.
| |
Collapse
|
44
|
Vogel P, Ding ZM, Read R, DaCosta CM, Hansard M, Small DL, Ye GL, Hansen G, Brommage R, Powell DR. Progressive Degenerative Myopathy and Myosteatosis in ASNSD1-Deficient Mice. Vet Pathol 2020; 57:723-735. [PMID: 32638637 DOI: 10.1177/0300985820939251] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Mice with an inactivating mutation in the gene encoding asparagine synthetase domain containing 1 (ASNSD1) develop a progressive degenerative myopathy that results in severe sarcopenia and myosteatosis. ASNSD1 is conserved across many species, and whole body gene expression surveys show maximal expression levels of ASNSD1 in skeletal muscle. However, potential functions of this protein have not been previously reported. Asnsd1-/- mice demonstrated severe muscle weakness, and their normalized body fat percentage on both normal chow and high fat diets was greater than 2 SD above the mean for 3651 chow-fed and 2463 high-fat-diet-fed knockout (KO) lines tested. Histologic lesions were essentially limited to the muscle and were characterized by a progressive degenerative myopathy with extensive transdifferentiation and replacement of muscle by well-differentiated adipose tissue. There was minimal inflammation, fibrosis, and muscle regeneration associated with this myopathy. In addition, the absence of any signs of lipotoxicity in Asnsd1-/- mice despite their extremely elevated body fat percentage and low muscle mass suggests a role for metabolic dysfunctions in the development of this phenotype. Asnsd1-/- mice provide the first insight into the function of this protein, and this mouse model could prove useful in elucidating fundamental metabolic interactions between skeletal muscle and adipose tissue.
Collapse
Affiliation(s)
- Peter Vogel
- 57636Lexicon Pharmaceuticals Inc, The Woodlands, TX, USA
| | - Zhi-Ming Ding
- 57636Lexicon Pharmaceuticals Inc, The Woodlands, TX, USA
| | - Robert Read
- 57636Lexicon Pharmaceuticals Inc, The Woodlands, TX, USA
| | | | | | - Daniel L Small
- 57636Lexicon Pharmaceuticals Inc, The Woodlands, TX, USA
| | - Gui-Lan Ye
- 57636Lexicon Pharmaceuticals Inc, The Woodlands, TX, USA
| | - Gwenn Hansen
- 57636Lexicon Pharmaceuticals Inc, The Woodlands, TX, USA
| | | | - David R Powell
- 57636Lexicon Pharmaceuticals Inc, The Woodlands, TX, USA
| |
Collapse
|
45
|
Zhao M, Havrilla JM, Fang L, Chen Y, Peng J, Liu C, Wu C, Sarmady M, Botas P, Isla J, Lyon GJ, Weng C, Wang K. Phen2Gene: rapid phenotype-driven gene prioritization for rare diseases. NAR Genom Bioinform 2020; 2:lqaa032. [PMID: 32500119 PMCID: PMC7252576 DOI: 10.1093/nargab/lqaa032] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 04/10/2020] [Accepted: 04/28/2020] [Indexed: 02/07/2023] Open
Abstract
Human Phenotype Ontology (HPO) terms are increasingly used in diagnostic settings to aid in the characterization of patient phenotypes. The HPO annotation database is updated frequently and can provide detailed phenotype knowledge on various human diseases, and many HPO terms are now mapped to candidate causal genes with binary relationships. To further improve the genetic diagnosis of rare diseases, we incorporated these HPO annotations, gene-disease databases and gene-gene databases in a probabilistic model to build a novel HPO-driven gene prioritization tool, Phen2Gene. Phen2Gene accesses a database built upon this information called the HPO2Gene Knowledgebase (H2GKB), which provides weighted and ranked gene lists for every HPO term. Phen2Gene is then able to access the H2GKB for patient-specific lists of HPO terms or PhenoPacket descriptions supported by GA4GH (http://phenopackets.org/), calculate a prioritized gene list based on a probabilistic model and output gene-disease relationships with great accuracy. Phen2Gene outperforms existing gene prioritization tools in speed and acts as a real-time phenotype-driven gene prioritization tool to aid the clinical diagnosis of rare undiagnosed diseases. In addition to a command line tool released under the MIT license (https://github.com/WGLab/Phen2Gene), we also developed a web server and web service (https://phen2gene.wglab.org/) for running the tool via web interface or RESTful API queries. Finally, we have curated a large amount of benchmarking data for phenotype-to-gene tools involving 197 patients across 76 scientific articles and 85 patients' de-identified HPO term data from the Children's Hospital of Philadelphia.
Collapse
Affiliation(s)
- Mengge Zhao
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - James M Havrilla
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Li Fang
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Ying Chen
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Jacqueline Peng
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University Medical Center, New York, NY 10032, USA
| | - Chao Wu
- Division of Genomic Diagnostics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Mahdi Sarmady
- Division of Genomic Diagnostics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Pablo Botas
- Foundation 29, Pozuelo de Alarcon, 28223 Madrid, Spain
| | - Julián Isla
- Foundation 29, Pozuelo de Alarcon, 28223 Madrid, Spain
- Dravet Syndrome European Federation, 29200 Brest, France
| | - Gholson J Lyon
- Institute for Basic Research in Developmental Disabilities (IBR), Staten Island, NY 10314, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Medical Center, New York, NY 10032, USA
| | - Kai Wang
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| |
Collapse
|
46
|
Alliance of Genome Resources Portal: unified model organism research platform. Nucleic Acids Res 2020; 48:D650-D658. [PMID: 31552413 PMCID: PMC6943066 DOI: 10.1093/nar/gkz813] [Citation(s) in RCA: 125] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 09/03/2019] [Accepted: 09/19/2019] [Indexed: 01/13/2023] Open
Abstract
The Alliance of Genome Resources (Alliance) is a consortium of the major model organism databases and the Gene Ontology that is guided by the vision of facilitating exploration of related genes in human and well-studied model organisms by providing a highly integrated and comprehensive platform that enables researchers to leverage the extensive body of genetic and genomic studies in these organisms. Initiated in 2016, the Alliance is building a central portal (www.alliancegenome.org) for access to data for the primary model organisms along with gene ontology data and human data. All data types represented in the Alliance portal (e.g. genomic data and phenotype descriptions) have common data models and workflows for curation. All data are open and freely available via a variety of mechanisms. Long-term plans for the Alliance project include a focus on coverage of additional model organisms including those without dedicated curation communities, and the inclusion of new data types with a particular focus on providing data and tools for the non-model-organism researcher that support enhanced discovery about human health and disease. Here we review current progress and present immediate plans for this new bioinformatics resource.
Collapse
|
47
|
Köhler S, Carmody L, Vasilevsky N, Jacobsen JOB, Danis D, Gourdine JP, Gargano M, Harris NL, Matentzoglu N, McMurry JA, Osumi-Sutherland D, Cipriani V, Balhoff JP, Conlin T, Blau H, Baynam G, Palmer R, Gratian D, Dawkins H, Segal M, Jansen AC, Muaz A, Chang WH, Bergerson J, Laulederkind SJF, Yüksel Z, Beltran S, Freeman AF, Sergouniotis PI, Durkin D, Storm AL, Hanauer M, Brudno M, Bello SM, Sincan M, Rageth K, Wheeler MT, Oegema R, Lourghi H, Della Rocca MG, Thompson R, Castellanos F, Priest J, Cunningham-Rundles C, Hegde A, Lovering RC, Hajek C, Olry A, Notarangelo L, Similuk M, Zhang XA, Gómez-Andrés D, Lochmüller H, Dollfus H, Rosenzweig S, Marwaha S, Rath A, Sullivan K, Smith C, Milner JD, Leroux D, Boerkoel CF, Klion A, Carter MC, Groza T, Smedley D, Haendel MA, Mungall C, Robinson PN. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res 2020; 47:D1018-D1027. [PMID: 30476213 PMCID: PMC6324074 DOI: 10.1093/nar/gky1105] [Citation(s) in RCA: 435] [Impact Index Per Article: 87.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 10/24/2018] [Indexed: 12/12/2022] Open
Abstract
The Human Phenotype Ontology (HPO)—a standardized vocabulary of phenotypic abnormalities associated with 7000+ diseases—is used by thousands of researchers, clinicians, informaticians and electronic health record systems around the world. Its detailed descriptions of clinical abnormalities and computable disease definitions have made HPO the de facto standard for deep phenotyping in the field of rare disease. The HPO’s interoperability with other ontologies has enabled it to be used to improve diagnostic accuracy by incorporating model organism data. It also plays a key role in the popular Exomiser tool, which identifies potential disease-causing variants from whole-exome or whole-genome sequencing data. Since the HPO was first introduced in 2008, its users have become both more numerous and more diverse. To meet these emerging needs, the project has added new content, language translations, mappings and computational tooling, as well as integrations with external community data. The HPO continues to collaborate with clinical adopters to improve specific areas of the ontology and extend standardized disease descriptions. The newly redesigned HPO website (www.human-phenotype-ontology.org) simplifies browsing terms and exploring clinical features, diseases, and human genes.
Collapse
Affiliation(s)
- Sebastian Köhler
- Charité Centrum für Therapieforschung, Charité-Universitätsmedizin Berlin Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin 10117, Germany.,Einstein Center Digital Future, Berlin 10117, Germany.,Monarch Initiative, monarchinitiative.org
| | - Leigh Carmody
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Nicole Vasilevsky
- Monarch Initiative, monarchinitiative.org.,Oregon Health & Science University, Portland, OR 97217, USA
| | - Julius O B Jacobsen
- Monarch Initiative, monarchinitiative.org.,Genomics England, Queen Mary University of London, Dawson Hall, Charterhouse Square, London EC1M 6BQ, UK
| | - Daniel Danis
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Jean-Philippe Gourdine
- Monarch Initiative, monarchinitiative.org.,Oregon Health & Science University, Portland, OR 97217, USA
| | - Michael Gargano
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Nomi L Harris
- Monarch Initiative, monarchinitiative.org.,Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nicolas Matentzoglu
- Monarch Initiative, monarchinitiative.org.,European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, UK
| | - Julie A McMurry
- Monarch Initiative, monarchinitiative.org.,Linus Pauling institute, Oregon State University, Corvallis, OR, USA
| | - David Osumi-Sutherland
- Monarch Initiative, monarchinitiative.org.,European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, UK
| | - Valentina Cipriani
- Monarch Initiative, monarchinitiative.org.,William Harvey Research Institute, Queen Mary University College of London.,UCL Genetics Institute, University College of London.,UCL Institute of Ophthalmology, University College of London
| | - James P Balhoff
- Monarch Initiative, monarchinitiative.org.,Renaissance Computing Institute, University of North Carolina at Chapel Hill
| | - Tom Conlin
- Monarch Initiative, monarchinitiative.org.,Linus Pauling institute, Oregon State University, Corvallis, OR, USA
| | - Hannah Blau
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Gareth Baynam
- Western Australian Register of Developmental Anomalies and Genetic Services of Western Australia, Department of Health, Government of Western Australia, WA, Australia.,School of Paediatrics and Telethon Kids Institute, University of Western Australia, Perth, WA, Australia.,Institute for Immunology and Infectious Diseases, Murdoch University, Perth, WA, Australia.,Spatial Sciences, Department of Science and Engineering, Curtin University, Perth, WA, Australia.,The Office of Population Health Genomics, Department of Health, Government of Western Australia, Perth, WA, Australia
| | - Richard Palmer
- Spatial Sciences, Department of Science and Engineering, Curtin University, Perth, WA, Australia
| | - Dylan Gratian
- Western Australian Register of Developmental Anomalies and Genetic Services of Western Australia, Department of Health, Government of Western Australia, WA, Australia
| | - Hugh Dawkins
- The Office of Population Health Genomics, Department of Health, Government of Western Australia, Perth, WA, Australia
| | | | - Anna C Jansen
- Neurogenetics Research Group, Vrije Universiteit Brussel, Brussels, Belgium.,Pediatric Neurology Unit, Department of Pediatrics, UZ Brussel, Brussels, Belgium
| | - Ahmed Muaz
- Monarch Initiative, monarchinitiative.org.,Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW 2010, Australia
| | - Willie H Chang
- Centre for Computational Medicine, Hospital for Sick Children and Department of Computer Science, University of Toronto, Toronto, Canada
| | - Jenna Bergerson
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Stanley J F Laulederkind
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin & Marquette University, 8701 Watertown Plank Road Milwaukee, WI 53226, USA
| | | | - Sergi Beltran
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, Barcelona 08028, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Alexandra F Freeman
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | | | - Daniel Durkin
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Andrea L Storm
- ICF, Rockville, MD, USA.,National Center for Advancing Translational Sciences, Office of Rare Diseases Research, National Institutes of Health, Bethesda, MD, USA
| | - Marc Hanauer
- INSERM, US14-Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Michael Brudno
- Centre for Computational Medicine, Hospital for Sick Children and Department of Computer Science, University of Toronto, Toronto, Canada
| | | | - Murat Sincan
- Sanford Imagenetics, Sanford Health, Sioux Falls, SD, USA
| | - Kayli Rageth
- Sanford Imagenetics, Sanford Health, Sioux Falls, SD, USA
| | - Matthew T Wheeler
- Center for Undiagnosed Diseases, Stanford University School of Medicine, Stanford, CA, USA
| | - Renske Oegema
- Department of Genetics, University Medical Center Utrecht, the Netherlands
| | - Halima Lourghi
- INSERM, US14-Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Maria G Della Rocca
- ICF, Rockville, MD, USA.,National Center for Advancing Translational Sciences, Office of Rare Diseases Research, National Institutes of Health, Bethesda, MD, USA
| | - Rachel Thompson
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, UK
| | | | - James Priest
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Ayushi Hegde
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Ruth C Lovering
- Institute of Cardiovascular Science, University College London, UK
| | | | - Annie Olry
- INSERM, US14-Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Luigi Notarangelo
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Morgan Similuk
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Xingmin A Zhang
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - David Gómez-Andrés
- Child Neurology Unit. Hospital Universitari Vall d'Hebron, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
| | - Hanns Lochmüller
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, Barcelona 08028, Spain.,Department of Neuropediatrics and Muscle Disorders, Medical Center-University of Freiburg, Faculty of Medicine, Freiburg, Germany.,Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, Canada.,Division of Neurology, Department of Medicine, The Ottawa Hospital, Ottawa, Canada
| | - Hélène Dollfus
- Centre for Rare Eye Diseases CARGO, SENSGENE FSMR Network, Strasbourg University Hospital, Strasbourg, France
| | - Sergio Rosenzweig
- Immunology Service, Department of Laboratory Medicine, NIH Clinical Center, Bethesda, MD, USA
| | - Shruti Marwaha
- Center for Undiagnosed Diseases, Stanford University School of Medicine, Stanford, CA, USA
| | - Ana Rath
- INSERM, US14-Orphanet, Plateforme Maladies Rares, 75014 Paris, France
| | - Kathleen Sullivan
- Department of Pediatrics, Division of Allergy Immunology, The Children's Hospital of Philadelphia, University of Pennsylvania Perelman School of Medicine, 3615 Civic Center Boulevard, Philadelphia, PA 19104, USA
| | | | - Joshua D Milner
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Dorothée Leroux
- Centre for Rare Eye Diseases CARGO, SENSGENE FSMR Network, Strasbourg University Hospital, Strasbourg, France
| | | | - Amy Klion
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Melody C Carter
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Tudor Groza
- Monarch Initiative, monarchinitiative.org.,Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW 2010, Australia
| | - Damian Smedley
- Monarch Initiative, monarchinitiative.org.,Genomics England, Queen Mary University of London, Dawson Hall, Charterhouse Square, London EC1M 6BQ, UK
| | - Melissa A Haendel
- Monarch Initiative, monarchinitiative.org.,Oregon Health & Science University, Portland, OR 97217, USA.,Linus Pauling institute, Oregon State University, Corvallis, OR, USA
| | - Chris Mungall
- Monarch Initiative, monarchinitiative.org.,Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Peter N Robinson
- Monarch Initiative, monarchinitiative.org.,The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.,Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA
| |
Collapse
|
48
|
Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Félix E, Magariños MP, Mosquera JF, Mutowo P, Nowotka M, Gordillo-Marañón M, Hunter F, Junco L, Mugumbate G, Rodriguez-Lopez M, Atkinson F, Bosc N, Radoux CJ, Segura-Cabrera A, Hersey A, Leach AR. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 2020; 47:D930-D940. [PMID: 30398643 PMCID: PMC6323927 DOI: 10.1093/nar/gky1075] [Citation(s) in RCA: 1211] [Impact Index Per Article: 242.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 10/18/2018] [Indexed: 12/31/2022] Open
Abstract
ChEMBL is a large, open-access bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012, 2014 and 2017 Nucleic Acids Research Database Issues. In the last two years, several important improvements have been made to the database and are described here. These include more robust capture and representation of assay details; a new data deposition system, allowing updating of data sets and deposition of supplementary data; and a completely redesigned web interface, with enhanced search and filtering capabilities.
Collapse
Affiliation(s)
- David Mendez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Anna Gaulton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - A Patrícia Bento
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Jon Chambers
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Marleen De Veij
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Eloy Félix
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - María Paula Magariños
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.,Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Juan F Mosquera
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Prudence Mutowo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Michal Nowotka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - María Gordillo-Marañón
- Institute of Cardiovascular Science, University College London, Gower Street, London WC1E 6BT, UK
| | - Fiona Hunter
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Laura Junco
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Grace Mugumbate
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Milagros Rodriguez-Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Francis Atkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Nicolas Bosc
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Chris J Radoux
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.,Open Targets, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Aldo Segura-Cabrera
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Anne Hersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Andrew R Leach
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| |
Collapse
|
49
|
Smith CM, Hayamizu TF, Finger JH, Bello SM, McCright IJ, Xu J, Baldarelli RM, Beal JS, Campbell J, Corbani LE, Frost PJ, Lewis JR, Giannatto SC, Miers D, Shaw DR, Kadin JA, Richardson JE, Smith CL, Ringwald M. The mouse Gene Expression Database (GXD): 2019 update. Nucleic Acids Res 2020; 47:D774-D779. [PMID: 30335138 PMCID: PMC6324054 DOI: 10.1093/nar/gky922] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 10/04/2018] [Indexed: 11/13/2022] Open
Abstract
The mouse Gene Expression Database (GXD) is an extensive, well-curated community resource freely available at www.informatics.jax.org/expression.shtml. Covering all developmental stages, GXD includes data from RNA in situ hybridization, immunohistochemistry, RT-PCR, northern blot and western blot experiments in wild-type and mutant mice. GXD's gene expression information is integrated with the other data in Mouse Genome Informatics and interconnected with other databases, placing these data in the larger biological and biomedical context. Since the last report, the ability of GXD to provide insights into the molecular mechanisms of development and disease has been greatly enhanced by the addition of new data and by the implementation of new web features. These include: improvements to the Differential Gene Expression Data Search, facilitating searches for genes that have been shown to be exclusively expressed in a specified structure and/or developmental stage; an enhanced anatomy browser that now provides access to expression data and phenotype data for a given anatomical structure; direct access to the wild-type gene expression data for the tissues affected in a specific mutant; and a comparison matrix that juxtaposes tissues where a gene is normally expressed against tissues, where mutations in that gene cause abnormalities.
Collapse
Affiliation(s)
| | - Terry F Hayamizu
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | | - Susan M Bello
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | | - Jingxia Xu
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | | - Jonathan S Beal
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Jeffrey Campbell
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Lori E Corbani
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Pete J Frost
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Jill R Lewis
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | | - Dave Miers
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - David R Shaw
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - James A Kadin
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | | - Cynthia L Smith
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Martin Ringwald
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| |
Collapse
|
50
|
Mabee PM, Balhoff JP, Dahdul WM, Lapp H, Mungall CJ, Vision TJ. A Logical Model of Homology for Comparative Biology. Syst Biol 2020; 69:345-362. [PMID: 31596473 PMCID: PMC7672696 DOI: 10.1093/sysbio/syz067] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 09/20/2019] [Accepted: 09/26/2019] [Indexed: 01/09/2023] Open
Abstract
There is a growing body of research on the evolution of anatomy in a wide variety of organisms. Discoveries in this field could be greatly accelerated by computational methods and resources that enable these findings to be compared across different studies and different organisms and linked with the genes responsible for anatomical modifications. Homology is a key concept in comparative anatomy; two important types are historical homology (the similarity of organisms due to common ancestry) and serial homology (the similarity of repeated structures within an organism). We explored how to most effectively represent historical and serial homology across anatomical structures to facilitate computational reasoning. We assembled a collection of homology assertions from the literature with a set of taxon phenotypes for the skeletal elements of vertebrate fins and limbs from the Phenoscape Knowledgebase. Using seven competency questions, we evaluated the reasoning ramifications of two logical models: the Reciprocal Existential Axioms (REA) homology model and the Ancestral Value Axioms (AVA) homology model. The AVA model returned all user-expected results in addition to the search term and any of its subclasses. The AVA model also returns any superclass of the query term in which a homology relationship has been asserted. The REA model returned the user-expected results for five out of seven queries. We identify some challenges of implementing complete homology queries due to limitations of OWL reasoning. This work lays the foundation for homology reasoning to be incorporated into other ontology-based tools, such as those that enable synthetic supermatrix construction and candidate gene discovery. [Homology; ontology; anatomy; morphology; evolution; knowledgebase; phenoscape.].
Collapse
Affiliation(s)
- Paula M Mabee
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, 100 Europa Drive, Suite 540, Chapel Hill, NC 27517, USA
| | - Wasila M Dahdul
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
| | - Hilmar Lapp
- Center for Genomic and Computational Biology, Duke University, 101 Science Drive, Durham, NC 27708, USA
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Todd J Vision
- Department of Biology and School of Information and Library Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3280, USA
| |
Collapse
|