1
|
Repositioning the Early Pathology of Type 1 Diabetes to the Extraislet Vasculature. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2024; 212:1094-1104. [PMID: 38426888 PMCID: PMC10944819 DOI: 10.4049/jimmunol.2300769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 01/29/2024] [Indexed: 03/02/2024]
Abstract
Type 1 diabetes (T1D) is a prototypic T cell-mediated autoimmune disease. Because the islets of Langerhans are insulated from blood vessels by a double basement membrane and lack detectable lymphatic drainage, interactions between endocrine and circulating T cells are not permitted. Thus, we hypothesized that initiation and progression of anti-islet immunity required islet neolymphangiogenesis to allow T cell access to the islet. Combining microscopy and single cell approaches, the timing of this phenomenon in mice was situated between 5 and 8 wk of age when activated anti-insulin CD4 T cells became detectable in peripheral blood while peri-islet pathology developed. This "peri-insulitis," dominated by CD4 T cells, respected the islet basement membrane and was limited on the outside by lymphatic endothelial cells that gave it the attributes of a tertiary lymphoid structure. As in most tissues, lymphangiogenesis seemed to be secondary to local segmental endothelial inflammation at the collecting postcapillary venule. In addition to classic markers of inflammation such as CD29, V-CAM, and NOS, MHC class II molecules were expressed by nonhematopoietic cells in the same location both in mouse and human islets. This CD45- MHC class II+ cell population was capable of spontaneously presenting islet Ags to CD4 T cells. Altogether, these observations favor an alternative model for the initiation of T1D, outside of the islet, in which a vascular-associated cell appears to be an important MHC class II-expressing and -presenting cell.
Collapse
|
2
|
Author Correction: A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses. BMC Biol 2023; 21:261. [PMID: 37974169 PMCID: PMC10655412 DOI: 10.1186/s12915-023-01764-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023] Open
|
3
|
DrugMechDB: A Curated Database of Drug Mechanisms. Sci Data 2023; 10:632. [PMID: 37717042 PMCID: PMC10505144 DOI: 10.1038/s41597-023-02534-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 09/01/2023] [Indexed: 09/18/2023] Open
Abstract
Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to a disease prediction. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repositioning models or as a valuable resource for training such models.
Collapse
|
4
|
An approach for collaborative development of a federated biomedical knowledge graph-based question-answering system: Question-of-the-Month challenges. J Clin Transl Sci 2023; 7:e214. [PMID: 37900350 PMCID: PMC10603356 DOI: 10.1017/cts.2023.619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 08/21/2023] [Indexed: 10/31/2023] Open
Abstract
Knowledge graphs have become a common approach for knowledge representation. Yet, the application of graph methodology is elusive due to the sheer number and complexity of knowledge sources. In addition, semantic incompatibilities hinder efforts to harmonize and integrate across these diverse sources. As part of The Biomedical Translator Consortium, we have developed a knowledge graph-based question-answering system designed to augment human reasoning and accelerate translational scientific discovery: the Translator system. We have applied the Translator system to answer biomedical questions in the context of a broad array of diseases and syndromes, including Fanconi anemia, primary ciliary dyskinesia, multiple sclerosis, and others. A variety of collaborative approaches have been used to research and develop the Translator system. One recent approach involved the establishment of a monthly "Question-of-the-Month (QotM) Challenge" series. Herein, we describe the structure of the QotM Challenge; the six challenges that have been conducted to date on drug-induced liver injury, cannabidiol toxicity, coronavirus infection, diabetes, psoriatic arthritis, and ATP1A3-related phenotypes; the scientific insights that have been gleaned during the challenges; and the technical issues that were identified over the course of the challenges and that can now be addressed to foster further development of the prototype Translator system. We close with a discussion on Large Language Models such as ChatGPT and highlight differences between those models and the Translator system.
Collapse
|
5
|
BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs. Bioinformatics 2023; 39:7273783. [PMID: 37707514 PMCID: PMC11015316 DOI: 10.1093/bioinformatics/btad570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 08/18/2023] [Accepted: 09/12/2023] [Indexed: 09/15/2023] Open
Abstract
SUMMARY Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThings Explorer is distributed as a lightweight application that dynamically retrieves information at query time. AVAILABILITY AND IMPLEMENTATION More information can be found at https://explorer.biothings.io and code is available at https://github.com/biothings/biothings_explorer.
Collapse
|
6
|
Association study between drug prescriptions and Alzheimer's disease claims in a commercial insurance database. Alzheimers Res Ther 2023; 15:118. [PMID: 37355615 PMCID: PMC10290352 DOI: 10.1186/s13195-023-01255-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 06/01/2023] [Indexed: 06/26/2023]
Abstract
In the ongoing effort to discover treatments for Alzheimer's disease (AD), there has been considerable focus on investigating the use of repurposed drug candidates. Mining of electronic health record data has the potential to identify novel correlated effects between commonly used drugs and AD. In this study, claims from members with commercial health insurance coverage were analyzed to determine the correlation between the use of various drugs on AD incidence and claim frequency. We found that, within the insured population, several medications for psychotic and mental illnesses were associated with higher disease incidence and frequency, while, to a lesser extent, antibiotics and anti-inflammatory drugs were associated with lower AD incidence rates. The observations thus provide a general overview of the prescription and claim relationships between various drug types and Alzheimer's disease, with insights into which drugs have possible implications on resulting AD diagnosis.
Collapse
|
7
|
DrugMechDB: A Curated Database of Drug Mechanisms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.01.538993. [PMID: 37205439 PMCID: PMC10187194 DOI: 10.1101/2023.05.01.538993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to disease predictions. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repurposing models or as a valuable resource for training such models.
Collapse
|
8
|
Schema Playground: a tool for authoring, extending, and using metadata schemas to improve FAIRness of biomedical data. BMC Bioinformatics 2023; 24:159. [PMID: 37081398 PMCID: PMC10116472 DOI: 10.1186/s12859-023-05258-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 03/27/2023] [Indexed: 04/22/2023] Open
Abstract
BACKGROUND Biomedical researchers are strongly encouraged to make their research outputs more Findable, Accessible, Interoperable, and Reusable (FAIR). While many biomedical research outputs are more readily accessible through open data efforts, finding relevant outputs remains a significant challenge. Schema.org is a metadata vocabulary standardization project that enables web content creators to make their content more FAIR. Leveraging Schema.org could benefit biomedical research resource providers, but it can be challenging to apply Schema.org standards to biomedical research outputs. We created an online browser-based tool that empowers researchers and repository developers to utilize Schema.org or other biomedical schema projects. RESULTS Our browser-based tool includes features which can help address many of the barriers towards Schema.org-compliance such as: The ability to easily browse for relevant Schema.org classes, the ability to extend and customize a class to be more suitable for biomedical research outputs, the ability to create data validation to ensure adherence of a research output to a customized class, and the ability to register a custom class to our schema registry enabling others to search and re-use it. We demonstrate the use of our tool with the creation of the Outbreak.info schema-a large multi-class schema for harmonizing various COVID-19 related resources. CONCLUSIONS We have created a browser-based tool to empower biomedical research resource providers to leverage Schema.org classes to make their research outputs more FAIR.
Collapse
|
9
|
BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs. ARXIV 2023:arXiv:2304.09344v1. [PMID: 37131885 PMCID: PMC10153288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThing Explorer is distributed as a lightweight application that dynamically retrieves information at query time. More information can be found at https://explorer.biothings.io, and code is available at https://github.com/biothings/biothings_explorer.
Collapse
|
10
|
Outbreak.info Research Library: a standardized, searchable platform to discover and explore COVID-19 resources. Nat Methods 2023; 20:536-540. [PMID: 36823331 PMCID: PMC10393269 DOI: 10.1038/s41592-023-01770-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 01/17/2023] [Indexed: 02/25/2023]
Abstract
Outbreak.info Research Library is a standardized, searchable interface of coronavirus disease 2019 (COVID-19) and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) publications, clinical trials, datasets, protocols and other resources, built with a reusable framework. We developed a rigorous schema to enforce consistency across different sources and resource types and linked related resources. Researchers can quickly search the latest research across data repositories, regardless of resource type or repository location, via a search interface, public application programming interface (API) and R package.
Collapse
|
11
|
Outbreak.info genomic reports: scalable and dynamic surveillance of SARS-CoV-2 variants and mutations. Nat Methods 2023; 20:512-522. [PMID: 36823332 PMCID: PMC10399614 DOI: 10.1038/s41592-023-01769-3] [Citation(s) in RCA: 83] [Impact Index Per Article: 83.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 01/17/2023] [Indexed: 02/25/2023]
Abstract
In response to the emergence of SARS-CoV-2 variants of concern, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info , a platform that currently tracks over 40 million combinations of Pango lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials and the general public. We describe the interpretable visualizations available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data and the server infrastructure that enables widespread data dissemination via a high-performance API that can be accessed using an R package. We show how outbreak.info can be used for genomic surveillance and as a hypothesis-generation tool to understand the ongoing pandemic at varying geographic and temporal scales.
Collapse
|
12
|
Addressing barriers in FAIR data practices for biomedical data. Sci Data 2023; 10:98. [PMID: 36823198 PMCID: PMC9950056 DOI: 10.1038/s41597-023-01969-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/13/2023] [Indexed: 02/25/2023] Open
|
13
|
Developing a standardized but extendable framework to increase the findability of infectious disease datasets. Sci Data 2023; 10:99. [PMID: 36823157 PMCID: PMC9950378 DOI: 10.1038/s41597-023-01968-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/13/2023] [Indexed: 02/25/2023] Open
Abstract
Biomedical datasets are increasing in size, stored in many repositories, and face challenges in FAIRness (findability, accessibility, interoperability, reusability). As a Consortium of infectious disease researchers from 15 Centers, we aim to adopt open science practices to promote transparency, encourage reproducibility, and accelerate research advances through data reuse. To improve FAIRness of our datasets and computational tools, we evaluated metadata standards across established biomedical data repositories. The vast majority do not adhere to a single standard, such as Schema.org, which is widely-adopted by generalist repositories. Consequently, datasets in these repositories are not findable in aggregation projects like Google Dataset Search. We alleviated this gap by creating a reusable metadata schema based on Schema.org and catalogued nearly 400 datasets and computational tools we collected. The approach is easily reusable to create schemas interoperable with community standards, but customized to a particular context. Our approach enabled data discovery, increased the reusability of datasets from a large research consortium, and accelerated research. Lastly, we discuss ongoing challenges with FAIRness beyond discoverability.
Collapse
|
14
|
Outbreak.info Research Library: A standardized, searchable platform to discover and explore COVID-19 resources. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.01.20.477133. [PMID: 35132411 PMCID: PMC8820656 DOI: 10.1101/2022.01.20.477133] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
To combat the ongoing COVID-19 pandemic, scientists have been conducting research at breakneck speeds, producing over 52,000 peer-reviewed articles within the first year. To address the challenge in tracking the vast amount of new research located in separate repositories, we developed outbreak.info Research Library, a standardized, searchable interface of COVID-19 and SARS-CoV-2 resources. Unifying metadata from sixteen repositories, we assembled a collection of over 350,000 publications, clinical trials, datasets, protocols, and other resources as of October 2022. We used a rigorous schema to enforce consistency across different sources and resource types and linked related resources. Researchers can quickly search the latest research across data repositories, regardless of resource type or repository location, via a search interface, public API, and R package. Finally, we discuss the challenges inherent in combining metadata from scattered and heterogeneous resources and provide recommendations to streamline this process to aid scientific research.
Collapse
|
15
|
A retrospective evaluation of a decade of Gene Wiki Reviews and their impact. Gene X 2022; 830:146534. [PMID: 35525475 DOI: 10.1016/j.gene.2022.146534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
|
16
|
Outbreak.info genomic reports: scalable and dynamic surveillance of SARS-CoV-2 variants and mutations. RESEARCH SQUARE 2022:rs.3.rs-1723829. [PMID: 35794893 PMCID: PMC9258294 DOI: 10.21203/rs.3.rs-1723829/v1] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
The emergence of SARS-CoV-2 variants of concern has prompted the need for near real-time genomic surveillance to inform public health interventions. In response to this need, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info, a platform that currently tracks over 40 million combinations of PANGO lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials, and the general public. We describe the interpretable and opinionated visualizations in the variant and location focussed reports available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data, and the server infrastructure that enables widespread data dissemination via a high performance API that can be accessed using an R package. We present a case study that illustrates how outbreak.info can be used for genomic surveillance and as a hypothesis generation tool to understand the ongoing pandemic at varying geographic and temporal scales. With an emphasis on scalability, interactivity, interpretability, and reusability, outbreak.info provides a template to enable genomic surveillance at a global and localized scale.
Collapse
|
17
|
Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science. Clin Transl Sci 2022; 15:1848-1855. [PMID: 36125173 PMCID: PMC9372416 DOI: 10.1111/cts.13302] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/27/2022] [Accepted: 05/02/2022] [Indexed: 12/12/2022] Open
Abstract
Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph‐based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these “knowledge graphs” (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open‐access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open‐source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object‐oriented classification and graph‐oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.
Collapse
|
18
|
Progress toward a universal biomedical data translator. Clin Transl Sci 2022; 15:1838-1847. [PMID: 35611543 PMCID: PMC9372428 DOI: 10.1111/cts.13301] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/27/2022] [Accepted: 05/02/2022] [Indexed: 11/28/2022] Open
Abstract
Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well-being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline-specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph-based "Translator" system capable of integrating existing biomedical data sets and "translating" those data into insights intended to augment human reasoning and accelerate translational science. Having demonstrated feasibility of the Translator system, the Translator program has since moved into development, and the Translator Consortium has made significant progress in the research, design, and implementation of an operational system. Herein, we describe the current system's architecture, performance, and quality of results. We apply Translator to several real-world use cases developed in collaboration with subject-matter experts. Finally, we discuss the scientific and technical features of Translator and compare those features to other state-of-the-art, biomedical graph-based question-answering systems.
Collapse
Grants
- OT3TR002019 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- ZIA TR000276-05 National Center for Advancing Translational Sciences, Intramural Research Program
- OT2TR003449 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- U01 DK065201 NIDDK NIH HHS
- OT2TR002515 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003443 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR002584 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003434 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2 TR003449 NCATS NIH HHS
- OT2TR003433 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003435 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR002517 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT3TR002027 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003422 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003441 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT3TR002020 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003448 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003428 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003445 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- I75N95021P00636 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR002520 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003427 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003436 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- ZIA TR000276 Intramural NIH HHS
- OT2TR002514 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT3TR002025 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2 TR003428 NCATS NIH HHS
- 5U01DK065201 NIDDK NIH HHS
- OT2TR003437 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003450 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT3TR002026 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- OT2TR003430 National Center for Advancing Translational Sciences, Biomedical Data Translator Program
- National Institute of Diabetes and Digestive and Kidney Diseases
- National Center for Advancing Translational Sciences, Biomedical Data Translator Program
Collapse
|
19
|
Design and application of a knowledge network for automatic prioritization of drug mechanisms. Bioinformatics 2022; 38:2880-2891. [PMID: 35561182 PMCID: PMC9113361 DOI: 10.1093/bioinformatics/btac205] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 02/17/2022] [Accepted: 04/04/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Drug repositioning is an attractive alternative to de novo drug discovery due to reduced time and costs to bring drugs to market. Computational repositioning methods, particularly non-black-box methods that can account for and predict a drug's mechanism, may provide great benefit for directing future development. By tuning both data and algorithm to utilize relationships important to drug mechanisms, a computational repositioning algorithm can be trained to both predict and explain mechanistically novel indications. RESULTS In this work, we examined the 123 curated drug mechanism paths found in the drug mechanism database (DrugMechDB) and after identifying the most important relationships, we integrated 18 data sources to produce a heterogeneous knowledge graph, MechRepoNet, capable of capturing the information in these paths. We applied the Rephetio repurposing algorithm to MechRepoNet using only a subset of relationships known to be mechanistic in nature and found adequate predictive ability on an evaluation set with AUROC value of 0.83. The resulting repurposing model allowed us to prioritize paths in our knowledge graph to produce a predicted treatment mechanism. We found that DrugMechDB paths, when present in the network were rated highly among predicted mechanisms. We then demonstrated MechRepoNet's ability to use mechanistic insight to identify a drug's mechanistic target, with a mean reciprocal rank of 0.525 on a test set of known drug-target interactions. Finally, we walked through repurposing examples of the anti-cancer drug imatinib for use in the treatment of asthma, and metolazone for use in the treatment of osteoporosis, to demonstrate this method's utility in providing mechanistic insight into repurposing predictions it provides. AVAILABILITY AND IMPLEMENTATION The Python code to reproduce the entirety of this analysis is available at: https://github.com/SuLab/MechRepoNet (archived at https://doi.org/10.5281/zenodo.6456335). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
20
|
Quantitative metaproteomics and activity-based protein profiling of patient fecal microbiome identifies host and microbial serine-type endopeptidase activity associated with ulcerative colitis. Mol Cell Proteomics 2022; 21:100197. [PMID: 35033677 PMCID: PMC8941213 DOI: 10.1016/j.mcpro.2022.100197] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 01/10/2022] [Accepted: 01/11/2022] [Indexed: 12/12/2022] Open
Abstract
The gut microbiota plays an important yet incompletely understood role in the induction and propagation of ulcerative colitis (UC). Organism-level efforts to identify UC-associated microbes have revealed the importance of community structure, but less is known about the molecular effectors of disease. We performed 16S rRNA gene sequencing in parallel with label-free data-dependent LC-MS/MS proteomics to characterize the stool microbiomes of healthy (n = 8) and UC (n = 10) patients. Comparisons of taxonomic composition between techniques revealed major differences in community structure partially attributable to the additional detection of host, fungal, viral, and food peptides by metaproteomics. Differential expression analysis of metaproteomic data identified 176 significantly enriched protein groups between healthy and UC patients. Gene ontology analysis revealed several enriched functions with serine-type endopeptidase activity overrepresented in UC patients. Using a biotinylated fluorophosphonate probe and streptavidin-based enrichment, we show that serine endopeptidases are active in patient fecal samples and that additional putative serine hydrolases are detectable by this approach compared with unenriched profiling. Finally, as metaproteomic databases expand, they are expected to asymptotically approach completeness. Using ComPIL and de novo peptide sequencing, we estimate the size of the probable peptide space unidentified (“dark peptidome”) by our large database approach to establish a rough benchmark for database sufficiency. Despite high variability inherent in patient samples, our analysis yielded a catalog of differentially enriched proteins between healthy and UC fecal proteomes. This catalog provides a clinically relevant jumping-off point for further molecular-level studies aimed at identifying the microbial underpinnings of UC. Identified 176 significantly altered protein groups between healthy and UC patients. Serine-type endopeptidase activity is overrepresented in UC patients. Fluorophosphonate ABPP shows that endopeptidases are active in fecal samples. ABPP enrichment helps identify additional putative serine hydrolases in samples. De novo sequencing used to estimate number of MS2 spectra unidentified by ComPIL.
Collapse
|
21
|
BioThings SDK: a toolkit for building high-performance data APIs in biomedical research. Bioinformatics 2022; 38:2077-2079. [PMID: 35020801 PMCID: PMC8963279 DOI: 10.1093/bioinformatics/btac017] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 12/10/2021] [Accepted: 01/08/2022] [Indexed: 02/04/2023] Open
Abstract
SUMMARY To meet the increased need of making biomedical resources more accessible and reusable, Web Application Programming Interfaces (APIs) or web services have become a common way to disseminate knowledge sources. The BioThings APIs are a collection of high-performance, scalable, annotation as a service APIs that automate the integration of biological annotations from disparate data sources. This collection of APIs currently includes MyGene.info, MyVariant.info and MyChem.info for integrating annotations on genes, variants and chemical compounds, respectively. These APIs are used by both individual researchers and application developers to simplify the process of annotation retrieval and identifier mapping. Here, we describe the BioThings Software Development Kit (SDK), a generalizable and reusable toolkit for integrating data from multiple disparate data sources and creating high-performance APIs. This toolkit allows users to easily create their own BioThings APIs for any data type of interest to them, as well as keep APIs up-to-date with their underlying data sources. AVAILABILITY AND IMPLEMENTATION The BioThings SDK is built in Python and released via PyPI (https://pypi.org/project/biothings/). Its source code is hosted at its github repository (https://github.com/biothings/biothings.api). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
22
|
Cano M, Tsueng G, Zhou X, Hughes LD, Mullen JL, Xin J, Su AI, Wu C. Schema Playground: A tool for authoring, extending, and using metadata schemas to improve FAIRness of biomedical data.. [PMID: 35677074 PMCID: PMC9176648 DOI: 10.1101/2021.09.02.458726] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Background: Biomedical researchers are strongly encouraged to make their research outputs more Findable, Accessible, Interoperable, and Reusable (FAIR). While many biomedical research outputs are more readily accessible through open data efforts, finding relevant outputs remains a significant challenge. Schema.org is a metadata vocabulary standardization project that enables web content creators to make their content more FAIR. Leveraging schema.org could benefit biomedical research resource providers, but it can be challenging to apply schema.org standards to biomedical research outputs. We created an online browser-based tool that empowers researchers and repository developers to utilize schema.org or other biomedical schema projects. Results: Our browser-based tool includes features which can help address many of the barriers towards schema.org-compliance such as: The ability to easily browse for relevant schema.org classes, the ability to extend and customize a class to be more suitable for biomedical research outputs, the ability to create data validation to ensure adherence of a research output to a customized class, and the ability to register a custom class to our schema registry enabling others to search and re-use it. We demonstrate the use of our tool with the creation of the Outbreak.info schema—a large multi-class schema for harmonizing various COVID-19 related resources. Conclusions: We have created a browser-based tool to empower biomedical research resource providers to leverage schema.org classes to make their research outputs more FAIR.
Collapse
|
23
|
Mohawk is a transcription factor that promotes meniscus cell phenotype and tissue repair and reduces osteoarthritis severity. Sci Transl Med 2021; 12:12/567/eaan7967. [PMID: 33115953 DOI: 10.1126/scitranslmed.aan7967] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 02/06/2020] [Accepted: 09/21/2020] [Indexed: 12/13/2022]
Abstract
Meniscus tears are common knee injuries and a major osteoarthritis (OA) risk factor. Knowledge gaps that limit the development of therapies for meniscus injury and degeneration concern transcription factors that control the meniscus cell phenotype. Analysis of RNA sequencing data from 37 human tissues in the Genotype-Tissue Expression database and RNA sequencing data from meniscus and articular cartilage showed that transcription factor Mohawk (MKX) is highly enriched in meniscus. In human meniscus cells, MKX regulates the expression of meniscus marker genes, OA-related genes, and other transcription factors, including Scleraxis (SCX), SRY Box 5 (SOX5), and Runt domain-related transcription factor 2 (RUNX2). In mesenchymal stem cells (MSCs), the combination of adenoviral MKX (Ad-MKX) and transforming growth factor-β3 (TGF-β3) induced a meniscus cell phenotype. When Ad-MKX-transduced MSCs were seeded on TGF-β3-conjugated decellularized meniscus scaffold (DMS) and inserted into experimental tears in meniscus explants, they increased glycosaminoglycan content, extracellular matrix interconnectivity, cell infiltration into the DMS, and improved biomechanical properties. Ad-MKX injection into mouse knee joints with experimental OA induced by surgical destabilization of the meniscus suppressed meniscus and cartilage damage, reducing OA severity. Ad-MKX injection into human OA meniscus tissue explants corrected pathogenic gene expression. These results identify MKX as a previously unidentified key transcription factor that regulates the meniscus cell phenotype. The combination of Ad-MKX with TGF-β3 is effective for differentiation of MSCs to a meniscus cell phenotype and useful for meniscus repair. MKX is a promising therapeutic target for meniscus tissue engineering, repair, and prevention of OA.
Collapse
|
24
|
Structured reviews for data and knowledge-driven research. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2020:5818923. [PMID: 32283553 PMCID: PMC7153956 DOI: 10.1093/database/baaa015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 01/21/2020] [Accepted: 02/07/2020] [Indexed: 12/25/2022]
Abstract
Hypothesis generation is a critical step in research and a cornerstone in the rare disease field. Research is most efficient when those hypotheses are based on the entirety of knowledge known to date. Systematic review articles are commonly used in biomedicine to summarize existing knowledge and contextualize experimental data. But the information contained within review articles is typically only expressed as free-text, which is difficult to use computationally. Researchers struggle to navigate, collect and remix prior knowledge as it is scattered in several silos without seamless integration and access. This lack of a structured information framework hinders research by both experimental and computational scientists. To better organize knowledge and data, we built a structured review article that is specifically focused on NGLY1 Deficiency, an ultra-rare genetic disease first reported in 2012. We represented this structured review as a knowledge graph and then stored this knowledge graph in a Neo4j database to simplify dissemination, querying and visualization of the network. Relative to free-text, this structured review better promotes the principles of findability, accessibility, interoperability and reusability (FAIR). In collaboration with domain experts in NGLY1 Deficiency, we demonstrate how this resource can improve the efficiency and comprehensiveness of hypothesis generation. We also developed a read–write interface that allows domain experts to contribute FAIR structured knowledge to this community resource. In contrast to traditional free-text review articles, this structured review exists as a living knowledge graph that is curated by humans and accessible to computational analyses. Finally, we have generalized this workflow into modular and repurposable components that can be applied to other domain areas. This NGLY1 Deficiency-focused network is publicly available at http://ngly1graph.org/. Availability and implementation Database URL: http://ngly1graph.org/. Network data files are at: https://github.com/SuLab/ngly1-graph and source code at: https://github.com/SuLab/bioknowledge-reviewer. Contact asu@scripps.edu
Collapse
|
25
|
A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses. BMC Biol 2021; 19:12. [PMID: 33482803 PMCID: PMC7820539 DOI: 10.1186/s12915-020-00940-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 12/13/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Pandemics, even more than other medical problems, require swift integration of knowledge. When caused by a new virus, understanding the underlying biology may help finding solutions. In a setting where there are a large number of loosely related projects and initiatives, we need common ground, also known as a "commons." Wikidata, a public knowledge graph aligned with Wikipedia, is such a commons and uses unique identifiers to link knowledge in other knowledge bases. However, Wikidata may not always have the right schema for the urgent questions. In this paper, we address this problem by showing how a data schema required for the integration can be modeled with entity schemas represented by Shape Expressions. RESULTS As a telling example, we describe the process of aligning resources on the genomes and proteomes of the SARS-CoV-2 virus and related viruses as well as how Shape Expressions can be defined for Wikidata to model the knowledge, helping others studying the SARS-CoV-2 pandemic. How this model can be used to make data between various resources interoperable is demonstrated by integrating data from NCBI (National Center for Biotechnology Information) Taxonomy, NCBI Genes, UniProt, and WikiPathways. Based on that model, a set of automated applications or bots were written for regular updates of these sources in Wikidata and added to a platform for automatically running these updates. CONCLUSIONS Although this workflow is developed and applied in the context of the COVID-19 pandemic, to demonstrate its broader applicability it was also applied to other human coronaviruses (MERS, SARS, human coronavirus NL63, human coronavirus 229E, human coronavirus HKU1, human coronavirus OC4).
Collapse
|
26
|
Multi-Omics Database Analysis of Aminoacyl-tRNA Synthetases in Cancer. Genes (Basel) 2020; 11:genes11111384. [PMID: 33266490 PMCID: PMC7700366 DOI: 10.3390/genes11111384] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 10/24/2020] [Accepted: 11/20/2020] [Indexed: 12/23/2022] Open
Abstract
Aminoacyl-tRNA synthetases (aaRSs) are key enzymes in the mRNA translation machinery, yet they possess numerous non-canonical functions developed during the evolution of complex organisms. The aaRSs and aaRS-interacting multi-functional proteins (AIMPs) are continually being implicated in tumorigenesis, but these connections are often limited in scope, focusing on specific aaRSs in distinct cancer subtypes. Here, we analyze publicly available genomic and transcriptomic data on human cytoplasmic and mitochondrial aaRSs across many cancer types. As high-throughput technologies have improved exponentially, large-scale projects have systematically quantified genetic alteration and expression from thousands of cancer patient samples. One such project is the Cancer Genome Atlas (TCGA), which processed over 20,000 primary cancer and matched normal samples from 33 cancer types. The wealth of knowledge provided from this undertaking has streamlined the identification of cancer drivers and suppressors. We examined aaRS expression data produced by the TCGA project and combined this with patient survival data to recognize trends in aaRSs' impact on cancer both molecularly and prognostically. We further compared these trends to an established tumor suppressor and a proto-oncogene. We observed apparent upregulation of many tRNA synthetase genes with aggressive cancer types, yet, at the individual gene level, some aaRSs resemble a tumor suppressor while others show similarities to an oncogene. This study provides an unbiased, overarching perspective on the relationship of aaRSs with cancers and identifies certain aaRS family members as promising therapeutic targets or potential leads for developing biological therapy for cancer.
Collapse
|
27
|
Discovery of SARS-CoV-2 antiviral drugs through large-scale compound repurposing. Nature 2020; 586:113-119. [PMID: 32707573 PMCID: PMC7603405 DOI: 10.1038/s41586-020-2577-1] [Citation(s) in RCA: 559] [Impact Index Per Article: 139.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 07/17/2020] [Indexed: 02/08/2023]
Abstract
The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in 2019 has triggered an ongoing global pandemic of the severe pneumonia-like disease coronavirus disease 2019 (COVID-19)1. The development of a vaccine is likely to take at least 12-18 months, and the typical timeline for approval of a new antiviral therapeutic agent can exceed 10 years. Thus, repurposing of known drugs could substantially accelerate the deployment of new therapies for COVID-19. Here we profiled a library of drugs encompassing approximately 12,000 clinical-stage or Food and Drug Administration (FDA)-approved small molecules to identify candidate therapeutic drugs for COVID-19. We report the identification of 100 molecules that inhibit viral replication of SARS-CoV-2, including 21 drugs that exhibit dose-response relationships. Of these, thirteen were found to harbour effective concentrations commensurate with probable achievable therapeutic doses in patients, including the PIKfyve kinase inhibitor apilimod2-4 and the cysteine protease inhibitors MDL-28170, Z LVG CHN2, VBY-825 and ONO 5334. Notably, MDL-28170, ONO 5334 and apilimod were found to antagonize viral replication in human pneumocyte-like cells derived from induced pluripotent stem cells, and apilimod also demonstrated antiviral efficacy in a primary human lung explant model. Since most of the molecules identified in this study have already advanced into the clinic, their known pharmacological and human safety profiles will enable accelerated preclinical and clinical evaluation of these drugs for the treatment of COVID-19.
Collapse
|
28
|
Abstract
Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.
Collapse
|
29
|
Functional Annotation of the Transcriptome of the Pig, Sus scrofa, Based Upon Network Analysis of an RNAseq Transcriptional Atlas. Front Genet 2020; 10:1355. [PMID: 32117413 PMCID: PMC7034361 DOI: 10.3389/fgene.2019.01355] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 12/11/2019] [Indexed: 12/15/2022] Open
Abstract
The domestic pig (Sus scrofa) is both an economically important livestock species and a model for biomedical research. Two highly contiguous pig reference genomes have recently been released. To support functional annotation of the pig genomes and comparative analysis with large human transcriptomic data sets, we aimed to create a pig gene expression atlas. To achieve this objective, we extended a previous approach developed for the chicken. We downloaded RNAseq data sets from public repositories, down-sampled to a common depth, and quantified expression against a reference transcriptome using the mRNA quantitation tool, Kallisto. We then used the network analysis tool Graphia to identify clusters of transcripts that were coexpressed across the merged data set. Consistent with the principle of guilt-by-association, we identified coexpression clusters that were highly tissue or cell-type restricted and contained transcription factors that have previously been implicated in lineage determination. Other clusters were enriched for transcripts associated with biological processes, such as the cell cycle and oxidative phosphorylation. The same approach was used to identify coexpression clusters within RNAseq data from multiple individual liver and brain samples, highlighting cell type, process, and region-specific gene expression. Evidence of conserved expression can add confidence to assignment of orthology between pig and human genes. Many transcripts currently identified as novel genes with ENSSSCG or LOC IDs were found to be coexpressed with annotated neighbouring transcripts in the same orientation, indicating they may be products of the same transcriptional unit. The meta-analytic approach to utilising public RNAseq data is extendable to include new data sets and new species and provides a framework to support the Functional Annotation of Animals Genomes (FAANG) initiative.
Collapse
|
30
|
Advancing computational biology and bioinformatics research through open innovation competitions. PLoS One 2019; 14:e0222165. [PMID: 31560691 PMCID: PMC6764653 DOI: 10.1371/journal.pone.0222165] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 08/22/2019] [Indexed: 11/19/2022] Open
Abstract
Open data science and algorithm development competitions offer a unique avenue for rapid discovery of better computational strategies. We highlight three examples in computational biology and bioinformatics research in which the use of competitions has yielded significant performance gains over established algorithms. These include algorithms for antibody clustering, imputing gene expression data, and querying the Connectivity Map (CMap). Performance gains are evaluated quantitatively using realistic, albeit sanitized, data sets. The solutions produced through these competitions are then examined with respect to their utility and the prospects for implementation in the field. We present the decision process and competition design considerations that lead to these successful outcomes as a model for researchers who want to use competitions and non-domain crowds as collaborators to further their research.
Collapse
|
31
|
FoxO transcription factors modulate autophagy and proteoglycan 4 in cartilage homeostasis and osteoarthritis. Sci Transl Med 2019; 10:10/428/eaan0746. [PMID: 29444976 DOI: 10.1126/scitranslmed.aan0746] [Citation(s) in RCA: 167] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 01/08/2018] [Indexed: 12/14/2022]
Abstract
Aging is a main risk factor for osteoarthritis (OA). FoxO transcription factors protect against cellular and organismal aging, and FoxO expression in cartilage is reduced with aging and in OA. To investigate the role of FoxO in cartilage, Col2Cre-FoxO1, 3, and 4 single knockout (KO) and triple KO mice (Col2Cre-TKO) were analyzed. Articular cartilage in Col2Cre-TKO and Col2Cre-FoxO1 KO mice was thicker than in control mice at 1 or 2 months of age. This was associated with increased proliferation of chondrocytes of Col2Cre-TKO mice in vivo and in vitro. OA-like changes developed in cartilage, synovium, and subchondral bone between 4 and 6 months of age in Col2Cre-TKO and Col2Cre-FoxO1 KO mice. Col2Cre-FoxO3 and FoxO4 KO mice showed no cartilage abnormalities until 18 months of age when Col2Cre-FoxO3 KO mice had more severe OA than control mice. Autophagy and antioxidant defense genes were reduced in Col2Cre-TKO mice. Deletion of FoxO1/3/4 in mature mice using Aggrecan(Acan)-CreERT2 (AcanCreERT-TKO) also led to spontaneous cartilage degradation and increased OA severity in a surgical model or treadmill running. The superficial zone of knee articular cartilage of Col2Cre-TKO and AcanCreERT-TKO mice exhibited reduced cell density and markedly decreased Prg4 In vitro, ectopic FoxO1 expression increased Prg4 and synergized with transforming growth factor-β stimulation. In OA chondrocytes, overexpression of FoxO1 reduced inflammatory mediators and cartilage-degrading enzymes, increased protective genes, and antagonized interleukin-1β effects. Our observations suggest that FoxO play a key role in postnatal cartilage development, maturation, and homeostasis and protect against OA-associated cartilage damage.
Collapse
|
32
|
Applying citizen science to gene, drug and disease relationship extraction from biomedical abstracts. Bioinformatics 2019; 36:1226-1233. [PMID: 31504205 PMCID: PMC8104067 DOI: 10.1093/bioinformatics/btz678] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 08/05/2019] [Accepted: 08/29/2019] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION Biomedical literature is growing at a rate that outpaces our ability to harness the knowledge contained therein. To mine valuable inferences from the large volume of literature, many researchers use information extraction algorithms to harvest information in biomedical texts. Information extraction is usually accomplished via a combination of manual expert curation and computational methods. Advances in computational methods usually depend on the time-consuming generation of gold standards by a limited number of expert curators. Citizen science is public participation in scientific research. We previously found that citizen scientists are willing and capable of performing named entity recognition of disease mentions in biomedical abstracts, but did not know if this was true with relationship extraction (RE). RESULTS In this article, we introduce the Relationship Extraction Module of the web-based application Mark2Cure (M2C) and demonstrate that citizen scientists can perform RE. We confirm the importance of accurate named entity recognition on user performance of RE and identify design issues that impacted data quality. We find that the data generated by citizen scientists can be used to identify relationship types not currently available in the M2C Relationship Extraction Module. We compare the citizen science-generated data with algorithm-mined data and identify ways in which the two approaches may complement one another. We also discuss opportunities for future improvement of this system, as well as the potential synergies between citizen science, manual biocuration and natural language processing. AVAILABILITY AND IMPLEMENTATION Mark2Cure platform: https://mark2cure.org; Mark2Cure source code: https://github.com/sulab/mark2cure; and data and analysis code for this article: https://github.com/gtsueng/M2C_rel_nb. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
33
|
ChlamBase: a curated model organism database for the Chlamydia research community. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019; 2019:5519651. [PMID: 31211397 PMCID: PMC6580685 DOI: 10.1093/database/baz091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
|
34
|
exRNA Atlas Analysis Reveals Distinct Extracellular RNA Cargo Types and Their Carriers Present across Human Biofluids. Cell 2019; 177:463-477.e15. [PMID: 30951672 PMCID: PMC6616370 DOI: 10.1016/j.cell.2019.02.018] [Citation(s) in RCA: 187] [Impact Index Per Article: 37.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Revised: 11/06/2018] [Accepted: 02/11/2019] [Indexed: 12/11/2022]
Abstract
To develop a map of cell-cell communication mediated by extracellular RNA (exRNA), the NIH Extracellular RNA Communication Consortium created the exRNA Atlas resource (https://exrna-atlas.org). The Atlas version 4P1 hosts 5,309 exRNA-seq and exRNA qPCR profiles from 19 studies and a suite of analysis and visualization tools. To analyze variation between profiles, we apply computational deconvolution. The analysis leads to a model with six exRNA cargo types (CT1, CT2, CT3A, CT3B, CT3C, CT4), each detectable in multiple biofluids (serum, plasma, CSF, saliva, urine). Five of the cargo types associate with known vesicular and non-vesicular (lipoprotein and ribonucleoprotein) exRNA carriers. To validate utility of this model, we re-analyze an exercise response study by deconvolution to identify physiologically relevant response pathways that were not detected previously. To enable wide application of this model, as part of the exRNA Atlas resource, we provide tools for deconvolution and analysis of user-provided case-control studies.
Collapse
|
35
|
Aligning Needs: Integrating Citizen Science Efforts into Schools Through Service Requirements. HUMAN COMPUTATION (FAIRFAX, VA.) 2019; 6:56-82. [PMID: 31363486 PMCID: PMC6667230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Citizen science is the participation in scientific research by members of the public, and it is an increasingly valuable tool for both scientists and educators. For researchers, citizen science is a means of more quickly investigating questions which would otherwise be time-consuming and costly to study. For educators, citizen science offers a means to engage students in actual research and improve learning outcomes. Since most citizen science projects are usually designed with research goals in mind, many lack the necessary educator materials for successful integration in a formal science education (FSE) setting. In an ideal world, researchers and educators would build the necessary materials together; however, many researchers lack the time, resources, and networks to create these materials early on in the life of a citizen science project. For resource-poor projects, we propose an intermediate entry point for recruiting from the educational setting: community service or service learning requirements (CSSLRs). Many schools require students to participate in community service or service learning activities in order to graduate. When implemented well, CSSLRs provide students with growth and development opportunities outside the classroom while contributing to the community and other worthwhile causes. However, CSSLRs take time, resources, and effort to implement well. Just as citizen science projects need to establish relationships to transition well into formal science education, schools need to cultivate relationships with community service organizations. Students and educators at schools with CSSLRs where implementation is still a work in progress may be left with a burdensome requirement and inadequate support. With the help of a volunteer fulfilling a CSSLR, we investigated the number of students impacted by CSSLRs set at different levels of government and explored the qualifications needed for citizen science projects to fulfill CSSLRs by examining the explicitly-stated justifications for having CSSLRs, surveying how CSSLRs are verified, and using these qualifications to demonstrate how an online citizen science project, Mark2Cure, could use this information to meet the needs of students fulfilling CSSLRs.
Collapse
|
36
|
ChlamBase: a curated model organism database for the Chlamydia research community. Database (Oxford) 2019; 2019:baz041. [PMID: 30985891 PMCID: PMC6463448 DOI: 10.1093/database/baz041] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 02/22/2019] [Accepted: 03/07/2019] [Indexed: 02/06/2023]
Abstract
The accelerating growth of genomic and proteomic information for Chlamydia species, coupled with unique biological aspects of these pathogens, necessitates bioinformatic tools and features that are not provided by major public databases. To meet these growing needs, we developed ChlamBase, a model organism database for Chlamydia that is built upon the WikiGenomes application framework, and Wikidata, a community-curated database. ChlamBase was designed to serve as a central access point for genomic and proteomic information for the Chlamydia research community. ChlamBase integrates information from numerous external databases, as well as important data extracted from the literature that are otherwise not available in structured formats that are easy to use. In addition, a key feature of ChlamBase is that it empowers users in the field to contribute new annotations and data as the field advances with continued discoveries. ChlamBase is freely and publicly available at chlambase.org.
Collapse
|
37
|
Identification of transcription factors responsible for dysregulated networks in human osteoarthritis cartilage by global gene expression analysis. Osteoarthritis Cartilage 2018; 26:1531-1538. [PMID: 30081074 PMCID: PMC6245598 DOI: 10.1016/j.joca.2018.07.012] [Citation(s) in RCA: 130] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Revised: 06/28/2018] [Accepted: 07/13/2018] [Indexed: 02/02/2023]
Abstract
OBJECTIVE Osteoarthritis (OA) is the most prevalent joint disease. As disease-modifying therapies are not available, novel therapeutic targets need to be discovered and prioritized for their importance in mediating the abnormal phenotype of cells in OA-affected joints. Here, we generated a genome-wide molecular profile of OA to elucidate regulatory mechanisms of OA pathogenesis and to identify possible therapeutic targets using integrative analysis of mRNA-sequencing data obtained from human knee cartilage. DESIGN RNA-sequencing (RNA-seq) was performed on 18 normal and 20 OA human knee cartilage tissues. RNA-seq datasets were analysed to identify genes, pathways and regulatory networks that were dysregulated in OA. RESULTS RNA-seq data analysis revealed 1332 differentially expressed (DE) genes between OA and non-OA samples, including known and novel transcription factors (TFs). Pathway analysis identified 15 significantly perturbed pathways in OA with ECM-related, PI3K-Akt, HIF-1, FoxO and circadian rhythm pathways being the most significantly dysregulated. We selected DE TFs that are enriched for regulating DE genes in OA and prioritized these TFs by creating a cartilage-specific interaction subnetwork. This analysis revealed eight TFs, including JUN, Early growth response (EGR)1, JUND, FOSL2, MYC, KLF4, RELA, and FOS that both target large numbers of dysregulated genes in OA and are themselves suppressed in OA. CONCLUSIONS We identified a novel subnetwork of dysregulated TFs that represent new mediators of abnormal gene expression and promising therapeutic targets in OA.
Collapse
|
38
|
The ReFRAME library as a comprehensive drug repurposing library and its application to the treatment of cryptosporidiosis. Proc Natl Acad Sci U S A 2018; 115:10750-10755. [PMID: 30282735 PMCID: PMC6196526 DOI: 10.1073/pnas.1810137115] [Citation(s) in RCA: 128] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The chemical diversity and known safety profiles of drugs previously tested in humans make them a valuable set of compounds to explore potential therapeutic utility in indications outside those originally targeted, especially neglected tropical diseases. This practice of "drug repurposing" has become commonplace in academic and other nonprofit drug-discovery efforts, with the appeal that significantly less time and resources are required to advance a candidate into the clinic. Here, we report a comprehensive open-access, drug repositioning screening set of 12,000 compounds (termed ReFRAME; Repurposing, Focused Rescue, and Accelerated Medchem) that was assembled by combining three widely used commercial drug competitive intelligence databases (Clarivate Integrity, GVK Excelra GoStar, and Citeline Pharmaprojects), together with extensive patent mining of small molecules that have been dosed in humans. To date, 12,000 compounds (∼80% of compounds identified from data mining) have been purchased or synthesized and subsequently plated for screening. To exemplify its utility, this collection was screened against Cryptosporidium spp., a major cause of childhood diarrhea in the developing world, and two active compounds previously tested in humans for other therapeutic indications were identified. Both compounds, VB-201 and a structurally related analog of ASP-7962, were subsequently shown to be efficacious in animal models of Cryptosporidium infection at clinically relevant doses, based on available human doses. In addition, an open-access data portal (https://reframedb.org) has been developed to share ReFRAME screen hits to encourage additional follow-up and maximize the impact of the ReFRAME screening collection.
Collapse
|
39
|
Abstract
The lysis and extraction of soluble bacterial proteins from cells is a common practice for proteomics analyses, but insoluble bacterial biomasses are often left behind. Here, we show that with triflic acid treatment, the insoluble bacterial biomass of Gram- and Gram+ bacteria can be rendered soluble. We use LC-MS/MS shotgun proteomics to show that bacterial proteins in the soluble and insoluble postlysis fractions differ significantly. Additionally, in the case of Gram- Pseudomonas aeruginosa, triflic acid treatment enables the enrichment of cell-envelope-associated proteins. Finally, we apply triflic acid to a human microbiome sample to show that this treatment is robust and enables the identification of a new, complementary subset of proteins from a complex microbial mixture.
Collapse
|
40
|
Correcting the F508del-CFTR variant by modulating eukaryotic translation initiation factor 3-mediated translation initiation. J Biol Chem 2018; 293:13477-13495. [PMID: 30006345 DOI: 10.1074/jbc.ra118.003192] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 07/05/2018] [Indexed: 12/31/2022] Open
Abstract
Inherited and somatic rare diseases result from >200,000 genetic variants leading to loss- or gain-of-toxic function, often caused by protein misfolding. Many of these misfolded variants fail to properly interact with other proteins. Understanding the link between factors mediating the transcription, translation, and protein folding of these disease-associated variants remains a major challenge in cell biology. Herein, we utilized the cystic fibrosis transmembrane conductance regulator (CFTR) protein as a model and performed a proteomics-based high-throughput screen (HTS) to identify pathways and components affecting the folding and function of the most common cystic fibrosis-associated mutation, the F508del variant of CFTR. Using a shortest-path algorithm we developed, we mapped HTS hits to the CFTR interactome to provide functional context to the targets and identified the eukaryotic translation initiation factor 3a (eIF3a) as a central hub for the biogenesis of CFTR. Of note, siRNA-mediated silencing of eIF3a reduced the polysome-to-monosome ratio in F508del-expressing cells, which, in turn, decreased the translation of CFTR variants, leading to increased CFTR stability, trafficking, and function at the cell surface. This finding suggested that eIF3a is involved in mediating the impact of genetic variations in CFTR on the folding of this protein. We posit that the number of ribosomes on a CFTR mRNA transcript is inversely correlated with the stability of the translated polypeptide. Polysome-based translation challenges the capacity of the proteostasis environment to balance message fidelity with protein folding, leading to disease. We suggest that this deficit can be corrected through control of translation initiation.
Collapse
|
41
|
Abstract
Background The Jurkat cell line has an extensive history as a model of T cell signaling. But at the turn of the 21st century, some expression irregularities were observed, raising doubts about how closely the cell line paralleled normal human T cells. While numerous expression deficiencies have been described in Jurkat, genetic explanations have only been provided for a handful of defects. Results Here, we report a comprehensive catolog of genomic variation in the Jurkat cell line based on whole-genome sequencing. With this list of all detectable, non-reference sequences, we prioritize potentially damaging mutations by mining public databases for functional effects. We confirm documented mutations in Jurkat and propose links from detrimental gene variants to observed expression abnormalities in the cell line. Conclusions The Jurkat cell line harbors many mutations that are associated with cancer and contribute to Jurkat’s unique characteristics. Genes with damaging mutations in the Jurkat cell line are involved in T-cell receptor signaling (PTEN, INPP5D, CTLA4, and SYK), maintenance of genome stability (TP53, BAX, and MSH2), and O-linked glycosylation (C1GALT1C1). This work ties together decades of molecular experiments and serves as a resource that will streamline both the interpretation of past research and the design of future Jurkat studies. Electronic supplementary material The online version of this article (10.1186/s12864-018-4718-6) contains supplementary material, which is available to authorized users.
Collapse
|
42
|
Common PIEZO1 Allele in African Populations Causes RBC Dehydration and Attenuates Plasmodium Infection. Cell 2018; 173:443-455.e12. [PMID: 29576450 DOI: 10.1016/j.cell.2018.02.047] [Citation(s) in RCA: 140] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Revised: 01/06/2018] [Accepted: 02/14/2018] [Indexed: 01/05/2023]
Abstract
Hereditary xerocytosis is thought to be a rare genetic condition characterized by red blood cell (RBC) dehydration with mild hemolysis. RBC dehydration is linked to reduced Plasmodium infection in vitro; however, the role of RBC dehydration in protection against malaria in vivo is unknown. Most cases of hereditary xerocytosis are associated with gain-of-function mutations in PIEZO1, a mechanically activated ion channel. We engineered a mouse model of hereditary xerocytosis and show that Plasmodium infection fails to cause experimental cerebral malaria in these mice due to the action of Piezo1 in RBCs and in T cells. Remarkably, we identified a novel human gain-of-function PIEZO1 allele, E756del, present in a third of the African population. RBCs from individuals carrying this allele are dehydrated and display reduced Plasmodium infection in vitro. The existence of a gain-of-function PIEZO1 at such high frequencies is surprising and suggests an association with malaria resistance.
Collapse
|
43
|
Exploring applications of crowdsourcing to cryo-EM. J Struct Biol 2018; 203:37-45. [PMID: 29486249 PMCID: PMC6086358 DOI: 10.1016/j.jsb.2018.02.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Revised: 02/19/2018] [Accepted: 02/22/2018] [Indexed: 11/28/2022]
Abstract
Extraction of particles from cryo-electron microscopy (cryo-EM) micrographs is a crucial step in processing single-particle datasets. Although algorithms have been developed for automatic particle picking, these algorithms generally rely on two-dimensional templates for particle identification, which may exhibit biases that can propagate artifacts through the reconstruction pipeline. Manual picking is viewed as a gold-standard solution for particle selection, but it is too time-consuming to perform on data sets of thousands of images. In recent years, crowdsourcing has proven effective at leveraging the open web to manually curate datasets. In particular, citizen science projects such as Galaxy Zoo have shown the power of appealing to users’ scientific interests to process enormous amounts of data. To this end, we explored the possible applications of crowdsourcing in cryo-EM particle picking, presenting a variety of novel experiments including the production of a fully annotated particle set from untrained citizen scientists. We show the possibilities and limitations of crowdsourcing particle selection tasks, and explore further options for crowdsourcing cryo-EM data processing.
Collapse
|
44
|
Metaproteomics of Colonic Microbiota Unveils Discrete Protein Functions among Colitic Mice and Control Groups. Proteomics 2018; 18:10.1002/pmic.201700391. [PMID: 29319931 PMCID: PMC5921860 DOI: 10.1002/pmic.201700391] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 12/19/2017] [Indexed: 12/14/2022]
Abstract
Metaproteomics can greatly assist established high-throughput sequencing methodologies to provide systems biological insights into the alterations of microbial protein functionalities correlated with disease-associated dysbiosis of the intestinal microbiota. Here, the authors utilize the well-characterized murine T cell transfer model of colitis to find specific changes within the intestinal luminal proteome associated with inflammation. MS proteomic analysis of colonic samples permitted the identification of ≈10 000-12 000 unique peptides that corresponded to 5610 protein clusters identified across three groups, including the colitic Rag1-/- T cell recipients, isogenic Rag1-/- controls, and wild-type mice. The authors demonstrate that the colitic mice exhibited a significant increase in Proteobacteria and Verrucomicrobia and show that such alterations in the microbial communities contributed to the enrichment of specific proteins with transcription and translation gene ontology terms. In combination with 16S sequencing, the authors' metaproteomics-based microbiome studies provide a foundation for assessing alterations in intestinal luminal protein functionalities in a robust and well-characterized mouse model of colitis, and set the stage for future studies to further explore the functional mechanisms of altered protein functionalities associated with dysbiosis and inflammation.
Collapse
|
45
|
Cross-linking BioThings APIs through JSON-LD to facilitate knowledge exploration. BMC Bioinformatics 2018; 19:30. [PMID: 29390967 PMCID: PMC5796402 DOI: 10.1186/s12859-018-2041-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 01/24/2018] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Application Programming Interfaces (APIs) are now widely used to distribute biological data. And many popular biological APIs developed by many different research teams have adopted Javascript Object Notation (JSON) as their primary data format. While usage of a common data format offers significant advantages, that alone is not sufficient for rich integrative queries across APIs. RESULTS Here, we have implemented JSON for Linking Data (JSON-LD) technology on the BioThings APIs that we have developed, MyGene.info , MyVariant.info and MyChem.info . JSON-LD provides a standard way to add semantic context to the existing JSON data structure, for the purpose of enhancing the interoperability between APIs. We demonstrated several use cases that were facilitated by semantic annotations using JSON-LD, including simpler and more precise query capabilities as well as API cross-linking. CONCLUSIONS We believe that this pattern offers a generalizable solution for interoperability of APIs in the life sciences.
Collapse
|
46
|
WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2017; 2017:3084697. [PMID: 28365742 PMCID: PMC5467579 DOI: 10.1093/database/bax025] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Accepted: 03/06/2017] [Indexed: 11/25/2022]
Abstract
With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomic data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction. Database URL: www.wikigenomes.org
Collapse
|
47
|
|
48
|
Increased DNA Methylation and Reduced Expression of Transcription Factors in Human Osteoarthritis Cartilage. Arthritis Rheumatol 2017; 68:1876-86. [PMID: 26881698 DOI: 10.1002/art.39643] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Accepted: 02/11/2016] [Indexed: 12/13/2022]
Abstract
OBJECTIVE To analyze the methylome of normal and osteoarthritic (OA) knee articular cartilage and to determine the role of DNA methylation in the regulation of gene expression in vitro. METHODS DNA was isolated from human normal (n = 11) and OA (n = 12) knee articular cartilage and analyzed using the Infinium HumanMethylation450 BeadChip array. To integrate methylation and transcription, RNA sequencing was performed on normal and OA cartilage and validated by quantitative polymerase chain reaction. Functional validation was performed in the human TC28 cell line and primary chondrocytes that were treated with the DNA methylation inhibitor 5-aza-2'-deoxycytidine (5-aza-dC). RESULTS DNA methylation profiling revealed 929 differentially methylated sites between normal and OA cartilage, comprising a total of 500 individual genes. Among these, 45 transcription factors that harbored differentially methylated sites were identified. Integrative analysis and subsequent validation showed a subset of 6 transcription factors that were significantly hypermethylated and down-regulated in OA cartilage (ATOH8, MAFF, NCOR2, TBX4, ZBTB16, and ZHX2). Upon 5-aza-dC treatment, TC28 cells showed a significant increase in gene expression for all 6 transcription factors. In primary chondrocytes, ATOH8 and TBX4 were increased after 5-aza-dC treatment. CONCLUSION Our findings reveal that normal and OA knee articular cartilage have significantly different methylomes. The identification of a subset of epigenetically regulated transcription factors with reduced expression in OA may represent an important mechanism to explain changes in the chondrocyte transcriptome and function during OA pathogenesis.
Collapse
|
49
|
Dysregulated circadian rhythm pathway in human osteoarthritis: NR1D1 and BMAL1 suppression alters TGF-β signaling in chondrocytes. Osteoarthritis Cartilage 2017; 25:943-951. [PMID: 27884645 PMCID: PMC5438901 DOI: 10.1016/j.joca.2016.11.007] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Revised: 11/08/2016] [Accepted: 11/12/2016] [Indexed: 02/02/2023]
Abstract
OBJECTIVES Circadian rhythm (CR) was identified by RNA sequencing as the most dysregulated pathway in human osteoarthritis (OA) in articular cartilage. This study examined circadian rhythmicity in cultured chondrocytes and the role of the CR genes NR1D1 and BMAL1 in regulating chondrocyte functions. METHODS RNA was extracted from normal and OA-affected human knee cartilage (n = 14 each). Expression levels of NR1D1 and BMAL1 mRNA and protein were assessed by quantitative PCR and immunohistochemistry. Human chondrocytes were synchronized and harvested at regular intervals to examine circadian rhythmicity in RNA and protein expression. Chondrocytes were treated with small interfering RNA (siRNA) for NR1D1 or BMAL1, followed by RNA sequencing and analysis of the effects on the transforming growth factor beta (TGF-β) pathway. RESULTS NR1D1 and BMAL1 mRNA and protein levels were significantly reduced in OA compared to normal cartilage. In cultured human chondrocytes, a clear circadian rhythmicity was observed for NR1D1 and BMAL1. Increased BMAL1 expression was observed after knocking down NR1D1, and decreased NR1D1 levels were observed after knocking down BMAL1. Sequencing of RNA from chondrocytes treated with NR1D1 or BMAL1 siRNA identified 330 and 68 significantly different genes, respectively, and this predominantly affected the TGF-β signaling pathway. CONCLUSIONS The CR pathway is dysregulated in OA cartilage. Interference with circadian rhythmicity in cultured chondrocytes affects TGF-β signaling, which is a central pathway in cartilage homeostasis.
Collapse
|
50
|
Quantitative Metaproteomics and Activity-Based Probe Enrichment Reveals Significant Alterations in Protein Expression from a Mouse Model of Inflammatory Bowel Disease. J Proteome Res 2017; 16:1014-1026. [PMID: 28052195 DOI: 10.1021/acs.jproteome.6b00938] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Tandem mass spectrometry based shotgun proteomics of distal gut microbiomes is exceedingly difficult due to the inherent complexity and taxonomic diversity of the samples. We introduce two new methodologies to improve metaproteomic studies of microbiome samples. These methods include the stable isotope labeling in mammals to permit protein quantitation across two mouse cohorts as well as the application of activity-based probes to enrich and analyze both host and microbial proteins with specific functionalities. We used these technologies to study the microbiota from the adoptive T cell transfer mouse model of inflammatory bowel disease (IBD) and compare these samples to an isogenic control, thereby limiting genetic and environmental variables that influence microbiome composition. The data generated highlight quantitative alterations in both host and microbial proteins due to intestinal inflammation and corroborates the observed phylogenetic changes in bacteria that accompany IBD in humans and mouse models. The combination of isotope labeling with shotgun proteomics resulted in the total identification of 4434 protein clusters expressed in the microbial proteomic environment, 276 of which demonstrated differential abundance between control and IBD mice. Notably, application of a novel cysteine-reactive probe uncovered several microbial proteases and hydrolases overrepresented in the IBD mice. Implementation of these methods demonstrated that substantial insights into the identity and dysregulation of host and microbial proteins altered in IBD can be accomplished and can be used in the interrogation of other microbiome-related diseases.
Collapse
|