1
|
Davis AP, Wiegers TC, Wiegers J, Wyatt B, Johnson RJ, Sciaky D, Barkalow F, Strong M, Planchart A, Mattingly CJ. CTD tetramers: a new online tool that computationally links curated chemicals, genes, phenotypes, and diseases to inform molecular mechanisms for environmental health. Toxicol Sci 2023; 195:155-168. [PMID: 37486259 PMCID: PMC10535784 DOI: 10.1093/toxsci/kfad069] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/25/2023] Open
Abstract
The molecular mechanisms connecting environmental exposures to adverse endpoints are often unknown, reflecting knowledge gaps. At the Comparative Toxicogenomics Database (CTD), we developed a bioinformatics approach that integrates manually curated, literature-based interactions from CTD to generate a "CGPD-tetramer": a 4-unit block of information organized as a step-wise molecular mechanism linking an initiating Chemical, an interacting Gene, a Phenotype, and a Disease outcome. Here, we describe a novel, user-friendly tool called CTD Tetramers that generates these evidence-based CGPD-tetramers for any curated chemical, gene, phenotype, or disease of interest. Tetramers offer potential solutions for the unknown underlying mechanisms and intermediary phenotypes connecting a chemical exposure to a disease. Additionally, multiple tetramers can be assembled to construct detailed modes-of-action for chemical-induced disease pathways. As well, tetramers can help inform environmental influences on adverse outcome pathways (AOPs). We demonstrate the tool's utility with relevant use cases for a variety of environmental chemicals (eg, perfluoroalkyl substances, bisphenol A), phenotypes (eg, apoptosis, spermatogenesis, inflammatory response), and diseases (eg, asthma, obesity, male infertility). Finally, we map AOP adverse outcome terms to corresponding CTD terms, allowing users to query for tetramers that can help augment AOP pathways with additional stressors, genes, and phenotypes, as well as formulate potential AOP disease networks (eg, liver cirrhosis and prostate cancer). This novel tool, as part of the complete suite of tools offered at CTD, provides users with computational datasets and their supporting evidence to potentially fill exposure knowledge gaps and develop testable hypotheses about environmental health.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Brent Wyatt
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Robin J Johnson
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Fern Barkalow
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Melissa Strong
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Antonio Planchart
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina 27695, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina 27695, USA
| |
Collapse
|
2
|
Davis AP, Wiegers TC, Johnson RJ, Sciaky D, Wiegers J, Mattingly C. Comparative Toxicogenomics Database (CTD): update 2023. Nucleic Acids Res 2022; 51:D1257-D1262. [PMID: 36169237 PMCID: PMC9825590 DOI: 10.1093/nar/gkac833] [Citation(s) in RCA: 116] [Impact Index Per Article: 58.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 09/06/2022] [Accepted: 09/15/2022] [Indexed: 01/30/2023] Open
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) harmonizes cross-species heterogeneous data for chemical exposures and their biological repercussions by manually curating and interrelating chemical, gene, phenotype, anatomy, disease, taxa, and exposure content from the published literature. This curated information is integrated to generate inferences, providing potential molecular mediators to develop testable hypotheses and fill in knowledge gaps for environmental health. This dual nature, acting as both a knowledgebase and a discoverybase, makes CTD a unique resource for the scientific community. Here, we report a 20% increase in overall CTD content for 17 100 chemicals, 54 300 genes, 6100 phenotypes, 7270 diseases and 202 000 exposure statements. We also present CTD Tetramers, a novel tool that computationally generates four-unit information blocks connecting a chemical, gene, phenotype, and disease to construct potential molecular mechanistic pathways. Finally, we integrate terms for human biological media used in the CTD Exposure module to corresponding CTD Anatomy pages, allowing users to survey the chemical profiles for any tissue-of-interest and see how these environmental biomarkers are related to phenotypes for any anatomical site. These, and other webpage visual enhancements, continue to promote CTD as a practical, user-friendly, and innovative resource for finding information and generating testable hypotheses about environmental health.
Collapse
Affiliation(s)
- Allan Peter Davis
- To whom correspondence should be addressed. Tel: +1 919 515 5705; Fax: +1 919 515 3355;
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Robin J Johnson
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA,Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
3
|
Davis AP, Wiegers TC, Wiegers J, Grondin CJ, Johnson RJ, Sciaky D, Mattingly CJ. CTD Anatomy: analyzing chemical-induced phenotypes and exposures from an anatomical perspective, with implications for environmental health studies. Curr Res Toxicol 2021; 2:128-139. [PMID: 33768211 PMCID: PMC7990325 DOI: 10.1016/j.crtox.2021.03.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 02/01/2021] [Accepted: 03/01/2021] [Indexed: 12/12/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD) is a freely available public resource that curates and interrelates chemical, gene/protein, phenotype, disease, organism, and exposure data. CTD can be used to address toxicological mechanisms for environmental chemicals and facilitate the generation of testable hypotheses about how exposures affect human health. At CTD, manually curated interactions for chemical-induced phenotypes are enhanced with anatomy terms (tissues, fluids, and cell types) to describe the physiological system of the reported event. These same anatomy terms are used to annotate the human media (e.g., urine, hair, nail, blood, etc.) in which an environmental chemical was assayed for exposure. Currently, CTD uses more than 880 unique anatomy terms to contextualize over 255,000 chemical-phenotype interactions and 167,000 exposure statements. These annotations allow chemical-phenotype interactions and exposure data to be explored from a novel, anatomical perspective. Here, we describe CTD's anatomy curation process (including the construction of a controlled, interoperable vocabulary) and new anatomy webpages (that coalesce and organize the curated chemical-phenotype and exposure data sets). We also provide examples that demonstrate how this feature can be used to identify system- and cell-specific chemical-induced toxicities, help inform exposure data, prioritize phenotypes for environmental diseases, survey tissue and pregnancy exposomes, and facilitate data connections with external resources. Anatomy annotations advance understanding of environmental health by providing new ways to explore and survey chemical-induced events and exposure studies in the CTD framework.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, United States
| | - Thomas C. Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, United States
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, United States
| | - Cynthia J. Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, United States
| | - Robin J. Johnson
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, United States
| | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, United States
| | - Carolyn J. Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, United States
- Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, United States
| |
Collapse
|
4
|
Pinkhasova DV, Jameson LE, Conrow KD, Simeone MP, Davis AP, Wiegers TC, Mattingly CJ, Leung MCK. Regulatory Status of Pesticide Residues in Cannabis: Implications to Medical Use in Neurological Diseases. Curr Res Toxicol 2021; 2:140-148. [PMID: 34308371 PMCID: PMC8296824 DOI: 10.1016/j.crtox.2021.02.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Movement disorders are the most common neurological category of qualifying conditions in the U.S. The number and action levels of regulated pesticides in cannabis differ vastly in 33 states and Washington, D.C. Network analysis reveals potential interactions of insecticides, cannabinoids, and seizure at a functional level.
Medical cannabis represents a potential route of pesticide exposure to susceptible populations. We compared the qualifying conditions for medical use and pesticide testing requirements of cannabis in 33 states and Washington, D.C. Movement disorders were the most common neurological category of qualifying conditions, including epilepsy, certain symptoms of multiple sclerosis, Parkinson’s Disease, and any cause of symptoms leading to seizures or spasticity. Different approaches of pesticide regulation were implemented in cannabis and cannabis-derived products. Six states imposed the strictest U.S. EPA tolerances (i.e. maximum residue levels) for food commodities on up to 400 pesticidal active ingredients in cannabis, while pesticide testing was optional in three states. Dimethomorph showed the largest variation in action levels, ranging from 0.1 to 60 ppm in 5 states. We evaluated the potential connections between insecticides, cannabinoids, and seizure using the Comparative Toxicogenomics Database. Twenty-two insecticides, two cannabinoids, and 63 genes were associated with 674 computationally generated chemical-gene-phenotype-disease (CGPD) tetramer constructs. Notable functional clusters included oxidation-reduction process (183 CGPD-tetramers), synaptic signaling pathways (151), and neuropeptide hormone activity (46). Cholinergic, dopaminergic, and retrograde endocannabinoid signaling pathways were linked to 10 genetic variants of epilepsy patients. Further research is needed to assess human health risk of cannabinoids and pesticides in support of a national standard for cannabis pesticide regulations.
Collapse
Affiliation(s)
- Dorina V Pinkhasova
- School of Mathematical and Natural Sciences, Arizona State University - West Campus, Glendale, AZ 85306.,Pharmacology and Toxicology Program, Arizona State University - West Campus, Glendale, AZ 85306
| | - Laura E Jameson
- Pharmacology and Toxicology Program, Arizona State University - West Campus, Glendale, AZ 85306
| | - Kendra D Conrow
- Pharmacology and Toxicology Program, Arizona State University - West Campus, Glendale, AZ 85306
| | - Michael P Simeone
- ASU Library Data Science and Analytics, Arizona State University, Tempe, AZ 85281
| | - Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695.,Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695
| | - Maxwell C K Leung
- School of Mathematical and Natural Sciences, Arizona State University - West Campus, Glendale, AZ 85306.,Pharmacology and Toxicology Program, Arizona State University - West Campus, Glendale, AZ 85306
| |
Collapse
|
5
|
Davis AP, Grondin CJ, Johnson RJ, Sciaky D, Wiegers J, Wiegers TC, Mattingly CJ. Comparative Toxicogenomics Database (CTD): update 2021. Nucleic Acids Res 2021; 49:D1138-D1143. [PMID: 33068428 PMCID: PMC7779006 DOI: 10.1093/nar/gkaa891] [Citation(s) in RCA: 503] [Impact Index Per Article: 167.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/25/2020] [Accepted: 09/30/2020] [Indexed: 02/07/2023] Open
Abstract
The public Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is an innovative digital ecosystem that relates toxicological information for chemicals, genes, phenotypes, diseases, and exposures to advance understanding about human health. Literature-based, manually curated interactions are integrated to create a knowledgebase that harmonizes cross-species heterogeneous data for chemical exposures and their biological repercussions. In this biennial update, we report a 20% increase in CTD curated content and now provide 45 million toxicogenomic relationships for over 16 300 chemicals, 51 300 genes, 5500 phenotypes, 7200 diseases and 163 000 exposure events, from 600 comparative species. Furthermore, we increase the functionality of chemical-phenotype content with new data-tabs on CTD Disease pages (to help fill in knowledge gaps for environmental health) and new phenotype search parameters (for Batch Query and Venn analysis tools). As well, we introduce new CTD Anatomy pages that allow users to uniquely explore and analyze chemical-phenotype interactions from an anatomical perspective. Finally, we have enhanced CTD Chemical pages with new literature-based chemical synonyms (to improve querying) and added 1600 amino acid-based compounds (to increase chemical landscape). Together, these updates continue to augment CTD as a powerful resource for generating testable hypotheses about the etiologies and molecular mechanisms underlying environmentally influenced diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Robin J Johnson
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
- Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
6
|
Davis AP, Wiegers TC, Grondin CJ, Johnson RJ, Sciaky D, Wiegers J, Mattingly CJ. Leveraging the Comparative Toxicogenomics Database to Fill in Knowledge Gaps for Environmental Health: A Test Case for Air Pollution-induced Cardiovascular Disease. Toxicol Sci 2020; 177:392-404. [PMID: 32663284 PMCID: PMC7548289 DOI: 10.1093/toxsci/kfaa113] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Environmental health studies relate how exposures (eg, chemicals) affect human health and disease; however, in most cases, the molecular and biological mechanisms connecting an exposure with a disease remain unknown. To help fill in these knowledge gaps, we sought to leverage content from the public Comparative Toxicogenomics Database (CTD) to identify potential intermediary steps. In a proof-of-concept study, we systematically compute the genes, molecular mechanisms, and biological events for the environmental health association linking air pollution toxicants with 2 cardiovascular diseases (myocardial infarction and hypertension) as a test case. Our approach integrates 5 types of curated interactions in CTD to build sets of "CGPD-tetramers," computationally constructed information blocks relating a Chemical- Gene interaction with a Phenotype and Disease. This bioinformatics strategy generates 653 CGPD-tetramers for air pollution-associated myocardial infarction (involving 5 pollutants, 58 genes, and 117 phenotypes) and 701 CGPD-tetramers for air pollution-associated hypertension (involving 3 pollutants, 96 genes, and 142 phenotypes). Collectively, we identify 19 genes and 96 phenotypes shared between these 2 air pollutant-induced outcomes, and suggest important roles for oxidative stress, inflammation, immune responses, cell death, and circulatory system processes. Moreover, CGPD-tetramers can be assembled into extensive chemical-induced disease pathways involving multiple gene products and sequential biological events, and many of these computed intermediary steps are validated in the literature. Our method does not require a priori knowledge of the toxicant, interacting gene, or biological system, and can be used to analyze any environmental chemical-induced disease curated within the public CTD framework. This bioinformatics strategy links and interrelates chemicals, genes, phenotypes, and diseases to fill in knowledge gaps for environmental health studies, as demonstrated for air pollution-associated cardiovascular disease, but can be adapted by researchers for any environmentally influenced disease-of-interest.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Carolyn J Mattingly
- Department of Biological Sciences
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina 27695
| |
Collapse
|
7
|
Davis AP, Grondin CJ, Johnson RJ, Sciaky D, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2019. Nucleic Acids Res 2020; 47:D948-D954. [PMID: 30247620 PMCID: PMC6323936 DOI: 10.1093/nar/gky868] [Citation(s) in RCA: 564] [Impact Index Per Article: 141.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 09/14/2018] [Indexed: 11/27/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is a premier public resource for literature-based, manually curated associations between chemicals, gene products, phenotypes, diseases, and environmental exposures. In this biennial update, we present our new chemical–phenotype module that codes chemical-induced effects on phenotypes, curated using controlled vocabularies for chemicals, phenotypes, taxa, and anatomical descriptors; this module provides unique opportunities to explore cellular and system-level phenotypes of the pre-disease state and allows users to construct predictive adverse outcome pathways (linking chemical–gene molecular initiating events with phenotypic key events, diseases, and population-level health outcomes). We also report a 46% increase in CTD manually curated content, which when integrated with other datasets yields more than 38 million toxicogenomic relationships. We describe new querying and display features for our enhanced chemical–exposure science module, providing greater scope of content and utility. As well, we discuss an updated MEDIC disease vocabulary with over 1700 new terms and accession identifiers. To accommodate these increases in data content and functionality, CTD has upgraded its computational infrastructure. These updates continue to improve CTD and help inform new testable hypotheses about the etiology and mechanisms underlying environmentally influenced diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Robin J Johnson
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Roy McMorran
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA.,Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
8
|
Abstract
Public databases provide a wealth of freely available information about chemicals, genes, proteins, biological networks, phenotypes, diseases, and exposure science that can be integrated to construct pathways for systems toxicology applications. Relating this disparate information from public repositories, however, can be challenging since databases use a variety of ways to represent, describe, and make available their content. The use of standard vocabularies to annotate key data concepts, however, allows the information to be more easily exchanged and combined for discovery of new findings. We explore some of the many public data sources currently available to support systems toxicology, and demonstrate the value of standardizing data to help construct chemical-induced outcome pathways.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, United States
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, United States
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina 27695, United States
| |
Collapse
|
9
|
Davis AP, Wiegers TC, Wiegers J, Johnson RJ, Sciaky D, Grondin CJ, Mattingly CJ. Chemical-Induced Phenotypes at CTD Help Inform the Predisease State and Construct Adverse Outcome Pathways. Toxicol Sci 2018; 165:145-156. [PMID: 29846728 PMCID: PMC6111787 DOI: 10.1093/toxsci/kfy131] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org) is a public resource that manually curates the scientific literature to provide content that illuminates the molecular mechanisms by which environmental exposures affect human health. We introduce our new chemical-phenotype module that describes how chemicals can affect molecular, cellular, and physiological phenotypes. At CTD, we operationally distinguish between phenotypes and diseases, wherein a phenotype refers to a nondisease biological event: eg, decreased cell cycle arrest (phenotype) versus liver cancer (disease), increased fat cell proliferation (phenotype) versus morbid obesity (disease), etc. Chemical-phenotype interactions are expressed in a formal structured notation using controlled terms for chemicals, phenotypes, taxon, and anatomical descriptors. Combining this information with CTD's chemical-disease module allows inferences to be made between phenotypes and diseases, yielding potential insight into the predisease state. Integration of all 4 CTD modules furnishes unique opportunities for toxicologists to generate computationally predictive adverse outcome pathways, linking chemical-gene molecular initiating events with phenotypic key events, adverse diseases, and population-level health outcomes. As examples, we present 3 diverse case studies discerning the effect of vehicle emissions on altered leukocyte migration, the role of cadmium in influencing phenotypes preceding Alzheimer disease, and the connection of arsenic-induced glucose metabolic phenotypes with diabetes. To date, CTD contains over 165 000 interactions that connect more than 6400 chemicals to 3900 phenotypes for 760 anatomical terms in 215 species, from over 19 000 scientific articles. To our knowledge, this is the first comprehensive set of manually curated, literature-based, contextualized, chemical-induced, nondisease phenotype data provided to the public.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Carolyn J Mattingly
- Department of Biological Sciences
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina 27695
| |
Collapse
|
10
|
Grondin CJ, Davis AP, Wiegers TC, Wiegers JA, Mattingly CJ. Accessing an Expanded Exposure Science Module at the Comparative Toxicogenomics Database. Environ Health Perspect 2018; 126:014501. [PMID: 29351546 PMCID: PMC6014688 DOI: 10.1289/ehp2873] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Accepted: 11/17/2017] [Indexed: 05/22/2023]
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org) is a free resource that provides manually curated information on chemical, gene, phenotype, and disease relationships to advance understanding of the effect of environmental exposures on human health. Four core content areas are independently curated: chemical-gene interactions, chemical-disease and gene-disease associations, chemical-phenotype interactions, and environmental exposure data (e.g., effects of chemical stressors on humans). Since releasing exposure data in 2015, we have vastly increased our coverage of chemicals and disease/phenotype outcomes; greatly expanded access to exposure content; added search capability by stressors, cohorts, population demographics, and measured outcomes; and created user-specified displays of content. These enhancements aim to facilitate human studies by allowing comparisons among experimental parameters and across studies involving specified chemicals, populations, or outcomes. Integration of data among CTD's four content areas and external data sets, such as Gene Ontology annotations and pathway information, links exposure data with over 1.8 million chemical-gene, chemical-disease and gene-disease interactions. Our analysis tools reveal direct and inferred relationships among the data and provide opportunities to generate predictive connections between environmental exposures and population-level health outcomes. https://doi.org/10.1289/EHP2873.
Collapse
Affiliation(s)
- Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
| | - Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
| | - Jolene A Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
- Center for Human Health and the Environment, North Carolina State University , Raleigh, North Carolina, USA
| |
Collapse
|
11
|
Grondin CJ, Davis AP, Wiegers TC, King BL, Wiegers JA, Reif DM, Hoppin JA, Mattingly CJ. Advancing Exposure Science through Chemical Data Curation and Integration in the Comparative Toxicogenomics Database. Environ Health Perspect 2016; 124:1592-1599. [PMID: 27170236 PMCID: PMC5047769 DOI: 10.1289/ehp174] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Revised: 01/11/2016] [Accepted: 04/26/2016] [Indexed: 05/17/2023]
Abstract
BACKGROUND Exposure science studies the interactions and outcomes between environmental stressors and human or ecological receptors. To augment its role in understanding human health and the exposome, we aimed to centralize and integrate exposure science data into the broader biological framework of the Comparative Toxicogenomics Database (CTD), a public resource that promotes understanding of environmental chemicals and their effects on human health. OBJECTIVES We integrated exposure data within the CTD to provide a centralized, freely available resource that facilitates identification of connections between real-world exposures, chemicals, genes/proteins, diseases, biological processes, and molecular pathways. METHODS We developed a manual curation paradigm that captures exposure data from the scientific literature using controlled vocabularies and free text within the context of four primary exposure concepts: stressor, receptor, exposure event, and exposure outcome. Using data from the Agricultural Health Study, we have illustrated the benefits of both centralization and integration of exposure information with CTD core data. RESULTS We have described our curation process, demonstrated how exposure data can be accessed and analyzed in the CTD, and shown how this integration provides a broad biological context for exposure data to promote mechanistic understanding of environmental influences on human health. CONCLUSIONS Curation and integration of exposure data within the CTD provides researchers with new opportunities to correlate exposures with human health outcomes, to identify underlying potential molecular mechanisms, and to improve understanding about the exposome. CITATION Grondin CJ, Davis AP, Wiegers TC, King BL, Wiegers JA, Reif DM, Hoppin JA, Mattingly CJ. 2016. Advancing exposure science through chemical data curation and integration in the Comparative Toxicogenomics Database. Environ Health Perspect 124:1592-1599; http://dx.doi.org/10.1289/EHP174.
Collapse
Affiliation(s)
- Cynthia J. Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
- Address correspondence to C.J. Grondin, North Carolina State University, Department of Biological Sciences, Campus Box 7617, Raleigh, NC 27695-7617 USA. Telephone: (919) 515-1509. E-mail:
| | - Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
| | - Thomas C. Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
| | - Benjamin L. King
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, USA
| | - Jolene A. Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
| | - David M. Reif
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina, USA
| | - Jane A. Hoppin
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina, USA
| | - Carolyn J. Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, USA
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina, USA
| |
Collapse
|
12
|
Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2017. Nucleic Acids Res 2016; 45:D972-D978. [PMID: 27651457 PMCID: PMC5210612 DOI: 10.1093/nar/gkw838] [Citation(s) in RCA: 378] [Impact Index Per Article: 47.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 09/09/2016] [Indexed: 12/19/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between chemicals and gene products, and their relationships to diseases. Core CTD content (chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature) are integrated with each other as well as with select external datasets to generate expanded networks and predict novel associations. Today, core CTD includes more than 30.5 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, Gene Ontology (GO) annotations, pathways, and gene interaction modules. In this update, we report a 33% increase in our core data content since 2015, describe our new exposure module (that harmonizes exposure science information with core toxicogenomic data) and introduce a novel dataset of GO-disease inferences (that identify common molecular underpinnings for seemingly unrelated pathologies). These advancements centralize and contextualize real-world chemical exposures with molecular pathways to help scientists generate testable hypotheses in an effort to understand the etiology and mechanisms underlying environmentally influenced diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Robin J Johnson
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Benjamin L King
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Roy McMorran
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA.,Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
13
|
Li J, Sun Y, Johnson RJ, Sciaky D, Wei CH, Leaman R, Davis AP, Mattingly CJ, Wiegers TC, Lu Z. BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database (Oxford) 2016; 2016:baw068. [PMID: 27161011 PMCID: PMC4860626 DOI: 10.1093/database/baw068] [Citation(s) in RCA: 123] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2015] [Accepted: 04/11/2016] [Indexed: 11/14/2022]
Abstract
Community-run, formal evaluations and manually annotated text corpora are critically important for advancing biomedical text-mining research. Recently in BioCreative V, a new challenge was organized for the tasks of disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. Given the nature of both tasks, a test collection is required to contain both disease/chemical annotations and relation annotations in the same set of articles. Despite previous efforts in biomedical corpus construction, none was found to be sufficient for the task. Thus, we developed our own corpus called BC5CDR during the challenge by inviting a team of Medical Subject Headings (MeSH) indexers for disease/chemical entity annotation and Comparative Toxicogenomics Database (CTD) curators for CID relation annotation. To ensure high annotation quality and productivity, detailed annotation guidelines and automatic annotation tools were provided. The resulting BC5CDR corpus consists of 1500 PubMed articles with 4409 annotated chemicals, 5818 diseases and 3116 chemical-disease interactions. Each entity annotation includes both the mention text spans and normalized concept identifiers, using MeSH as the controlled vocabulary. To ensure accuracy, the entities were first captured independently by two annotators followed by a consensus annotation: The average inter-annotator agreement (IAA) scores were 87.49% and 96.05% for the disease and chemicals, respectively, in the test set according to the Jaccard similarity coefficient. Our corpus was successfully used for the BioCreative V challenge tasks and should serve as a valuable resource for the text-mining research community.Database URL: http://www.biocreative.org/tasks/biocreative-v/track-3-cdr/.
Collapse
Affiliation(s)
- Jiao Li
- 1Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
| | - Yueping Sun
- 1Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
| | - Robin J Johnson
- 2Department of Biological Sciences and the Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| | - Daniela Sciaky
- 2Department of Biological Sciences and the Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| | - Chih-Hsuan Wei
- 3National Center for Biotechnology Information, Bethesda, MD 20894, USA
| | - Robert Leaman
- 3National Center for Biotechnology Information, Bethesda, MD 20894, USA
| | - Allan Peter Davis
- 2Department of Biological Sciences and the Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| | - Carolyn J Mattingly
- 2Department of Biological Sciences and the Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| | - Thomas C Wiegers
- 2Department of Biological Sciences and the Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| | - Zhiyong Lu
- 3National Center for Biotechnology Information, Bethesda, MD 20894, USA
| |
Collapse
|
14
|
Wei CH, Peng Y, Leaman R, Davis AP, Mattingly CJ, Li J, Wiegers TC, Lu Z. Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task. Database (Oxford) 2016; 2016:baw032. [PMID: 26994911 PMCID: PMC4799720 DOI: 10.1093/database/baw032] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 02/25/2016] [Indexed: 11/14/2022]
Abstract
Manually curating chemicals, diseases and their relationships is significantly important to biomedical research, but it is plagued by its high cost and the rapid growth of the biomedical literature. In recent years, there has been a growing interest in developing computational approaches for automatic chemical-disease relation (CDR) extraction. Despite these attempts, the lack of a comprehensive benchmarking dataset has limited the comparison of different techniques in order to assess and advance the current state-of-the-art. To this end, we organized a challenge task through BioCreative V to automatically extract CDRs from the literature. We designed two challenge tasks: disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. To assist system development and assessment, we created a large annotated text corpus that consisted of human annotations of chemicals, diseases and their interactions from 1500 PubMed articles. 34 teams worldwide participated in the CDR task: 16 (DNER) and 18 (CID). The best systems achieved an F-score of 86.46% for the DNER task—a result that approaches the human inter-annotator agreement (0.8875)—and an F-score of 57.03% for the CID task, the highest results ever reported for such tasks. When combining team results via machine learning, the ensemble system was able to further improve over the best team results by achieving 88.89% and 62.80% in F-score for the DNER and CID task, respectively. Additionally, another novel aspect of our evaluation is to test each participating system’s ability to return real-time results: the average response time for each team’s DNER and CID web service systems were 5.6 and 9.3 s, respectively. Most teams used hybrid systems for their submissions based on machining learning. Given the level of participation and results, we found our task to be successful in engaging the text-mining research community, producing a large annotated corpus and improving the results of automatic disease recognition and CDR extraction. Database URL:http://www.biocreative.org/tasks/biocreative-v/track-3-cdr/
Collapse
Affiliation(s)
- Chih-Hsuan Wei
- National Center for Biotechnology Information, Bethesda, MD 20894, USA
| | - Yifan Peng
- National Center for Biotechnology Information, Bethesda, MD 20894, USA Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA
| | - Robert Leaman
- National Center for Biotechnology Information, Bethesda, MD 20894, USA
| | - Allan Peter Davis
- Department of Biological Sciences and the Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA and
| | - Carolyn J Mattingly
- Department of Biological Sciences and the Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA and
| | - Jiao Li
- Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100700, China
| | - Thomas C Wiegers
- Department of Biological Sciences and the Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA and
| | - Zhiyong Lu
- National Center for Biotechnology Information, Bethesda, MD 20894, USA
| |
Collapse
|
15
|
Davis AP, Grondin CJ, Lennon-Hopkins K, Saraceni-Richards C, Sciaky D, King BL, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database's 10th year anniversary: update 2015. Nucleic Acids Res 2014; 43:D914-20. [PMID: 25326323 PMCID: PMC4384013 DOI: 10.1093/nar/gku935] [Citation(s) in RCA: 284] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Ten years ago, the Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) was developed out of a need to formalize, harmonize and centralize the information on numerous genes and proteins responding to environmental toxic agents across diverse species. CTD's initial approach was to facilitate comparisons of nucleotide and protein sequences of toxicologically significant genes by curating these sequences and electronically annotating them with chemical terms from their associated references. Since then, however, CTD has vastly expanded its scope to robustly represent a triad of chemical–gene, chemical–disease and gene–disease interactions that are manually curated from the scientific literature by professional biocurators using controlled vocabularies, ontologies and structured notation. Today, CTD includes 24 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, phenotypes, Gene Ontology annotations, pathways and interaction modules. In this 10th year anniversary update, we outline the evolution of CTD, including our increased data content, new ‘Pathway View’ visualization tool, enhanced curation practices, pilot chemical–phenotype results and impending exposure data set. The prototype database originally described in our first report has transformed into a sophisticated resource used actively today to help scientists develop and test hypotheses about the etiologies of environmentally influenced diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Kelley Lennon-Hopkins
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | | | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Benjamin L King
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
| |
Collapse
|
16
|
Wiegers TC, Davis AP, Mattingly CJ. Web services-based text-mining demonstrates broad impacts for interoperability and process simplification. Database (Oxford) 2014; 2014:bau050. [PMID: 24919658 PMCID: PMC4207221 DOI: 10.1093/database/bau050] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Revised: 03/31/2014] [Accepted: 05/02/2014] [Indexed: 12/31/2022]
Abstract
The Critical Assessment of Information Extraction systems in Biology (BioCreAtIvE) challenge evaluation tasks collectively represent a community-wide effort to evaluate a variety of text-mining and information extraction systems applied to the biological domain. The BioCreative IV Workshop included five independent subject areas, including Track 3, which focused on named-entity recognition (NER) for the Comparative Toxicogenomics Database (CTD; http://ctdbase.org). Previously, CTD had organized document ranking and NER-related tasks for the BioCreative Workshop 2012; a key finding of that effort was that interoperability and integration complexity were major impediments to the direct application of the systems to CTD's text-mining pipeline. This underscored a prevailing problem with software integration efforts. Major interoperability-related issues included lack of process modularity, operating system incompatibility, tool configuration complexity and lack of standardization of high-level inter-process communications. One approach to potentially mitigate interoperability and general integration issues is the use of Web services to abstract implementation details; rather than integrating NER tools directly, HTTP-based calls from CTD's asynchronous, batch-oriented text-mining pipeline could be made to remote NER Web services for recognition of specific biological terms using BioC (an emerging family of XML formats) for inter-process communications. To test this concept, participating groups developed Representational State Transfer /BioC-compliant Web services tailored to CTD's NER requirements. Participants were provided with a comprehensive set of training materials. CTD evaluated results obtained from the remote Web service-based URLs against a test data set of 510 manually curated scientific articles. Twelve groups participated in the challenge. Recall, precision, balanced F-scores and response times were calculated. Top balanced F-scores for gene, chemical and disease NER were 61, 74 and 51%, respectively. Response times ranged from fractions-of-a-second to over a minute per article. We present a description of the challenge and summary of results, demonstrating how curation groups can effectively use interoperable NER technologies to simplify text-mining pipeline implementation. Database URL: http://ctdbase.org/
Collapse
Affiliation(s)
- Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, 139 David Clark Lab, Campus Box 7617, Raleigh, NC 27695-7617, USA
| | - Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, 139 David Clark Lab, Campus Box 7617, Raleigh, NC 27695-7617, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, 139 David Clark Lab, Campus Box 7617, Raleigh, NC 27695-7617, USA
| |
Collapse
|
17
|
Davis AP, Wiegers TC, Roberts PM, King BL, Lay JM, Lennon-Hopkins K, Sciaky D, Johnson R, Keating H, Greene N, Hernandez R, McConnell KJ, Enayetallah AE, Mattingly CJ. A CTD-Pfizer collaboration: manual curation of 88,000 scientific articles text mined for drug-disease and drug-phenotype interactions. Database (Oxford) 2013; 2013:bat080. [PMID: 24288140 PMCID: PMC3842776 DOI: 10.1093/database/bat080] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Improving the prediction of chemical toxicity is a goal common to both environmental health research and pharmaceutical drug development. To improve safety detection assays, it is critical to have a reference set of molecules with well-defined toxicity annotations for training and validation purposes. Here, we describe a collaboration between safety researchers at Pfizer and the research team at the Comparative Toxicogenomics Database (CTD) to text mine and manually review a collection of 88 629 articles relating over 1 200 pharmaceutical drugs to their potential involvement in cardiovascular, neurological, renal and hepatic toxicity. In 1 year, CTD biocurators curated 2 54 173 toxicogenomic interactions (1 52 173 chemical–disease, 58 572 chemical–gene, 5 345 gene–disease and 38 083 phenotype interactions). All chemical–gene–disease interactions are fully integrated with public CTD, and phenotype interactions can be downloaded. We describe Pfizer’s text-mining process to collate the articles, and CTD’s curation strategy, performance metrics, enhanced data content and new module to curate phenotype information. As well, we show how data integration can connect phenotypes to diseases. This curation can be leveraged for information about toxic endpoints important to drug safety and help develop testable hypotheses for drug–disease events. The availability of these detailed, contextualized, high-quality annotations curated from seven decades’ worth of the scientific literature should help facilitate new mechanistic screening assays for pharmaceutical compound survival. This unique partnership demonstrates the importance of resource sharing and collaboration between public and private entities and underscores the complementary needs of the environmental health science and pharmaceutical communities. Database URL: http://ctdbase.org/
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, 3510 Thomas Hall, North Carolina State University, Raleigh, NC 27695-7617, USA, Computational Sciences Center of Emphasis, 200 Cambridgepark Drive, Pfizer Inc., Cambridge, MA 02139, USA, Department of Bioinformatics, P.O. Box 35, Old Bar Harbor Road, MDI Biological Laboratory, Salisbury Cove, ME 04672, USA, Compound Safety Prediction, MS 8118-B3, Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA, Computational Sciences Center of Emphasis, Pfizer Inc., Ramsgate Road, Sandwich, Kent CT13 9NJ, UK, Computational Sciences Center of Emphasis, 558 Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA and Drug Safety Research and Development, 558 Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Davis AP, Wiegers TC, Johnson RJ, Lay JM, Lennon-Hopkins K, Saraceni-Richards C, Sciaky D, Murphy CG, Mattingly CJ. Text mining effectively scores and ranks the literature for improving chemical-gene-disease curation at the comparative toxicogenomics database. PLoS One 2013; 8:e58201. [PMID: 23613709 PMCID: PMC3629079 DOI: 10.1371/journal.pone.0058201] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Accepted: 01/31/2013] [Indexed: 11/30/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is a public resource that curates interactions between environmental chemicals and gene products, and their relationships to diseases, as a means of understanding the effects of environmental chemicals on human health. CTD provides a triad of core information in the form of chemical-gene, chemical-disease, and gene-disease interactions that are manually curated from scientific articles. To increase the efficiency, productivity, and data coverage of manual curation, we have leveraged text mining to help rank and prioritize the triaged literature. Here, we describe our text-mining process that computes and assigns each article a document relevancy score (DRS), wherein a high DRS suggests that an article is more likely to be relevant for curation at CTD. We evaluated our process by first text mining a corpus of 14,904 articles triaged for seven heavy metals (cadmium, cobalt, copper, lead, manganese, mercury, and nickel). Based upon initial analysis, a representative subset corpus of 3,583 articles was then selected from the 14,094 articles and sent to five CTD biocurators for review. The resulting curation of these 3,583 articles was analyzed for a variety of parameters, including article relevancy, novel data content, interaction yield rate, mean average precision, and biological and toxicological interpretability. We show that for all measured parameters, the DRS is an effective indicator for scoring and improving the ranking of literature for the curation of chemical-gene-disease information at CTD. Here, we demonstrate how fully incorporating text mining-based DRS scoring into our curation pipeline enhances manual curation by prioritizing more relevant articles, thereby increasing data content, productivity, and efficiency.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biology, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Thomas C. Wiegers
- Department of Biology, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Robin J. Johnson
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Jean M. Lay
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Kelley Lennon-Hopkins
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Cynthia Saraceni-Richards
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Daniela Sciaky
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Cynthia Grondin Murphy
- Department of Biology, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Carolyn J. Mattingly
- Department of Biology, North Carolina State University, Raleigh, North Carolina, United States of America
| |
Collapse
|
19
|
Davis AP, Johnson RJ, Lennon-Hopkins K, Sciaky D, Rosenstein MC, Wiegers TC, Mattingly CJ. Targeted journal curation as a method to improve data currency at the Comparative Toxicogenomics Database. Database (Oxford) 2012; 2012:bas051. [PMID: 23221299 PMCID: PMC3515863 DOI: 10.1093/database/bas051] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators read the scientific literature and manually curate a triad of chemical–gene, chemical–disease and gene–disease interactions. Typically, articles for CTD are selected using a chemical-centric approach by querying PubMed to retrieve a corpus containing the chemical of interest. Although this technique ensures adequate coverage of knowledge about the chemical (i.e. data completeness), it does not necessarily reflect the most current state of all toxicological research in the community at large (i.e. data currency). Keeping databases current with the most recent scientific results, as well as providing a rich historical background from legacy articles, is a challenging process. To address this issue of data currency, CTD designed and tested a journal-centric approach of curation to complement our chemical-centric method. We first identified priority journals based on defined criteria. Next, over 7 weeks, three biocurators reviewed 2425 articles from three consecutive years (2009–2011) of three targeted journals. From this corpus, 1252 articles contained relevant data for CTD and 52 752 interactions were manually curated. Here, we describe our journal selection process, two methods of document delivery for the biocurators and the analysis of the resulting curation metrics, including data currency, and both intra-journal and inter-journal comparisons of research topics. Based on our results, we expect that curation by select journals can (i) be easily incorporated into the curation pipeline to complement our chemical-centric approach; (ii) build content more evenly for chemicals, genes and diseases in CTD (rather than biasing data by chemicals-of-interest); (iii) reflect developing areas in environmental health and (iv) improve overall data currency for chemicals, genes and diseases. Database URL: http://ctdbase.org/
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biology, North Carolina State University, Raleigh, NC 27695-7617, USA.
| | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
The Critical Assessment of Information Extraction systems in Biology (BioCreAtIvE) challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems for the biological domain. The 'BioCreative Workshop 2012' subcommittee identified three areas, or tracks, that comprised independent, but complementary aspects of data curation in which they sought community input: literature triage (Track I); curation workflow (Track II) and text mining/natural language processing (NLP) systems (Track III). Track I participants were invited to develop tools or systems that would effectively triage and prioritize articles for curation and present results in a prototype web interface. Training and test datasets were derived from the Comparative Toxicogenomics Database (CTD; http://ctdbase.org) and consisted of manuscripts from which chemical-gene-disease data were manually curated. A total of seven groups participated in Track I. For the triage component, the effectiveness of participant systems was measured by aggregate gene, disease and chemical 'named-entity recognition' (NER) across articles; the effectiveness of 'information retrieval' (IR) was also measured based on 'mean average precision' (MAP). Top recall scores for gene, disease and chemical NER were 49, 65 and 82%, respectively; the top MAP score was 80%. Each participating group also developed a prototype web interface; these interfaces were evaluated based on functionality and ease-of-use by CTD's biocuration project manager. In this article, we present a detailed description of the challenge and a summary of the results.
Collapse
Affiliation(s)
- Thomas C Wiegers
- Department of Biology, North Carolina State University, Raleigh, NC 27695-7617, USA.
| | | | | |
Collapse
|
21
|
King BL, Davis AP, Rosenstein MC, Wiegers TC, Mattingly CJ. Ranking transitive chemical-disease inferences using local network topology in the comparative toxicogenomics database. PLoS One 2012; 7:e46524. [PMID: 23144783 PMCID: PMC3492369 DOI: 10.1371/journal.pone.0046524] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2011] [Accepted: 09/05/2012] [Indexed: 11/19/2022] Open
Abstract
Exposure to chemicals in the environment is believed to play a critical role in the etiology of many human diseases. To enhance understanding about environmental effects on human health, the Comparative Toxicogenomics Database (CTD; http://ctdbase.org) provides unique curated data that enable development of novel hypotheses about the relationships between chemicals and diseases. CTD biocurators read the literature and curate direct relationships between chemicals-genes, genes-diseases, and chemicals-diseases. These direct relationships are then computationally integrated to create additional inferred relationships; for example, a direct chemical-gene statement can be combined with a direct gene-disease statement to generate a chemical-disease inference (inferred via the shared gene). In CTD, the number of inferences has increased exponentially as the number of direct chemical, gene and disease interactions has grown. To help users navigate and prioritize these inferences for hypothesis development, we implemented a statistic to score and rank them based on the topology of the local network consisting of the chemical, disease and each of the genes used to make an inference. In this network, chemicals, diseases and genes are nodes connected by edges representing the curated interactions. Like other biological networks, node connectivity is an important consideration when evaluating the CTD network, as the connectivity of nodes follows the power-law distribution. Topological methods reduce the influence of highly connected nodes that are present in biological networks. We evaluated published methods that used local network topology to determine the reliability of protein-protein interactions derived from high-throughput assays. We developed a new metric that combines and weights two of these methods and uniquely takes into account the number of common neighbors and the connectivity of each entity involved. We present several CTD inferences as case studies to demonstrate the value of this metric and the biological relevance of the inferences.
Collapse
Affiliation(s)
- Benjamin L. King
- Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Allan Peter Davis
- North Carolina State University, Raleigh, North Carolina, United States of America
| | - Michael C. Rosenstein
- Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Thomas C. Wiegers
- North Carolina State University, Raleigh, North Carolina, United States of America
| | - Carolyn J. Mattingly
- North Carolina State University, Raleigh, North Carolina, United States of America
| |
Collapse
|
22
|
Davis AP, Murphy CG, Johnson R, Lay JM, Lennon-Hopkins K, Saraceni-Richards C, Sciaky D, King BL, Rosenstein MC, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2013. Nucleic Acids Res 2012; 41:D1104-14. [PMID: 23093600 PMCID: PMC3531134 DOI: 10.1093/nar/gks994] [Citation(s) in RCA: 287] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between environmental chemicals and gene products and their relationships to diseases. Chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature are integrated to generate expanded networks and predict many novel associations between different data types. CTD now contains over 15 million toxicogenomic relationships. To navigate this sea of data, we added several new features, including DiseaseComps (which finds comparable diseases that share toxicogenomic profiles), statistical scoring for inferred gene-disease and pathway-chemical relationships, filtering options for several tools to refine user analysis and our new Gene Set Enricher (which provides biological annotations that are enriched for gene sets). To improve data visualization, we added a Cytoscape Web view to our ChemComps feature, included color-coded interactions and created a 'slim list' for our MEDIC disease vocabulary (allowing diseases to be grouped for meta-analysis, visualization and better data management). CTD continues to promote interoperability with external databases by providing content and cross-links to their sites. Together, this wealth of expanded chemical-gene-disease data, combined with novel ways to analyze and view content, continues to help users generate testable hypotheses about the molecular mechanisms of environmental diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biology, North Carolina State University, Raleigh, NC 27695-7617, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Davis AP, Wiegers TC, Rosenstein MC, Mattingly CJ. MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database. Database (Oxford) 2012; 2012:bar065. [PMID: 22434833 PMCID: PMC3308155 DOI: 10.1093/database/bar065] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators manually curate a triad of chemical–gene, chemical–disease and gene–disease relationships from the scientific literature. The CTD curation paradigm uses controlled vocabularies for chemicals, genes and diseases. To curate disease information, CTD first had to identify a source of controlled terms. Two resources seemed to be good candidates: the Online Mendelian Inheritance in Man (OMIM) and the ‘Diseases’ branch of the National Library of Medicine's Medical Subject Headers (MeSH). To maximize the advantages of both, CTD biocurators undertook a novel initiative to map the flat list of OMIM disease terms into the hierarchical nature of the MeSH vocabulary. The result is CTD’s ‘merged disease vocabulary’ (MEDIC), a unique resource that integrates OMIM terms, synonyms and identifiers with MeSH terms, synonyms, definitions, identifiers and hierarchical relationships. MEDIC is both a deep and broad vocabulary, composed of 9700 unique diseases described by more than 67 000 terms (including synonyms). It is freely available to download in various formats from CTD. While neither a true ontology nor a perfect solution, this vocabulary has nonetheless proved to be extremely successful and practical for our biocurators in generating over 2.5 million disease-associated toxicogenomic relationships in CTD. Other external databases have also begun to adopt MEDIC for their disease vocabulary. Here, we describe the construction, implementation, maintenance and use of MEDIC to raise awareness of this resource and to offer it as a putative scaffold in the formal construction of an official disease ontology. Database URL:http://ctd.mdibl.org/voc.go?type=disease
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA.
| | | | | | | |
Collapse
|
24
|
Davis AP, Rosenstein MC, Wiegers TC, Mattingly CJ. DiseaseComps: a metric that discovers similar diseases based upon common toxicogenomic profiles at CTD. Bioinformation 2011; 7:154-6. [PMID: 22125387 PMCID: PMC3220301 DOI: 10.6026/97320630007154] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2011] [Accepted: 10/05/2011] [Indexed: 11/23/2022] Open
Abstract
UNLABELLED The Comparative Toxicogenomics Database (CTD) is a free resource that describes chemical-gene-disease networks to help understand the effects of environmental exposures on human health. The database contains more than 13,500 chemical-disease and 14,200 gene-disease interactions. In CTD, chemicals and genes are associated with a disease via two types of relationships: as a biomarker or molecular mechanism for the disease (M-type) or as a real or putative therapy for the disease (T-type). We leveraged these curated datasets to compute similarity indices that can be used to produce lists of comparable diseases ("DiseaseComps") based upon shared toxicogenomic profiles. This new metric now classifies diseases with common molecular characteristics, instead of the traditional approach of using histology or tissue of origin to define the disorder. In the dawning era of "personalized medicine", this feature provides a new way to view and describe diseases and will help develop testable hypotheses about chemical-gene-disease networks. AVAILABILITY The database is available for free at http://ctd.mdibl.org/
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Bioinformatics, the Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Michael C Rosenstein
- Department of Bioinformatics, the Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Thomas Conrad Wiegers
- Department of Bioinformatics, the Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Carolyn J Mattingly
- Department of Bioinformatics, the Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| |
Collapse
|
25
|
Davis AP, Wiegers TC, Rosenstein MC, Murphy CG, Mattingly CJ. The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database. Database (Oxford) 2011; 2011:bar034. [PMID: 21933848 PMCID: PMC3176677 DOI: 10.1093/database/bar034] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators read the scientific literature and convert free-text information into a structured format using official nomenclature, integrating third party controlled vocabularies for chemicals, genes, diseases and organisms, and a novel controlled vocabulary for molecular interactions. Manual curation produces a robust, richly annotated dataset of highly accurate and detailed information. Currently, CTD describes over 349 000 molecular interactions between 6800 chemicals, 20 900 genes (for 330 organisms) and 4300 diseases that have been manually curated from over 25 400 peer-reviewed articles. This manually curated data are further integrated with other third party data (e.g. Gene Ontology, KEGG and Reactome annotations) to generate a wealth of toxicogenomic relationships. Here, we describe our approach to manual curation that uses a powerful and efficient paradigm involving mnemonic codes. This strategy allows biocurators to quickly capture detailed information from articles by generating simple statements using codes to represent the relationships between data types. The paradigm is versatile, expandable, and able to accommodate new data challenges that arise. We have incorporated this strategy into a web-based curation tool to further increase efficiency and productivity, implement quality control in real-time and accommodate biocurators working remotely. Database URL:http://ctd.mdibl.org
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | | | | | | | | |
Collapse
|
26
|
Davis AP, King BL, Mockus S, Murphy CG, Saraceni-Richards C, Rosenstein M, Wiegers T, Mattingly CJ. The Comparative Toxicogenomics Database: update 2011. Nucleic Acids Res 2010; 39:D1067-72. [PMID: 20864448 PMCID: PMC3013756 DOI: 10.1093/nar/gkq813] [Citation(s) in RCA: 183] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the interaction of environmental chemicals with gene products, and their effects on human health. Biocurators at CTD manually curate a triad of chemical–gene, chemical–disease and gene–disease relationships from the literature. These core data are then integrated to construct chemical–gene–disease networks and to predict many novel relationships using different types of associated data. Since 2009, we dramatically increased the content of CTD to 1.4 million chemical–gene–disease data points and added many features, statistical analyses and analytical tools, including GeneComps and ChemComps (to find comparable genes and chemicals that share toxicogenomic profiles), enriched Gene Ontology terms associated with chemicals, statistically ranked chemical–disease inferences, Venn diagram tools to discover overlapping and unique attributes of any set of chemicals, genes or disease, and enhanced gene pathway data content, among other features. Together, this wealth of expanded chemical–gene–disease data continues to help users generate testable hypotheses about the molecular mechanisms of environmental diseases. CTD is freely available at http://ctd.mdibl.org.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Davis AP, Murphy CG, Saraceni-Richards CA, Rosenstein MC, Wiegers TC, Hampton TH, Mattingly CJ. GeneComps and ChemComps: a new CTD metric to identify genes and chemicals with shared toxicogenomic profiles. Bioinformation 2009; 4:173-4. [PMID: 20198196 PMCID: PMC2825594 DOI: 10.6026/97320630004173] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2009] [Accepted: 10/13/2009] [Indexed: 11/23/2022] Open
Abstract
UNLABELLED The Comparative Toxicogenomics Database is a public resource that promotes understanding about the effects of environmental chemicals on human health. Currently, CTD describes over 184,000 molecular interactions for more than 5,100 chemicals and 16,300 genes/proteins. We have leveraged this dataset of chemical-gene relationships to compute similarity indices following the statistical method of the Jaccard index. These scores are used to produce lists of comparable genes ("GeneComps") or chemicals ("ChemComps") based on shared toxicogenomic profiles. GeneComps and ChemComps are now provided for every curated gene and chemical in CTD. ChemComps are particularly significant because they provide a way to group chemicals based upon their biological effects, instead of their physical or structural properties. These metrics provide a novel way to view and classify genes and chemicals and will help advance testable hypotheses about environmental chemical-genedisease networks. AVAILABILITY CTD is freely available at http://ctd.mdibl.org/
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA.
| | | | | | | | | | | | | |
Collapse
|
28
|
Wiegers TC, Davis AP, Cohen KB, Hirschman L, Mattingly CJ. Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD). BMC Bioinformatics 2009; 10:326. [PMID: 19814812 PMCID: PMC2768719 DOI: 10.1186/1471-2105-10-326] [Citation(s) in RCA: 100] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2009] [Accepted: 10/08/2009] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND The Comparative Toxicogenomics Database (CTD) is a publicly available resource that promotes understanding about the etiology of environmental diseases. It provides manually curated chemical-gene/protein interactions and chemical- and gene-disease relationships from the peer-reviewed, published literature. The goals of the research reported here were to establish a baseline analysis of current CTD curation, develop a text-mining prototype from readily available open source components, and evaluate its potential value in augmenting curation efficiency and increasing data coverage. RESULTS Prototype text-mining applications were developed and evaluated using a CTD data set consisting of manually curated molecular interactions and relationships from 1,600 documents. Preliminary results indicated that the prototype found 80% of the gene, chemical, and disease terms appearing in curated interactions. These terms were used to re-rank documents for curation, resulting in increases in mean average precision (63% for the baseline vs. 73% for a rule-based re-ranking), and in the correlation coefficient of rank vs. number of curatable interactions per document (baseline 0.14 vs. 0.38 for the rule-based re-ranking). CONCLUSION This text-mining project is unique in its integration of existing tools into a single workflow with direct application to CTD. We performed a baseline assessment of the inter-curator consistency and coverage in CTD, which allowed us to measure the potential of these integrated tools to improve prioritization of journal articles for manual curation. Our study presents a feasible and cost-effective approach for developing a text mining solution to enhance manual curation throughput and efficiency.
Collapse
Affiliation(s)
- Thomas C Wiegers
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME, USA
| | - Allan Peter Davis
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME, USA
| | - K Bretonnel Cohen
- Center for Computational Pharmacology, University of Colorado School of Medicine, Aurora, CO, USA
- Information Technology Center, The MITRE Corporation, 202 Burlington Road, Bedford, MA, USA
| | - Lynette Hirschman
- Information Technology Center, The MITRE Corporation, 202 Burlington Road, Bedford, MA, USA
| | - Carolyn J Mattingly
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME, USA
| |
Collapse
|
29
|
Davis AP, Murphy CG, Saraceni-Richards CA, Rosenstein MC, Wiegers TC, Mattingly CJ. Comparative Toxicogenomics Database: a knowledgebase and discovery tool for chemical-gene-disease networks. Nucleic Acids Res 2008; 37:D786-92. [PMID: 18782832 PMCID: PMC2686584 DOI: 10.1093/nar/gkn580] [Citation(s) in RCA: 215] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD) is a curated database that promotes understanding about the effects of environmental chemicals on human health. Biocurators at CTD manually curate chemical–gene interactions, chemical–disease relationships and gene–disease relationships from the literature. This strategy allows data to be integrated to construct chemical–gene–disease networks. CTD is unique in numerous respects: curation focuses on environmental chemicals; interactions are manually curated; interactions are constructed using controlled vocabularies and hierarchies; additional gene attributes (such as Gene Ontology, taxonomy and KEGG pathways) are integrated; data can be viewed from the perspective of a chemical, gene or disease; results and batch queries can be downloaded and saved; and most importantly, CTD acts as both a knowledgebase (by reporting data) and a discovery tool (by generating novel inferences). Over 116 000 interactions between 3900 chemicals and 13 300 genes have been curated from 270 species, and 5900 gene–disease and 2500 chemical–disease direct relationships have been captured. By integrating these data, 350 000 gene–disease relationships and 77 000 chemical–disease relationships can be inferred. This wealth of chemical–gene–disease information yields testable hypotheses for understanding the effects of environmental chemicals on human health. CTD is freely available at http://ctd.mdibl.org.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | | | | | | | | | | |
Collapse
|
30
|
Mattingly CJ, Rosenstein MC, Davis AP, Colby GT, Forrest JN, Boyer JL. The comparative toxicogenomics database: a cross-species resource for building chemical-gene interaction networks. Toxicol Sci 2006; 92:587-95. [PMID: 16675512 PMCID: PMC1586111 DOI: 10.1093/toxsci/kfl008] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Chemicals in the environment play a critical role in the etiology of many human diseases. Despite their prevalence, the molecular mechanisms of action and the effects of chemicals on susceptibility to disease are not well understood. To promote understanding of these mechanisms, the Comparative Toxicogenomics Database (CTD; http://ctd.mdibl.org/) presents scientifically reviewed and curated information on chemicals, relevant genes and proteins, and their interactions in vertebrates and invertebrates. CTD integrates sequence, reference, species, microarray, and general toxicology information to provide a unique centralized resource for toxicogenomic research. The database also provides visualization capabilities that enable cross-species comparisons of gene and protein sequences. These comparisons will facilitate understanding of structure-function correlations and the genetic basis of susceptibility. Manual curation and integration of cross-species chemical-gene and chemical-protein interactions from the literature are now underway. These data will provide information for building complex interaction networks. New CTD features include (1) cross-species gene, rather than sequence, query and visualization capabilities; (2) integrated cross-links to microarray data from chemicals, genes, and sequences in CTD; (3) a reference set related to chemical-gene and protein interactions identified by an information retrieval system; and (4) a "Chemicals in the News" initiative that provides links from CTD chemicals to environmental health articles from the popular press. Here we describe these new features and our novel cross-species curation of chemical-gene and chemical-protein interactions.
Collapse
Affiliation(s)
- Carolyn J Mattingly
- Department of Bioinformatics, Mount Desert Island Biological Laboratory, Salisbury Cove, Maine 04672, USA.
| | | | | | | | | | | |
Collapse
|
31
|
Sessions BR, Aston KI, Davis AP, Pate BJ, White KL. Effects of amino acid substitutions in and around the arginine-glycine-aspartic acid (RGD) sequence on fertilization and parthenogenetic development in mature bovine oocytes. Mol Reprod Dev 2006; 73:651-7. [PMID: 16493691 DOI: 10.1002/mrd.20462] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Integrins have been shown to be involved in the process of fertilization and many integrin-ligand interactions are mediated through the recognition of an arginine-glycine-aspartic acid (RGD) sequence. Despite the fact the RGD domain is a principal player in determining the functional characteristics of an adhesive protein, increasing evidence has accumulated implicating the amino acids flanking the RGD sequence in determining the functional properties of the RGD-containing protein. A set of linear peptides in which the amino acid sequence in and around the RGD tri-peptide was modified was synthesized to better understand the specificity of the RGD-receptor interaction. Mature oocytes were fertilized in vitro in the presence of RGD-containing and RGD-modified peptides. Both the RGD-containing and RGD-modified peptides impaired the ability of sperm to fertilize bovine oocytes, illustrated by a reduction in cleavage. The linear modified RGD containing peptides were also examined for their ability to induce parthenogenetic development with the objective of providing a linear RGD peptide with greater biological activity than the one (GRGDSPK) used previously (Campbell et al., 2000). The data demonstrate the specificity of the receptor for the RGD sequence, further implicate the involvement of integrins in the process of bovine fertilization, and illustrate the importance of the amino acids surrounding the RGD sequence in determining the binding and functional properties of RGD-containing peptides. The data support the findings that a linear RGD peptide can block fertilization and that amino acids around the RGD sequence have an impact on the biological activity of the receptor.
Collapse
Affiliation(s)
- B R Sessions
- Department of Animal, Dairy, and Veterinary Sciences and Center for Integrated Biosystems, Utah State University, Logan, Utah, USA
| | | | | | | | | |
Collapse
|
32
|
Hsieh CH, Davis AP. Multiple-event study of bioretention for treatment of urban storm water runoff. Water Sci Technol 2005; 51:177-181. [PMID: 15850188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Bioretention is a novel best management practice for urban storm water, employed to minimize the impact of urban runoff during storm events. Bioretention consists of porous media layers that can remove pollutants from infiltrating runoff via mechanisms that include adsorption, precipitation, and filtration. However, the effectiveness of bioretention in treating repetitive inputs of runoff has not been investigated. In this study, a bioretention test column was set up and experiments proceeded once every week for a total of 12 tests. Through all 12 repetitions, the infiltration rate remained constant (0.35 cm/min). All 12 tests demonstrated excellent removal efficiency for TSS, oil/grease, and lead (99%). For total phosphorus, the removal efficiency was about 47% the system removal efficiency ranged from 2.3% to 23%. Effluent nitrate concentration became higher than the influent concentration during the first 28 days and removal efficiency ranged from 9% to 20% afterward. Some degree of denitrification was apparently proceeding in the bioretention system. Overall, the top mulch layer filtered most of TSS in the runoff and prevented the bioretention media from clogging during 12 repetitions. Runoff quality was improved by the bioretention column.
Collapse
Affiliation(s)
- C H Hsieh
- Department of Civil and Environmental Engineering, University of Maryland, College Park, MD 20742, USA.
| | | |
Collapse
|
33
|
Affiliation(s)
- A J Ayling
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, UK
| | | | | |
Collapse
|
34
|
Abstract
The yeast RAD52 gene is essential for homology-dependent repair of DNA double-strand breaks. In vitro, Rad52 binds to single- and double-stranded DNA and promotes annealing of complementary single-stranded DNA. Genetic studies indicate that the Rad52 and Rad59 proteins act in the same recombination pathway either as a complex or through overlapping functions. Here we demonstrate physical interaction between Rad52 and Rad59 using the yeast two-hybrid system and co-immunoprecipitation from yeast extracts. Purified Rad59 efficiently anneals complementary oligonucleotides and is able to overcome the inhibition to annealing imposed by replication protein A (RPA). Although Rad59 has strand-annealing activity by itself in vitro, this activity is insufficient to promote strand annealing in vivo in the absence of Rad52. The rfa1-D288Y allele partially suppresses the in vivo strand-annealing defect of rad52 mutants, but this is independent of RAD59. These results suggest that in vivo Rad59 is unable to compete with RPA for single-stranded DNA and therefore is unable to promote single-strand annealing. Instead, Rad59 appears to augment the activity of Rad52 in strand annealing.
Collapse
Affiliation(s)
- A P Davis
- Department of Microbiology and Institute of Cancer Research, Columbia University College of Physicians and Surgeons, New York, New York 10032, USA
| | | |
Collapse
|
35
|
Abstract
Competitive photocatalytic oxidation (PCO) of mixtures of Cu(II)-EDTA and Cd(II)-EDTA was studied with variation of molar ratio of these two complexes (1 x 10(-4):0, 8 x 10(-5): 2 x 10(-5), 5 x 10-5:5 x 10(-5), 2 x 10-5:8 x 10(-5), 0:1 x 10(-4) M) and in the pH range of 4-8. PCO rates for each compound can be described using a combined aqueous + adsorbed pathway: -dC/dt = k1Caq(1+ k2Caq)+ kadsCads. This expression is valid under both noncompetitive and competitive conditions. Differences in rates under competition result from differences in the partitioning of the two species between the TiO2 surface and the aqueous phase. Total initial complex degradation rates (rTT), obtained by summation of the total destruction rates for Cu(II)-EDTA and Cd(II)-EDTA, were relatively constant at pH 4 and 5 for all ratios. At these pH values, contribution of adsorbed pathways to rTT was important, and rates were similar to those of the aqueous phase pathways. From pH 6 to 8, the degree of adsorption, and thus the adsorbed pathway rate, diminished. Through the adsorbed pathway, no difference in rate constants was found between Cu(II)-EDTA and Cd(II)-EDTA; Cd(II)-EDTA is somewhat more reactive through the aqueous phase pathway.
Collapse
Affiliation(s)
- J K Yang
- Department of Civil and Environmental Engineering, University of Maryland, College Park 20742, USA
| | | |
Collapse
|
36
|
Abstract
Urban stormwater runoff is being recognized as a substantial source of pollutants to receiving waters. A number of investigators have found significant levels of metals in runoff from urban areas, especially in highway runoff. As an initiatory study, this work estimates lead, copper, cadmium, and zinc loadings from various sources in a developed area utilizing information available in the literature, in conjunction with controlled experimental and sampling investigations. Specific sources examined include building siding and roofs; automobile brakes, tires, and oil leakage; and wet and dry atmospheric deposition. Important sources identified are building siding for all four metals, vehicle brake emissions for copper and tire wear for zinc. Atmospheric deposition is an important source for cadmium, copper, and lead. Loadings and source distributions depend on building and automobile density assumptions and the type of materials present in the area examined. Identified important sources are targeted for future comprehensive mechanistic studies. Improved information on the metal release and distributions from the specific sources, along with detailed characterization of watershed areas will allow refinements in the predictions.
Collapse
Affiliation(s)
- A P Davis
- Department of Civil and Environmental Engineering, University of Maryland, College Park 20742, USA.
| | | | | |
Collapse
|
37
|
Affiliation(s)
- D P Hill
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA.
| | | | | | | | | | | | | |
Collapse
|
38
|
Abstract
Dried waste slurry generated in seafood processing factories has been shown to be an effective adsorbent for the removal of heavy metals from dilute solutions. Characterization of the sludge surface with scanning electron microscope and X-ray microanalyzer were carried out to evaluate the components on the sludge surface that are related to the adsorption of metal ions. Aluminum and calcium, as well as organic carbon are distributed on the surface of sludge. Alkalimetric titration was used to characterize the surface acidity of the sludge sample. The surface acidity constants, pKa1s and pKa2s, were 5.80 and 9.55, respectively. Batch as well as dynamic adsorption studies were conducted with 10(-5) to 5 x 10(-3) M Cu(II) and Cd(II). A surface complexation model with the diffuse layer model successfully predicted Cu(II) and Cd(II) removals in single metal solutions. Predictions of sorption in binary-adsorbate systems based on single-adsorbate data fits represented competitive sorption data reasonably well over a wide range of conditions. The breakthrough capacity found from column studies was different for each metal ion and the data reflect the order of metal affinity for the adsorbent material very well.
Collapse
Affiliation(s)
- S M Lee
- Department of Environmental Engineering, Kwandong University, Yangyang 215-800, South Korea.
| | | |
Collapse
|
39
|
Abstract
Adsorption of metals from aqueous solution onto oxide and other surfaces is known to affect trace metal transport in many natural and engineered systems. It is therefore important to understand whether dissolved metal inputs will be easily bound to particles or will be strongly complexed in solution and transported with the water phase. The effect of poly(acrylic acid) (PAA), representing a model compound for natural organic matter, on the adsorption of Cd(II) onto gamma-Al2O3 was determined using batch adsorption experiments over a pH range from 4 to 10. Initially, interactions among the individual components were evaluated. Cadmium adsorption onto alumina showed a typical S-shaped metal adsorption curve. PAA adsorption onto gamma-Al2O3 decreased with increase in pH. The affinity of PAA for Cd2+ increased strongly with pH. In ternary systems, the presence of PAA resulted in an enhancement of Cd(II) adsorption below pH 6, apparently due to ternary surface complex formation. Above pH 6, a decrease in cadmium adsorption onto gamma-Al2O3 was observed resulting from an increase in the concentration of soluble Cd-PAA complexes. Overall, results indicate that the presence of natural organic matter could have a significant impact on the distribution and mobility of cadmium in the environment. Simple surface complexation modeling was insufficient to describe behavior in the ternary systems due to the complexity of the PAA polymer.
Collapse
Affiliation(s)
- R M Floroiu
- Department of Civil and Environmental Engineering, University of Maryland, College Park 20742, USA
| | | | | |
Collapse
|
40
|
Davis AP. Comparison of the gross motor function measure and paediatric evaluation of disability inventory in assessing motor function in children undergoing selective dorsal rhizotomy. Pediatr Phys Ther 2001; 13:91-2. [PMID: 17053662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
|
41
|
Abstract
Urban stormwater runoff contains a broad range of pollutants that are transported to natural water systems. A practice known as biological retention (bioretention) has been suggested to manage stormwater runoff from small, developed areas. Bioretention facilities consist of porous soil, a topping layer of hardwood mulch, and a variety of different plant species. A detailed study of the characteristics and performance of bioretention systems for the removal of several heavy metals (copper, lead, and zinc) and nutrients (phosphorus, total Kjeldahl nitrogen [TKN], ammonium, and nitrate) from a synthetic urban stormwater runoff was completed using batch and column adsorption studies along with pilot-scale laboratory systems. The roles of the soil, mulch, and plants in the removal of heavy metals and nutrients were evaluated to estimate the treatment capacity of laboratory bioretention systems. Reductions in concentrations of all metals were excellent (> 90%) with specific metal removals of 15 to 145 mg/m2 per event. Moderate reductions of TKN, ammonium, and phosphorus levels were found (60 to 80%). Little nitrate was removed, and nitrate production was noted in several cases. The importance of the mulch layer in metal removal was identified. Overall results support the use of bioretention as a stormwater best management practice and indicate the need for further research and development.
Collapse
Affiliation(s)
- A P Davis
- Department of Civil and Environmental Engineering, University of Maryland, College Park, MD 20742, USA
| | | | | | | |
Collapse
|
42
|
Hobson GM, Davis AP, Stowell NC, Kolodny EH, Sistermans EA, de Coo IF, Funanage VL, Marks HG. Mutations in noncoding regions of the proteolipid protein gene in Pelizaeus-Merzbacher disease. Neurology 2000; 55:1089-96. [PMID: 11071483 DOI: 10.1212/wnl.55.8.1089] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Pelizaeus-Merzbacher disease (PMD) is an X-linked recessive dysmyelinating disorder of the CNS. Duplications or point mutations in exons of the proteolipid protein (PLP) gene are found in most patients. OBJECTIVE To describe five patients with PMD who have mutations in noncoding regions of the PLP gene. METHODS Quantitative multiplex PCR and Southern blot analyses were used to detect duplication of the PLP gene, and DNA sequence analysis, including exon-intron borders, was used to detect mutation of the PLP gene. RESULTS Duplication of the PLP gene was ruled out, and mutations were identified in noncoding regions of five patients in four families with PMD. In two brothers with a severe form of PMD, a G to T transversion at IVS6+3 was detected. This mutation resulted in skipping of exon 6 in the PLP mRNA of cultured fibroblasts. A patient who developed nystagmus at 16 months and progressive spastic ataxia at 18 months was found to have a 19-base pair (bp) deletion of a G-rich region near the 5' end of intron 3 of the PLP gene. A patient with a T to C transition at IVS3+2 and a patient with an A to G transition at IVS3+4 have the classic form of PMD. These, like the 19-bp deletion, are in intron 3, which is involved in PLP/DM20 alternative splice site selection. CONCLUSIONS Mutations in introns of the PLP gene, even at positions that are not 100% conserved at splice sites, are an important cause of PMD.
Collapse
Affiliation(s)
- G M Hobson
- Department of Research, Alfred I. duPont Hospital for Children, Wilmington, DE 19899, USA.
| | | | | | | | | | | | | | | |
Collapse
|
43
|
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000. [PMID: 10802651 DOI: 10.1038/75556.gene] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
Collapse
Affiliation(s)
- M Ashburner
- Department of Genetics, Stanford University School of Medicine, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25-9. [PMID: 10802651 PMCID: PMC3037419 DOI: 10.1038/75556] [Citation(s) in RCA: 26081] [Impact Index Per Article: 1086.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
Collapse
Affiliation(s)
- M Ashburner
- Department of Genetics, Stanford University School of Medicine, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Bai Y, Davis AP, Symington LS. A novel allele of RAD52 that causes severe DNA repair and recombination deficiencies only in the absence of RAD51 or RAD59. Genetics 1999; 153:1117-30. [PMID: 10545446 PMCID: PMC1460819 DOI: 10.1093/genetics/153.3.1117] [Citation(s) in RCA: 48] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
With the use of an intrachromosomal inverted repeat as a recombination reporter, we have shown that mitotic recombination is dependent on the RAD52 gene, but reduced only fivefold by mutation of RAD51. RAD59, a component of the RAD51-independent pathway, was identified previously by screening for mutations that reduced inverted-repeat recombination in a rad51 strain. Here we describe a rad52 mutation, rad52R70K, that also reduced recombination synergistically in a rad51 background. The phenotype of the rad52R70K strain, which includes weak gamma-ray sensitivity, a fourfold reduction in the rate of inverted-repeat recombination, elevated allelic recombination, sporulation proficiency, and a reduction in the efficiency of mating-type switching and single-strand annealing, was similar to that observed for deletion of the RAD59 gene. However, rad52R70K rad59 double mutants showed synergistic defects in ionizing radiation resistance, sporulation, and mating-type switching. These results suggest that Rad52 and Rad59 have partially overlapping functions and that Rad59 can substitute for this function of Rad52 in a RAD51 rad52R70K strain.
Collapse
Affiliation(s)
- Y Bai
- Department of Microbiology and Institute of Cancer Research, Columbia University, New York, New York 10032, USA
| | | | | |
Collapse
|
46
|
|
47
|
Abstract
Cu(II), EDTA, Cu(II)-EDTA, Cd(II)-EDTA, and Cu(II)/Cd(II) and Cu(II)-EDTA/Cd(II)-EDTA competitive adsorption onto TiO2 has been studied with variation of pH and concentration. For Cu(II) and EDTA, typical cationic and anionic types of adsorption are noted, respectively. Ligand-type adsorption is found for Cu(II)-EDTA and Cd(II)-EDTA under both single and competitive conditions. Surface complexation modeling considered inner-sphere complexation and the diffuse layer model employing MINTEQA2; surface complexes used include Ti-(OH2)O-Cu+, Ti-(OH)EDTAH-22, Ti-(OH)EDTA-Cu-2, and Ti-(OH)EDTA-Cd-2. Experimental and model predictions suggest no competitive adsorption between Cu(II) and Cd(II) at 5 x 10(-5) M. On the other hand, adsorption data and model predictions indicate that Cd(II)-EDTA adsorption is favored over that of Cu(II)-EDTA with some competition for adsorption sites. Cd(II)-EDTA adsorption was only slightly affected by the presence of Cu(II)-EDTA; however, Cu(II)-EDTA adsorption was strongly influenced by the presence of Cd(II)-EDTA, especially as the molar ratio of Cd(II)-EDTA/Cu(II)-EDTA increased. A modified surface complexation constant for Cd(II)-EDTA is required to explain the competitive data, suggesting surface site heterogeneity. Copyright 1999 Academic Press.
Collapse
Affiliation(s)
- JK Yang
- Department of Civil and Environmental Engineering, University of Maryland, College Park, Maryland, 20742
| | | |
Collapse
|
48
|
Abstract
The ability to rapidly and reliably genotype mice is an important concern. Traditional methods employ labour intensive and time consuming techniques such as test crossing, gel electrophoresis or nucleic acid hybridization. Here we show that a new molecular biology workstation, the WAVE DNA Fragment Analysis System, can easily resolve polymerase chain reaction (PCR) products that have small differences in their lengths. Analysis is fully automated and takes less than 7 min per sample. Approximately 200 samples can be analysed per day with only minutes of hands-on time after completion of the PCR. Genotyping with the WAVE DNA Fragment Analysis System is a fast and efficient method with minimal manual intervention.
Collapse
Affiliation(s)
- A Kuklin
- Transgenomic, Inc., 2032 Concourse Drive, San Jose, CA 95131, USA
| | | | | | | | | |
Collapse
|
49
|
Affiliation(s)
- A P Davis
- Life Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-8080, USA
| | | | | |
Collapse
|
50
|
Affiliation(s)
- A P Davis
- Life Sciences Division, Oak Ridge National Laboratory, TN 37831-8080, USA.
| | | |
Collapse
|