1
|
Deans AR, Nastasi LF, Davis C. GallOnt: An ontology for plant gall phenotypes. Biodivers Data J 2024; 12:e128585. [PMID: 39229384 PMCID: PMC11369494 DOI: 10.3897/bdj.12.e128585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 08/18/2024] [Indexed: 09/05/2024] Open
Abstract
Galls are novel plant structures that develop in response to select biotic stressors. These structures, extended phenotypes of the inducer, usually serve to protect and feed the inducer or its progeny. This life history strategy has evolved dozens of times, and tens of thousands of species - including many bacteria, fungi, nematodes, mites and insects - are capable of manipulating plants in this way. The variation in gall phenotypes is extraordinary across species but usually predictable for each species of inducer. We introduce here a new ontology, GallOnt, that facilitates consistent descriptions and the semantic representation of and reasoning over plant gall phenotype data. GallOnt was largely developed from ontologies in the Open Biological and Biomedical Ontology (OBO) Foundry and stands to connect plant gall phenotypes to knowledge derived from model plant systems, including genotype-phenotype and agricultural research. We also introduce the idea of a new gall data standard - Minimum Information for the Description of Galls (MIDG version 0.1) - as a starting point for discussions regarding cecidology best practices.
Collapse
Affiliation(s)
- Andrew R Deans
- Frost Entomological Museum, The Pennsylvania State University, University Park, United States of AmericaFrost Entomological Museum, The Pennsylvania State UniversityUniversity ParkUnited States of America
| | - Louis Frank Nastasi
- Frost Entomological Museum, The Pennsylvania State University, University Park, United States of AmericaFrost Entomological Museum, The Pennsylvania State UniversityUniversity ParkUnited States of America
| | - Charles Davis
- Frost Entomological Museum, The Pennsylvania State University, University Park, United States of AmericaFrost Entomological Museum, The Pennsylvania State UniversityUniversity ParkUnited States of America
| |
Collapse
|
2
|
Ramsay M, Crampin AC, Bawah AA, Gitau E, Herbst K. The Value Proposition of Coordinated Population Cohorts Across Africa. Annu Rev Biomed Data Sci 2024; 7:277-294. [PMID: 39178423 DOI: 10.1146/annurev-biodatasci-020722-015026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
Building longitudinal population cohorts in Africa for coordinated research and surveillance can influence the setting of national health priorities, lead to the introduction of appropriate interventions, and provide evidence for targeted treatment, leading to better health across the continent. However, compared to cohorts from the global north, longitudinal continental African population cohorts remain scarce, are relatively small in size, and lack data complexity. As infections and noncommunicable diseases disproportionately affect Africa's approximately 1.4 billion inhabitants, African cohorts present a unique opportunity for research and surveillance. High genetic diversity in African populations and multiomic research studies, together with detailed phenotyping and clinical profiling, will be a treasure trove for discovery. The outcomes, including novel drug targets, biological pathways for disease, and gene-environment interactions, will boost precision medicine approaches, not only in Africa but across the globe.
Collapse
Affiliation(s)
- Michèle Ramsay
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa;
| | - Amelia C Crampin
- Malawi Epidemiology and Intervention Research Unit, Lilongwe, Malawi
| | - Ayaga A Bawah
- Regional Institute for Population Studies, University of Ghana, Accra, Ghana
| | - Evelyn Gitau
- African Population and Health Research Center, Nairobi, Kenya
| | - Kobus Herbst
- Africa Health Research Institute, Durban, South Africa
- South African Population Research Infrastructure Network, Department of Science and Innovation and South African Medical Research Council, Durban, South Africa
| |
Collapse
|
3
|
Montanaro G, Balhoff JP, Girón JC, Söderholm M, Tarasov S. Computable species descriptions and nanopublications: applying ontology-based technologies to dung beetles (Coleoptera, Scarabaeinae). Biodivers Data J 2024; 12:e121562. [PMID: 38912113 PMCID: PMC11190572 DOI: 10.3897/bdj.12.e121562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 05/22/2024] [Indexed: 06/25/2024] Open
Abstract
Background Taxonomy has long struggled with analysing vast amounts of phenotypic data due to computational and accessibility challenges. Ontology-based technologies provide a framework for modelling semantic phenotypes that are understandable by computers and compliant with FAIR principles. In this paper, we explore the use of Phenoscript, an emerging language designed for creating semantic phenotypes, to produce computable species descriptions. Our case study centers on the application of this approach to dung beetles (Coleoptera, Scarabaeinae). New information We illustrate the effectiveness of Phenoscript for creating semantic phenotypes. We also demonstrate the ability of the Phenospy python package to automatically translate Phenoscript descriptions into natural language (NL), which eliminates the need for writing traditional NL descriptions. We introduce a computational pipeline that streamlines the generation of semantic descriptions and their conversion to NL. To demonstrate the power of the semantic approach, we apply simple semantic queries to the generated phenotypic descriptions. This paper addresses the current challenges in crafting semantic species descriptions and outlines the path towards future improvements. Furthermore, we discuss the promising integration of semantic phenotypes and nanopublications, as emerging methods for sharing scientific information. Overall, our study highlights the pivotal role of ontology-based technologies in modernising taxonomy and aligning it with the evolving landscape of big data analysis and FAIR principles.
Collapse
Affiliation(s)
- Giulio Montanaro
- Finnish Museum of Natural History, University of Helsinki, Helsinki, FinlandFinnish Museum of Natural History, University of HelsinkiHelsinkiFinland
| | - James P. Balhoff
- RENCI, University of North Carolina, Chapel Hill, North Carolina, United States of AmericaRENCI, University of North CarolinaChapel Hill, North CarolinaUnited States of America
| | - Jennifer C. Girón
- Museum of Texas Tech University, Texas, United States of AmericaMuseum of Texas Tech UniversityTexasUnited States of America
| | - Max Söderholm
- Finnish Museum of Natural History, University of Helsinki, Helsinki, FinlandFinnish Museum of Natural History, University of HelsinkiHelsinkiFinland
| | - Sergei Tarasov
- Finnish Museum of Natural History, University of Helsinki, Helsinki, FinlandFinnish Museum of Natural History, University of HelsinkiHelsinkiFinland
| |
Collapse
|
4
|
Nikolski M, Hovig E, Al-Shahrour F, Blomberg N, Scollen S, Valencia A, Saunders G. Roadmap for a European cancer data management and precision medicine infrastructure. NATURE CANCER 2024; 5:367-372. [PMID: 38321342 DOI: 10.1038/s43018-023-00717-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Affiliation(s)
- Macha Nikolski
- University of Bordeaux, CNRS-IBGC, UMR 5095, Bordeaux, France.
- University of Bordeaux, Bordeaux Bioinformatics Center CBiB, Bordeaux, France.
| | - Eivind Hovig
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Fatima Al-Shahrour
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | - Serena Scollen
- ELIXIR Hub, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Alfonso Valencia
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain
- ICREA, Barcelona, Spain
| | - Gary Saunders
- ELIXIR Hub, Wellcome Genome Campus, Hinxton, Cambridge, UK
- EATRIS-ERIC, Amsterdam, the Netherlands
| |
Collapse
|
5
|
Girón JC, Tarasov S, González Montaña LA, Matentzoglu N, Smith AD, Koch M, Boudinot BE, Bouchard P, Burks R, Vogt L, Yoder M, Osumi-Sutherland D, Friedrich F, Beutel RG, Mikó I. Formalizing Invertebrate Morphological Data: A Descriptive Model for Cuticle-Based Skeleto-Muscular Systems, an Ontology for Insect Anatomy, and their Potential Applications in Biodiversity Research and Informatics. Syst Biol 2023; 72:1084-1100. [PMID: 37094905 DOI: 10.1093/sysbio/syad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 04/17/2023] [Accepted: 04/21/2023] [Indexed: 04/26/2023] Open
Abstract
The spectacular radiation of insects has produced a stunning diversity of phenotypes. During the past 250 years, research on insect systematics has generated hundreds of terms for naming and comparing them. In its current form, this terminological diversity is presented in natural language and lacks formalization, which prohibits computer-assisted comparison using semantic web technologies. Here we propose a Model for Describing Cuticular Anatomical Structures (MoDCAS) which incorporates structural properties and positional relationships for standardized, consistent, and reproducible descriptions of arthropod phenotypes. We applied the MoDCAS framework in creating the ontology for the Anatomy of the Insect Skeleto-Muscular system (AISM). The AISM is the first general insect ontology that aims to cover all taxa by providing generalized, fully logical, and queryable, definitions for each term. It was built using the Ontology Development Kit (ODK), which maximizes interoperability with Uberon (Uberon multispecies anatomy ontology) and other basic ontologies, enhancing the integration of insect anatomy into the broader biological sciences. A template system for adding new terms, extending, and linking the AISM to additional anatomical, phenotypic, genetic, and chemical ontologies is also introduced. The AISM is proposed as the backbone for taxon-specific insect ontologies and has potential applications spanning systematic biology and biodiversity informatics, allowing users to: 1) use controlled vocabularies and create semiautomated computer-parsable insect morphological descriptions; 2) integrate insect morphology into broader fields of research, including ontology-informed phylogenetic methods, logical homology hypothesis testing, evo-devo studies, and genotype to phenotype mapping; and 3) automate the extraction of morphological data from the literature, enabling the generation of large-scale phenomic data, by facilitating the production and testing of informatic tools able to extract, link, annotate, and process morphological data. This descriptive model and its ontological applications will allow for clear and semantically interoperable integration of arthropod phenotypes in biodiversity studies.
Collapse
Affiliation(s)
- Jennifer C Girón
- Department of Entomology, Purdue University, West Lafayette, IN, USA
- Natural Science Research Laboratory, Museum of Texas Tech University, Lubbock, TX, USA
| | - Sergei Tarasov
- Finnish Museum of Natural History, University of Helsinki, Pohjoinen Rautatiekatu 13, FI-00014 Helsinki, Finland
| | | | | | - Aaron D Smith
- Department of Entomology, Purdue University, West Lafayette, IN, USA
| | - Markus Koch
- Institute of Evolutionary Biology and Ecology, University of Bonn, An der Immenburg 1, 53121 Bonn, Germany
| | - Brendon E Boudinot
- Department of Entomology & Nematology, University of California, Davis, One Shields Ave, CA, USA
- Institut für Zoologie und Evolutionsforschung, Friedrich-Schiller-Universität Jena, Erbertstraße 1, 07743 Jena, Germany
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington DC, USA
| | - Patrice Bouchard
- Biodiversity and Bioresources, Canadian National Collection of Insects, Arachnids and Nematodes, Agriculture and Agri-Food Canada, 960 Carling Avenue, Ottawa, Ontario, K1A 0C6, Canada
| | - Roger Burks
- Entomology Department, University of California, Riverside, 900 University Ave. Riverside, CA, USA
| | - Lars Vogt
- TIB Leibniz Information Centre for Science and Technology, Welfengarten 1B, 30167 Hannover, Germany
| | - Matthew Yoder
- Illinois Natural History Survey, University of Illinois, Champaign, IL, USA
| | | | - Frank Friedrich
- Institut für Zell- und Systembiologie der Tiere, Universität Hamburg, Martin-Luther-King-Platz 3, 20146, Hamburg, Germany
| | - Rolf G Beutel
- Institut für Zoologie und Evolutionsforschung, Friedrich-Schiller-Universität Jena, Erbertstraße 1, 07743 Jena, Germany
| | - István Mikó
- Department of Biological Sciences, University of New Hampshire, Durham, NH, USA
| |
Collapse
|
6
|
Grams M, Richter S. On the four complementary aspects of hierarchical character relationships and their bearing on scoring constraints, expressed in a new syntax for character dependencies. Cladistics 2023; 39:437-455. [PMID: 37428134 DOI: 10.1111/cla.12550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 06/02/2023] [Accepted: 06/03/2023] [Indexed: 07/11/2023] Open
Abstract
Morphological matrices, including the conceptualization of characters and character states and scoring thereof, still are a valuable and necessary tool for phylogenetic analyses. Although they are often seen only as numerically simplified summaries of observations for the purpose of cladistic analyses, they also hold value as collections of ideas, concepts and the current state of knowledge, conveying various hypotheses on character state identity, homology and evolutionary transformations. A common and persistent issue in scoring and analysing morphological matrices is the phenomenon of inapplicable characters ("inapplicables"). Inapplicables result from the ontological dependency (based on hierarchical relationships) between characters. Traditionally handled the same as "missing data", inapplicables were shown to be problematic in holding the potential to result in unreasonable algorithmic preference for certain cladograms over others. Recently, though, this problem has been solved by approaching parsimony as a maximization of homology rather than a minimization of transformational steps. We herein aim to further improve our theoretical understanding of the underlying hierarchical nature of morphological characters, which causes the phenomenon of ontological dependencies and, thereby, inapplicables. As a result, we present a discussion of various character-dependency scenarios and a new concept of hierarchical character relationships as being composed of four complementary sub-aspects. Building on this, a new syntax for the designation of character dependencies as part of the character statement is proposed, to help identify and apply scoring constraints for manual and automated scoring of morphological character matrices and their cladistic analysis.
Collapse
Affiliation(s)
- Markus Grams
- Universität Rostock Institut für Biowissenschaften, Allgemeine & Spezielle Zoologie, Rostock, Germany
| | - Stefan Richter
- Universität Rostock Institut für Biowissenschaften, Allgemeine & Spezielle Zoologie, Rostock, Germany
| |
Collapse
|
7
|
Alghamdi SM, Hoehndorf R. Improving the classification of cardinality phenotypes using collections. J Biomed Semantics 2023; 14:9. [PMID: 37550716 PMCID: PMC10405428 DOI: 10.1186/s13326-023-00290-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 07/07/2023] [Indexed: 08/09/2023] Open
Abstract
MOTIVATION Phenotypes are observable characteristics of an organism and they can be highly variable. Information about phenotypes is collected in a clinical context to characterize disease, and is also collected in model organisms and stored in model organism databases where they are used to understand gene functions. Phenotype data is also used in computational data analysis and machine learning methods to provide novel insights into disease mechanisms and support personalized diagnosis of disease. For mammalian organisms and in a clinical context, ontologies such as the Human Phenotype Ontology and the Mammalian Phenotype Ontology are widely used to formally and precisely describe phenotypes. We specifically analyze axioms pertaining to phenotypes of collections of entities within a body, and we find that some of the axioms in phenotype ontologies lead to inferences that may not accurately reflect the underlying biological phenomena. RESULTS We reformulate the phenotypes of collections of entities using an ontological theory of collections. By reformulating phenotypes of collections in phenotypes ontologies, we avoid potentially incorrect inferences pertaining to the cardinality of these collections. We apply our method to two phenotype ontologies and show that the reformulation not only removes some problematic inferences but also quantitatively improves biological data analysis.
Collapse
Affiliation(s)
- Sarah M Alghamdi
- Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia.
- King Abdul-Aziz University, Faculty of Computing and Information Technology, 25732, Rabigh, Saudi Arabia.
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia.
| |
Collapse
|
8
|
Wong AW, Tran KC, Binka M, Janjua NZ, Sbihi H, Russell JA, Carlsten C, Levin A, Ryerson CJ. Use of latent class analysis and patient reported outcome measures to identify distinct long COVID phenotypes: A longitudinal cohort study. PLoS One 2023; 18:e0286588. [PMID: 37267379 DOI: 10.1371/journal.pone.0286588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 05/18/2023] [Indexed: 06/04/2023] Open
Abstract
OBJECTIVES We sought to 1) identify long COVID phenotypes based on patient reported outcome measures (PROMs) and 2) determine whether the phenotypes were associated with quality of life (QoL) and/or lung function. METHODS This was a longitudinal cohort study of hospitalized and non-hospitalized patients from March 2020 to January 2022 that was conducted across 4 Post-COVID Recovery Clinics in British Columbia, Canada. Latent class analysis was used to identify long COVID phenotypes using baseline PROMs (fatigue, dyspnea, cough, anxiety, depression, and post-traumatic stress disorder). We then explored the association between the phenotypes and QoL (using the EuroQoL 5 dimensions visual analogue scale [EQ5D VAS]) and lung function (using the diffusing capacity of the lung for carbon monoxide [DLCO]). RESULTS There were 1,344 patients enrolled in the study (mean age 51 ±15 years; 780 [58%] were females; 769 (57%) were of a non-White race). Three distinct long COVID phenotypes were identified: Class 1) fatigue and dyspnea, Class 2) anxiety and depression, and Class 3) fatigue, dyspnea, anxiety, and depression. Class 3 had a significantly lower EQ5D VAS at 3 (50±19) and 6 months (54 ± 22) compared to Classes 1 and 2 (p<0.001). The EQ5D VAS significantly improved between 3 and 6 months for Class 1 (median difference of 6.0 [95% CI, 4.0 to 8.0]) and Class 3 (median difference of 5.0 [95% CI, 0 to 8.5]). There were no differences in DLCO between the classes. CONCLUSIONS There were 3 distinct long COVID phenotypes with different outcomes in QoL between 3 and 6 months after symptom onset. These phenotypes suggest that long COVID is a heterogeneous condition with distinct subpopulations who may have different outcomes and warrant tailored therapeutic approaches.
Collapse
Affiliation(s)
- Alyson W Wong
- Department of Medicine, University of British Columbia, Vancouver, Canada
- Centre for Heart Lung Innovation, St. Paul's Hospital, University of British Columbia, Vancouver, Canada
| | - Karen C Tran
- Division of General Internal Medicine, Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Mawuena Binka
- Data and Analytic Services, BC Centre for Disease Control, Vancouver, British Columbia, Canada
- School of Population and Public Health, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Naveed Z Janjua
- Data and Analytic Services, BC Centre for Disease Control, Vancouver, British Columbia, Canada
- School of Population and Public Health, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Hind Sbihi
- Data and Analytic Services, BC Centre for Disease Control, Vancouver, British Columbia, Canada
- School of Population and Public Health, The University of British Columbia, Vancouver, British Columbia, Canada
| | - James A Russell
- Centre for Heart Lung Innovation, St. Paul's Hospital, University of British Columbia, Vancouver, Canada
| | - Christopher Carlsten
- Department of Medicine, University of British Columbia, Vancouver, Canada
- School of Population and Public Health, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Adeera Levin
- Department of Medicine, University of British Columbia, Vancouver, Canada
| | - Christopher J Ryerson
- Department of Medicine, University of British Columbia, Vancouver, Canada
- Centre for Heart Lung Innovation, St. Paul's Hospital, University of British Columbia, Vancouver, Canada
| |
Collapse
|
9
|
Csősz S, Báthori F, Rádai Z, Herczeg G, Fisher BL. Comparing ant morphology measurements from microscope and online AntWeb.org 2D z-stacked images. Ecol Evol 2023; 13:e9897. [PMID: 36950369 PMCID: PMC10025076 DOI: 10.1002/ece3.9897] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 02/17/2023] [Accepted: 02/21/2023] [Indexed: 03/22/2023] Open
Abstract
Unprecedented technological advances in digitization and the steadily expanding open-access digital repositories are yielding new opportunities to quickly and efficiently measure morphological traits without transportation and advanced/expensive microscope machinery. A prime example is the AntWeb.org database, which allows researchers from all over the world to study taxonomic, ecological, or evolutionary questions on the same ant specimens with ease. However, the reproducibility and reliability of morphometric data deduced from AntWeb compared to traditional microscope measurements has not yet been tested. Here, we compared 12 morphological traits of 46 Temnothorax ant specimens measured either directly by stereomicroscope on physical specimens or via the widely used open-access software tpsDig utilizing AntWeb digital images. We employed a complex statistical framework to test several aspects of reproducibility and reliability between the methods. We estimated (i) the agreement between the measurement methods and (ii) the trait value dependence of the agreement, then (iii) compared the coefficients of variation produced by the different methods, and finally, (iv) tested for systematic bias between the methods in a mixed modeling-based statistical framework. The stereomicroscope measurements were extremely precise. Our comparisons showed that agreement between the two methods was exceptionally high, without trait value dependence. Furthermore, the coefficients of variation did not differ between the methods. However, we found systematic bias in eight traits: apart from one trait where software measurements overestimated the microscopic measurements, the former underestimated the latter. Our results shed light on the fact that relying solely on the level of agreement between methods can be highly misleading. In our case, even though the software measurements predicted microscope measurements very well, replacing traditional microscope measurements with software measurements, and especially mixing data collected by the different methods, might result in erroneous conclusions. We provide guidance on the best way to utilize virtual specimens (2D z-stacked images) as a source of morphometric data, emphasizing the method's limitations in certain fields and applications.
Collapse
Affiliation(s)
- Sándor Csősz
- ELKH‐ELTE‐MTM Integrative Ecology Research GroupBudapestHungary
| | - Ferenc Báthori
- Department of Systematic Zoology and EcologyInstitute of Biology, ELTE‐Eötvös Loránd UniversityBudapestHungary
| | - Zoltán Rádai
- Lendület Seed Ecology Research GroupInstitute of Ecology and Botany, Centre for Ecological ResearchVácrátótHungary
| | - Gábor Herczeg
- ELKH‐ELTE‐MTM Integrative Ecology Research GroupBudapestHungary
- Department of Systematic Zoology and EcologyInstitute of Biology, ELTE‐Eötvös Loránd UniversityBudapestHungary
| | - Brian L. Fisher
- EntomologyCalifornia Academy of SciencesSan FranciscoCaliforniaUSA
| |
Collapse
|
10
|
Hines HM, Kilpatrick SK, Mikó I, Snellings D, López-Uribe MM, Tian L. The diversity, evolution, and development of setal morphologies in bumble bees (Hymenoptera: Apidae: Bombus spp.). PeerJ 2022; 10:e14555. [PMID: 36573237 PMCID: PMC9789693 DOI: 10.7717/peerj.14555] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 11/21/2022] [Indexed: 12/24/2022] Open
Abstract
Bumble bees are characterized by their thick setal pile that imparts aposematic color patterns often used for species-level identification. Like all bees, the single-celled setae of bumble bees are branched, an innovation thought important for pollen collection. To date no studies have quantified the types of setal morphologies and their distribution on these bees, information that can facilitate understanding of their adaptive ecological function. This study defines several major setal morphotypes in the common eastern bumble bee Bombus impatiens Cresson, revealing these setal types differ by location across the body. The positions of these types of setae are similar across individuals, castes, and sexes within species. We analyzed the distribution of the two most common setal types (plumose and spinulate) across the body dorsum of half of the described bumble bee species. This revealed consistently high density of plumose (long-branched) setae across bumble bees on the head and mesosoma, but considerable variation in the amount of metasomal plumosity. Variation on the metasoma shows strong phylogenetic signal at subgeneric and smaller group levels, making it a useful trait for species delimitation research, and plumosity has increased from early Bombus ancestors. The distribution of these setal types suggests these setae may serve several functions, including pollen-collecting and thermoregulatory roles, and probable mechanosensory functions. This study further examines how and when setae of the pile develop, evidence for mechanosensory function, and the timing of pigmentation as a foundation for future genetic and developmental research in these bees.
Collapse
Affiliation(s)
- Heather M. Hines
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States,Department of Entomology, Pennsylvania State University, University Park, Pennsylvania, United States
| | - Shelby Kerrin Kilpatrick
- Department of Entomology, Pennsylvania State University, University Park, Pennsylvania, United States,Department of Entomology, Texas A & M University, College Station, Texas, United States
| | - István Mikó
- Department of Entomology, Pennsylvania State University, University Park, Pennsylvania, United States,Department of Biological Sciences, University of New Hampshire, Durham, New Hampshire, United States
| | - Daniel Snellings
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States,Division of Genetics & Genomics, Boston Children’s Hospital, Boston, Massachusetts, United States
| | - Margarita M. López-Uribe
- Department of Entomology, Pennsylvania State University, University Park, Pennsylvania, United States
| | - Li Tian
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States,Department of Entomology, China Agricultural University, Beijing, China
| |
Collapse
|
11
|
Zeng C, Bastarache LA, Tao R, Venner E, Hebbring S, Andujar JD, Bland ST, Crosslin DR, Pratap S, Cooley A, Pacheco JA, Christensen KD, Perez E, Zawatsky CLB, Witkowski L, Zouk H, Weng C, Leppig KA, Sleiman PMA, Hakonarson H, Williams MS, Luo Y, Jarvik GP, Green RC, Chung WK, Gharavi AG, Lennon NJ, Rehm HL, Gibbs RA, Peterson JF, Roden DM, Wiesner GL, Denny JC. Association of Pathogenic Variants in Hereditary Cancer Genes With Multiple Diseases. JAMA Oncol 2022; 8:835-844. [PMID: 35446370 PMCID: PMC9026237 DOI: 10.1001/jamaoncol.2022.0373] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Importance Knowledge about the spectrum of diseases associated with hereditary cancer syndromes may improve disease diagnosis and management for patients and help to identify high-risk individuals. Objective To identify phenotypes associated with hereditary cancer genes through a phenome-wide association study. Design, Setting, and Participants This phenome-wide association study used health data from participants in 3 cohorts. The Electronic Medical Records and Genomics Sequencing (eMERGEseq) data set recruited predominantly healthy individuals from 10 US medical centers from July 16, 2016, through February 18, 2018, with a mean follow-up through electronic health records (EHRs) of 12.7 (7.4) years. The UK Biobank (UKB) cohort recruited participants from March 15, 2006, through August 1, 2010, with a mean (SD) follow-up of 12.4 (1.0) years. The Hereditary Cancer Registry (HCR) recruited patients undergoing clinical genetic testing at Vanderbilt University Medical Center from May 1, 2012, through December 31, 2019, with a mean (SD) follow-up through EHRs of 8.8 (6.5) years. Exposures Germline variants in 23 hereditary cancer genes. Pathogenic and likely pathogenic variants for each gene were aggregated for association analyses. Main Outcomes and Measures Phenotypes in the eMERGEseq and HCR cohorts were derived from the linked EHRs. Phenotypes in UKB were from multiple sources of health-related data. Results A total of 214 020 participants were identified, including 23 544 in eMERGEseq cohort (mean [SD] age, 47.8 [23.7] years; 12 611 women [53.6%]), 187 234 in the UKB cohort (mean [SD] age, 56.7 [8.1] years; 104 055 [55.6%] women), and 3242 in the HCR cohort (mean [SD] age, 52.5 [15.5] years; 2851 [87.9%] women). All 38 established gene-cancer associations were replicated, and 19 new associations were identified. These included the following 7 associations with neoplasms: CHEK2 with leukemia (odds ratio [OR], 3.81 [95% CI, 2.64-5.48]) and plasma cell neoplasms (OR, 3.12 [95% CI, 1.84-5.28]), ATM with gastric cancer (OR, 4.27 [95% CI, 2.35-7.44]) and pancreatic cancer (OR, 4.44 [95% CI, 2.66-7.40]), MUTYH (biallelic) with kidney cancer (OR, 32.28 [95% CI, 6.40-162.73]), MSH6 with bladder cancer (OR, 5.63 [95% CI, 2.75-11.49]), and APC with benign liver/intrahepatic bile duct tumors (OR, 52.01 [95% CI, 14.29-189.29]). The remaining 12 associations with nonneoplastic diseases included BRCA1/2 with ovarian cysts (OR, 3.15 [95% CI, 2.22-4.46] and 3.12 [95% CI, 2.36-4.12], respectively), MEN1 with acute pancreatitis (OR, 33.45 [95% CI, 9.25-121.02]), APC with gastritis and duodenitis (OR, 4.66 [95% CI, 2.61-8.33]), and PTEN with chronic gastritis (OR, 15.68 [95% CI, 6.01-40.92]). Conclusions and Relevance The findings of this genetic association study analyzing the EHRs of 3 large cohorts suggest that these new phenotypes associated with hereditary cancer genes may facilitate early detection and better management of cancers. This study highlights the potential benefits of using EHR data in genomic medicine.
Collapse
Affiliation(s)
- Chenjie Zeng
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Lisa A Bastarache
- Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Ran Tao
- Department of Biostatistics, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Eric Venner
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas
| | - Scott Hebbring
- Center for Human Genetics, Marshfield Clinic Research Institute, Marshfield, Wisconsin
| | - Justin D Andujar
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Clinical and Translational Hereditary Cancer Program, Division of Genetic Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University, Nashville, Tennessee
| | - Sarah T Bland
- Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - David R Crosslin
- Department of Biomedical Informatics and Medical Education, University of Washington School of Medicine, Seattle
| | - Siddharth Pratap
- School of Graduate Studies and Research, Meharry Medical College, Nashville, Tennessee
| | - Ayorinde Cooley
- Department of Microbiology, Immunology and Physiology, Meharry Medical College, Nashville, Tennessee
| | - Jennifer A Pacheco
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois
| | - Kurt D Christensen
- PRecisiOn Medicine Translational Research (PROMoTeR) Center, Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, Massachusetts.,Department of Population Medicine, Harvard Medical School, Boston, Massachusetts
| | - Emma Perez
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Carrie L Blout Zawatsky
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Leora Witkowski
- Centre Universitaire de Santé McGill, McGill University Health Centre, Montreal, Quebec, Canada
| | - Hana Zouk
- Laboratory for Molecular Medicine, Partners Healthcare Personalized Medicine, Cambridge, Massachusetts.,Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York
| | - Kathleen A Leppig
- Genetic Services and Kaiser Permanente Washington Health Research Institute, Kaiser Permanente of Washington, Seattle
| | - Patrick M A Sleiman
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.,Division of Human Genetics, Department of Pediatrics, The University of Pennsylvania Perelman School of Medicine, Philadelphia
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.,Division of Human Genetics, Department of Pediatrics, The University of Pennsylvania Perelman School of Medicine, Philadelphia
| | - Marc S Williams
- Genomic Medicine Institute, Geisinger, Danville, Pennsylvania
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois
| | - Gail P Jarvik
- Department of Medicine (Medical Genetics), University of Washington, Seattle.,Department of Genome Sciences, University of Washington, Seattle
| | - Robert C Green
- Brigham and Women's Hospital, Broad Institute, Ariadne Labs and Harvard Medical School, Boston, Massachusetts
| | - Wendy K Chung
- Department of Pediatrics, Columbia University, New York, New York.,Department of Medicine, Columbia University, New York, New York
| | - Ali G Gharavi
- Division of Nephrology, Department of Medicine, Columbia University Irving Medical Center, New York, New York.,Center for Precision Medicine and Genomics, Department of Medicine, Columbia University Irving Medical Center, New York, New York
| | - Niall J Lennon
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Heidi L Rehm
- Medical & Population Genetics Program and Genomics Platform, Broad Institute of MIT and Harvard Cambridge, Cambridge, Massachusetts.,Center for Genomic Medicine, Massachusetts General Hospital, Boston.,Department of Pathology, Harvard Medical School, Boston, Massachusetts
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas
| | - Josh F Peterson
- Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Dan M Roden
- Center for Precision Medicine, Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee.,Divisions of Cardiovascular Medicine and Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Department of Pharmacology, Vanderbilt University, Nashville, Tennessee
| | - Georgia L Wiesner
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Clinical and Translational Hereditary Cancer Program, Division of Genetic Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University, Nashville, Tennessee
| | - Joshua C Denny
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
12
|
Dhombres F, Morgan P, Chaudhari BP, Filges I, Sparks TN, Lapunzina P, Roscioli T, Agarwal U, Aggarwal S, Beneteau C, Cacheiro P, Carmody LC, Collardeau‐Frachon S, Dempsey EA, Dufke A, Duyzend MH, el Ghosh M, Giordano JL, Glad R, Grinfelde I, Iliescu DG, Ladewig MS, Munoz‐Torres MC, Pollazzon M, Radio FC, Rodo C, Silva RG, Smedley D, Sundaramurthi JC, Toro S, Valenzuela I, Vasilevsky NA, Wapner RJ, Zemet R, Haendel MA, Robinson PN. Prenatal phenotyping: A community effort to enhance the Human Phenotype Ontology. AMERICAN JOURNAL OF MEDICAL GENETICS. PART C, SEMINARS IN MEDICAL GENETICS 2022; 190:231-242. [PMID: 35872606 PMCID: PMC9588534 DOI: 10.1002/ajmg.c.31989] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 07/01/2022] [Indexed: 01/07/2023]
Abstract
Technological advances in both genome sequencing and prenatal imaging are increasing our ability to accurately recognize and diagnose Mendelian conditions prenatally. Phenotype-driven early genetic diagnosis of fetal genetic disease can help to strategize treatment options and clinical preventive measures during the perinatal period, to plan in utero therapies, and to inform parental decision-making. Fetal phenotypes of genetic diseases are often unique and at present are not well understood; more comprehensive knowledge about prenatal phenotypes and computational resources have an enormous potential to improve diagnostics and translational research. The Human Phenotype Ontology (HPO) has been widely used to support diagnostics and translational research in human genetics. To better support prenatal usage, the HPO consortium conducted a series of workshops with a group of domain experts in a variety of medical specialties, diagnostic techniques, as well as diseases and phenotypes related to prenatal medicine, including perinatal pathology, musculoskeletal anomalies, neurology, medical genetics, hydrops fetalis, craniofacial malformations, cardiology, neonatal-perinatal medicine, fetal medicine, placental pathology, prenatal imaging, and bioinformatics. We expanded the representation of prenatal phenotypes in HPO by adding 95 new phenotype terms under the Abnormality of prenatal development or birth (HP:0001197) grouping term, and revised definitions, synonyms, and disease annotations for most of the 152 terms that existed before the beginning of this effort. The expansion of prenatal phenotypes in HPO will support phenotype-driven prenatal exome and genome sequencing for precision genetic diagnostics of rare diseases to support prenatal care.
Collapse
Affiliation(s)
- Ferdinand Dhombres
- Sorbonne University, GRC26, INSERM, Limics, Armand Trousseau Hospital, Fetal Medicine Department, APHPParisFrance
| | - Patricia Morgan
- American College of Medical Genetics and Genomics, Newborn Screening Translational Research NetworkBethesdaMarylandUSA
| | - Bimal P. Chaudhari
- Institute for Genomic MedicineNationwide Children's HospitalColumbusOhioUSA
| | - Isabel Filges
- University Hospital Basel and University of Basel, Medical GeneticsBaselSwitzerland
| | - Teresa N. Sparks
- Department of Obstetrics, Gynecology, & Reproductive SciencesUniversity of California, San FranciscoSan FranciscoCaliforniaUSA
| | - Pablo Lapunzina
- CIBERER and Hospital Universitario La Paz, INGEMM‐Institute of Medical and Molecular GeneticsMadridSpain
| | - Tony Roscioli
- Neuroscience Research Australia (NeuRA), University of New South WalesSydneyNew South WalesAustralia
| | - Umber Agarwal
- Department of Maternal and Fetal MedicineLiverpool Women's NHS Foundation TrustLiverpoolUK
| | - Shagun Aggarwal
- Department of Medical GeneticsNizam's Institute of Medical SciencesHyderabadTelanganaIndia
| | - Claire Beneteau
- Service de Génétique Médicale, UF 9321 de Fœtopathologie et Génétique, CHU de NantesNantesFrance
| | - Pilar Cacheiro
- William Harvey Research InstituteQueen Mary University of LondonLondonUK
| | - Leigh C. Carmody
- Department of Genomic MedicineThe Jackson LaboratoryFarmingtonConnecticutUSA
| | | | - Esther A. Dempsey
- St George's University of London, Molecular and Clinical Sciences Research InstituteLondonUK
| | - Andreas Dufke
- University of Tübingen, Institute of Medical Genetics and Applied GenomicsTübingenGermany
| | | | | | - Jessica L. Giordano
- Department of Obstetrics and GynecologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Ragnhild Glad
- Department of Obstetrics and GynecologyUniversity Hospital of North NorwayTromsøNorway
| | - Ieva Grinfelde
- Department of Medical Genetics and Prenatal diagnosisChildren's University HospitalRigaLatvia
| | - Dominic G. Iliescu
- Department of Obstetrics and GynecologyUniversity of Medicine and Pharmacy CraiovaCraiovaDoljRomania
| | - Markus S. Ladewig
- Department of OphthalmologyKlinikum SaarbrückenSaarbrückenSaarlandGermany
| | - Monica C. Munoz‐Torres
- Department of Biochemistry and Molecular GeneticsUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Marzia Pollazzon
- Azienda USL‐IRCCS di Reggio EmiliaMedical Genetics UnitReggio EmiliaItaly
| | | | - Carlota Rodo
- Vall d'Hebron Hospital Campus, Maternal & Fetal MedicineBarcelonaSpain
| | - Raquel Gouveia Silva
- Hospital Santa Maria, Serviço de Genética, Departamento de PediatriaHospital de Santa Maria, Centro Hospitalar Universitário Lisboa Norte, Centro Académico de Medicina de LisboaLisboaPortugal
| | - Damian Smedley
- William Harvey Research InstituteQueen Mary University of LondonLondonUK
| | | | - Sabrina Toro
- Department of Biochemistry and Molecular GeneticsUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Irene Valenzuela
- Hospital Vall d'Hebron, Clinical and Molecular Genetics AreaBarcelonaSpain
| | - Nicole A. Vasilevsky
- Department of Biochemistry and Molecular GeneticsUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Ronald J. Wapner
- Department of Obstetrics and GynecologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Roni Zemet
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexasUSA
| | - Melissa A Haendel
- Department of Biochemistry and Molecular GeneticsUniversity of Colorado Anschutz Medical CampusAuroraColoradoUSA
| | - Peter N. Robinson
- Department of Genomic MedicineThe Jackson LaboratoryFarmingtonConnecticutUSA
| |
Collapse
|
13
|
Farrell MJ, Brierley L, Willoughby A, Yates A, Mideo N. Past and future uses of text mining in ecology and evolution. Proc Biol Sci 2022; 289:20212721. [PMID: 35582795 PMCID: PMC9114983 DOI: 10.1098/rspb.2021.2721] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Ecology and evolutionary biology, like other scientific fields, are experiencing an exponential growth of academic manuscripts. As domain knowledge accumulates, scientists will need new computational approaches for identifying relevant literature to read and include in formal literature reviews and meta-analyses. Importantly, these approaches can also facilitate automated, large-scale data synthesis tasks and build structured databases from the information in the texts of primary journal articles, books, grey literature, and websites. The increasing availability of digital text, computational resources, and machine-learning based language models have led to a revolution in text analysis and natural language processing (NLP) in recent years. NLP has been widely adopted across the biomedical sciences but is rarely used in ecology and evolutionary biology. Applying computational tools from text mining and NLP will increase the efficiency of data synthesis, improve the reproducibility of literature reviews, formalize analyses of research biases and knowledge gaps, and promote data-driven discovery of patterns across ecology and evolutionary biology. Here we present recent use cases from ecology and evolution, and discuss future applications, limitations and ethical issues.
Collapse
Affiliation(s)
- Maxwell J. Farrell
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Canada
| | - Liam Brierley
- Department of Health Data Science, University of Liverpool, Liverpool, UK
| | - Anna Willoughby
- Odum School of Ecology, University of Georgia, Athens, GA, USA,Center for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, USA
| | - Andrew Yates
- University of Amsterdam, Amsterdam, The Netherlands
| | - Nicole Mideo
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Canada
| |
Collapse
|
14
|
Porto DS, Dahdul WM, Lapp H, Balhoff JP, Vision TJ, Mabee PM, Uyeda J. Assessing Bayesian Phylogenetic Information Content of Morphological Data Using Knowledge from Anatomy Ontologies. Syst Biol 2022; 71:1290-1306. [PMID: 35285502 PMCID: PMC9558846 DOI: 10.1093/sysbio/syac022] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 02/09/2022] [Accepted: 03/05/2022] [Indexed: 11/18/2022] Open
Abstract
Morphology remains a primary source of phylogenetic information for many groups of organisms, and the only one for most fossil taxa. Organismal anatomy is not a collection of randomly assembled and independent “parts”, but instead a set of dependent and hierarchically nested entities resulting from ontogeny and phylogeny. How do we make sense of these dependent and at times redundant characters? One promising approach is using ontologies—structured controlled vocabularies that summarize knowledge about different properties of anatomical entities, including developmental and structural dependencies. Here, we assess whether evolutionary patterns can explain the proximity of ontology-annotated characters within an ontology. To do so, we measure phylogenetic information across characters and evaluate if it matches the hierarchical structure given by ontological knowledge—in much the same way as across-species diversity structure is given by phylogeny. We implement an approach to evaluate the Bayesian phylogenetic information (BPI) content and phylogenetic dissonance among ontology-annotated anatomical data subsets. We applied this to data sets representing two disparate animal groups: bees (Hexapoda: Hymenoptera: Apoidea, 209 chars) and characiform fishes (Actinopterygii: Ostariophysi: Characiformes, 463 chars). For bees, we find that BPI is not substantially explained by anatomy since dissonance is often high among morphologically related anatomical entities. For fishes, we find substantial information for two clusters of anatomical entities instantiating concepts from the jaws and branchial arch bones, but among-subset information decreases and dissonance increases substantially moving to higher-level subsets in the ontology. We further applied our approach to address particular evolutionary hypotheses with an example of morphological evolution in miniature fishes. While we show that phylogenetic information does match ontology structure for some anatomical entities, additional relationships and processes, such as convergence, likely play a substantial role in explaining BPI and dissonance, and merit future investigation. Our work demonstrates how complex morphological data sets can be interrogated with ontologies by allowing one to access how information is spread hierarchically across anatomical concepts, how congruent this information is, and what sorts of processes may play a role in explaining it: phylogeny, development, or convergence. [Apidae; Bayesian phylogenetic information; Ostariophysi; Phenoscape; phylogenetic dissonance; semantic similarity.]
Collapse
Affiliation(s)
- Diego S Porto
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, 926 West Campus Drive, Blacksburg, VA 24061, USA
| | - Wasila M Dahdul
- UCI Libraries,University of California, Irvine, Irvine, CA 92623, USA
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
| | - Hilmar Lapp
- Center for Genomic and Computational Biology, Duke University, 101 Science Drive, Durham, NC 27708, USA
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, 100 Europa Drive, Suite 540, Chapel Hill, NC 27517, USA
| | - Todd J Vision
- Department of Biology and School of Information and Library Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Paula M Mabee
- Department of Biology, University of South Dakota, 414 East Clark Street, Vermillion, SD 57069, USA
- Battelle, National Ecological Observatory Network, Boulder, CO 80301, USA
| | - Josef Uyeda
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, 926 West Campus Drive, Blacksburg, VA 24061, USA
| |
Collapse
|
15
|
Taylor RA, Fiellin D, D’Onofrio G, Venkatesh A. Identifying opioid-related electronic health record phenotypes for emergency care research and surveillance: An expert consensus driven concept mapping process. Subst Abuse 2022; 43:841-847. [DOI: 10.1080/08897077.2021.1975864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- R. Andrew Taylor
- Department of Emergency Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - David Fiellin
- Department of Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Gail D’Onofrio
- Department of Emergency Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Arjun Venkatesh
- Department of Emergency Medicine, Yale University School of Medicine, New Haven, CT, USA
| |
Collapse
|
16
|
Tuggle CK, Clarke J, Dekkers JCM, Ertl D, Lawrence-Dill CJ, Lyons E, Murdoch BM, Scott NM, Schnable PS. The Agricultural Genome to Phenome Initiative (AG2PI): creating a shared vision across crop and livestock research communities. Genome Biol 2022; 23:3. [PMID: 34980221 PMCID: PMC8722016 DOI: 10.1186/s13059-021-02570-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Affiliation(s)
| | | | | | - David Ertl
- Iowa Corn Growers Association, Johnston, USA
| | | | | | | | | | | |
Collapse
|
17
|
Lobo D. Formalizing Phenotypes of Regeneration. Methods Mol Biol 2022; 2450:663-679. [PMID: 35359335 PMCID: PMC9761515 DOI: 10.1007/978-1-0716-2172-1_36] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Regeneration experiments can produce complex phenotypes including morphological outcomes and gene expression patterns that are crucial for the understanding of the mechanisms of regeneration. However, due to their inherent complexity, variability between individuals, and heterogeneous data spreading across the literature, extracting mechanistic knowledge from them is a current challenge. Toward this goal, here we present protocols to unambiguously formalize the phenotypes of regeneration and their experimental procedures using precise mathematical morphological descriptions and standardized gene expression patterns. We illustrate the application of the methodology with step-by-step protocols for planaria and limb regeneration phenotypes. The curated datasets with these methods are not only helpful for human scientists, but they represent a key formalized resource that can be easily integrated into downstream reverse engineering methodologies for the automatic extraction of mechanistic knowledge. This approach can pave the way for discovering comprehensive systems-level models of regeneration.
Collapse
Affiliation(s)
- Daniel Lobo
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD, USA.
| |
Collapse
|
18
|
Janko K, Bartoš O, Kočí J, Roslein J, Drdová EJ, Kotusz J, Eisner J, Mokrejš M, Štefková-Kašparová E. Genome Fractionation and Loss of Heterozygosity in Hybrids and Polyploids: Mechanisms, Consequences for Selection, and Link to Gene Function. Mol Biol Evol 2021; 38:5255-5274. [PMID: 34410426 PMCID: PMC8662595 DOI: 10.1093/molbev/msab249] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Hybridization and genome duplication have played crucial roles in the evolution of many animal and plant taxa. The subgenomes of parental species undergo considerable changes in hybrids and polyploids, which often selectively eliminate segments of one subgenome. However, the mechanisms underlying these changes are not well understood, particularly when the hybridization is linked with asexual reproduction that opens up unexpected evolutionary pathways. To elucidate this problem, we compared published cytogenetic and RNAseq data with exome sequences of asexual diploid and polyploid hybrids between three fish species; Cobitis elongatoides, C. taenia, and C. tanaitica. Clonal genomes remained generally static at chromosome-scale levels but their heterozygosity gradually deteriorated at the level of individual genes owing to allelic deletions and conversions. Interestingly, the impact of both processes varies among animals and genomic regions depending on ploidy level and the properties of affected genes. Namely, polyploids were more tolerant to deletions than diploid asexuals where conversions prevailed, and genomic restructuring events accumulated preferentially in genes characterized by high transcription levels and GC-content, strong purifying selection and specific functions like interacting with intracellular membranes. Although hybrids were phenotypically more similar to C. taenia, we found that they preferentially retained C. elongatoides alleles. This demonstrates that favored subgenome is not necessarily the transcriptionally dominant one. This study demonstrated that subgenomes in asexual hybrids and polyploids evolve under a complex interplay of selection and several molecular mechanisms whose efficiency depends on the organism's ploidy level, as well as functional properties and parental ancestry of the genomic region.
Collapse
Affiliation(s)
- Karel Janko
- Laboratory of Fish Genetics, Institute of Animal Physiology and Genetics of the Czech Academy of Sciences, Liběchov, Czech Republic
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Oldřich Bartoš
- Laboratory of Fish Genetics, Institute of Animal Physiology and Genetics of the Czech Academy of Sciences, Liběchov, Czech Republic
- Department of Zoology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Jan Kočí
- Laboratory of Fish Genetics, Institute of Animal Physiology and Genetics of the Czech Academy of Sciences, Liběchov, Czech Republic
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Jan Roslein
- Laboratory of Fish Genetics, Institute of Animal Physiology and Genetics of the Czech Academy of Sciences, Liběchov, Czech Republic
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Edita Janková Drdová
- Institute of Experimental Botany, Academy of Sciences of the Czech Republic, Prague, Czech Republic
| | - Jan Kotusz
- Museum of Natural History, University of Wroclaw, Wroclaw, Poland
| | - Jan Eisner
- Department of Mathematics, Faculty of Science, University of South Bohemia in České Budějovice, České Budějovice, Czech Republic
| | - Martin Mokrejš
- Laboratory of Fish Genetics, Institute of Animal Physiology and Genetics of the Czech Academy of Sciences, Liběchov, Czech Republic
- IT4Innovations, VŠB—Technical University of Ostrava, Ostrava-Poruba, Czech Republic
| | - Eva Štefková-Kašparová
- Laboratory of Fish Genetics, Institute of Animal Physiology and Genetics of the Czech Academy of Sciences, Liběchov, Czech Republic
- Department of Genetics and Breeding, FAFNR, Czech University of Life Sciences Prague, Czech Republic
| |
Collapse
|
19
|
Billiet B, Amati-Bonneau P, Desquiret-Dumas V, Guehlouz K, Milea D, Gohier P, Lenaers G, Mirebeau-Prunier D, den Dunnen JT, Reynier P, Ferré M. NR2F1 database: 112 variants and 84 patients support refining the clinical synopsis of Bosch-Boonstra-Schaaf optic atrophy syndrome. Hum Mutat 2021; 43:128-142. [PMID: 34837429 DOI: 10.1002/humu.24305] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 10/12/2021] [Accepted: 11/16/2021] [Indexed: 11/09/2022]
Abstract
Pathogenic variants of the nuclear receptor subfamily 2 group F member 1 gene (NR2F1) are responsible for Bosch-Boonstra-Schaaf optic atrophy syndrome (BBSOAS), an autosomal dominant disorder characterized by optic atrophy associated with developmental delay and intellectual disability, but with a clinical presentation which appears to be multifaceted. We created the first public locus-specific database dedicated to NR2F1. All variants and clinical cases reported in the literature, as well as new unpublished cases, were integrated into the database using standard nomenclature to describe both molecular and phenotypic anomalies. We subsequently pursued a comprehensive approach based on computed representation and analysis suggesting a refinement of the BBSOAS clinical description with respect to neurological features and the inclusion of additional signs of hypotonia and feeding difficulties. This database is fully accessible for both clinician and molecular biologists and should prove useful in further refining the clinical synopsis of NR2F1 as new data is recorded.
Collapse
Affiliation(s)
- Benjamin Billiet
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Patrizia Amati-Bonneau
- Unité MITOVASC, Équipe Mitolab, SFR ICAT, INSERM, CNRS, Université d'Angers, Angers, France.,Laboratoire de Biochimie et Biologie moléculaire, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Valérie Desquiret-Dumas
- Unité MITOVASC, Équipe Mitolab, SFR ICAT, INSERM, CNRS, Université d'Angers, Angers, France.,Laboratoire de Biochimie et Biologie moléculaire, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Khadidja Guehlouz
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Dan Milea
- Singapore Eye Research Institute, Singapore National Eye Centre, Duke-NUS, Singapore
| | - Philippe Gohier
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Guy Lenaers
- Unité MITOVASC, Équipe Mitolab, SFR ICAT, INSERM, CNRS, Université d'Angers, Angers, France
| | - Delphine Mirebeau-Prunier
- Unité MITOVASC, Équipe Mitolab, SFR ICAT, INSERM, CNRS, Université d'Angers, Angers, France.,Laboratoire de Biochimie et Biologie moléculaire, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Johan T den Dunnen
- Department of Human Genetics, Department of Clinical Genetics, Leiden University Medical Centre, Leiden, The Netherlands
| | - Pascal Reynier
- Unité MITOVASC, Équipe Mitolab, SFR ICAT, INSERM, CNRS, Université d'Angers, Angers, France.,Laboratoire de Biochimie et Biologie moléculaire, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Marc Ferré
- Unité MITOVASC, Équipe Mitolab, SFR ICAT, INSERM, CNRS, Université d'Angers, Angers, France
| |
Collapse
|
20
|
Vogt L. FAIR data representation in times of eScience: a comparison of instance-based and class-based semantic representations of empirical data using phenotype descriptions as example. J Biomed Semantics 2021; 12:20. [PMID: 34823588 PMCID: PMC8613519 DOI: 10.1186/s13326-021-00254-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 11/11/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND The size, velocity, and heterogeneity of Big Data outclasses conventional data management tools and requires data and metadata to be fully machine-actionable (i.e., eScience-compliant) and thus findable, accessible, interoperable, and reusable (FAIR). This can be achieved by using ontologies and through representing them as semantic graphs. Here, we discuss two different semantic graph approaches of representing empirical data and metadata in a knowledge graph, with phenotype descriptions as an example. Almost all phenotype descriptions are still being published as unstructured natural language texts, with far-reaching consequences for their FAIRness, substantially impeding their overall usability within the life sciences. However, with an increasing amount of anatomy ontologies becoming available and semantic applications emerging, a solution to this problem becomes available. Researchers are starting to document and communicate phenotype descriptions through the Web in the form of highly formalized and structured semantic graphs that use ontology terms and Uniform Resource Identifiers (URIs) to circumvent the problems connected with unstructured texts. RESULTS Using phenotype descriptions as an example, we compare and evaluate two basic representations of empirical data and their accompanying metadata in the form of semantic graphs: the class-based TBox semantic graph approach called Semantic Phenotype and the instance-based ABox semantic graph approach called Phenotype Knowledge Graph. Their main difference is that only the ABox approach allows for identifying every individual part and property mentioned in the description in a knowledge graph. This technical difference results in substantial practical consequences that significantly affect the overall usability of empirical data. The consequences affect findability, accessibility, and explorability of empirical data as well as their comparability, expandability, universal usability and reusability, and overall machine-actionability. Moreover, TBox semantic graphs often require querying under entailment regimes, which is computationally more complex. CONCLUSIONS We conclude that, from a conceptual point of view, the advantages of the instance-based ABox semantic graph approach outweigh its shortcomings and outweigh the advantages of the class-based TBox semantic graph approach. Therefore, we recommend the instance-based ABox approach as a FAIR approach for documenting and communicating empirical data and metadata in a knowledge graph.
Collapse
Affiliation(s)
- Lars Vogt
- TIB Leibniz Information Centre for Science and Technology, Welfengarten 1B, 30167, Hanover, Germany.
| |
Collapse
|
21
|
Tarasov S, Mikó I, Yoder MJ. ontoFAST: An R package for interactive and semi‐automatic annotation of characters with biological ontologies. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Sergei Tarasov
- Finnish Museum of Natural History Helsinki Finland
- National Institute for Mathematical and Biological Synthesis University of Tennessee Knoxville TN USA
| | | | | |
Collapse
|
22
|
Pleiotropy data resource as a primer for investigating co-morbidities/multi-morbidities and their role in disease. Mamm Genome 2021; 33:135-142. [PMID: 34524473 PMCID: PMC8913486 DOI: 10.1007/s00335-021-09917-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 09/06/2021] [Indexed: 11/06/2022]
Abstract
Most current biomedical and protein research focuses only on a small proportion of genes, which results in a lost opportunity to identify new gene-disease associations and explore new opportunities for therapeutic intervention. The International Mouse Phenotyping Consortium (IMPC) focuses on elucidating gene function at scale for poorly characterized and/or under-studied genes. A key component of the IMPC initiative is the implementation of a broad phenotyping pipeline, which is facilitating the discovery of pleiotropy. Characterizing pleiotropy is essential to identify gene-disease associations, and it is of particular importance when elucidating the genetic causes of syndromic disorders. Here we show how the IMPC is effectively uncovering pleiotropy and how the new mouse models and gene function hypotheses generated by the IMPC are increasing our understanding of the mammalian genome, forming the basis of new research and identifying new gene-disease associations.
Collapse
|
23
|
Guehlouz K, Foulonneau T, Amati-Bonneau P, Charif M, Colin E, Bris C, Desquiret-Dumas V, Milea D, Gohier P, Procaccio V, Bonneau D, den Dunnen JT, Lenaers G, Reynier P, Ferré M. ACO2 clinicobiological dataset with extensive phenotype ontology annotation. Sci Data 2021; 8:205. [PMID: 34354088 PMCID: PMC8342444 DOI: 10.1038/s41597-021-00984-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 06/22/2021] [Indexed: 11/08/2022] Open
Abstract
Pathogenic variants of the aconitase 2 gene (ACO2) are responsible for a broad clinical spectrum involving optic nerve degeneration, ranging from isolated optic neuropathy with recessive or dominant inheritance, to complex neurodegenerative syndromes with recessive transmission. We created the first public locus-specific database (LSDB) dedicated to ACO2 within the "Global Variome shared LOVD" using exclusively the Human Phenotype Ontology (HPO), a standard vocabulary for describing phenotypic abnormalities. All the variants and clinical cases listed in the literature were incorporated into the database, from which we produced a dataset. We followed a rational and comprehensive approach based on the HPO thesaurus, demonstrating that ACO2 patients should not be classified separately between isolated and syndromic cases. Our data highlight that certain syndromic patients do not have optic neuropathy and provide support for the classification of the recurrent pathogenic variants c.220C>G and c.336C>G as likely pathogenic. Overall, our data records demonstrate that the clinical spectrum of ACO2 should be considered as a continuum of symptoms and refines the classification of some common variants.
Collapse
Affiliation(s)
- Khadidja Guehlouz
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Thomas Foulonneau
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
| | - Patrizia Amati-Bonneau
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
- Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Majida Charif
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
- Genetics, and immuno-cell therapy Team, Mohammed First University, Oujda, Morocco
| | - Estelle Colin
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
- Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Céline Bris
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
- Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Valérie Desquiret-Dumas
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
- Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Dan Milea
- Singapore National Eye Centre, Singapore Eye Research Institute, Duke-NUS, Singapore
| | - Philippe Gohier
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Vincent Procaccio
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
- Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Dominique Bonneau
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
- Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Johan T den Dunnen
- Human Genetics and Clinical Genetics, Leiden University Medical Centre, Leiden, The Netherlands
| | - Guy Lenaers
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
| | - Pascal Reynier
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
- Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Marc Ferré
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France.
| |
Collapse
|
24
|
Liu L, Zhu S. Computational Methods for Prediction of Human Protein-Phenotype Associations: A Review. PHENOMICS (CHAM, SWITZERLAND) 2021; 1:171-185. [PMID: 36939789 PMCID: PMC9590544 DOI: 10.1007/s43657-021-00019-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 06/05/2021] [Accepted: 06/16/2021] [Indexed: 12/01/2022]
Abstract
Deciphering the relationship between human proteins (genes) and phenotypes is one of the fundamental tasks in phenomics research. The Human Phenotype Ontology (HPO) builds upon a standardized logical vocabulary to describe the abnormal phenotypes encountered in human diseases and paves the way towards the computational analysis of their genetic causes. To date, many computational methods have been proposed to predict the HPO annotations of proteins. In this paper, we conduct a comprehensive review of the existing approaches to predicting HPO annotations of novel proteins, identifying missing HPO annotations, and prioritizing candidate proteins with respect to a certain HPO term. For each topic, we first give the formalized description of the problem, and then systematically revisit the published literatures highlighting their advantages and disadvantages, followed by the discussion on the challenges and promising future directions. In addition, we point out several potential topics to be worthy of exploration including the selection of negative HPO annotations and detecting HPO misannotations. We believe that this review will provide insight to the researchers in the field of computational phenotype analyses in terms of comprehending and developing novel prediction algorithms.
Collapse
Affiliation(s)
- Lizhi Liu
- School of Computer Science, Fudan University, Shanghai, 200433 China
| | - Shanfeng Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433 China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, 200433 China
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433 China
- Zhangjiang Fudan International Innovation Center, Shanghai, 200433 China
- Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, 200433 China
| |
Collapse
|
25
|
Kulmanov M, Smaili FZ, Gao X, Hoehndorf R. Semantic similarity and machine learning with ontologies. Brief Bioinform 2021; 22:bbaa199. [PMID: 33049044 PMCID: PMC8293838 DOI: 10.1093/bib/bbaa199] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 08/03/2020] [Accepted: 08/04/2020] [Indexed: 12/13/2022] Open
Abstract
Ontologies have long been employed in the life sciences to formally represent and reason over domain knowledge and they are employed in almost every major biological database. Recently, ontologies are increasingly being used to provide background knowledge in similarity-based analysis and machine learning models. The methods employed to combine ontologies and machine learning are still novel and actively being developed. We provide an overview over the methods that use ontologies to compute similarity and incorporate them in machine learning methods; in particular, we outline how semantic similarity measures and ontology embeddings can exploit the background knowledge in ontologies and how ontologies can provide constraints that improve machine learning models. The methods and experiments we describe are available as a set of executable notebooks, and we also provide a set of slides and additional resources at https://github.com/bio-ontology-research-group/machine-learning-with-ontologies.
Collapse
Affiliation(s)
| | | | - Xin Gao
- Computational Bioscience Research Center and lead of the Structural and Functional Bioinformatics Group at King Abdullah University of Science and Technology
| | | |
Collapse
|
26
|
Kalfakakou D, Fostira F, Papathanasiou A, Apostolou P, Dellatola V, Gavra IE, Vlachos IS, Scouras ZG, Drosopoulou E, Yannoukakos D, Konstantopoulou I. CanVaS: Documenting the genetic variation spectrum of Greek cancer patients. Hum Mutat 2021; 42:1081-1093. [PMID: 34174131 DOI: 10.1002/humu.24249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 05/13/2021] [Accepted: 06/14/2021] [Indexed: 11/08/2022]
Abstract
National genetic variation registries vastly increase the level of detail for the relevant population, while directly affecting patient management. Herein, we report CanVaS, a Cancer Variation reSource aiming to document the genetic variation of cancer patients in Greece. CanVaS comprises germline genetic data from 7,363 Greek individuals with a personal and/or family history of malignancy. The data set incorporates approximately 24,000 functionally annotated rare variants in 97 established or suspected cancer susceptibility genes. For each variant, allele frequency for the Greek population, interpretation for clinical significance, anonymized family and segregation information, as well as phenotypic traits of the carriers, are included. Moreover, information on the geographic distribution of the variants across the country is provided, enabling the study of Greek population isolates. Direct comparisons between Greek (sub)populations with relevant genetic resources are supported, allowing fine-grain localized adjustment of guidelines and clinical decision-making. Most importantly, anonymized data are available for download, while the Leiden Open Variation Database schema is adopted, enabling integration/interconnection with central resources. CanVaS could become a stepping-stone for a countrywide effort to characterize the cancer genetic variation landscape, concurrently supporting national and international cancer research. The database can be accessed at: http://ithaka.rrp.demokritos.gr/CanVaS.
Collapse
Affiliation(s)
- Despoina Kalfakakou
- Department of Genetics, Development & Molecular Biology, School of Biology, Aristotle University of Thessaloniki, Thessaloniki, Greece.,Molecular Diagnostics Laboratory, Institute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research "Demokritos", Athens, Greece
| | - Florentia Fostira
- Molecular Diagnostics Laboratory, Institute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research "Demokritos", Athens, Greece
| | - Athanasios Papathanasiou
- Molecular Diagnostics Laboratory, Institute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research "Demokritos", Athens, Greece
| | - Paraskevi Apostolou
- Molecular Diagnostics Laboratory, Institute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research "Demokritos", Athens, Greece
| | - Vasiliki Dellatola
- Molecular Diagnostics Laboratory, Institute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research "Demokritos", Athens, Greece
| | - Ioanna E Gavra
- Molecular Diagnostics Laboratory, Institute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research "Demokritos", Athens, Greece
| | - Ioannis S Vlachos
- Department of Pathology, Cancer Research Institute, Beth Israel Deaconess Medical Center/Harvard Medical School, Boston, Massachusetts, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Zacharias G Scouras
- Department of Genetics, Development & Molecular Biology, School of Biology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Eleni Drosopoulou
- Department of Genetics, Development & Molecular Biology, School of Biology, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Drakoulis Yannoukakos
- Molecular Diagnostics Laboratory, Institute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research "Demokritos", Athens, Greece
| | - Irene Konstantopoulou
- Molecular Diagnostics Laboratory, Institute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research "Demokritos", Athens, Greece
| |
Collapse
|
27
|
Lafuente E, Alves F, King JG, Peralta CM, Beldade P. Many ways to make darker flies: Intra- and interspecific variation in Drosophila body pigmentation components. Ecol Evol 2021; 11:8136-8155. [PMID: 34188876 PMCID: PMC8216949 DOI: 10.1002/ece3.7646] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 04/14/2021] [Accepted: 04/18/2021] [Indexed: 12/13/2022] Open
Abstract
Body pigmentation is an evolutionarily diversified and ecologically relevant trait with substantial variation within and between species, and important roles in animal survival and reproduction. Insect pigmentation, in particular, provides some of the most compelling examples of adaptive evolution, including its ecological significance and genetic bases. Pigmentation includes multiple aspects of color and color pattern that may vary more or less independently, and can be under different selective pressures. We decompose Drosophila thorax and abdominal pigmentation, a valuable eco-evo-devo model, into distinct measurable traits related to color and color pattern. We investigate intra- and interspecific variation for those traits and assess its different sources. For each body part, we measured overall darkness, as well as four other pigmentation properties distinguishing between background color and color of the darker pattern elements that decorate each body part. By focusing on two standard D. melanogaster laboratory populations, we show that pigmentation components vary and covary in distinct manners depending on sex, genetic background, and temperature during development. Studying three natural populations of D. melanogaster along a latitudinal cline and five other Drosophila species, we then show that evolution of lighter or darker bodies can be achieved by changing distinct component traits. Our results paint a much more complex picture of body pigmentation variation than previous studies could uncover, including patterns of sexual dimorphism, thermal plasticity, and interspecific diversity. These findings underscore the value of detailed quantitative phenotyping and analysis of different sources of variation for a better understanding of phenotypic variation and diversification, and the ecological pressures and genetic mechanisms underlying them.
Collapse
Affiliation(s)
- Elvira Lafuente
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Present address:
Swiss Federal Institute of Aquatic Science and TechnologyDepartment of Aquatic EcologyDübendorfSwitzerland
| | | | - Jessica G. King
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Present address:
Institute of Evolutionary BiologySchool of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Carolina M. Peralta
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Present address:
Max Planck Institute for Evolutionary BiologyPlönGermany
| | - Patrícia Beldade
- Instituto Gulbenkian de CiênciaOeirasPortugal
- CE3C: Centre for Ecology, Evolution, and Environmental Changes, Faculty of SciencesUniversity of LisbonLisbonPortugal
| |
Collapse
|
28
|
Folk RA, Siniscalchi CM. Biodiversity at the global scale: the synthesis continues. AMERICAN JOURNAL OF BOTANY 2021; 108:912-924. [PMID: 34181762 DOI: 10.1002/ajb2.1694] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 04/14/2021] [Indexed: 06/13/2023]
Abstract
Traditionally, the generation and use of biodiversity data and their associated specimen objects have been primarily the purview of individuals and small research groups. While deposition of data and specimens in herbaria and other repositories has long been the norm, throughout most of their history, these resources have been accessible only to a small community of specialists. Through recent concerted efforts, primarily at the level of national and international governmental agencies over the last two decades, the pace of biodiversity data accumulation has accelerated, and a wider array of biodiversity scientists has gained access to this massive accumulation of resources, applying them to an ever-widening compass of research pursuits. We review how these new resources and increasing access to them are affecting the landscape of biodiversity research in plants today, focusing on new applications across evolution, ecology, and other fields that have been enabled specifically by the availability of these data and the global scope that was previously beyond the reach of individual investigators. We give an overview of recent advances organized along three lines: broad-scale analyses of distributional data and spatial information, phylogenetic research circumscribing large clades with comprehensive taxon sampling, and data sets derived from improved accessibility of biodiversity literature. We also review synergies between large data resources and more traditional data collection paradigms, describe shortfalls and how to overcome them, and reflect on the future of plant biodiversity analyses in light of increasing linkages between data types and scientists in our field.
Collapse
Affiliation(s)
- Ryan A Folk
- Department of Biological Sciences, Mississippi State University, Mississippi State, Mississippi, USA
| | - Carolina M Siniscalchi
- Department of Biological Sciences, Mississippi State University, Mississippi State, Mississippi, USA
| |
Collapse
|
29
|
Savojardo C, Babbi G, Martelli PL, Casadio R. Mapping OMIM Disease-Related Variations on Protein Domains Reveals an Association Among Variation Type, Pfam Models, and Disease Classes. Front Mol Biosci 2021; 8:617016. [PMID: 34026820 PMCID: PMC8138129 DOI: 10.3389/fmolb.2021.617016] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 04/09/2021] [Indexed: 12/23/2022] Open
Abstract
Human genome resequencing projects provide an unprecedented amount of data about single-nucleotide variations occurring in protein-coding regions and often leading to observable changes in the covalent structure of gene products. For many of these variations, links to Online Mendelian Inheritance in Man (OMIM) genetic diseases are available and are reported in many databases that are collecting human variation data such as Humsavar. However, the current knowledge on the molecular mechanisms that are leading to diseases is, in many cases, still limited. For understanding the complex mechanisms behind disease insurgence, the identification of putative models, when considering the protein structure and chemico-physical features of the variations, can be useful in many contexts, including early diagnosis and prognosis. In this study, we investigate the occurrence and distribution of human disease–related variations in the context of Pfam domains. The aim of this study is the identification and characterization of Pfam domains that are statistically more likely to be associated with disease-related variations. The study takes into consideration 2,513 human protein sequences with 22,763 disease-related variations. We describe patterns of disease-related variation types in biunivocal relation with Pfam domains, which are likely to be possible markers for linking Pfam domains to OMIM diseases. Furthermore, we take advantage of the specific association between disease-related variation types and Pfam domains for clustering diseases according to the Human Disease Ontology, and we establish a relation among variation types, Pfam domains, and disease classes. We find that Pfam models are specific markers of patterns of variation types and that they can serve to bridge genes, diseases, and disease classes. Data are available as Supplementary Material for 1,670 Pfam models, including 22,763 disease-related variations associated to 3,257 OMIM diseases.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Giulia Babbi
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari, Italy
| |
Collapse
|
30
|
Ontological representation, classification and data-driven computing of phenotypes. J Biomed Semantics 2020; 11:15. [PMID: 33349245 PMCID: PMC7751121 DOI: 10.1186/s13326-020-00230-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 11/03/2020] [Indexed: 11/21/2022] Open
Abstract
Background The successful determination and analysis of phenotypes plays a key role in the diagnostic process, the evaluation of risk factors and the recruitment of participants for clinical and epidemiological studies. The development of computable phenotype algorithms to solve these tasks is a challenging problem, caused by various reasons. Firstly, the term ‘phenotype’ has no generally agreed definition and its meaning depends on context. Secondly, the phenotypes are most commonly specified as non-computable descriptive documents. Recent attempts have shown that ontologies are a suitable way to handle phenotypes and that they can support clinical research and decision making. The SMITH Consortium is dedicated to rapidly establish an integrative medical informatics framework to provide physicians with the best available data and knowledge and enable innovative use of healthcare data for research and treatment optimisation. In the context of a methodological use case ‘phenotype pipeline’ (PheP), a technology to automatically generate phenotype classifications and annotations based on electronic health records (EHR) is developed. A large series of phenotype algorithms will be implemented. This implies that for each algorithm a classification scheme and its input variables have to be defined. Furthermore, a phenotype engine is required to evaluate and execute developed algorithms. Results In this article, we present a Core Ontology of Phenotypes (COP) and the software Phenotype Manager (PhenoMan), which implements a novel ontology-based method to model, classify and compute phenotypes from already available data. Our solution includes an enhanced iterative reasoning process combining classification tasks with mathematical calculations at runtime. The ontology as well as the reasoning method were successfully evaluated with selected phenotypes including SOFA score, socio-economic status, body surface area and WHO BMI classification based on available medical data. Conclusions We developed a novel ontology-based method to model phenotypes of living beings with the aim of automated phenotype reasoning based on available data. This new approach can be used in clinical context, e.g., for supporting the diagnostic process, evaluating risk factors, and recruiting appropriate participants for clinical and epidemiological studies.
Collapse
|
31
|
Lecointre G, Schnell NK, Teletchea F. Hierarchical analysis of ontogenetic time to describe heterochrony and taxonomy of developmental stages. Sci Rep 2020; 10:19732. [PMID: 33184336 PMCID: PMC7665009 DOI: 10.1038/s41598-020-76270-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Accepted: 10/12/2020] [Indexed: 12/11/2022] Open
Abstract
Even though an accurate description of early life stages is available for some teleostean species in form of embryonic and post-embryonic developmental tables, there is poor overlap between species-specific staging vocabularies beyond the taxonomic family level. What is called "embryonic period", "larval period", "metamorphosis", or "juvenile" is anatomically different across teleostean families. This problem, already pointed out 50 years ago, challenges the consistency of developmental biology, embryology, systematics, and hampers an efficient aquaculture diversification. We propose a general solution by producing a proof-of-concept hierarchical analysis of ontogenetic time using a set of four freshwater species displaying strongly divergent reproductive traits. With a parsimony analysis of a matrix where "operational taxonomic units" are species at a given ontogenetic time segment and characters are organs or structures which are coded present or absent at this time, we show that the hierarchies obtained have both very high consistency and retention index, indicating that the ontogenetic time is correctly grasped through a hierarchical graph. This allows to formally detect developmental heterochronies and might provide a baseline to name early life stages for any set of species. The present method performs a phylogenetic segmentation of ontogenetic time, which can be correctly seen as depicting ontophylogenesis.
Collapse
Affiliation(s)
- Guillaume Lecointre
- Institut de Systématique, Évolution, Biodiversité (ISYEB), UMR 7205 Muséum national d'Histoire naturelle, CNRS, SU, EPHE, UA, Sorbonne Universités, CP24, Muséum national d'Histoire naturelle, 57 rue Cuvier, 75005, Paris, France.
| | - Nalani K Schnell
- Institut Systématique, Évolution, Biodiversité (ISYEB), UMR 7205 Muséum national d'Histoire naturelle, CNRS, SU, EPHE, UA, Sorbonne Universités, Station Marine de Concarneau, Place de la Croix, 29900, Concarneau, France
| | - Fabrice Teletchea
- Université de Lorraine, Unité de Recherche Animal and Fonctionnalités des Produits Animaux, Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, 54505, Vandœuvre-lès-Nancy, France
| |
Collapse
|
32
|
Thessen AE, Walls RL, Vogt L, Singer J, Warren R, Buttigieg PL, Balhoff JP, Mungall CJ, McGuinness DL, Stucky BJ, Yoder MJ, Haendel MA. Transforming the study of organisms: Phenomic data models and knowledge bases. PLoS Comput Biol 2020; 16:e1008376. [PMID: 33232313 PMCID: PMC7685442 DOI: 10.1371/journal.pcbi.1008376] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The rapidly decreasing cost of gene sequencing has resulted in a deluge of genomic data from across the tree of life; however, outside a few model organism databases, genomic data are limited in their scientific impact because they are not accompanied by computable phenomic data. The majority of phenomic data are contained in countless small, heterogeneous phenotypic data sets that are very difficult or impossible to integrate at scale because of variable formats, lack of digitization, and linguistic problems. One powerful solution is to represent phenotypic data using data models with precise, computable semantics, but adoption of semantic standards for representing phenotypic data has been slow, especially in biodiversity and ecology. Some phenotypic and trait data are available in a semantic language from knowledge bases, but these are often not interoperable. In this review, we will compare and contrast existing ontology and data models, focusing on nonhuman phenotypes and traits. We discuss barriers to integration of phenotypic data and make recommendations for developing an operationally useful, semantically interoperable phenotypic data ecosystem.
Collapse
Affiliation(s)
- Anne E. Thessen
- Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, United States of America
- Ronin Institute for Independent Scholarship, Monclair, New Jersey, United States of America
| | - Ramona L. Walls
- Bio5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Lars Vogt
- TIB Leibniz Information Centre for Science and Technology, Hannover, Germany
| | | | | | - Pier Luigi Buttigieg
- Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung, Bremerhaven, Germany
| | - James P. Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Christopher J. Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | | | - Brian J. Stucky
- Florida Museum of Natural History, University of Florida, Gainesville, Florida, United States of America
| | - Matthew J. Yoder
- Illinois Natural History Survey, Champaign, Illinois, United States of America
| | - Melissa A. Haendel
- Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, United States of America
| |
Collapse
|
33
|
MacLeod N, Kolska Horwitz L. Machine-learning strategies for testing patterns of morphological variation in small samples: sexual dimorphism in gray wolf (Canis lupus) crania. BMC Biol 2020; 18:113. [PMID: 32883273 PMCID: PMC7470621 DOI: 10.1186/s12915-020-00832-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2020] [Accepted: 07/20/2020] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Studies of mammalian sexual dimorphism have traditionally involved the measurement of selected dimensions of particular skeletal elements and use of single data-analysis procedures. Consequently, such studies have been limited by a variety of both practical and conceptual constraints. To compare and contrast what might be gained from a more exploratory, multifactorial approach to the quantitative assessment of form-variation, images of a small sample of modern Israeli gray wolf (Canis lupus) crania were analyzed via elliptical Fourier analysis of cranial outlines, a Naïve Bayes machine-learning approach to the analysis of these same outline data, and a deep-learning analysis of whole images in which all aspects of these cranial morphologies were represented. The statistical significance and stability of each discriminant result were tested using bootstrap and jackknife procedures. RESULTS Our results reveal no evidence for statistically significant sexual size dimorphism, but significant sex-mediated shape dimorphism. These are consistent with the findings of prior wolf sexual dimorphism studies and extend these studies by identifying new aspects of dimorphic variation. Additionally, our results suggest that shape-based sexual dimorphism in the C. lupus cranial complex may be more widespread morphologically than had been appreciated by previous researchers. CONCLUSION Our results suggest that size and shape dimorphism can be detected in small samples and may be dissociated in mammalian morphologies. This result is particularly noteworthy in that it implies there may be a need to refine allometric hypothesis tests that seek to account for phenotypic sexual dimorphism. The methods we employed in this investigation are fully generalizable and can be applied to a wide range of biological materials and could facilitate the rapid evaluation of a diverse array of morphological/phenomic hypotheses.
Collapse
Affiliation(s)
- Norman MacLeod
- School of Earth Science and Engineering, Zhu Gongshan Building, Nanjing University, 163 Xianlin Avenue, Nanjing, 210023 Jiangsu China
| | - Liora Kolska Horwitz
- National Natural History Collections, Faculty of Life Sciences, The Hebrew University of Jerusalem, The Edmond J. Safra Campus - Givat Ram, 9190401 Jerusalem, Israel
| |
Collapse
|
34
|
Borges LM, Reis VC, Izbicki R. Schrödinger's phenotypes: Herbarium specimens show two‐dimensional images are both good and (not so) bad sources of morphological data. Methods Ecol Evol 2020. [DOI: 10.1111/2041-210x.13450] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Leonardo M. Borges
- Departamento de Botânica Universidade Federal de São Carlos São Carlos SP Brazil
| | - Victor Candido Reis
- Departamento de Estatística Universidade Federal de São Carlos São Carlos SP Brazil
| | - Rafael Izbicki
- Departamento de Estatística Universidade Federal de São Carlos São Carlos SP Brazil
| |
Collapse
|
35
|
Davis AP, Grondin CJ, Johnson RJ, Sciaky D, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2019. Nucleic Acids Res 2020; 47:D948-D954. [PMID: 30247620 PMCID: PMC6323936 DOI: 10.1093/nar/gky868] [Citation(s) in RCA: 589] [Impact Index Per Article: 147.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 09/14/2018] [Indexed: 11/27/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is a premier public resource for literature-based, manually curated associations between chemicals, gene products, phenotypes, diseases, and environmental exposures. In this biennial update, we present our new chemical–phenotype module that codes chemical-induced effects on phenotypes, curated using controlled vocabularies for chemicals, phenotypes, taxa, and anatomical descriptors; this module provides unique opportunities to explore cellular and system-level phenotypes of the pre-disease state and allows users to construct predictive adverse outcome pathways (linking chemical–gene molecular initiating events with phenotypic key events, diseases, and population-level health outcomes). We also report a 46% increase in CTD manually curated content, which when integrated with other datasets yields more than 38 million toxicogenomic relationships. We describe new querying and display features for our enhanced chemical–exposure science module, providing greater scope of content and utility. As well, we discuss an updated MEDIC disease vocabulary with over 1700 new terms and accession identifiers. To accommodate these increases in data content and functionality, CTD has upgraded its computational infrastructure. These updates continue to improve CTD and help inform new testable hypotheses about the etiology and mechanisms underlying environmentally influenced diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Robin J Johnson
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Roy McMorran
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA.,Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
36
|
Gallagher RV, Falster DS, Maitner BS, Salguero-Gómez R, Vandvik V, Pearse WD, Schneider FD, Kattge J, Poelen JH, Madin JS, Ankenbrand MJ, Penone C, Feng X, Adams VM, Alroy J, Andrew SC, Balk MA, Bland LM, Boyle BL, Bravo-Avila CH, Brennan I, Carthey AJR, Catullo R, Cavazos BR, Conde DA, Chown SL, Fadrique B, Gibb H, Halbritter AH, Hammock J, Hogan JA, Holewa H, Hope M, Iversen CM, Jochum M, Kearney M, Keller A, Mabee P, Manning P, McCormack L, Michaletz ST, Park DS, Perez TM, Pineda-Munoz S, Ray CA, Rossetto M, Sauquet H, Sparrow B, Spasojevic MJ, Telford RJ, Tobias JA, Violle C, Walls R, Weiss KCB, Westoby M, Wright IJ, Enquist BJ. Open Science principles for accelerating trait-based science across the Tree of Life. Nat Ecol Evol 2020; 4:294-303. [PMID: 32066887 DOI: 10.1038/s41559-020-1109-6] [Citation(s) in RCA: 75] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 01/10/2020] [Indexed: 01/22/2023]
Abstract
Synthesizing trait observations and knowledge across the Tree of Life remains a grand challenge for biodiversity science. Species traits are widely used in ecological and evolutionary science, and new data and methods have proliferated rapidly. Yet accessing and integrating disparate data sources remains a considerable challenge, slowing progress toward a global synthesis to integrate trait data across organisms. Trait science needs a vision for achieving global integration across all organisms. Here, we outline how the adoption of key Open Science principles-open data, open source and open methods-is transforming trait science, increasing transparency, democratizing access and accelerating global synthesis. To enhance widespread adoption of these principles, we introduce the Open Traits Network (OTN), a global, decentralized community welcoming all researchers and institutions pursuing the collaborative goal of standardizing and integrating trait data across organisms. We demonstrate how adherence to Open Science principles is key to the OTN community and outline five activities that can accelerate the synthesis of trait data across the Tree of Life, thereby facilitating rapid advances to address scientific inquiries and environmental issues. Lessons learned along the path to a global synthesis of trait data will provide a framework for addressing similarly complex data science and informatics challenges.
Collapse
Affiliation(s)
- Rachael V Gallagher
- Department of Biological Sciences, Macquarie University, Sydney, New South Wales, Australia.
| | - Daniel S Falster
- Evolution and Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Brian S Maitner
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Roberto Salguero-Gómez
- Department of Zoology, Oxford University, Oxford, UK.,Centre for Biodiversity and Conservation Science, University of Queensland, Brisbane, Queensland, Australia.,Evolutionary Demography Laboratory, Max Plank Institute for Demographic Research, Rostock, Germany
| | - Vigdis Vandvik
- Department of Biological Sciences, University of Bergen, Bergen, Norway.,Bjerknes Centre for Climate Research, University of Bergen, Bergen, Norway
| | - William D Pearse
- Ecology Center and Department of Biology, Utah State University, Logan, UT, USA
| | | | - Jens Kattge
- Max Planck Institute for Biogeochemistry, Jena, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| | | | - Joshua S Madin
- Hawai'i Institute of Marine Biology, University of Hawai'i at Manoa, Manoa, HI, USA
| | - Markus J Ankenbrand
- Department of Bioinformatics, Biocenter, University of Wuerzburg, Wuerzburg, Germany.,Center for Computational and Theoretical Biology, Biocenter, University of Wuerzburg, Wuerzburg, Germany.,Comprehensive Heart Failure Center, University Hospital Wuerzburg, Wuerzburg, Germany
| | - Caterina Penone
- Institute of Plant Sciences, University of Bern, Bern, Switzerland
| | - Xiao Feng
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Vanessa M Adams
- Discipline of Geography and Spatial Sciences, University of Tasmania, Hobart, Tasmania, Australia
| | - John Alroy
- Department of Biological Sciences, Macquarie University, Sydney, New South Wales, Australia
| | - Samuel C Andrew
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, Australian Capital Territory, Australia
| | - Meghan A Balk
- Bio5 Institute, University of Arizona, Tucson, AZ, USA
| | - Lucie M Bland
- School of Life and Environmental Sciences, Centre for Integrative Ecology, Deakin University, Geelong, Victoria, Australia
| | - Brad L Boyle
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Catherine H Bravo-Avila
- Department of Biology, University of Miami, Miami, FL, USA.,Fairchild Tropical Botanic Garden, Coral Gables, FL, USA
| | - Ian Brennan
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Alexandra J R Carthey
- Department of Biological Sciences, Macquarie University, Sydney, New South Wales, Australia
| | - Renee Catullo
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Brittany R Cavazos
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, USA
| | - Dalia A Conde
- Species360 Conservation Science Alliance, Bloomington, MN, USA.,Interdisciplinary Center on Population Dynamics, University of Southern Denmark, Odense, Denmark.,Department of Biology, University of Southern Denmark, Odense, Denmark
| | - Steven L Chown
- School of Biological Sciences, Monash University, Melbourne, Victoria, Australia
| | - Belen Fadrique
- Department of Biology, University of Miami, Miami, FL, USA
| | - Heloise Gibb
- Department of Ecology, Environment and Evolution and Centre for Future Landscapes, La Trobe University, Melbourne, Victoria, Australia
| | - Aud H Halbritter
- Department of Biological Sciences, University of Bergen, Bergen, Norway.,Bjerknes Centre for Climate Research, University of Bergen, Bergen, Norway
| | - Jennifer Hammock
- National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - J Aaron Hogan
- International Center for Tropical Botany, Department of Biological Sciences, Florida International University, Miami, FL, USA
| | - Hamish Holewa
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, Australian Capital Territory, Australia
| | - Michael Hope
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, Australian Capital Territory, Australia
| | - Colleen M Iversen
- Climate Change Science Institute and Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Malte Jochum
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.,Institute of Plant Sciences, University of Bern, Bern, Switzerland.,Institute of Biology, Leipzig University, Leipzig, Germany
| | - Michael Kearney
- School of BioSciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Alexander Keller
- Department of Bioinformatics, Biocenter, University of Wuerzburg, Wuerzburg, Germany.,Center for Computational and Theoretical Biology, Biocenter, University of Wuerzburg, Wuerzburg, Germany
| | - Paula Mabee
- Department of Biology, University of South Dakota, Vermillion, SD, USA
| | - Peter Manning
- Senckenberg Biodiversity and Climate Research Centre (SBiK-F), Frankfurt, Germany
| | - Luke McCormack
- Center for Tree Science, The Morton Arboretum, Lisle, IL, USA
| | - Sean T Michaletz
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
| | - Daniel S Park
- Department of Organismic and Evolutionary Biology and Harvard University Herbaria, Harvard University, Cambridge, MA, USA
| | - Timothy M Perez
- Department of Biology, University of Miami, Miami, FL, USA.,Fairchild Tropical Botanic Garden, Coral Gables, FL, USA
| | - Silvia Pineda-Munoz
- School of Biological Sciences and School of Earth & Atmospheric Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Courtenay A Ray
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Maurizio Rossetto
- National Herbarium of New South Wales, Royal Botanic Gardens and Domain Trust, Sydney, New South Wales, Australia.,Queensland Alliance of Agriculture and Food Innovation, University of Queensland, Brisbane, Queensland, Australia
| | - Hervé Sauquet
- Evolution and Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, New South Wales, Australia.,National Herbarium of New South Wales, Royal Botanic Gardens and Domain Trust, Sydney, New South Wales, Australia.,Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech, Universite Paris-Saclay, Orsay, France
| | - Benjamin Sparrow
- TERN / School of Biological Sciences, Faculty of Science, The University of Adelaide, Adelaide, South Australia, Australia
| | - Marko J Spasojevic
- Department of Evolution, Ecology, and Organismal Biology, University of California Riverside, Riverside, CA, USA
| | - Richard J Telford
- Department of Biological Sciences, University of Bergen, Bergen, Norway.,Bjerknes Centre for Climate Research, University of Bergen, Bergen, Norway
| | - Joseph A Tobias
- Department of Life Sciences, Imperial College London, London, UK
| | - Cyrille Violle
- CEFE, CNRS, Univ Montpellier, Université Paul Valéry Montpellier, Montpellier, France
| | | | | | - Mark Westoby
- Department of Biological Sciences, Macquarie University, Sydney, New South Wales, Australia
| | - Ian J Wright
- Department of Biological Sciences, Macquarie University, Sydney, New South Wales, Australia
| | - Brian J Enquist
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA.,Santa Fe Institute, Santa Fe, NM, USA
| |
Collapse
|
37
|
Clavel J, Morlon H. Reliable Phylogenetic Regressions for Multivariate Comparative Data: Illustration with the MANOVA and Application to the Effect of Diet on Mandible Morphology in Phyllostomid Bats. Syst Biol 2020; 69:927-943. [DOI: 10.1093/sysbio/syaa010] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Revised: 02/02/2020] [Accepted: 02/07/2020] [Indexed: 11/12/2022] Open
Abstract
Abstract
Understanding what shapes species phenotypes over macroevolutionary timescales from comparative data often requires studying the relationship between phenotypes and putative explanatory factors or testing for differences in phenotypes across species groups. In phyllostomid bats for example, is mandible morphology associated to diet preferences? Performing such analyses depends upon reliable phylogenetic regression techniques and associated tests (e.g., phylogenetic Generalized Least Squares, pGLS, and phylogenetic analyses of variance and covariance, pANOVA, pANCOVA). While these tools are well established for univariate data, their multivariate counterparts are lagging behind. This is particularly true for high-dimensional phenotypic data, such as morphometric data. Here, we implement much-needed likelihood-based multivariate pGLS, pMANOVA, and pMANCOVA, and use a recently developed penalized-likelihood framework to extend their application to the difficult case when the number of traits $p$ approaches or exceeds the number of species $n$. We then focus on the pMANOVA and use intensive simulations to assess the performance of the approach as $p$ increases, under various levels of phylogenetic signal and correlations between the traits, phylogenetic structure in the predictors, and under various types of phenotypic differences across species groups. We show that our approach outperforms available alternatives under all circumstances, with greater power to detect phenotypic differences across species group when they exist, and a lower risk of improperly detecting nonexistent differences. Finally, we provide an empirical illustration of our pMANOVA on a geometric-morphometric data set describing mandible morphology in phyllostomid bats along with data on their diet preferences. Overall our results show significant differences between ecological groups. Our approach, implemented in the R package mvMORPH and illustrated in a tutorial for end-users, provides efficient multivariate phylogenetic regression tools for understanding what shapes phenotypic differences across species. [Generalized least squares; high-dimensional data sets; multivariate phylogenetic comparative methods; penalized likelihood; phenomics; phyllostomid bats; phylogenetic MANOVA; phylogenetic regression.]
Collapse
Affiliation(s)
- Julien Clavel
- Institut de Biologie de l’École Normale Supérieure (IBENS), École Normale Supérieure, Paris Sciences et Lettres (PSL) Research University, CNRS UMR 8197, INSERM U1024, 46 rue d’Ulm, F-75005 Paris, France
- Life Sciences Department, The Natural History Museum, Cromwell Road, London SW7 5BD, UK
- Univ Lyon, Laboratoire d’Ecologie des Hydrosystémes Naturels et Anthropisés, UMR CNRS 5023, Université Claude Bernard Lyon 1, ENTPE, Boulevard du 11 Novembre 1918 F-69622, Villeurbanne Cedex, France
| | - Hélène Morlon
- Institut de Biologie de l’École Normale Supérieure (IBENS), École Normale Supérieure, Paris Sciences et Lettres (PSL) Research University, CNRS UMR 8197, INSERM U1024, 46 rue d’Ulm, F-75005 Paris, France
| |
Collapse
|
38
|
Roy J, Cheung E, Bhatti J, Muneem A, Lobo D. Curation and annotation of planarian gene expression patterns with segmented reference morphologies. Bioinformatics 2020; 36:2881-2887. [DOI: 10.1093/bioinformatics/btaa023] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 12/07/2019] [Accepted: 01/14/2020] [Indexed: 12/30/2022] Open
Abstract
Abstract
Motivation
Morphological and genetic spatial data from functional experiments based on genetic, surgical and pharmacological perturbations are being produced at an extraordinary pace in developmental and regenerative biology. However, our ability to extract knowledge from these large datasets are hindered due to the lack of formalization methods and tools able to unambiguously describe, centralize and interpret them. Formalizing spatial phenotypes and gene expression patterns is especially challenging in organisms with highly variable morphologies such as planarian worms, which due to their extraordinary regenerative capability can experimentally result in phenotypes with almost any combination of body regions or parts.
Results
Here, we present a computational methodology and mathematical formalism to encode and curate the morphological outcomes and gene expression patterns in planaria. Worm morphologies are encoded with mathematical graphs based on anatomical ontology terms to automatically generate reference morphologies. Gene expression patterns are registered to these standard reference morphologies, which can then be annotated automatically with anatomical ontology terms by analyzing the spatial expression patterns and their textual descriptions. This methodology enables the curation and annotation of complex experimental morphologies together with their gene expression patterns in a centralized standardized dataset, paving the way for the extraction of knowledge and reverse-engineering of the much sought-after mechanistic models in planaria and other regenerative organisms.
Availability and implementation
We implemented this methodology in a user-friendly graphical software tool, PlanGexQ, freely available together with the data in the manuscript at https://lobolab.umbc.edu/plangexq.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Joy Roy
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| | - Eric Cheung
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| | - Junaid Bhatti
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| | - Abraar Muneem
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| | - Daniel Lobo
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| |
Collapse
|
39
|
Bartoš O, Röslein J, Kotusz J, Paces J, Pekárik L, Petrtýl M, Halačka K, Štefková Kašparová E, Mendel J, Boroń A, Juchno D, Leska A, Jablonska O, Benes V, Šídová M, Janko K. The Legacy of Sexual Ancestors in Phenotypic Variability, Gene Expression, and Homoeolog Regulation of Asexual Hybrids and Polyploids. Mol Biol Evol 2020; 36:1902-1920. [PMID: 31077330 PMCID: PMC6735777 DOI: 10.1093/molbev/msz114] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Hybridization and polyploidization are important evolutionary processes whose impacts range from the alteration of gene expression and phenotypic variation to the triggering of asexual reproduction. We investigated fishes of the Cobitis taenia-elongatoides hybrid complex, which allowed us to disentangle the direct effects of both processes, due to the co-occurrence of parental species with their diploid and triploid hybrids. Employing morphological, ecological, and RNAseq approaches, we investigated the molecular determinants of hybrid and polyploid forms. In contrast with other studies, hybridization and polyploidy induced relatively very little transgressivity. Instead, Cobitis hybrids appeared intermediate with a clear effect of genomic dosing when triploids expressed higher similarity to the parent contributing two genome sets. This dosage effect was symmetric in the germline (oocyte gene expression), interestingly though, we observed an overall bias toward C. taenia in somatic tissues and traits. At the level of individual genes, expression-level dominance vastly prevailed over additivity or transgressivity. Also, trans-regulation of gene expression was less efficient in diploid hybrids than in triploids, where the expression modulation of homoeologs derived from the "haploid" parent was stronger than those derived from the "diploid" parent. Our findings suggest that the apparent intermediacy of hybrid phenotypes results from the combination of individual genes with dominant expression rather than from simple additivity. The efficiency of cross-talk between trans-regulatory elements further appears dosage dependent. Important effects of polyploidization may thus stem from changes in relative concentrations of trans-regulatory elements and their binding sites between hybridizing genomes. Links between gene regulation and asexuality are discussed.
Collapse
Affiliation(s)
- Oldřich Bartoš
- Institute of Animal Physiology and Genetics, Laboratory of Fish Genetics, The Czech Academy of Sciences, Libechov, Czech Republic.,Department of Zoology, Faculty of Science, Charles University, Prague, Czech Republic
| | - Jan Röslein
- Institute of Animal Physiology and Genetics, Laboratory of Fish Genetics, The Czech Academy of Sciences, Libechov, Czech Republic.,Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Jan Kotusz
- Museum of Natural History, University of Wroclaw, Wroclaw, Poland
| | - Jan Paces
- Institute of Animal Physiology and Genetics, Laboratory of Fish Genetics, The Czech Academy of Sciences, Libechov, Czech Republic.,Institute of Molecular Genetics, Laboratory of Genomics and Bioinformatics, The Czech Academy of Sciences, Prague, Czech Republic
| | - Ladislav Pekárik
- Plant Science and Biodiversity Center, Institute of Botany, Slovak Academy of Sciences, Bratislava, Slovakia.,Faculty of Education, Trnava University, Trnava, Slovakia
| | - Miloslav Petrtýl
- Institute of Animal Physiology and Genetics, Laboratory of Fish Genetics, The Czech Academy of Sciences, Libechov, Czech Republic.,Department of Zoology and Fisheries, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Prague, Czech Republic
| | - Karel Halačka
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic
| | - Eva Štefková Kašparová
- Institute of Animal Physiology and Genetics, Laboratory of Fish Genetics, The Czech Academy of Sciences, Libechov, Czech Republic
| | - Jan Mendel
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic
| | - Alicja Boroń
- Department of Zoology, Faculty of Biology and Biotechnology, University of Warmia and Mazury in Olsztyn, Olsztyn, Poland
| | - Dorota Juchno
- Department of Zoology, Faculty of Biology and Biotechnology, University of Warmia and Mazury in Olsztyn, Olsztyn, Poland
| | - Anna Leska
- Department of Zoology, Faculty of Biology and Biotechnology, University of Warmia and Mazury in Olsztyn, Olsztyn, Poland
| | - Olga Jablonska
- Department of Zoology, Faculty of Biology and Biotechnology, University of Warmia and Mazury in Olsztyn, Olsztyn, Poland
| | - Vladimir Benes
- Genomics Core Facility, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Monika Šídová
- Institute of Biotechnology of the Czech Academy of Sciences - BIOCEV, Vestec, Czech Republic
| | - Karel Janko
- Institute of Animal Physiology and Genetics, Laboratory of Fish Genetics, The Czech Academy of Sciences, Libechov, Czech Republic.,Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| |
Collapse
|
40
|
Waller JT, Willink B, Tschol M, Svensson EI. The odonate phenotypic database, a new open data resource for comparative studies of an old insect order. Sci Data 2019; 6:316. [PMID: 31831730 PMCID: PMC6908694 DOI: 10.1038/s41597-019-0318-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 11/12/2019] [Indexed: 11/09/2022] Open
Abstract
We present The Odonate Phenotypic Database (OPD): an online data resource of dragonfly and damselfly phenotypes (Insecta: Odonata). Odonata is a relatively small insect order that currently consists of about 6400 species belonging to 32 families. The database consists of multiple morphological, life-history and behavioral traits, and biogeographical information collected from literature sources. We see taxon-specific phenotypic databases from Odonata and other organismal groups as becoming an increasing valuable resource in comparative studies. Our database has phenotypic records for 1011 of all 6400 known odonate species. The database is accessible at http://www.odonatephenotypicdatabase.org/, and a static version with an information file about the variables in the database is archived at Dryad.
Collapse
Affiliation(s)
- John T Waller
- Department of Biology, Lund University, SE-223 62, Lund, Sweden
- Global Biodiversity Information Facility (GBIF), GBIF Secretariat Universitetsparken 15, DK-2100, Copenhagen Ø, Denmark
| | - Beatriz Willink
- Department of Biology, Lund University, SE-223 62, Lund, Sweden
- School of Biology, University of Costa Rica, San Jose, 11501-2060, Costa Rica
| | - Maximilian Tschol
- Department of Biology, Lund University, SE-223 62, Lund, Sweden
- School of Biological Sciences, Zoology Building, University of Aberdeen, Tillydrone Avenue, Aberdeen, AB24 2TZ, UK
| | - Erik I Svensson
- Department of Biology, Lund University, SE-223 62, Lund, Sweden.
| |
Collapse
|
41
|
Rutter MT, Murren CJ, Callahan HS, Bisner AM, Leebens-Mack J, Wolyniak MJ, Strand AE. Distributed phenomics with the unPAK project reveals the effects of mutations. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 100:199-211. [PMID: 31155775 DOI: 10.1111/tpj.14427] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 05/01/2019] [Accepted: 05/10/2019] [Indexed: 06/09/2023]
Abstract
Determining how genes are associated with traits in plants and other organisms is a major challenge in modern biology. The unPAK project - undergraduates phenotyping Arabidopsis knockouts - has generated phenotype data for thousands of non-lethal insertion mutation lines within a single Arabidopsis thaliana genomic background. The focal phenotypes examined by unPAK are complex macroscopic fitness-related traits, which have ecological, evolutionary and agricultural importance. These phenotypes are placed in the context of the wild-type and also natural accessions (phytometers), and standardized for environmental differences between assays. Data from the unPAK project are used to describe broad patterns in the phenotypic consequences of insertion mutation, and to identify individual mutant lines with distinct phenotypes as candidates for further study. Inclusion of undergraduate researchers is at the core of unPAK activities, and an important broader impact of the project is providing students an opportunity to obtain research experience.
Collapse
Affiliation(s)
- Matthew T Rutter
- Department of Biology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
| | - Courtney J Murren
- Department of Biology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
| | - Hilary S Callahan
- Department of Biology, Barnard College, 3009 Broadway, New York, NY, 10027, USA
| | - April M Bisner
- Department of Biology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
| | - Jim Leebens-Mack
- Department of Plant Biology, University of Georgia, 120 Carlton St, Athens, GA, 30602, USA
| | | | - Allan E Strand
- Department of Biology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
| |
Collapse
|
42
|
Costa JM, Marques da Silva J, Pinheiro C, Barón M, Mylona P, Centritto M, Haworth M, Loreto F, Uzilday B, Turkan I, Oliveira MM. Opportunities and Limitations of Crop Phenotyping in Southern European Countries. FRONTIERS IN PLANT SCIENCE 2019; 10:1125. [PMID: 31608085 PMCID: PMC6774291 DOI: 10.3389/fpls.2019.01125] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Accepted: 08/15/2019] [Indexed: 05/31/2023]
Abstract
The Mediterranean climate is characterized by hot dry summers and frequent droughts. Mediterranean crops are frequently subjected to high evapotranspiration demands, soil water deficits, high temperatures, and photo-oxidative stress. These conditions will become more severe due to global warming which poses major challenges to the sustainability of the agricultural sector in Mediterranean countries. Selection of crop varieties adapted to future climatic conditions and more tolerant to extreme climatic events is urgently required. Plant phenotyping is a crucial approach to address these challenges. High-throughput plant phenotyping (HTPP) helps to monitor the performance of improved genotypes and is one of the most effective strategies to improve the sustainability of agricultural production. In spite of the remarkable progress in basic knowledge and technology of plant phenotyping, there are still several practical, financial, and political constraints to implement HTPP approaches in field and controlled conditions across the Mediterranean. The European panorama of phenotyping is heterogeneous and integration of phenotyping data across different scales and translation of "phytotron research" to the field, and from model species to crops, remain major challenges. Moreover, solutions specifically tailored to Mediterranean agriculture (e.g., crops and environmental stresses) are in high demand, as the region is vulnerable to climate change and to desertification processes. The specific phenotyping requirements of Mediterranean crops have not yet been fully identified. The high cost of HTPP infrastructures is a major limiting factor, though the limited availability of skilled personnel may also impair its implementation in Mediterranean countries. We propose that the lack of suitable phenotyping infrastructures is hindering the development of new Mediterranean agricultural varieties and will negatively affect future competitiveness of the agricultural sector. We provide an overview of the heterogeneous panorama of phenotyping within Mediterranean countries, describing the state of the art of agricultural production, breeding initiatives, and phenotyping capabilities in five countries: Italy, Greece, Portugal, Spain, and Turkey. We characterize some of the main impediments for development of plant phenotyping in those countries and identify strategies to overcome barriers and maximize the benefits of phenotyping and modeling approaches to Mediterranean agriculture and related sustainability.
Collapse
Affiliation(s)
| | - Jorge Marques da Silva
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, Universidade de Lisboa, Lisbon, Portugal
| | - Carla Pinheiro
- FCT NOVA, Universidade Nova de Lisboa, Monte da Caparica, Portugal
- ITQB NOVA, Universidade Nova de Lisboa, Oeiras, Portugal
| | - Matilde Barón
- Estación Experimental del Zaidín, Consejo Superior de Investigaciones Científicas (CSIC), Granada, Spain
| | - Photini Mylona
- HAO-DEMETER, Institute of Plant Breeding and Genetic Resources, Thermi, Greece
| | - Mauro Centritto
- Institute for Sustainable Plant Protection, Italian National Research Council (IPSP-CNR), Sesto Fiorentino, Italy
| | | | - Francesco Loreto
- Department of Biology, Agriculture and Food Sciences, CNR, Rome, Italy
| | - Baris Uzilday
- Department of Biology, Faculty of Science, Ege University, I˙zmir, Turkey
| | - Ismail Turkan
- Department of Biology, Faculty of Science, Ege University, I˙zmir, Turkey
| | | |
Collapse
|
43
|
Ebiki M, Okazaki T, Kai M, Adachi K, Nanba E. Comparison of Causative Variant Prioritization Tools Using Next-generation Sequencing Data in Japanese Patients with Mendelian Disorders. Yonago Acta Med 2019; 62:244-252. [PMID: 31582890 DOI: 10.33160/yam.2019.09.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 07/17/2019] [Indexed: 12/24/2022]
Abstract
Background During the investigation of causative variants of Mendelian disorders using next-generation sequencing, the enormous number of possible candidates makes the detection process complex, and the use of multidimensional methods is required. Although the utility of several variant prioritization tools has been reported, their effectiveness in Japanese patients remains largely unknown. Methods We selected 5 free variant prioritization tools (PhenIX, hiPHIVE, Phen-Gen, eXtasy-order statistics, and eXtasy-combined max) and assessed their effectiveness in Japanese patient populations. To compare these tools, we conducted 2 studies: one based on simulated data of 100 diseases and another based on the exome data of 20 in-house patients with Mendelian disorders. To this end we selected 100 pathogenic variants from the "Database of Pathogenic Variants (DPV)" and created 100 variant call format (VCF) files that each had pathogenic variants based on reference human genome data from the 1000 Genomes Project. The later "in-house" study used exome data from 20 Japanese patients with Mendelian disorders. In both studies, we utilized 1-5 terms of "Human Phenotype Ontology" as clinical information. Results In our analysis based on simulated disease data, the detection rate of the top 10 causative variants was 91% for hiPHIVE, and 88% for PhenIX, based on 100 sets of simulated disease VCF data. Also, both software packages detected 82% of the top 1 causative variants. When we used data from our in-house patients instead, we found that these two programs (PhenIX and hiPHIVE) produced higher detection rates than the other three systems in our study. The detection rate of the top 1 causative variant was 71.4% for PhenIX, 65.0% for hiPHIVE. Conclusion The rates of detecting causative variants in two Exomizer software packages, hiPHIVE and PhenIX, were higher than for the other three software systems we analyzed, with respect to Japanese patients.
Collapse
Affiliation(s)
- Mitsutaka Ebiki
- The Development of Innovative Future Medical Treatment, Graduate School of Medical Sciences, Tottori University, Yonago 683-8504, Japan.,KUSUNOKI SCALE INC., Yonago 683-0832, Japan
| | - Tetsuya Okazaki
- Division of Child Neurology, Department of Brain and Neurosciences, School of Medicine, Tottori University Faculty of Medicine, Yonago 683-8504, Japan.,Division of Clinical Genetics, Tottori University Hospital, Yonago 683-8504, Japan, ‖Technical Department, Tottori University, Yonago 683-8503, Japan
| | - Masachika Kai
- Research Initiative Center, Organization for Research Initiative and Promotion, Tottori University, Yonago 683-8503, Japan
| | - Kaori Adachi
- Research Strategy Division, Organization for Research Initiative and Promotion, Tottori University, Yonago 683-8503, Japan
| | - Eiji Nanba
- Division of Clinical Genetics, Tottori University Hospital, Yonago 683-8504, Japan, ‖Technical Department, Tottori University, Yonago 683-8503, Japan.,Research Strategy Division, Organization for Research Initiative and Promotion, Tottori University, Yonago 683-8503, Japan
| |
Collapse
|
44
|
Le Roux B, Lenaers G, Zanlonghi X, Amati-Bonneau P, Chabrun F, Foulonneau T, Caignard A, Leruez S, Gohier P, Procaccio V, Milea D, den Dunnen JT, Reynier P, Ferré M. OPA1: 516 unique variants and 831 patients registered in an updated centralized Variome database. Orphanet J Rare Dis 2019; 14:214. [PMID: 31500643 PMCID: PMC6734442 DOI: 10.1186/s13023-019-1187-1] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Accepted: 08/30/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The dysfunction of OPA1, a dynamin GTPase involved in mitochondrial fusion, is responsible for a large spectrum of neurological disorders, each of which includes optic neuropathy. The database dedicated to OPA1 ( https://www.lovd.nl/OPA1 ), created in 2005, has now evolved towards a centralized and more reliable database using the Global Variome shared Leiden Open-source Variation Database (LOVD) installation. RESULTS The updated OPA1 database, which registers all the patients from our center as well as those reported in the literature, now covers a total of 831 patients: 697 with isolated dominant optic atrophy (DOA), 47 with DOA "plus", and 83 with asymptomatic or unclassified DOA. It comprises 516 unique OPA1 variants, of which more than 80% (414) are considered pathogenic. Full clinical data for 118 patients are documented using the Human Phenotype Ontology, a standard vocabulary for referencing phenotypic abnormalities. Contributors may now make online submissions of phenotypes related to OPA1 mutations, giving clinical and molecular descriptions together with detailed ophthalmological and neurological data, according to an international thesaurus. CONCLUSIONS The evolution of the OPA1 database towards the LOVD, using unified nomenclature, should ensure its interoperability with other databases and prove useful for molecular diagnoses based on gene-panel sequencing, large-scale mutation statistics, and genotype-phenotype correlations.
Collapse
Affiliation(s)
- Bastien Le Roux
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Guy Lenaers
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
| | - Xavier Zanlonghi
- Centre de Compétence Maladie Rare, Clinique Jules Verne, Nantes, France
| | - Patrizia Amati-Bonneau
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France.,Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Floris Chabrun
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France.,Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Thomas Foulonneau
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France
| | - Angélique Caignard
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Stéphanie Leruez
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Philippe Gohier
- Département d'Ophtalmologie, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Vincent Procaccio
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France.,Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Dan Milea
- Singapore National Eye Center, Singapore Eye Research Institute, Duke-NUS, Singapore, Singapore
| | - Johan T den Dunnen
- Human Genetics and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Pascal Reynier
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France.,Département de Biochimie et Génétique, Centre Hospitalier Universitaire d'Angers, Angers, France
| | - Marc Ferré
- Unité Mixte de Recherche MITOVASC, CNRS 6015/INSERM 1083, Université d'Angers, Angers, France.
| |
Collapse
|
45
|
Tarasov S. Integration of Anatomy Ontologies and Evo-Devo Using Structured Markov Models Suggests a New Framework for Modeling Discrete Phenotypic Traits. Syst Biol 2019; 68:698-716. [PMID: 30668800 PMCID: PMC6701457 DOI: 10.1093/sysbio/syz005] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Revised: 01/06/2019] [Accepted: 01/15/2019] [Indexed: 11/12/2022] Open
Abstract
Modeling discrete phenotypic traits for either ancestral character state reconstruction or morphology-based phylogenetic inference suffers from ambiguities of character coding, homology assessment, dependencies, and selection of adequate models. These drawbacks occur because trait evolution is driven by two key processes-hierarchical and hidden-which are not accommodated simultaneously by the available phylogenetic methods. The hierarchical process refers to the dependencies between anatomical body parts, while the hidden process refers to the evolution of gene regulatory networks (GRNs) underlying trait development. Herein, I demonstrate that these processes can be efficiently modeled using structured Markov models (SMM) equipped with hidden states, which resolves the majority of the problems associated with discrete traits. Integration of SMM with anatomy ontologies can adequately incorporate the hierarchical dependencies, while the use of the hidden states accommodates hidden evolution of GRNs and substitution rate heterogeneity. I assess the new models using simulations and theoretical synthesis. The new approach solves the long-standing "tail color problem," in which the trait is scored for species with tails of different colors or no tails. It also presents a previously unknown issue called the "two-scientist paradox," in which the nature of coding the trait and the hidden processes driving the trait's evolution are confounded; failing to account for the hidden process may result in a bias, which can be avoided by using hidden state models. All this provides a clear guideline for coding traits into characters. This article gives practical examples of using the new framework for phylogenetic inference and comparative analysis.
Collapse
Affiliation(s)
- Sergei Tarasov
- National Institute for Mathematical and Biological Synthesis, University of Tennessee, Knoxville, TN 37996, USA
- Department of Biological Sciences, Virginia Tech, 4076 Derring Hall, 926 West Campus Drive, Blacksburg, VA 24061, USA
| |
Collapse
|
46
|
Lamichhaney S, Card DC, Grayson P, Tonini JFR, Bravo GA, Näpflin K, Termignoni-Garcia F, Torres C, Burbrink F, Clarke JA, Sackton TB, Edwards SV. Integrating natural history collections and comparative genomics to study the genetic architecture of convergent evolution. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180248. [PMID: 31154982 PMCID: PMC6560268 DOI: 10.1098/rstb.2018.0248] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/25/2019] [Indexed: 12/20/2022] Open
Abstract
Evolutionary convergence has been long considered primary evidence of adaptation driven by natural selection and provides opportunities to explore evolutionary repeatability and predictability. In recent years, there has been increased interest in exploring the genetic mechanisms underlying convergent evolution, in part, owing to the advent of genomic techniques. However, the current 'genomics gold rush' in studies of convergence has overshadowed the reality that most trait classifications are quite broadly defined, resulting in incomplete or potentially biased interpretations of results. Genomic studies of convergence would be greatly improved by integrating deep 'vertical', natural history knowledge with 'horizontal' knowledge focusing on the breadth of taxonomic diversity. Natural history collections have and continue to be best positioned for increasing our comprehensive understanding of phenotypic diversity, with modern practices of digitization and databasing of morphological traits providing exciting improvements in our ability to evaluate the degree of morphological convergence. Combining more detailed phenotypic data with the well-established field of genomics will enable scientists to make progress on an important goal in biology: to understand the degree to which genetic or molecular convergence is associated with phenotypic convergence. Although the fields of comparative biology or comparative genomics alone can separately reveal important insights into convergent evolution, here we suggest that the synergistic and complementary roles of natural history collection-derived phenomic data and comparative genomics methods can be particularly powerful in together elucidating the genomic basis of convergent evolution among higher taxa. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Sangeet Lamichhaney
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Daren C. Card
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
- Department of Biology, University of Texas Arlington, Arlington, TX 76019, USA
| | - Phil Grayson
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - João F. R. Tonini
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Kathrin Näpflin
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Flavia Termignoni-Garcia
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Christopher Torres
- Department of Biology, The University of Texas at Austin, Austin, MA 78712, USA
- Department of Geological Sciences, The University of Texas at Austin, Austin, MA 78712, USA
| | - Frank Burbrink
- Department of Herpetology, The American Museum of Natural History, New York, NY 10024, USA
| | - Julia A. Clarke
- Department of Biology, The University of Texas at Austin, Austin, MA 78712, USA
- Department of Geological Sciences, The University of Texas at Austin, Austin, MA 78712, USA
| | | | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
47
|
Eliason CM, Edwards SV, Clarke JA. phenotools: An
r
package for visualizing and analysing phenomic datasets. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Chad M. Eliason
- Department of Geological Sciences University of Texas Austin Austin Texas
- Grainger Bioinformatics Center Field Museum of Natural History Chicago Illinois
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology Harvard University Cambridge Massachusetts
| | - Julia A. Clarke
- Department of Geological Sciences University of Texas Austin Austin Texas
| |
Collapse
|
48
|
Larmande P, Do H, Wang Y. OryzaGP: rice gene and protein dataset for named-entity recognition. Genomics Inform 2019; 17:e17. [PMID: 31307132 PMCID: PMC6808627 DOI: 10.5808/gi.2019.17.2.e17] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 05/30/2019] [Indexed: 11/20/2022] Open
Abstract
Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.
Collapse
Affiliation(s)
- Pierre Larmande
- UMR DIADE, Institute of Research for Sustainable Development (IRD), F-34394 Montpellier, France.,ICT Lab, University of Science and Technology of Hanoi (USTH), 100000 Hanoi, Vietnam
| | - Huy Do
- ICT Lab, University of Science and Technology of Hanoi (USTH), 100000 Hanoi, Vietnam
| | - Yue Wang
- Database Center for Life Science (DBCLS), Chiba 277-0871, Japan
| |
Collapse
|
49
|
Cianciarullo AM, Bonini-Domingos CR, Vizotto LD, Kobashi LS, Beçak ML, Beçak W. Whole-genome duplication and hemoglobin differentiation traits between allopatric populations of Brazilian Odontophrynus americanus species complex (Amphibia, Anura). Genet Mol Biol 2019; 42:436-444. [PMID: 31259358 PMCID: PMC6726162 DOI: 10.1590/1678-4685-gmb-2017-0260] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 07/25/2018] [Indexed: 11/21/2022] Open
Abstract
Two allopatric populations of Brazilian diploid and tetraploid
Odontophrynus americanus species complex, both from São
Paulo state, had their blood hemoglobin biochemically analyzed. In addition,
these specimens were cytogenetically characterized. Biochemical characterization
of hemoglobin expression showed a distinct banding pattern between the
allopatric specimens. Besides this, two distinct phenotypes, not linked to
ploidy, sex, or age, were observed in adult animals of both populations.
Phenotype A exhibits dark-colored body with small papillae, ogival-shaped jaw
with reduced interpupillary distance and shorter hind limbs. Phenotype B shows
yellowish-colored body with larger papillae, arch-shaped jaw with broader
interpupillary distance and longer hind limbs. Intermediate phenotypes were also
found. Considering the geographical isolation of both populations, differences
in chromosomal secondary constrictions and distinct hemoglobins banding
patterns, these data indicate that 2n and 4n populations represent cryptic
species in the O. americanus species complex. The observed
phenotypic diversity can be interpreted as population genetic variability.
Eventually future data may indicate a probable beginning of speciation in these
Brazilian frogs. Such inter- and intrapopulational differentiation/speciation
process indicates that O. americanus species complex taxonomy
deserves further evaluation by genomics and metabarcoding communities, also
considering the pattern of hemoglobin expression, in South American frogs.
Collapse
Affiliation(s)
| | - Claudia R Bonini-Domingos
- Department of Biology, Laboratory of Hemoglobins and Genetics of the Hematological Diseases, Universidade Estadual Paulista "Julio de Mesquita Filho (UNESP), São José do Rio Preto, SP, Brazil
| | - Luiz D Vizotto
- Department of Zoology, Universidade Estadual Paulista "Julio de Mesquita Filho (UNESP), São José do Rio Preto, SP, Brazil
| | - Leonardo S Kobashi
- Laboratory of Ecology and Evolution, Instituto Butantan, São Paulo, SP, Brazil.,Universidade Paulista (UNIP) São Paulo, SP, Brazil
| | | | - Willy Beçak
- Laboratory of Genetics, Instituto Butantan, São Paulo, SP, Brazil
| |
Collapse
|
50
|
Vogt L. Organizing phenotypic data-a semantic data model for anatomy. J Biomed Semantics 2019; 10:12. [PMID: 31221226 PMCID: PMC6585074 DOI: 10.1186/s13326-019-0204-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 06/05/2019] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Currently, almost all morphological data are published as unstructured free text descriptions. This not only brings about terminological problems regarding semantic transparency, which hampers their re-use by non-experts, but the data cannot be parsed by computers either, which in turn hampers their integration across many fields in the life sciences, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. With an ever-increasing amount of available ontologies and the development of adequate semantic technology, however, a solution to this problem becomes available. Instead of free text descriptions, morphological data can be recorded, stored, and communicated through the Web in the form of highly formalized and structured directed graphs (semantic graphs) that use ontology terms and URIs as terminology. RESULTS After introducing an instance-based approach of recording morphological descriptions as semantic graphs (i.e., Semantic Instance Anatomy Knowledge Graphs) and discussing accompanying metadata graphs, I propose a general scheme of how to efficiently organize the resulting graphs in a tuple store framework based on instances of defined named graph ontology classes. The use of such named graph resources allows meaningful fragmentation of the data, which in turn enables subsequent specification of all kinds of data views for managing and accessing morphological data. CONCLUSIONS Morphological data that comply with the here proposed semantic data model will not only be computer-parsable but also re-usable by non-experts and could be better integrated with other sources of data in the life sciences. This would allow morphology as a discipline to further participate in eScience and Big Data.
Collapse
Affiliation(s)
- Lars Vogt
- Institut für Evolutionsbiologie und Ökologie, Rheinische Friedrich-Wilhelms-Universität Bonn, An der Immenburg 1, 53121, Bonn, Germany.
| |
Collapse
|