1
|
Lell M, Gogna A, Kloesgen V, Avenhaus U, Dörnte J, Eckhoff WM, Eschholz T, Gils M, Kirchhoff M, Koch M, Kollers S, Pfeiffer N, Rapp M, Wimmer V, Wolf M, Reif J, Zhao Y. Breaking down data silos across companies to train genome-wide predictions: A feasibility study in wheat. PLANT BIOTECHNOLOGY JOURNAL 2025. [PMID: 40253615 DOI: 10.1111/pbi.70095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 03/07/2025] [Accepted: 04/07/2025] [Indexed: 04/22/2025]
Abstract
Big data, combined with artificial intelligence (AI) techniques, holds the potential to significantly enhance the accuracy of genome-wide predictions. Motivated by the success reported for wheat hybrids, we extended the scope to inbred lines by integrating phenotypic and genotypic data from four commercial wheat breeding programs. Acting as an academic data trustee, we merged these data with historical experimental series from previous public-private partnerships. The integrated data spanned 12 years, 168 environments, and provided a genomic prediction training set of up to ~9500 genotypes for grain yield, plant height and heading date. Despite the heterogeneous phenotypic and genotypic data, we were able to obtain high-quality data by implementing rigorous data curation, including SNP imputation. We utilized the data to compare genomic best linear unbiased predictions with convolutional neural network-based genomic prediction. Our analysis revealed that we could flexibly combine experimental series for genomic prediction, with prediction ability steadily improving as the training set sizes increased, peaking at around 4000 genotypes. As training set sizes were further increased, the gains in prediction ability decreased, approaching a plateau well below the theoretical limit defined by the square root of the heritability. Potential avenues, such as designed training sets or novel non-linear prediction approaches, could overcome this plateau and help to more fully exploit the high-value big data generated by breaking down data silos across companies.
Collapse
Affiliation(s)
- Moritz Lell
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| | - Abhishek Gogna
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| | - Vincent Kloesgen
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| | - Ulrike Avenhaus
- W. von Borries-Eckendorf GmbH & Co. KG, Leopoldshöhe, Germany
| | - Jost Dörnte
- Deutsche Saatveredelung AG, Lippstadt, Germany
| | | | | | - Mario Gils
- Nordsaat Saatzucht GmbH, Langenstein, Germany
| | | | | | | | | | - Matthias Rapp
- W. von Borries-Eckendorf GmbH & Co. KG, Leopoldshöhe, Germany
| | | | | | - Jochen Reif
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| | - Yusheng Zhao
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| |
Collapse
|
2
|
Hafner A, DeLeo V, Deng CH, Elsik CG, S Fleming D, Harrison PW, Kalbfleisch TS, Petry B, Pucker B, Quezada-Rodríguez EH, Tuggle CK, Koltes JE. Data reuse in agricultural genomics research: challenges and recommendations. Gigascience 2025; 14:giae106. [PMID: 39804724 PMCID: PMC11727710 DOI: 10.1093/gigascience/giae106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 09/17/2024] [Accepted: 11/26/2024] [Indexed: 01/16/2025] Open
Abstract
The scientific community has long benefited from the opportunities provided by data reuse. Recognizing the need to identify the challenges and bottlenecks to reuse in the agricultural research community and propose solutions for them, the data reuse working group was started within the AgBioData consortium framework. Here, we identify the limitations of data standards, metadata deficiencies, data interoperability, data ownership, data availability, user skill level, resource availability, and equity issues, with a specific focus on agricultural genomics research. We propose possible solutions stakeholders could implement to mitigate and overcome these challenges and provide an optimistic perspective on the future of genomics and transcriptomics data reuse.
Collapse
Affiliation(s)
- Alenka Hafner
- Department of Biology, Frear North, Pennsylvania State University, University Park, PA, 16802, US
- Intercollege Graduate Degree Program in Plant Biology, Pennsylvania State University, University Park, PA, 16802, US
| | | | - Cecilia H Deng
- New Cultivar Innovation, The New Zealand Institute for Plant and Food Research Limited, Auckland, 1025, New Zealand
| | - Christine G Elsik
- Division of Animal Sciences and Division of Plant Science & Technology, University of Missouri, MO, 65211, US
- Institute for Data Science & Informatics, University of Missouri, MO, 65211, US
| | - Damarius S Fleming
- Animal Parasitic Diseases Laboratory, United States Department of Agriculture Agricultural Research Service, Beltsville, MD, 20705, US
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, Cambridgeshire, CB10 1SD, UK
| | - Theodore S Kalbfleisch
- Department of Veterinary Science, Martin-Gatton College of Agriculture, Food, and Environment, University of Kentucky, Lexington, KY, 40202, US
| | - Bruna Petry
- Department of Animal Science, Iowa State University, Ames, IA, 50011, US
| | - Boas Pucker
- Institute of Plant Biology & BRICS, TU Braunschweig, Braunschweig, 38106, Germany
| | - Elsa H Quezada-Rodríguez
- Departamento de Producción Agrícola y Animal, Universidad Autónoma Metropolitana-Xochimilco, Ciudad de México, 04510, México
- Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Ciudad de México, 04510, México
| | | | - James E Koltes
- Department of Animal Science, Iowa State University, Ames, IA, 50011, US
| |
Collapse
|
3
|
Mascher M, Jayakodi M, Shim H, Stein N. Promises and challenges of crop translational genomics. Nature 2024; 636:585-593. [PMID: 39313530 PMCID: PMC7616746 DOI: 10.1038/s41586-024-07713-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/13/2024] [Indexed: 09/25/2024]
Abstract
Crop translational genomics applies breeding techniques based on genomic datasets to improve crops. Technological breakthroughs in the past ten years have made it possible to sequence the genomes of increasing numbers of crop varieties and have assisted in the genetic dissection of crop performance. However, translating research findings to breeding applications remains challenging. Here we review recent progress and future prospects for crop translational genomics in bringing results from the laboratory to the field. Genetic mapping, genomic selection and sequence-assisted characterization and deployment of plant genetic resources utilize rapid genotyping of large populations. These approaches have all had an impact on breeding for qualitative traits, where single genes with large phenotypic effects exert their influence. Characterization of the complex genetic architectures that underlie quantitative traits such as yield and flowering time, especially in newly domesticated crops, will require further basic research, including research into regulation and interactions of genes and the integration of genomic approaches and high-throughput phenotyping, before targeted interventions can be designed. Future priorities for translation include supporting genomics-assisted breeding in low-income countries and adaptation of crops to changing environments.
Collapse
Affiliation(s)
- Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| | - Murukarthick Jayakodi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Hyeonah Shim
- Department of Agriculture, Forestry and Bioresources, Plant Genomics and Breeding Institute, Research Institute of Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, Korea
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
- Martin Luther University Halle-Wittenberg, Halle, Germany.
| |
Collapse
|
4
|
Murphy KM, Ludwig E, Gutierrez J, Gehan MA. Deep Learning in Image-Based Plant Phenotyping. ANNUAL REVIEW OF PLANT BIOLOGY 2024; 75:771-795. [PMID: 38382904 DOI: 10.1146/annurev-arplant-070523-042828] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
A major bottleneck in the crop improvement pipeline is our ability to phenotype crops quickly and efficiently. Image-based, high-throughput phenotyping has a number of advantages because it is nondestructive and reduces human labor, but a new challenge arises in extracting meaningful information from large quantities of image data. Deep learning, a type of artificial intelligence, is an approach used to analyze image data and make predictions on unseen images that ultimately reduces the need for human input in computation. Here, we review the basics of deep learning, assessments of deep learning success, examples of applications of deep learning in plant phenomics, best practices, and open challenges.
Collapse
Affiliation(s)
| | - Ella Ludwig
- Donald Danforth Plant Science Center, St. Louis, Missouri, USA;
| | - Jorge Gutierrez
- Donald Danforth Plant Science Center, St. Louis, Missouri, USA;
| | - Malia A Gehan
- Donald Danforth Plant Science Center, St. Louis, Missouri, USA;
| |
Collapse
|
5
|
Faria D, Eugénio P, Contreiras Silva M, Balbi L, Bedran G, Kallor AA, Nunes S, Palkowski A, Waleron M, Alfaro JA, Pesquita C. The Immunopeptidomics Ontology (ImPO). Database (Oxford) 2024; 2024:baae014. [PMID: 38857186 PMCID: PMC11164101 DOI: 10.1093/database/baae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 11/30/2023] [Accepted: 02/22/2024] [Indexed: 06/12/2024]
Abstract
The adaptive immune response plays a vital role in eliminating infected and aberrant cells from the body. This process hinges on the presentation of short peptides by major histocompatibility complex Class I molecules on the cell surface. Immunopeptidomics, the study of peptides displayed on cells, delves into the wide variety of these peptides. Understanding the mechanisms behind antigen processing and presentation is crucial for effectively evaluating cancer immunotherapies. As an emerging domain, immunopeptidomics currently lacks standardization-there is neither an established terminology nor formally defined semantics-a critical concern considering the complexity, heterogeneity, and growing volume of data involved in immunopeptidomics studies. Additionally, there is a disconnection between how the proteomics community delivers the information about antigen presentation and its uptake by the clinical genomics community. Considering the significant relevance of immunopeptidomics in cancer, this shortcoming must be addressed to bridge the gap between research and clinical practice. In this work, we detail the development of the ImmunoPeptidomics Ontology, ImPO, the first effort at standardizing the terminology and semantics in the domain. ImPO aims to encapsulate and systematize data generated by immunopeptidomics experimental processes and bioinformatics analysis. ImPO establishes cross-references to 24 relevant ontologies, including the National Cancer Institute Thesaurus, Mondo Disease Ontology, Logical Observation Identifier Names and Codes and Experimental Factor Ontology. Although ImPO was developed using expert knowledge to characterize a large and representative data collection, it may be readily used to encode other datasets within the domain. Ultimately, ImPO facilitates data integration and analysis, enabling querying, inference and knowledge generation and importantly bridging the gap between the clinical proteomics and genomics communities. As the field of immunogenomics uses protein-level immunopeptidomics data, we expect ImPO to play a key role in supporting a rich and standardized description of the large-scale data that emerging high-throughput technologies are expected to bring in the near future. Ontology URL: https://zenodo.org/record/10237571 Project GitHub: https://github.com/liseda-lab/ImPO/blob/main/ImPO.owl.
Collapse
Affiliation(s)
- Daniel Faria
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Rua Alves Redol, 9, Lisboa 1000-029, Portugal
| | - Patrícia Eugénio
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Marta Contreiras Silva
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Laura Balbi
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Georges Bedran
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Ashwin Adrian Kallor
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Susana Nunes
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Aleksander Palkowski
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Michal Waleron
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Javier A Alfaro
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
- Department of Biochemistry and Microbiology, University of Victoria, 3800 Finnerty Rd, Victoria, British Columbia, BC V8P 5C2, Canada
- Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh, Old College, South Bridge, Edinburgh, EH8 9YL, UK
- The Canadian Association for Responsible AI in Medicine, Victoria, Canada
| | - Catia Pesquita
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| |
Collapse
|
6
|
Nguyen HA, Martre P, Collet C, Draye X, Salon C, Jeudy C, Rincent R, Muller B. Are high-throughput root phenotyping platforms suitable for informing root system architecture models with genotype-specific parameters? An evaluation based on the root model ArchiSimple and a small panel of wheat cultivars. JOURNAL OF EXPERIMENTAL BOTANY 2024; 75:2510-2526. [PMID: 38520390 DOI: 10.1093/jxb/erae009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 03/21/2024] [Indexed: 03/25/2024]
Abstract
Given the difficulties in accessing plant roots in situ, high-throughput root phenotyping (HTRP) platforms under controlled conditions have been developed to meet the growing demand for characterizing root system architecture (RSA) for genetic analyses. However, a proper evaluation of their capacity to provide the same estimates for strictly identical root traits across platforms has never been achieved. In this study, we performed such an evaluation based on six major parameters of the RSA model ArchiSimple, using a diversity panel of 14 bread wheat cultivars in two HTRP platforms that had different growth media and non-destructive imaging systems together with a conventional set-up that had a solid growth medium and destructive sampling. Significant effects of the experimental set-up were found for all the parameters and no significant correlations across the diversity panel among the three set-ups could be detected. Differences in temperature, irradiance, and/or the medium in which the plants were growing might partly explain both the differences in the parameter values across the experiments as well as the genotype × set-up interactions. Furthermore, the values and the rankings across genotypes of only a subset of parameters were conserved between contrasting growth stages. As the parameters chosen for our analysis are root traits that have strong impacts on RSA and are close to parameters used in a majority of RSA models, our results highlight the need to carefully consider both developmental and environmental drivers in root phenomics studies.
Collapse
Affiliation(s)
- Hong Anh Nguyen
- LEPSE, Université de Montpellier, INRAE, Institut Agro Montpellier, Montpellier, France
| | - Pierre Martre
- LEPSE, Université de Montpellier, INRAE, Institut Agro Montpellier, Montpellier, France
| | - Clothilde Collet
- Earth and Life Institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Xavier Draye
- Earth and Life Institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Christophe Salon
- Agroécologie, AgroSup Dijon, INRAE, Université Bourgogne Franche-Comté, Dijon, France
| | - Christian Jeudy
- Agroécologie, AgroSup Dijon, INRAE, Université Bourgogne Franche-Comté, Dijon, France
| | - Renaud Rincent
- GDEC, Université Clermont-Auvergne, INRAE, Clermont-Ferrand, France
| | - Bertrand Muller
- LEPSE, Université de Montpellier, INRAE, Institut Agro Montpellier, Montpellier, France
| |
Collapse
|
7
|
Vargas-Rojas L, Ting TC, Rainey KM, Reynolds M, Wang DR. AgTC and AgETL: open-source tools to enhance data collection and management for plant science research. FRONTIERS IN PLANT SCIENCE 2024; 15:1265073. [PMID: 38450403 PMCID: PMC10915008 DOI: 10.3389/fpls.2024.1265073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 01/30/2024] [Indexed: 03/08/2024]
Abstract
Advancements in phenotyping technology have enabled plant science researchers to gather large volumes of information from their experiments, especially those that evaluate multiple genotypes. To fully leverage these complex and often heterogeneous data sets (i.e. those that differ in format and structure), scientists must invest considerable time in data processing, and data management has emerged as a considerable barrier for downstream application. Here, we propose a pipeline to enhance data collection, processing, and management from plant science studies comprising of two newly developed open-source programs. The first, called AgTC, is a series of programming functions that generates comma-separated values file templates to collect data in a standard format using either a lab-based computer or a mobile device. The second series of functions, AgETL, executes steps for an Extract-Transform-Load (ETL) data integration process where data are extracted from heterogeneously formatted files, transformed to meet standard criteria, and loaded into a database. There, data are stored and can be accessed for data analysis-related processes, including dynamic data visualization through web-based tools. Both AgTC and AgETL are flexible for application across plant science experiments without programming knowledge on the part of the domain scientist, and their functions are executed on Jupyter Notebook, a browser-based interactive development environment. Additionally, all parameters are easily customized from central configuration files written in the human-readable YAML format. Using three experiments from research laboratories in university and non-government organization (NGO) settings as test cases, we demonstrate the utility of AgTC and AgETL to streamline critical steps from data collection to analysis in the plant sciences.
Collapse
Affiliation(s)
- Luis Vargas-Rojas
- Department of Agronomy, Purdue University, West Lafayette, IN, United States
| | - To-Chia Ting
- Department of Agronomy, Purdue University, West Lafayette, IN, United States
| | - Katherine M. Rainey
- Department of Agronomy, Purdue University, West Lafayette, IN, United States
| | - Matthew Reynolds
- Wheat Physiology Group, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico
| | - Diane R. Wang
- Department of Agronomy, Purdue University, West Lafayette, IN, United States
| |
Collapse
|
8
|
Cao Y, Tian D, Tang Z, Liu X, Hu W, Zhang Z, Song S. OPIA: an open archive of plant images and related phenotypic traits. Nucleic Acids Res 2024; 52:D1530-D1537. [PMID: 37930849 PMCID: PMC10767956 DOI: 10.1093/nar/gkad975] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 10/11/2023] [Accepted: 10/16/2023] [Indexed: 11/08/2023] Open
Abstract
High-throughput plant phenotype acquisition technologies have been extensively utilized in plant phenomics studies, leading to vast quantities of images and image-based phenotypic traits (i-traits) that are critically essential for accelerating germplasm screening, plant diseases identification and biotic & abiotic stress classification. Here, we present the Open Plant Image Archive (OPIA, https://ngdc.cncb.ac.cn/opia/), an open archive of plant images and i-traits derived from high-throughput phenotyping platforms. Currently, OPIA houses 56 datasets across 11 plants, comprising a total of 566 225 images with 2 417 186 labeled instances. Notably, it incorporates 56 i-traits of 93 rice and 105 wheat cultivars based on 18 644 individual RGB images, and these i-traits are further annotated based on the Plant Phenotype and Trait Ontology (PPTO) and cross-linked with GWAS Atlas. Additionally, each dataset in OPIA is assigned an evaluation score that takes account of image data volume, image resolution, and the number of labeled instances. More importantly, OPIA is equipped with useful tools for online image pre-processing and intelligent prediction. Collectively, OPIA provides open access to valuable datasets, pre-trained models, and phenotypic traits across diverse plants and thus bears great potential to play a crucial role in facilitating artificial intelligence-assisted breeding research.
Collapse
Affiliation(s)
- Yongrong Cao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Dongmei Tian
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Zhixin Tang
- University of Chinese Academy of Sciences, Beijing 100049, China
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Xiaonan Liu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Weijuan Hu
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhang Zhang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shuhui Song
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
9
|
David R, Rybina A, Burel J, Heriche J, Audergon P, Boiten J, Coppens F, Crockett S, Exter K, Fahrner S, Fratelli M, Goble C, Gormanns P, Grantner T, Grüning B, Gurwitz KT, Hancock JM, Harmse H, Holub P, Juty N, Karnbach G, Karoune E, Keppler A, Klemeier J, Lancelotti C, Legras J, Lister AL, Longo DL, Ludwig R, Madon B, Massimi M, Matser V, Matteoni R, Mayrhofer MT, Ohmann C, Panagiotopoulou M, Parkinson H, Perseil I, Pfander C, Pieruschka R, Raess M, Rauber A, Richard AS, Romano P, Rosato A, Sánchez‐Pla A, Sansone S, Sarkans U, Serrano‐Solano B, Tang J, Tanoli Z, Tedds J, Wagener H, Weise M, Westerhoff HV, Wittner R, Ewbank J, Blomberg N, Gribbon P. "Be sustainable": EOSC-Life recommendations for implementation of FAIR principles in life science data handling. EMBO J 2023; 42:e115008. [PMID: 37964598 PMCID: PMC10690449 DOI: 10.15252/embj.2023115008] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 09/12/2023] [Accepted: 09/18/2023] [Indexed: 11/16/2023] Open
Abstract
The main goals and challenges for the life science communities in the Open Science framework are to increase reuse and sustainability of data resources, software tools, and workflows, especially in large-scale data-driven research and computational analyses. Here, we present key findings, procedures, effective measures and recommendations for generating and establishing sustainable life science resources based on the collaborative, cross-disciplinary work done within the EOSC-Life (European Open Science Cloud for Life Sciences) consortium. Bringing together 13 European life science research infrastructures, it has laid the foundation for an open, digital space to support biological and medical research. Using lessons learned from 27 selected projects, we describe the organisational, technical, financial and legal/ethical challenges that represent the main barriers to sustainability in the life sciences. We show how EOSC-Life provides a model for sustainable data management according to FAIR (findability, accessibility, interoperability, and reusability) principles, including solutions for sensitive- and industry-related resources, by means of cross-disciplinary training and best practices sharing. Finally, we illustrate how data harmonisation and collaborative work facilitate interoperability of tools, data, solutions and lead to a better understanding of concepts, semantics and functionalities in the life sciences.
Collapse
|
10
|
Dumschott K, Dörpholz H, Laporte MA, Brilhaus D, Schrader A, Usadel B, Neumann S, Arnaud E, Kranz A. Ontologies for increasing the FAIRness of plant research data. FRONTIERS IN PLANT SCIENCE 2023; 14:1279694. [PMID: 38098789 PMCID: PMC10720748 DOI: 10.3389/fpls.2023.1279694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 11/15/2023] [Indexed: 12/17/2023]
Abstract
The importance of improving the FAIRness (findability, accessibility, interoperability, reusability) of research data is undeniable, especially in the face of large, complex datasets currently being produced by omics technologies. Facilitating the integration of a dataset with other types of data increases the likelihood of reuse, and the potential of answering novel research questions. Ontologies are a useful tool for semantically tagging datasets as adding relevant metadata increases the understanding of how data was produced and increases its interoperability. Ontologies provide concepts for a particular domain as well as the relationships between concepts. By tagging data with ontology terms, data becomes both human- and machine- interpretable, allowing for increased reuse and interoperability. However, the task of identifying ontologies relevant to a particular research domain or technology is challenging, especially within the diverse realm of fundamental plant research. In this review, we outline the ontologies most relevant to the fundamental plant sciences and how they can be used to annotate data related to plant-specific experiments within metadata frameworks, such as Investigation-Study-Assay (ISA). We also outline repositories and platforms most useful for identifying applicable ontologies or finding ontology terms.
Collapse
Affiliation(s)
- Kathryn Dumschott
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics) & Bioeconomy Science Center (BioSC), CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| | - Hannah Dörpholz
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics) & Bioeconomy Science Center (BioSC), CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| | - Marie-Angélique Laporte
- Digital Solutions Team, Digital Inclusion Lever, Bioversity International, Montpellier Office, Montpellier, France
| | - Dominik Brilhaus
- Data Science and Management & Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Andrea Schrader
- Data Science and Management & Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Björn Usadel
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics) & Bioeconomy Science Center (BioSC), CEPLAS, Forschungszentrum Jülich, Jülich, Germany
- Institute for Biological Data Science & Cluster of Excellence on Plant Sciences (CEPLAS), Faculty of Mathematics and Life Sciences, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Steffen Neumann
- Program Center MetaCom, Leibniz Institute of Plant Biochemistry, Halle, Germany
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany
| | - Elizabeth Arnaud
- Digital Solutions Team, Digital Inclusion Lever, Bioversity International, Montpellier Office, Montpellier, France
| | - Angela Kranz
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics) & Bioeconomy Science Center (BioSC), CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| |
Collapse
|
11
|
Weil HL, Schneider K, Tschöpe M, Bauer J, Maus O, Frey K, Brilhaus D, Martins Rodrigues C, Doniparthi G, Wetzels F, Lukasczyk J, Kranz A, Grüning B, Zimmer D, Deßloch S, von Suchodoletz D, Usadel B, Garth C, Mühlhaus T. PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 116:974-988. [PMID: 37818860 DOI: 10.1111/tpj.16474] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 09/08/2023] [Indexed: 10/13/2023]
Abstract
In modern reproducible, hypothesis-driven plant research, scientists are increasingly relying on research data management (RDM) services and infrastructures to streamline the processes of collecting, processing, sharing, and archiving research data. FAIR (i.e., findable, accessible, interoperable, and reusable) research data play a pivotal role in enabling the integration of interdisciplinary knowledge and facilitating the comparison and synthesis of a wide range of analytical findings. The PLANTdataHUB offers a solution that realizes RDM of scientific (meta)data as evolving collections of files in a directory - yielding FAIR digital objects called ARCs - with tools that enable scientists to plan, communicate, collaborate, publish, and reuse data on the same platform while gaining continuous quality control insights. The centralized platform is scalable from personal use to global communities and provides advanced federation capabilities for institutions that prefer to host their own satellite instances. This approach borrows many concepts from software development and adapts them to fit the challenges of the field of modern plant science undergoing digital transformation. The PLANTdataHUB supports researchers in each stage of a scientific project with adaptable continuous quality control insights, from the early planning phase to data publication. The central live instance of PLANTdataHUB is accessible at (https://git.nfdi4plants.org), and it will continue to evolve as a community-driven and dynamic resource that serves the needs of contemporary plant science.
Collapse
Affiliation(s)
- Heinrich Lukas Weil
- Computational Systems Biology, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Kevin Schneider
- Computational Systems Biology, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Marcel Tschöpe
- Computer Center, University of Freiburg, Freiburg im Breisgau, Germany
| | - Jonathan Bauer
- Computer Center, University of Freiburg, Freiburg im Breisgau, Germany
| | - Oliver Maus
- Computational Systems Biology, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Kevin Frey
- Computational Systems Biology, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Dominik Brilhaus
- Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | | | - Gajendra Doniparthi
- Heterogenous Information Systems, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Florian Wetzels
- Scientific Visualization Lab, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Jonas Lukasczyk
- Scientific Visualization Lab, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Angela Kranz
- IBG-4 Bioinformatics, BioSC, Forschungszentrum Jülich, Jülich, Germany
| | - Björn Grüning
- Bioinformatics Group, University of Freiburg, Freiburg im Breisgau, Germany
| | - David Zimmer
- Computational Systems Biology, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Stefan Deßloch
- Heterogenous Information Systems, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | | | - Björn Usadel
- Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- IBG-4 Bioinformatics, BioSC, Forschungszentrum Jülich, Jülich, Germany
| | - Christoph Garth
- Scientific Visualization Lab, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Timo Mühlhaus
- Computational Systems Biology, University of Kaiserslautern-Landau, Kaiserslautern, Germany
| |
Collapse
|
12
|
Kemmer I, Keppler A, Serrano-Solano B, Rybina A, Özdemir B, Bischof J, El Ghadraoui A, Eriksson JE, Mathur A. Building a FAIR image data ecosystem for microscopy communities. Histochem Cell Biol 2023; 160:199-209. [PMID: 37341795 PMCID: PMC10492678 DOI: 10.1007/s00418-023-02203-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/27/2023] [Indexed: 06/22/2023]
Abstract
Bioimaging has now entered the era of big data with faster-than-ever development of complex microscopy technologies leading to increasingly complex datasets. This enormous increase in data size and informational complexity within those datasets has brought with it several difficulties in terms of common and harmonized data handling, analysis, and management practices, which are currently hampering the full potential of image data being realized. Here, we outline a wide range of efforts and solutions currently being developed by the microscopy community to address these challenges on the path towards FAIR bioimaging data. We also highlight how different actors in the microscopy ecosystem are working together, creating synergies that develop new approaches, and how research infrastructures, such as Euro-BioImaging, are fostering these interactions to shape the field.
Collapse
Affiliation(s)
- Isabel Kemmer
- Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany
| | - Antje Keppler
- Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany
| | - Beatriz Serrano-Solano
- Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany
| | - Arina Rybina
- Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany
| | - Buğra Özdemir
- Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany
| | - Johanna Bischof
- Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany
| | - Ayoub El Ghadraoui
- Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany
| | - John E Eriksson
- Euro-BioImaging ERIC Statutory Seat, Tykistökatu 6, P.O. Box 123, 20521, Turku, Finland
| | - Aastha Mathur
- Euro-BioImaging ERIC Bio-Hub, European Molecular Biology Laboratory (EMBL) Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany.
| |
Collapse
|
13
|
Poorter H, Hummel GM, Nagel KA, Fiorani F, von Gillhaussen P, Virnich O, Schurr U, Postma JA, van de Zedde R, Wiese-Klinkenberg A. Pitfalls and potential of high-throughput plant phenotyping platforms. FRONTIERS IN PLANT SCIENCE 2023; 14:1233794. [PMID: 37680357 PMCID: PMC10481964 DOI: 10.3389/fpls.2023.1233794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/01/2023] [Indexed: 09/09/2023]
Abstract
Automated high-throughput plant phenotyping (HTPP) enables non-invasive, fast and standardized evaluations of a large number of plants for size, development, and certain physiological variables. Many research groups recognize the potential of HTPP and have made significant investments in HTPP infrastructure, or are considering doing so. To make optimal use of limited resources, it is important to plan and use these facilities prudently and to interpret the results carefully. Here we present a number of points that users should consider before purchasing, building or utilizing such equipment. They relate to (1) the financial and time investment for acquisition, operation, and maintenance, (2) the constraints associated with such machines in terms of flexibility and growth conditions, (3) the pros and cons of frequent non-destructive measurements, (4) the level of information provided by proxy traits, and (5) the utilization of calibration curves. Using data from an Arabidopsis experiment, we demonstrate how diurnal changes in leaf angle can impact plant size estimates from top-view cameras, causing deviations of more than 20% over the day. Growth analysis data from another rosette species showed that there was a curvilinear relationship between total and projected leaf area. Neglecting this curvilinearity resulted in linear calibration curves that, although having a high r2 (> 0.92), also exhibited large relative errors. Another important consideration we discussed is the frequency at which calibration curves need to be generated and whether different treatments, seasons, or genotypes require distinct calibration curves. In conclusion, HTPP systems have become a valuable addition to the toolbox of plant biologists, provided that these systems are tailored to the research questions of interest, and users are aware of both the possible pitfalls and potential involved.
Collapse
Affiliation(s)
- Hendrik Poorter
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, Jülich, Germany
- Department of Natural Sciences, Macquarie University, North Ryde, NSW, Australia
| | | | - Kerstin A. Nagel
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, Jülich, Germany
| | - Fabio Fiorani
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, Jülich, Germany
| | | | - Olivia Virnich
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, Jülich, Germany
| | - Ulrich Schurr
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, Jülich, Germany
| | | | - Rick van de Zedde
- Plant Sciences Group, Wageningen University & Research, Wageningen, Netherlands
| | - Anika Wiese-Klinkenberg
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, Jülich, Germany
- Bioinformatics (IBG-4), Forschungszentrum Jülich GmbH, Jülich, Germany
| |
Collapse
|
14
|
Papoutsoglou EA, Athanasiadis IN, Visser RGF, Finkers R. The benefits and struggles of FAIR data: the case of reusing plant phenotyping data. Sci Data 2023; 10:457. [PMID: 37443110 PMCID: PMC10345100 DOI: 10.1038/s41597-023-02364-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 07/03/2023] [Indexed: 07/15/2023] Open
Abstract
Plant phenotyping experiments are conducted under a variety of experimental parameters and settings for diverse purposes. The data they produce is heterogeneous, complicated, often poorly documented and, as a result, difficult to reuse. Meeting societal needs (nutrition, crop adaptation and stability) requires more efficient methods toward data integration and reuse. In this work, we examine what "making data FAIR" entails, and investigate the benefits and the struggles not only of reusing FAIR data, but also making data FAIR using genotype by environment and QTL by environment interactions for developmental traits in potato as a case study. We assume the role of a scientist discovering a phenotypic dataset on a FAIR data point, verifying the existence of related datasets with environmental data, acquiring both and integrating them. We report and discuss the challenges and the potential for reusability and reproducibility of FAIRifying existing datasets, using metadata standards such as MIAPPE, that were encountered in this process.
Collapse
Affiliation(s)
- Evangelia A Papoutsoglou
- Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
- Taxonic B.V., De Meern, The Netherlands
| | - Ioannis N Athanasiadis
- Wageningen Data Competence Center and Geo-Information Science & Remote Sensing Lab, Wageningen University and Research, Wageningen, The Netherlands
| | - Richard G F Visser
- Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Richard Finkers
- Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands.
- GenNovation B.V., Wageningen, The Netherlands.
| |
Collapse
|
15
|
Paulus S, Leiding B. Can Distributed Ledgers Help to Overcome the Need of Labeled Data for Agricultural Machine Learning Tasks? PLANT PHENOMICS (WASHINGTON, D.C.) 2023; 5:0070. [PMID: 37434757 PMCID: PMC10332799 DOI: 10.34133/plantphenomics.0070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 06/25/2023] [Indexed: 07/13/2023]
Affiliation(s)
- Stefan Paulus
- Institute of Sugar Beet Research, Holtenser Landstr. 77, 37079 Göttingen, Germany
| | - Benjamin Leiding
- Institute for Software and Systems Engineering, TU Clausthal, Wallstr. 6, 38640 Goslar, Germany
| |
Collapse
|
16
|
Großkinsky DK, Faure JD, Gibon Y, Haslam RP, Usadel B, Zanetti F, Jonak C. The potential of integrative phenomics to harness underutilized crops for improving stress resilience. FRONTIERS IN PLANT SCIENCE 2023; 14:1216337. [PMID: 37409292 PMCID: PMC10318926 DOI: 10.3389/fpls.2023.1216337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 06/08/2023] [Indexed: 07/07/2023]
Affiliation(s)
- Dominik K. Großkinsky
- AIT Austrian Institute of Technology, Center for Health and Bioresources, Bioresources Unit, Tulln a. d. Donau, Austria
| | - Jean-Denis Faure
- Université Paris-Saclay, INRAE, AgroParisTech, Institut Jean-Pierre Bourgin, Versailles, France
| | - Yves Gibon
- INRAE, Univ. Bordeaux, UMR BFP, Villenave d’Ornon, France
- Bordeaux Metabolome, INRAE, Univ. Bordeaux, Villenave d’Ornon, France
| | | | - Björn Usadel
- IBG-4 Bioinformatics, CEPLAS, Forschungszentrum, Jülich, Germany
- Biological Data Science, Heinrich Heine University, Universitätsstrasse 1, Düsseldorf, Germany
| | - Federica Zanetti
- Department of Agricultural and Food Sciences (DISTAL), Alma Mater Studiorum - Università di Bologna, Bologna, Italy
| | - Claudia Jonak
- AIT Austrian Institute of Technology, Center for Health and Bioresources, Bioresources Unit, Tulln a. d. Donau, Austria
| |
Collapse
|
17
|
Daykin GM, Aizen MA, Barrett LG, Bartlett LJ, Batáry P, Garibaldi LA, Güncan A, Gutam S, Maas B, Mitnala J, Montaño-Centellas F, Muoni T, Öckinger E, Okechalu O, Ostler R, Potts SG, Rose DC, Topp CFE, Usieta HO, Utoblo OG, Watson C, Zou Y, Sutherland WJ, Hood ASC. AgroEcoList 1.0: A checklist to improve reporting standards in ecological research in agriculture. PLoS One 2023; 18:e0285478. [PMID: 37310957 DOI: 10.1371/journal.pone.0285478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 04/24/2023] [Indexed: 06/15/2023] Open
Abstract
Many publications lack sufficient background information (e.g. location) to be interpreted, replicated, or reused for synthesis. This impedes scientific progress and the application of science to practice. Reporting guidelines (e.g. checklists) improve reporting standards. They have been widely taken up in the medical sciences, but not in ecological and agricultural research. Here, we use a community-centred approach to develop a reporting checklist (AgroEcoList 1.0) through surveys and workshops with 23 experts and the wider agroecological community. To put AgroEcoList in context, we also assessed the agroecological community's perception of reporting standards in agroecology. A total of 345 researchers, reviewers, and editors, responded to our survey. Although only 32% of respondents had prior knowledge of reporting guidelines, 76% of those that had said guidelines improved reporting standards. Overall, respondents agreed on the need of AgroEcolist 1.0; only 24% of respondents had used reporting guidelines before, but 78% indicated they would use AgroEcoList 1.0. We updated AgroecoList 1.0 based on respondents' feedback and user-testing. AgroecoList 1.0 consists of 42 variables in seven groups: experimental/sampling set-up, study site, soil, livestock management, crop and grassland management, outputs, and finances. It is presented here, and is also available on github (https://github.com/AgroecoList/Agroecolist). AgroEcoList 1.0 can serve as a guide for authors, reviewers, and editors to improve reporting standards in agricultural ecology. Our community-centred approach is a replicable method that could be adapted to develop reporting checklists in other fields. Reporting guidelines such as AgroEcoList can improve reporting standards and therefore the application of research to practice, and we recommend that they are adopted more widely in agriculture and ecology.
Collapse
Affiliation(s)
- Georgia M Daykin
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - Marcelo A Aizen
- Instituto de Investigaciones en Biodiversidad y Medio Ambiente (INIBIOMA), Universidad Nacional del Comahue - Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Carlos de Bariloche, Río Negro, Argentina
| | | | - Lewis J Bartlett
- Center for the Ecology of Infectious Diseases, Odum School of Ecology, University of Georgia, Athens, Georgia, United States of America
| | - Péter Batáry
- "Lendület" Landscape and Conservation Ecology, Institute of Ecology and Botany, Centre for Ecological Research, Vácrátót, Alkomány, Hungary
| | - Lucas A Garibaldi
- Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural, Universidad Nacional de Río Negro, Viedma, Río Negro, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Investigaciones en Recursos Naturales, Agroecología y Desarrollo Rural, Bariloche, Río Negro, Argentina
| | - Ali Güncan
- Department of Plant Protection, Faculty of Agriculture, University of Ordu, Ordu, Turkey
| | - Sridhar Gutam
- ICAR-AICRP on Fruits, ICAR-Indian Institute of Horticultural Research, Bengaluru, Karnataka, India
| | - Bea Maas
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
- Agroecology, University of Goettingen, Göettingen, Germany
| | - Jayalakshmi Mitnala
- Regional Agricultural Research Station, Acharya N. G. Ranga Agricultural University, Hyderabad, Andhra Pradesh, India
| | - Flavia Montaño-Centellas
- Instituto de Ecología, Universidad Mayor de San Andrés, La Paz, Bolivia
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Tarirai Muoni
- CIMMYT Southern Africa Regional Office, Harare, Zimbabwe
- Department of Crop Production Ecology, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Erik Öckinger
- Department of Ecology, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Ode Okechalu
- Department of Plant Science and Biotechnology, University of Jos, Plateau, Nigeria
| | - Richard Ostler
- Computational and Analytical Sciences, Rothamsted Research, Harpenden, United Kingdom
| | - Simon G Potts
- Centre for Agri-environmental Research, School of Agriculture, Policy and Development, University of Reading, Reading, United Kingdom
| | - David C Rose
- Centre for Agri-environmental Research, School of Agriculture, Policy and Development, University of Reading, Reading, United Kingdom
- School of Water, Energy, and Environment, Cranfield University, Cranfield, United Kingdom
| | - Cairistiona F E Topp
- Agriculture, Horticulture and Engineering Sciences, Scotland's Rural College, Edinburgh, United Kingdom
| | - Hope O Usieta
- Leventis Foundation Nigeria, F. C. T. Abuja, Nigeria
| | - Obaiya G Utoblo
- Department of Plant Science and Biotechnology, University of Jos, Plateau, Nigeria
| | - Christine Watson
- Department of Crop Production Ecology, Swedish University of Agricultural Sciences, Uppsala, Sweden
- Rural Land Use, Scotland's Rural College, Craibstone Estate, Aberdeen, United Kingdom
| | - Yi Zou
- Department of Health and Environmental Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, P. R. China
| | | | - Amelia S C Hood
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
- Centre for Agri-environmental Research, School of Agriculture, Policy and Development, University of Reading, Reading, United Kingdom
| |
Collapse
|
18
|
Karabulut E, Erkoç K, Acı M, Aydın M, Barriball S, Braley J, Cassetta E, Craine EB, Diaz-Garcia L, Hershberger J, Meyering B, Miller AJ, Rubin MJ, Tesdell O, Schlautman B, Şakiroğlu M. Sainfoin ( Onobrychis spp.) crop ontology: supporting germplasm characterization and international research collaborations. FRONTIERS IN PLANT SCIENCE 2023; 14:1177406. [PMID: 37255566 PMCID: PMC10225502 DOI: 10.3389/fpls.2023.1177406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 04/18/2023] [Indexed: 06/01/2023]
Abstract
Sainfoin (Onobrychis spp.) is a perennial forage legume that is also attracting attention as a perennial pulse with potential for human consumption. The dual use of sainfoin underpins diverse research and breeding programs focused on improving sainfoin lines for forage and pulses, which is driving the generation of complex datasets describing high dimensional phenotypes in the post-omics era. To ensure that multiple user groups, for example, breeders selecting for forage and those selecting for edible seed, can utilize these rich datasets, it is necessary to develop common ontologies and accessible ontology platforms. One such platform, Crop Ontology, was created in 2008 by the Consortium of International Agricultural Research Centers (CGIAR) to host crop-specific trait ontologies that support standardized plant breeding databases. In the present study, we describe the sainfoin crop ontology (CO). An in-depth literature review was performed to develop a comprehensive list of traits measured and reported in sainfoin. Because the same traits can be measured in different ways, ultimately, a set of 98 variables (variable = plant trait + method of measurement + scale of measurement) used to describe variation in sainfoin were identified. Variables were formatted and standardized based on guidelines provided here for inclusion in the sainfoin CO. The 98 variables contained a total of 82 traits from four trait classes of which 24 were agronomic, 31 were morphological, 19 were seed and forage quality related, and 8 were phenological. In addition to the developed variables, we have provided a roadmap for developing and submission of new traits to the sainfoin CO.
Collapse
Affiliation(s)
- Ebrar Karabulut
- Bioengineering Department, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye
| | - Kübra Erkoç
- Bioengineering Department, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye
| | - Murat Acı
- Bioengineering Department, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye
- The Land Institute, Salina, KS, United States
| | - Mahmut Aydın
- Department of Computer Engineering, Kafkas University, Kars, Türkiye
| | | | - Jackson Braley
- Donald Danforth Plant Science Center, St. Louis, MO, United States
| | | | | | - Luis Diaz-Garcia
- Department of Viticulture and Enology, University of California Davis, Davis, CA, United States
| | - Jenna Hershberger
- Plant and Environmental Sciences Department, Clemson University, Clemson, SC, United States
| | - Bo Meyering
- The Land Institute, Salina, KS, United States
| | - Allison J. Miller
- Donald Danforth Plant Science Center, St. Louis, MO, United States
- Department. of Biology, Saint Louis University, St. Louis, MO, United States
| | - Matthew J. Rubin
- Donald Danforth Plant Science Center, St. Louis, MO, United States
| | - Omar Tesdell
- Department of Geography, Birzeit University, Birzeit, West Bank, Palestine
| | | | - Muhammet Şakiroğlu
- Bioengineering Department, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye
| |
Collapse
|
19
|
Harfouche AL, Nakhle F, Harfouche AH, Sardella OG, Dart E, Jacobson D. A primer on artificial intelligence in plant digital phenomics: embarking on the data to insights journey. TRENDS IN PLANT SCIENCE 2023; 28:154-184. [PMID: 36167648 DOI: 10.1016/j.tplants.2022.08.021] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 08/22/2022] [Accepted: 08/25/2022] [Indexed: 06/16/2023]
Abstract
Artificial intelligence (AI) has emerged as a fundamental component of global agricultural research that is poised to impact on many aspects of plant science. In digital phenomics, AI is capable of learning intricate structure and patterns in large datasets. We provide a perspective and primer on AI applications to phenome research. We propose a novel human-centric explainable AI (X-AI) system architecture consisting of data architecture, technology infrastructure, and AI architecture design. We clarify the difference between post hoc models and 'interpretable by design' models. We include guidance for effectively using an interpretable by design model in phenomic analysis. We also provide directions to sources of tools and resources for making data analytics increasingly accessible. This primer is accompanied by an interactive online tutorial.
Collapse
Affiliation(s)
- Antoine L Harfouche
- Department for Innovation in Biological, Agro-Food, and Forest Systems, University of Tuscia, Viterbo, VT 01100, Italy.
| | - Farid Nakhle
- Department for Innovation in Biological, Agro-Food, and Forest Systems, University of Tuscia, Viterbo, VT 01100, Italy
| | - Antoine H Harfouche
- Unité de Formation et de Recherche en Sciences Économiques, Gestion, Mathématiques, et Informatique, Université Paris Nanterre, 92001 Nanterre, France
| | - Orlando G Sardella
- Department for Innovation in Biological, Agro-Food, and Forest Systems, University of Tuscia, Viterbo, VT 01100, Italy
| | - Eli Dart
- Energy Sciences Network (ESnet), Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Daniel Jacobson
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| |
Collapse
|
20
|
Yang F, Liu Z, Wang Y, Wang X, Zhang Q, Han Y, Zhao X, Pan S, Yang S, Wang S, Zhang Q, Qiu J, Wang K. A variety test platform for the standardization and data quality improvement of crop variety tests. FRONTIERS IN PLANT SCIENCE 2023; 14:1077196. [PMID: 36760650 PMCID: PMC9902355 DOI: 10.3389/fpls.2023.1077196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Accepted: 01/09/2023] [Indexed: 06/18/2023]
Abstract
Variety testing is an indispensable and essential step in the process of creating new improved varieties from breeding to adoption. The performance of the varieties can be compared and evaluated based on multi-trait data from multi-location variety tests in multiple years. Although high-throughput phenotypic platforms have been used for observing some specific traits, manual phenotyping is still widely used. The efficient management of large amounts of data is still a significant problem for crop variety testing. This study reports a variety test platform (VTP) that was created to manage the whole workflow for the standardization and data quality improvement of crop variety testing. Through the VTP, the phenotype data of varieties can be integrated and reused based on standardized data elements and datasets. Moreover, the information support and automated functions for the whole testing workflow help users conduct tests efficiently through a series of functions such as test design, data acquisition and processing, and statistical analyses. The VTP has been applied to regional variety tests covering more than seven thousand locations across the whole country, and then a standardized and authoritative phenotypic database covering five crops has been generated. In addition, the VTP can be deployed on either privately or publicly available high-performance computing nodes so that test management and data analysis can be conveniently done using a web-based interface or mobile application. In this way, the system can provide variety test management services to more small and medium-sized breeding organizations, and ensures the mutual independence and security of test data. The application of VTP shows that the platform can make variety testing more efficient and can be used to generate a reliable database suitable for meta-analysis in multi-omics breeding and variety development projects.
Collapse
Affiliation(s)
- Feng Yang
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Zhongqiang Liu
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Yuxi Wang
- National Agro-Tech Extension and Service Center, Beijing, China
| | - Xiaofeng Wang
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- Key Laboratory of Agri-informatics, Ministry of Agriculture, Beijing, China
| | - Qiusi Zhang
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- Key Laboratory of Agri-informatics, Ministry of Agriculture, Beijing, China
| | - Yanyun Han
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- Key Laboratory of Agri-informatics, Ministry of Agriculture, Beijing, China
| | - Xiangyu Zhao
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- National Engineering Research Center for Information Technology in Agriculture, Beijing, China
| | - Shouhui Pan
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- National Engineering Research Center for Information Technology in Agriculture, Beijing, China
| | - Shuo Yang
- AgChip Science and Technology (Beijing) Co., Ltd., Beijing, China
| | - Shufeng Wang
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- Key Laboratory of Agri-informatics, Ministry of Agriculture, Beijing, China
| | - Qi Zhang
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- Key Laboratory of Agri-informatics, Ministry of Agriculture, Beijing, China
| | - Jun Qiu
- National Agro-Tech Extension and Service Center, Beijing, China
| | - Kaiyi Wang
- Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- National Engineering Research Center for Information Technology in Agriculture, Beijing, China
| |
Collapse
|
21
|
Arend D, Scholz U, Lange M. The Plant Phenomics and Genomics Research Data Repository: An On-Premise Approach for FAIR-Compliant Data Acquisition. Methods Mol Biol 2023; 2703:3-22. [PMID: 37646933 DOI: 10.1007/978-1-0716-3389-2_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
The FAIR data principle as a commitment to support long-term research data management is widely accepted in the scientific community. However, although many established infrastructures provide comprehensive and long-term stable services and platforms, a large quantity of research data is still hidden. Currently, high-throughput plant genomics and phenomics technologies are producing research data in abundance, the storage of which is not covered by established core databases. This concerns the data volume, for example, time series of images or high-resolution hyperspectral data; the quality of data formatting and annotation, e.g., with regard to structure and annotation specifications of core databases; uncovered data domains; or organizational constraints prohibiting primary data storage outside institutional boundaries. To share these potentially dark data in a FAIR way and master these challenges the ELIXIR Germany/de.NBI service Plant Genomic and Phenomics Research Data Repository (PGP) implements an on-premise approach, which allows research data to be kept in place and wrapped in FAIR-aware software infrastructure. In this chapter, the e!DAL infrastructure software and the PGP repository are presented as best practice on how to easily setup FAIR-compliant and intuitive research data services.
Collapse
Affiliation(s)
- Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany.
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany
| |
Collapse
|
22
|
Nijsse B, Schaap PJ, Koehorst JJ. FAIR data station for lightweight metadata management and validation of omics studies. Gigascience 2022; 12:giad014. [PMID: 36879493 PMCID: PMC9989329 DOI: 10.1093/gigascience/giad014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 01/19/2023] [Accepted: 02/21/2023] [Indexed: 03/08/2023] Open
Abstract
BACKGROUND The life sciences are one of the biggest suppliers of scientific data. Reusing and connecting these data can uncover hidden insights and lead to new concepts. Efficient reuse of these datasets is strongly promoted when they are interlinked with a sufficient amount of machine-actionable metadata. While the FAIR (Findable, Accessible, Interoperable, Reusable) guiding principles have been accepted by all stakeholders, in practice, there are only a limited number of easy-to-adopt implementations available that fulfill the needs of data producers. FINDINGS We developed the FAIR Data Station, a lightweight application written in Java, that aims to support researchers in managing research metadata according to the FAIR principles. It implements the ISA metadata framework and uses minimal information metadata standards to capture experiment metadata. The FAIR Data Station consists of 3 modules. Based on the minimal information model(s) selected by the user, the "form generation module" creates a metadata template Excel workbook with a header row of machine-actionable attribute names. The Excel workbook is subsequently used by the data producer(s) as a familiar environment for sample metadata registration. At any point during this process, the format of the recorded values can be checked using the "validation module." Finally, the "resource module" can be used to convert the set of metadata recorded in the Excel workbook in RDF format, enabling (cross-project) (meta)data searches and, for publishing of sequence data, in an European Nucleotide Archive-compatible XML metadata file. CONCLUSIONS Turning FAIR into reality requires the availability of easy-to-adopt data FAIRification workflows that are also of direct use for data producers. As such, the FAIR Data Station provides, in addition to the means to correctly FAIRify (omics) data, the means to build searchable metadata databases of similar projects and can assist in ENA metadata submission of sequence data. The FAIR Data Station is available at https://fairbydesign.nl.
Collapse
Affiliation(s)
- Bart Nijsse
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, The Netherlands
- UNLOCK Large Scale Infrastructure for Microbial Communities, Wageningen University & Research and Delft University of Technology, Stippeneng 4, 6708 WE Wageningen, The Netherlands
| | - Peter J Schaap
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, The Netherlands
- UNLOCK Large Scale Infrastructure for Microbial Communities, Wageningen University & Research and Delft University of Technology, Stippeneng 4, 6708 WE Wageningen, The Netherlands
| | - Jasper J Koehorst
- Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Stippeneng 4, 6708 WE Wageningen, The Netherlands
- UNLOCK Large Scale Infrastructure for Microbial Communities, Wageningen University & Research and Delft University of Technology, Stippeneng 4, 6708 WE Wageningen, The Netherlands
| |
Collapse
|
23
|
Feser M, König P, Fiebig A, Arend D, Lange M, Scholz U. On the way to plant data commons - a genotyping use case. J Integr Bioinform 2022; 19:jib-2022-0033. [PMID: 36065132 PMCID: PMC9800039 DOI: 10.1515/jib-2022-0033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 08/04/2022] [Accepted: 08/11/2022] [Indexed: 01/09/2023] Open
Abstract
Over the last years it has been observed that the progress in data collection in life science has created increasing demand and opportunities for advanced bioinformatics. This includes data management as well as the individual data analysis and often covers the entire data life cycle. A variety of tools have been developed to store, share, or reuse the data produced in the different domains such as genotyping. Especially imputation, as a subfield of genotyping, requires good Research Data Management (RDM) strategies to enable use and re-use of genotypic data. To aim for sustainable software, it is necessary to develop tools and surrounding ecosystems, which are reusable and maintainable. Reusability in the context of streamlined tools can e.g. be achieved by standardizing the input and output of the different tools and adapting to open and broadly used file formats. By using such established file formats, the tools can also be connected with others, improving the overall interoperability of the software. Finally, it is important to build strong communities that maintain the tools by developing and contributing new features and maintenance updates. In this article, concepts for this will be presented for an imputation service.
Collapse
Affiliation(s)
- Manuel Feser
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466Seeland, Germany
| | - Patrick König
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466Seeland, Germany
| | - Anne Fiebig
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466Seeland, Germany
| | - Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466Seeland, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466Seeland, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466Seeland, Germany
| |
Collapse
|
24
|
Bercovich N, Genze N, Todesco M, Owens GL, Légaré JS, Huang K, Rieseberg LH, Grimm DG. HeliantHOME, a public and centralized database of phenotypic sunflower data. Sci Data 2022; 9:735. [PMID: 36450875 PMCID: PMC9712528 DOI: 10.1038/s41597-022-01842-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 11/11/2022] [Indexed: 12/02/2022] Open
Abstract
Genomic studies often attempt to link natural genetic variation with important phenotypic variation. To succeed, robust and reliable phenotypic data, as well as curated genomic assemblies, are required. Wild sunflowers, originally from North America, are adapted to diverse and often extreme environments and have historically been a widely used model plant system for the study of population genomics, adaptation, and speciation. Moreover, cultivated sunflower, domesticated from a wild relative (Helianthus annuus) is a global oil crop, ranking fourth in production of vegetable oils worldwide. Public availability of data resources both for the plant research community and for the associated agricultural sector, are extremely valuable. We have created HeliantHOME ( http://www.helianthome.org ), a curated, public, and interactive database of phenotypes including developmental, structural and environmental ones, obtained from a large collection of both wild and cultivated sunflower individuals. Additionally, the database is enriched with external genomic data and results of genome-wide association studies. Finally, being a community open-source platform, HeliantHOME is expected to expand as new knowledge and resources become available.
Collapse
Affiliation(s)
- Natalia Bercovich
- grid.17091.3e0000 0001 2288 9830Department of Botany, University of British Columbia, Vancouver, British Columbia Canada ,grid.17091.3e0000 0001 2288 9830Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| | - Nikita Genze
- grid.6936.a0000000123222966Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, Straubing, Germany ,grid.4819.40000 0001 0704 7467Weihenstephan-Triesdorf University of Applied Sciences, Straubing, Germany
| | - Marco Todesco
- grid.17091.3e0000 0001 2288 9830Department of Botany, University of British Columbia, Vancouver, British Columbia Canada ,grid.17091.3e0000 0001 2288 9830Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| | - Gregory L. Owens
- grid.17091.3e0000 0001 2288 9830Department of Botany, University of British Columbia, Vancouver, British Columbia Canada ,grid.17091.3e0000 0001 2288 9830Biodiversity Research Centre, University of British Columbia, Vancouver, Canada ,grid.143640.40000 0004 1936 9465Department of Biology, University of Victoria, Victoria, BC Canada
| | - Jean-Sébastien Légaré
- grid.17091.3e0000 0001 2288 9830Department of Botany, University of British Columbia, Vancouver, British Columbia Canada ,grid.17091.3e0000 0001 2288 9830Biodiversity Research Centre, University of British Columbia, Vancouver, Canada ,grid.17091.3e0000 0001 2288 9830Department of Computer Science, University of British Columbia, Vancouver, British Columbia Canada ,grid.17091.3e0000 0001 2288 9830Data Science Institute, University of British Columbia, Vancouver, British Columbia Canada
| | - Kaichi Huang
- grid.17091.3e0000 0001 2288 9830Department of Botany, University of British Columbia, Vancouver, British Columbia Canada ,grid.17091.3e0000 0001 2288 9830Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| | - Loren H. Rieseberg
- grid.17091.3e0000 0001 2288 9830Department of Botany, University of British Columbia, Vancouver, British Columbia Canada ,grid.17091.3e0000 0001 2288 9830Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| | - Dominik G. Grimm
- grid.6936.a0000000123222966Technical University of Munich, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, Straubing, Germany ,grid.4819.40000 0001 0704 7467Weihenstephan-Triesdorf University of Applied Sciences, Straubing, Germany ,grid.6936.a0000000123222966Technical University of Munich, Department of Informatics, Garching, Germany
| |
Collapse
|
25
|
Röckel F, Schreiber T, Schüler D, Braun U, Krukenberg I, Schwander F, Peil A, Brandt C, Willner E, Gransow D, Scholz U, Kecke S, Maul E, Lange M, Töpfer R. PhenoApp: A mobile tool for plant phenotyping to record field and greenhouse observations. F1000Res 2022; 11:12. [PMID: 36636476 PMCID: PMC9813448 DOI: 10.12688/f1000research.74239.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/20/2021] [Indexed: 01/21/2023] Open
Abstract
With the ongoing cost decrease of genotyping and sequencing technologies, accurate and fast phenotyping remains the bottleneck in the utilizing of plant genetic resources for breeding and breeding research. Although cost-efficient high-throughput phenotyping platforms are emerging for specific traits and/or species, manual phenotyping is still widely used and is a time- and money-consuming step. Approaches that improve data recording, processing or handling are pivotal steps towards the efficient use of genetic resources and are demanded by the research community. Therefore, we developed PhenoApp, an open-source Android app for tablets and smartphones to facilitate the digital recording of phenotypical data in the field and in greenhouses. It is a versatile tool that offers the possibility to fully customize the descriptors/scales for any possible scenario, also in accordance with international information standards such as MIAPPE (Minimum Information About a Plant Phenotyping Experiment) and FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Furthermore, PhenoApp enables the use of pre-integrated ready-to-use BBCH (Biologische Bundesanstalt für Land- und Forstwirtschaft, Bundessortenamt und CHemische Industrie) scales for apple, cereals, grapevine, maize, potato, rapeseed and rice. Additional BBCH scales can easily be added. The simple and adaptable structure of input and output files enables an easy data handling by either spreadsheet software or even the integration in the workflow of laboratory information management systems (LIMS). PhenoApp is therefore a decisive contribution to increase efficiency of digital data acquisition in genebank management but also contributes to breeding and breeding research by accelerating the labour intensive and time-consuming acquisition of phenotyping data.
Collapse
Affiliation(s)
- Franco Röckel
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany,
| | - Toni Schreiber
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Erwin-Baur-Straße 27, Quedlinburg, 06484, Germany
| | - Danuta Schüler
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Ulrike Braun
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Ina Krukenberg
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Königin-Luise-Strasse 19, Berlin, 14195, Germany
| | - Florian Schwander
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Andreas Peil
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Breeding Research on Fruit Crops, Pillnitzer Platz 3a, Dresden/Pillnitz, 01326, Germany
| | - Christine Brandt
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Parkweg 3a, Sanitz, 18190, Germany
| | - Evelin Willner
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Inselstraße 9, Malchow/Poel, 23999, Germany
| | - Daniel Gransow
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Inselstraße 9, Malchow/Poel, 23999, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Steffen Kecke
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Erwin-Baur-Straße 27, Quedlinburg, 06484, Germany
| | - Erika Maul
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Reinhard Töpfer
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| |
Collapse
|
26
|
Röckel F, Schreiber T, Schüler D, Braun U, Krukenberg I, Schwander F, Peil A, Brandt C, Willner E, Gransow D, Scholz U, Kecke S, Maul E, Lange M, Töpfer R. PhenoApp: A mobile tool for plant phenotyping to record field and greenhouse observations. F1000Res 2022; 11:12. [PMID: 36636476 PMCID: PMC9813448 DOI: 10.12688/f1000research.74239.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/25/2022] [Indexed: 11/29/2022] Open
Abstract
With the ongoing cost decrease of genotyping and sequencing technologies, accurate and fast phenotyping remains the bottleneck in the utilizing of plant genetic resources for breeding and breeding research. Although cost-efficient high-throughput phenotyping platforms are emerging for specific traits and/or species, manual phenotyping is still widely used and is a time- and money-consuming step. Approaches that improve data recording, processing or handling are pivotal steps towards the efficient use of genetic resources and are demanded by the research community. Therefore, we developed PhenoApp, an open-source Android app for tablets and smartphones to facilitate the digital recording of phenotypical data in the field and in greenhouses. It is a versatile tool that offers the possibility to fully customize the descriptors/scales for any possible scenario, also in accordance with international information standards such as MIAPPE (Minimum Information About a Plant Phenotyping Experiment) and FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Furthermore, PhenoApp enables the use of pre-integrated ready-to-use BBCH (Biologische Bundesanstalt für Land- und Forstwirtschaft, Bundessortenamt und CHemische Industrie) scales for apple, cereals, grapevine, maize, potato, rapeseed and rice. Additional BBCH scales can easily be added. The simple and adaptable structure of input and output files enables an easy data handling by either spreadsheet software or even the integration in the workflow of laboratory information management systems (LIMS). PhenoApp is therefore a decisive contribution to increase efficiency of digital data acquisition in genebank management but also contributes to breeding and breeding research by accelerating the labour intensive and time-consuming acquisition of phenotyping data.
Collapse
Affiliation(s)
- Franco Röckel
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany,
| | - Toni Schreiber
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Erwin-Baur-Straße 27, Quedlinburg, 06484, Germany
| | - Danuta Schüler
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Ulrike Braun
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Ina Krukenberg
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Königin-Luise-Strasse 19, Berlin, 14195, Germany
| | - Florian Schwander
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Andreas Peil
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Breeding Research on Fruit Crops, Pillnitzer Platz 3a, Dresden/Pillnitz, 01326, Germany
| | - Christine Brandt
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Parkweg 3a, Sanitz, 18190, Germany
| | - Evelin Willner
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Inselstraße 9, Malchow/Poel, 23999, Germany
| | - Daniel Gransow
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), The Satellite Collections North, Inselstraße 9, Malchow/Poel, 23999, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Steffen Kecke
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Data Processing Department, Erwin-Baur-Straße 27, Quedlinburg, 06484, Germany
| | - Erika Maul
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstraße 3, Seeland, 06466, Germany
| | - Reinhard Töpfer
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, 76833, Germany
| |
Collapse
|
27
|
pISA-tree - a data management framework for life science research projects using a standardised directory tree. Sci Data 2022; 9:685. [DOI: 10.1038/s41597-022-01805-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 10/24/2022] [Indexed: 11/12/2022] Open
Abstract
AbstractWe developed pISA-tree, a straightforward and flexible data management solution for organisation of life science project-associated research data and metadata. pISA-tree was initiated by end-user requirements thus its strong points are practicality and low maintenance cost. It enables on-the-fly creation of enriched directory tree structure (project/Investigation/Study/Assay) based on the ISA model, in a standardised manner via consecutive batch files. Templates-based metadata is generated in parallel at each level enabling guided submission of experiment metadata. pISA-tree is complemented by two R packages, pisar and seekr. pisar facilitates integration of pISA-tree datasets into bioinformatic pipelines and generation of ISA-Tab exports. seekr enables synchronisation with the FAIRDOMHub repository. Applicability of pISA-tree was demonstrated in several national and international multi-partner projects. The system thus supports findable, accessible, interoperable and reusable (FAIR) research and is in accordance with the Open Science initiative. Source code and documentation of pISA-tree are available at https://github.com/NIB-SI/pISA-tree.
Collapse
|
28
|
Xu Y, Zhang X, Li H, Zheng H, Zhang J, Olsen MS, Varshney RK, Prasanna BM, Qian Q. Smart breeding driven by big data, artificial intelligence, and integrated genomic-enviromic prediction. MOLECULAR PLANT 2022; 15:1664-1695. [PMID: 36081348 DOI: 10.1016/j.molp.2022.09.001] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 08/20/2022] [Accepted: 09/02/2022] [Indexed: 05/12/2023]
Abstract
The first paradigm of plant breeding involves direct selection-based phenotypic observation, followed by predictive breeding using statistical models for quantitative traits constructed based on genetic experimental design and, more recently, by incorporation of molecular marker genotypes. However, plant performance or phenotype (P) is determined by the combined effects of genotype (G), envirotype (E), and genotype by environment interaction (GEI). Phenotypes can be predicted more precisely by training a model using data collected from multiple sources, including spatiotemporal omics (genomics, phenomics, and enviromics across time and space). Integration of 3D information profiles (G-P-E), each with multidimensionality, provides predictive breeding with both tremendous opportunities and great challenges. Here, we first review innovative technologies for predictive breeding. We then evaluate multidimensional information profiles that can be integrated with a predictive breeding strategy, particularly envirotypic data, which have largely been neglected in data collection and are nearly untouched in model construction. We propose a smart breeding scheme, integrated genomic-enviromic prediction (iGEP), as an extension of genomic prediction, using integrated multiomics information, big data technology, and artificial intelligence (mainly focused on machine and deep learning). We discuss how to implement iGEP, including spatiotemporal models, environmental indices, factorial and spatiotemporal structure of plant breeding data, and cross-species prediction. A strategy is then proposed for prediction-based crop redesign at both the macro (individual, population, and species) and micro (gene, metabolism, and network) scales. Finally, we provide perspectives on translating smart breeding into genetic gain through integrative breeding platforms and open-source breeding initiatives. We call for coordinated efforts in smart breeding through iGEP, institutional partnerships, and innovative technological support.
Collapse
Affiliation(s)
- Yunbi Xu
- Institute of Crop Sciences, CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China; CIMMYT-China Tropical Maize Research Center, School of Food Science and Engineering, Foshan University, Foshan, Guangdong 528231, China; Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261325, China.
| | - Xingping Zhang
- Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261325, China
| | - Huihui Li
- Institute of Crop Sciences, CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China; National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, Hainan 572024, China
| | - Hongjian Zheng
- CIMMYT-China Specialty Maize Research Center, Shanghai Academy of Agricultural Sciences, Shanghai 201400, China
| | - Jianan Zhang
- MolBreeding Biotechnology Co., Ltd., Shijiazhuang, Hebei 050035, China
| | - Michael S Olsen
- CIMMYT (International Maize and Wheat Improvement Center), ICRAF Campus, United Nations Avenue, Nairobi, Kenya
| | - Rajeev K Varshney
- State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Australia
| | - Boddupalli M Prasanna
- CIMMYT (International Maize and Wheat Improvement Center), ICRAF Campus, United Nations Avenue, Nairobi, Kenya
| | - Qian Qian
- Institute of Crop Sciences, CIMMYT-China, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| |
Collapse
|
29
|
Kotni P, van Hintum T, Maggioni L, Oppermann M, Weise S. EURISCO update 2023: the European Search Catalogue for Plant Genetic Resources, a pillar for documentation of genebank material. Nucleic Acids Res 2022; 51:D1465-D1469. [PMID: 36189883 PMCID: PMC9825528 DOI: 10.1093/nar/gkac852] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/13/2022] [Accepted: 09/23/2022] [Indexed: 01/30/2023] Open
Abstract
The European Search Catalogue for Plant Genetic Resources (EURISCO) is a central entry point for information on crop plant germplasm accessions from institutions in Europe and beyond. In total, it provides data on more than two million accessions, making an important contribution to unlocking the vast genetic diversity that lies deposited in >400 germplasm collections in 43 countries. EURISCO serves as the reference system for the Plant Genetic Resources Strategy for Europe and represents a significant approach for documenting and making available the world's agrobiological diversity. EURISCO is well established as a resource in this field and forms the basis for a wide range of research projects. In this paper, we present current developments of EURISCO, which is accessible at http://eurisco.ecpgr.org.
Collapse
Affiliation(s)
- Pragna Kotni
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstr. 3, 06466 Seeland, Germany
| | - Theo van Hintum
- Centre for Genetic Resources, The Netherlands (CGN), Wageningen University & Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
| | - Lorenzo Maggioni
- European Cooperative Programme for Plant Genetic Resources (ECPGR), c/o Alliance of Bioversity International and CIAT, Via di San Domenico 1, 00153 Rome, Italy
| | - Markus Oppermann
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstr. 3, 06466 Seeland, Germany
| | - Stephan Weise
- To whom correspondence should be addressed. Tel: +49 39482 5 744; Fax: +49 39482 5 155;
| |
Collapse
|
30
|
“KRiShI”: a manually curated knowledgebase on rice sheath blight disease. Funct Integr Genomics 2022; 22:1403-1410. [DOI: 10.1007/s10142-022-00899-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 06/28/2022] [Accepted: 09/04/2022] [Indexed: 11/04/2022]
|
31
|
Gogna A, Schulthess AW, Röder MS, Ganal MW, Reif JC. Gabi wheat a panel of European elite lines as central stock for wheat genetic research. Sci Data 2022; 9:538. [PMID: 36056030 PMCID: PMC9440043 DOI: 10.1038/s41597-022-01651-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 08/18/2022] [Indexed: 12/20/2022] Open
Abstract
In plant sciences, curation and availability of interoperable phenotypic and genomic data is still in its infancy and represents an obstacle to rapid scientific discoveries in this field. To that end, supplementing the efforts being made to generate open access wheat genome, pan wheat genome and other bioinformatic resources, we present the GABI-WHEAT panel of elite European cultivars comprising 358 winter and 14 summer wheat varieties released between 1975 to 2007. The panel has been genotyped with SNP arrays of increasing density to investigate several important agronomic, quality and disease resistance traits. The robustness of investigated traits and interoperability of genomic and phenotypic data was assessed in the current publication with the aim to transform this panel into a public data resource for future genetic research in wheat. Consecutively, the phenotypic data was formatted to comply with FAIR principles and linked to online databases to substantiate panel origin information and quality. Thus, we were able to make a valuable resource available for plant science in a sustainable way. Measurement(s) | agronomic, quality and disease traits | Technology Type(s) | manual measurement in the field | Sample Characteristic - Organism | Triticum aestivum L. |
Collapse
Affiliation(s)
- Abhishek Gogna
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Stadt Seeland, Germany
| | - Albert W Schulthess
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Stadt Seeland, Germany
| | - Marion S Röder
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Stadt Seeland, Germany
| | - Martin W Ganal
- SGS Institut Fresenius GmbH, TraitGenetics Section, Am Schwabeplan 1b, 06466, Stadt Seeland OT Gatersleben, Germany
| | - Jochen C Reif
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Stadt Seeland, Germany.
| |
Collapse
|
32
|
Senger E, Osorio S, Olbricht K, Shaw P, Denoyes B, Davik J, Predieri S, Karhu S, Raubach S, Lippi N, Höfer M, Cockerton H, Pradal C, Kafkas E, Litthauer S, Amaya I, Usadel B, Mezzetti B. Towards smart and sustainable development of modern berry cultivars in Europe. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 111:1238-1251. [PMID: 35751152 DOI: 10.1111/tpj.15876] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 06/15/2022] [Accepted: 06/22/2022] [Indexed: 06/15/2023]
Abstract
Fresh berries are a popular and important component of the human diet. The demand for high-quality berries and sustainable production methods is increasing globally, challenging breeders to develop modern berry cultivars that fulfill all desired characteristics. Since 1994, research projects have characterized genetic resources, developed modern tools for high-throughput screening, and published data in publicly available repositories. However, the key findings of different disciplines are rarely linked together, and only a limited range of traits and genotypes has been investigated. The Horizon2020 project BreedingValue will address these challenges by studying a broader panel of strawberry, raspberry and blueberry genotypes in detail, in order to recover the lost genetic diversity that has limited the aroma and flavor intensity of recent cultivars. We will combine metabolic analysis with sensory panel tests and surveys to identify the key components of taste, flavor and aroma in berries across Europe, leading to a high-resolution map of quality requirements for future berry cultivars. Traits linked to berry yields and the effect of environmental stress will be investigated using modern image analysis methods and modeling. We will also use genetic analysis to determine the genetic basis of complex traits for the development and optimization of modern breeding technologies, such as molecular marker arrays, genomic selection and genome-wide association studies. Finally, the results, raw data and metadata will be made publicly available on the open platform Germinate in order to meet FAIR data principles and provide the basis for sustainable research in the future.
Collapse
Affiliation(s)
- Elisa Senger
- Institute of Bio- and Geosciences, IBG-4 Bioinformatics, BioSC, CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| | - Sonia Osorio
- Departamento de Biología Molecular y Bioquímica, Instituto de Hortofruticultura Subtropical y Mediterránea 'La Mayora', Universidad de Málaga-Consejo Superior de Investigaciones Científicas, Campus de Teatinos, Málaga, Spain
| | | | - Paul Shaw
- Department of Information and Computational Sciences, The James Hutton Institute, Invergowrie, Scotland, UK
| | - Béatrice Denoyes
- Université de Bordeaux, UMR BFP, INRAE, Villenave d'Ornon, France
| | - Jahn Davik
- Department of Molecular Plant Biology, Norwegian Institute of Bioeconomy Research (NIBIO), Ås, Norway
| | - Stefano Predieri
- Bio-Agrofood Department, Institute for Bioeconomy, IBE-CNR, Italian National Research Council, Bologna, Italy
| | - Saila Karhu
- Natural Resources Institute Finland (Luke), Turku, Finland
| | - Sebastian Raubach
- Department of Information and Computational Sciences, The James Hutton Institute, Invergowrie, Scotland, UK
| | - Nico Lippi
- Bio-Agrofood Department, Institute for Bioeconomy, IBE-CNR, Italian National Research Council, Bologna, Italy
| | - Monika Höfer
- Institute of Breeding Research on Fruit Crops, Federal Research Centre for Cultivated Plants (JKI), Dresden, Germany
| | - Helen Cockerton
- Genetics, Genomics and Breeding Department, NIAB, East Malling, UK
| | - Christophe Pradal
- CIRAD and UMR AGAP Institute, Montpellier, France
- INRIA and LIRMM, University Montpellier, CNRS, Montpellier, France
| | - Ebru Kafkas
- Department of Horticulture, Faculty of Agriculture, Çukurova University, Balcalı, Adana, Turkey
| | | | - Iraida Amaya
- Unidad Asociada deI + D + i IFAPA-CSIC Biotecnología y Mejora en Fresa, Málaga, Spain
- Laboratorio de Genómica y Biotecnología, Centro IFAPA de Málaga, Instituto Andaluz de Investigación y Formación Agraria y Pesquera, Málaga, Spain
| | - Björn Usadel
- Institute of Bio- and Geosciences, IBG-4 Bioinformatics, BioSC, CEPLAS, Forschungszentrum Jülich, Jülich, Germany
- Institute for Biological Data Science, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany
| | - Bruno Mezzetti
- Department of Agricultural, Food and Environmental Sciences, Università Politecnica delle Marche, Ancona, Italy
| |
Collapse
|
33
|
Possamai T, Wiedemann-Merdinoglu S. Phenotyping for QTL identification: A case study of resistance to Plasmopara viticola and Erysiphe necator in grapevine. FRONTIERS IN PLANT SCIENCE 2022; 13:930954. [PMID: 36035702 PMCID: PMC9403010 DOI: 10.3389/fpls.2022.930954] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 06/27/2022] [Indexed: 06/01/2023]
Abstract
Vitis vinifera is the most widely cultivated grapevine species. It is highly susceptible to Plasmopara viticola and Erysiphe necator, the causal agents of downy mildew (DM) and powdery mildew (PM), respectively. Current strategies to control DM and PM mainly rely on agrochemical applications that are potentially harmful to humans and the environment. Breeding for resistance to DM and PM in wine grape cultivars by introgressing resistance loci from wild Vitis spp. is a complementary and more sustainable solution to manage these two diseases. During the last two decades, 33 loci of resistance to P. viticola (Rpv) and 15 loci of resistance to E. necator (Ren and Run) have been identified. Phenotyping is salient for QTL characterization and understanding the genetic basis of resistant traits. However, phenotyping remains a major bottleneck for research on Rpv and Ren/Run loci and disease resistance evaluation. A thorough analysis of the literature on phenotyping methods used for DM and PM resistance evaluation highlighted phenotyping performed in the vineyard, greenhouse or laboratory with major sources of variation, such as environmental conditions, plant material (organ physiology and age), pathogen inoculum (genetic and origin), pathogen inoculation (natural or controlled), and disease assessment method (date, frequency, and method of scoring). All these factors affect resistance assessment and the quality of phenotyping data. We argue that the use of new technologies for disease symptom assessment, and the production and adoption of standardized experimental guidelines should enhance the accuracy and reliability of phenotyping data. This should contribute to a better replicability of resistance evaluation outputs, facilitate QTL identification, and contribute to streamline disease resistance breeding programs.
Collapse
Affiliation(s)
- Tyrone Possamai
- CREA—Research Centre for Viticulture and Enology, Conegliano, Italy
| | | |
Collapse
|
34
|
Arend D, Psaroudakis D, Memon JA, Rey-Mazón E, Schüler D, Szymanski JJ, Scholz U, Junker A, Lange M. From data to knowledge - big data needs stewardship, a plant phenomics perspective. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 111:335-347. [PMID: 35535481 DOI: 10.1111/tpj.15804] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 05/02/2022] [Accepted: 05/06/2022] [Indexed: 06/14/2023]
Abstract
The research data life cycle from project planning to data publishing is an integral part of current research. Until the last decade, researchers were responsible for all associated phases in addition to the actual research and were assisted only at certain points by IT or bioinformaticians. Starting with advances in sequencing, the automation of analytical methods in all life science fields, including in plant phenotyping, has led to ever-increasing amounts of ever more complex data. The tasks associated with these challenges now often exceed the expertise of and infrastructure available to scientists, leading to an increased risk of data loss over time. The IPK Gatersleben has one of the world's largest germplasm collections and two decades of experience in crop plant research data management. In this article we show how challenges in modern, data-driven research can be addressed by data stewards. Based on concrete use cases, data management processes and best practices from plant phenotyping, we describe which expertise and skills are required and how data stewards as an integral actor can enhance the quality of a necessary digital transformation in progressive research.
Collapse
Affiliation(s)
- Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| | - Dennis Psaroudakis
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| | - Junaid Altaf Memon
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| | - Elena Rey-Mazón
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| | - Danuta Schüler
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| | - Jedrzej Jakub Szymanski
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| | - Astrid Junker
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT Gatersleben, Germany
| |
Collapse
|
35
|
Beier S, Fiebig A, Pommier C, Liyanage I, Lange M, Kersey PJ, Weise S, Finkers R, Koylass B, Cezard T, Courtot M, Contreras-Moreira B, Naamati G, Dyer S, Scholz U. Recommendations for the formatting of Variant Call Format (VCF) files to make plant genotyping data FAIR. F1000Res 2022; 11. [PMID: 35811804 PMCID: PMC9218589 DOI: 10.12688/f1000research.109080.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/17/2022] [Indexed: 11/20/2022] Open
Abstract
In this opinion article, we discuss the formatting of files from (plant) genotyping studies, in particular the formatting of metadata in Variant Call Format (VCF) files. The flexibility of the VCF format specification facilitates its use as a generic interchange format across domains but can lead to inconsistency between files in the presentation of metadata. To enable fully autonomous machine actionable data flow, generic elements need to be further specified. We strongly support the merits of the FAIR principles and see the need to facilitate them also through technical implementation specifications. They form a basis for the proposed VCF extensions here. We have learned from the existing application of VCF that the definition of relevant metadata using controlled standards, vocabulary and the consistent use of cross-references via resolvable identifiers (machine-readable) are particularly necessary and propose their encoding. VCF is an established standard for the exchange and publication of genotyping data. Other data formats are also used to capture variant data (for example, the HapMap and the gVCF formats), but none currently have the reach of VCF. For the sake of simplicity, we will only discuss VCF and our recommendations for its use, but these recommendations could also be applied to gVCF. However, the part of the VCF standard relating to metadata (as opposed to the actual variant calls) defines a syntactic format but no vocabulary, unique identifier or recommended content. In practice, often only sparse descriptive metadata is included. When descriptive metadata is provided, proprietary metadata fields are frequently added that have not been agreed upon within the community which may limit long-term and comprehensive interoperability. To address this, we propose recommendations for supplying and encoding metadata, focusing on use cases from plant sciences. We expect there to be overlap, but also divergence, with the needs of other domains.
Collapse
Affiliation(s)
- Sebastian Beier
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
- Institute of Bio- and Geosciences, Bioinformatics (IBG-4), Forschungszentrum Jülich GmbH, Jülich, 52425, Germany
| | - Anne Fiebig
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
| | - Cyril Pommier
- BioinfOmics, Plant bioinformatics facility, Université Paris-Saclay, INRAE, Versailles, France
| | - Isuru Liyanage
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Matthias Lange
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
| | | | - Stephan Weise
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
| | - Richard Finkers
- Plant Breeding, Wageningen University & Research, Wageningen, The Netherlands
- Gennovation B.V., Wageningen, The Netherlands
| | - Baron Koylass
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Timothee Cezard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
- Ontario Institute for Cancer Research, Toronto, Canada
| | - Bruno Contreras-Moreira
- Laboratorio de Biología Computacional y Estructural, Estación Experimental Aula Dei-CSIC, Zaragoza, 50059, Spain
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Uwe Scholz
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
| |
Collapse
|
36
|
Danilevicz MF, Gill M, Anderson R, Batley J, Bennamoun M, Bayer PE, Edwards D. Plant Genotype to Phenotype Prediction Using Machine Learning. Front Genet 2022; 13:822173. [PMID: 35664329 PMCID: PMC9159391 DOI: 10.3389/fgene.2022.822173] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 03/07/2022] [Indexed: 12/13/2022] Open
Abstract
Genomic prediction tools support crop breeding based on statistical methods, such as the genomic best linear unbiased prediction (GBLUP). However, these tools are not designed to capture non-linear relationships within multi-dimensional datasets, or deal with high dimension datasets such as imagery collected by unmanned aerial vehicles. Machine learning (ML) algorithms have the potential to surpass the prediction accuracy of current tools used for genotype to phenotype prediction, due to their capacity to autonomously extract data features and represent their relationships at multiple levels of abstraction. This review addresses the challenges of applying statistical and machine learning methods for predicting phenotypic traits based on genetic markers, environment data, and imagery for crop breeding. We present the advantages and disadvantages of explainable model structures, discuss the potential of machine learning models for genotype to phenotype prediction in crop breeding, and the challenges, including the scarcity of high-quality datasets, inconsistent metadata annotation and the requirements of ML models.
Collapse
Affiliation(s)
- Monica F. Danilevicz
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Mitchell Gill
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Robyn Anderson
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Jacqueline Batley
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Mohammed Bennamoun
- School of Physics, Mathematics and Computing, University of Western Australia, Perth, WA, Australia
| | - Philipp E. Bayer
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
- *Correspondence: David Edwards,
| |
Collapse
|
37
|
Petereit J, Marsh JI, Bayer PE, Danilevicz MF, Thomas WJW, Batley J, Edwards D. Genetic and Genomic Resources for Soybean Breeding Research. PLANTS (BASEL, SWITZERLAND) 2022; 11:1181. [PMID: 35567182 PMCID: PMC9101001 DOI: 10.3390/plants11091181] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 04/21/2022] [Accepted: 04/22/2022] [Indexed: 11/17/2022]
Abstract
Soybean (Glycine max) is a legume species of significant economic and nutritional value. The yield of soybean continues to increase with the breeding of improved varieties, and this is likely to continue with the application of advanced genetic and genomic approaches for breeding. Genome technologies continue to advance rapidly, with an increasing number of high-quality genome assemblies becoming available. With accumulating data from marker arrays and whole-genome resequencing, studying variations between individuals and populations is becoming increasingly accessible. Furthermore, the recent development of soybean pangenomes has highlighted the significant structural variation between individuals, together with knowledge of what has been selected for or lost during domestication and breeding, information that can be applied for the breeding of improved cultivars. Because of this, resources such as genome assemblies, SNP datasets, pangenomes and associated databases are becoming increasingly important for research underlying soybean crop improvement.
Collapse
Affiliation(s)
| | - Jacob I. Marsh
- School of Biological Sciences, The University of Western Australia, Perth, WA 6009, Australia; (J.P.); (J.I.M.); (P.E.B.); (M.F.D.); (W.J.W.T.); (J.B.)
| | | | | | | | | | - David Edwards
- School of Biological Sciences, The University of Western Australia, Perth, WA 6009, Australia; (J.P.); (J.I.M.); (P.E.B.); (M.F.D.); (W.J.W.T.); (J.B.)
| |
Collapse
|
38
|
Liyanage I, Burdett T, Droesbeke B, Erdos K, Fernandez R, Gray A, Haseeb M, Jupp S, Penim F, Pommier C, Rocca-Serra P, Courtot M, Coppens F. ELIXIR biovalidator for semantic validation of life science metadata. Bioinformatics 2022; 38:3141-3142. [PMID: 35380605 PMCID: PMC9154242 DOI: 10.1093/bioinformatics/btac195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 02/25/2022] [Accepted: 04/01/2022] [Indexed: 01/14/2023] Open
Abstract
SUMMARY To advance biomedical research, increasingly large amounts of complex data need to be discovered and integrated. This requires syntactic and semantic validation to ensure shared understanding of relevant entities. This article describes the ELIXIR biovalidator, which extends the syntactic validation of the widely used AJV library with ontology-based validation of JSON documents. AVAILABILITY AND IMPLEMENTATION Source code: https://github.com/elixir-europe/biovalidator, Release: v1.9.1, License: Apache License 2.0, Deployed at: https://www.ebi.ac.uk/biosamples/schema/validator/validate. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Isuru Liyanage
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Bert Droesbeke
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium,VIB Center for Plant Systems Biology, 9052 Ghent, Belgium
| | - Karoly Erdos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Rolando Fernandez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Alasdair Gray
- Department of Computer Science, Heriot-Watt University, Edinburgh EH14 4AS, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Simon Jupp
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Flavia Penim
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Cyril Pommier
- INRAE, BioinfOmics, Plant Bioinformatics Facility, Université Paris-Saclay, 78026 Versailles, France,INRAE, URGI, Université Paris-Saclay, 78026 Versailles, France
| | - Philippe Rocca-Serra
- Department of Engineering Science, University of Oxford e-Research Centre, University of Oxford, Oxford OX1 3QG, UK
| | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK,Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada,To whom correspondence should be addressed.
| | - Frederik Coppens
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium,VIB Center for Plant Systems Biology, 9052 Ghent, Belgium
| |
Collapse
|
39
|
Eid R, Landès C, Pernet A, Benoît E, Santagostini P, Ghaziri AE, Bourbeillon J. DIVIS: a semantic DIstance to improve the VISualisation of heterogeneous phenotypic datasets. BioData Min 2022; 15:10. [PMID: 35379292 PMCID: PMC8981856 DOI: 10.1186/s13040-022-00293-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 02/27/2022] [Indexed: 11/24/2022] Open
Abstract
Background Thanks to the wider spread of high-throughput experimental techniques, biologists are accumulating large amounts of datasets which often mix quantitative and qualitative variables and are not always complete, in particular when they regard phenotypic traits. In order to get a first insight into these datasets and reduce the data matrices size scientists often rely on multivariate analysis techniques. However such approaches are not always easily practicable in particular when faced with mixed datasets. Moreover displaying large numbers of individuals leads to cluttered visualisations which are difficult to interpret. Results We introduced a new methodology to overcome these limits. Its main feature is a new semantic distance tailored for both quantitative and qualitative variables which allows for a realistic representation of the relationships between individuals (phenotypic descriptions in our case). This semantic distance is based on ontologies which are engineered to represent real-life knowledge regarding the underlying variables. For easier handling by biologists, we incorporated its use into a complete tool, from raw data file to visualisation. Following the distance calculation, the next steps performed by the tool consist in (i) grouping similar individuals, (ii) representing each group by emblematic individuals we call archetypes and (iii) building sparse visualisations based on these archetypes. Our approach was implemented as a Python pipeline and applied to a rosebush dataset including passport and phenotypic data. Conclusions The introduction of our new semantic distance and of the archetype concept allowed us to build a comprehensive representation of an incomplete dataset characterised by a large proportion of qualitative data. The methodology described here could have wider use beyond information characterizing organisms or species and beyond plant science. Indeed we could apply the same approach to any mixed dataset. Supplementary Information The online version contains supplementary material available at (10.1186/s13040-022-00293-y).
Collapse
Affiliation(s)
- Rayan Eid
- Institut Agro, Univ Angers, INRAE, IRHS, SFR QuaSaV, Angers, 49000, France
| | - Claudine Landès
- Institut Agro, Univ Angers, INRAE, IRHS, SFR QuaSaV, Angers, 49000, France
| | - Alix Pernet
- Institut Agro, Univ Angers, INRAE, IRHS, SFR QuaSaV, Angers, 49000, France
| | | | | | | | - Julie Bourbeillon
- Institut Agro, Univ Angers, INRAE, IRHS, SFR QuaSaV, Angers, 49000, France.
| |
Collapse
|
40
|
Tanner F, Tonn S, de Wit J, Van den Ackerveken G, Berger B, Plett D. Sensor-based phenotyping of above-ground plant-pathogen interactions. PLANT METHODS 2022; 18:35. [PMID: 35313920 PMCID: PMC8935837 DOI: 10.1186/s13007-022-00853-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 02/08/2022] [Indexed: 05/20/2023]
Abstract
Plant pathogens cause yield losses in crops worldwide. Breeding for improved disease resistance and management by precision agriculture are two approaches to limit such yield losses. Both rely on detecting and quantifying signs and symptoms of plant disease. To achieve this, the field of plant phenotyping makes use of non-invasive sensor technology. Compared to invasive methods, this can offer improved throughput and allow for repeated measurements on living plants. Abiotic stress responses and yield components have been successfully measured with phenotyping technologies, whereas phenotyping methods for biotic stresses are less developed, despite the relevance of plant disease in crop production. The interactions between plants and pathogens can lead to a variety of signs (when the pathogen itself can be detected) and diverse symptoms (detectable responses of the plant). Here, we review the strengths and weaknesses of a broad range of sensor technologies that are being used for sensing of signs and symptoms on plant shoots, including monochrome, RGB, hyperspectral, fluorescence, chlorophyll fluorescence and thermal sensors, as well as Raman spectroscopy, X-ray computed tomography, and optical coherence tomography. We argue that choosing and combining appropriate sensors for each plant-pathosystem and measuring with sufficient spatial resolution can enable specific and accurate measurements of above-ground signs and symptoms of plant disease.
Collapse
Affiliation(s)
- Florian Tanner
- Australian Plant Phenomics Facility, School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, SA Australia
| | - Sebastian Tonn
- Department of Biology, Plant-Microbe Interactions, Utrecht University, 3584CH Utrecht, The Netherlands
| | - Jos de Wit
- Department of Imaging Physics, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands
| | - Guido Van den Ackerveken
- Department of Biology, Plant-Microbe Interactions, Utrecht University, 3584CH Utrecht, The Netherlands
| | - Bettina Berger
- Australian Plant Phenomics Facility, School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, SA Australia
| | - Darren Plett
- Australian Plant Phenomics Facility, School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, SA Australia
| |
Collapse
|
41
|
Sun D, Robbins K, Morales N, Shu Q, Cen H. Advances in optical phenotyping of cereal crops. TRENDS IN PLANT SCIENCE 2022; 27:191-208. [PMID: 34417079 DOI: 10.1016/j.tplants.2021.07.015] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 07/22/2021] [Accepted: 07/24/2021] [Indexed: 06/13/2023]
Abstract
Optical sensors and sensing-based phenotyping techniques have become mainstream approaches in high-throughput phenotyping for improving trait selection and genetic gains in crops. We review recent progress and contemporary applications of optical sensing-based phenotyping (OSP) techniques in cereal crops and highlight optical sensing principles for spectral response and sensor specifications. Further, we group phenotypic traits determined by OSP into four categories - morphological, biochemical, physiological, and performance traits - and illustrate appropriate sensors for each extraction. In addition to the current status, we discuss the challenges of OSP and provide possible solutions. We propose that optical sensing-based traits need to be explored further, and that standardization of the language of phenotyping and worldwide collaboration between phenotyping researchers and other fields need to be established.
Collapse
Affiliation(s)
- Dawei Sun
- College of Biosystems Engineering and Food Science, and State Key Laboratory of Modern Optical Instrumentation, Zhejiang University, Hangzhou 310058, PR China; Key Laboratory of Spectroscopy Sensing, Ministry of Agriculture and Rural Affairs, Hangzhou 310058, PR China
| | - Kelly Robbins
- Section of Plant Breeding and Genetics, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Nicolas Morales
- Section of Plant Breeding and Genetics, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Qingyao Shu
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, Institute of Crop Science, Zhejiang University, Hangzhou, PR China; State Key Laboratory of Rice Biology, Zhejiang University, Hangzhou 310058, PR China
| | - Haiyan Cen
- College of Biosystems Engineering and Food Science, and State Key Laboratory of Modern Optical Instrumentation, Zhejiang University, Hangzhou 310058, PR China; Key Laboratory of Spectroscopy Sensing, Ministry of Agriculture and Rural Affairs, Hangzhou 310058, PR China.
| |
Collapse
|
42
|
Raza A, Tabassum J, Zahid Z, Charagh S, Bashir S, Barmukh R, Khan RSA, Barbosa F, Zhang C, Chen H, Zhuang W, Varshney RK. Advances in "Omics" Approaches for Improving Toxic Metals/Metalloids Tolerance in Plants. FRONTIERS IN PLANT SCIENCE 2022; 12:794373. [PMID: 35058954 PMCID: PMC8764127 DOI: 10.3389/fpls.2021.794373] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 11/22/2021] [Indexed: 05/17/2023]
Abstract
Food safety has emerged as a high-urgency matter for sustainable agricultural production. Toxic metal contamination of soil and water significantly affects agricultural productivity, which is further aggravated by extreme anthropogenic activities and modern agricultural practices, leaving food safety and human health at risk. In addition to reducing crop production, increased metals/metalloids toxicity also disturbs plants' demand and supply equilibrium. Counterbalancing toxic metals/metalloids toxicity demands a better understanding of the complex mechanisms at physiological, biochemical, molecular, cellular, and plant level that may result in increased crop productivity. Consequently, plants have established different internal defense mechanisms to cope with the adverse effects of toxic metals/metalloids. Nevertheless, these internal defense mechanisms are not adequate to overwhelm the metals/metalloids toxicity. Plants produce several secondary messengers to trigger cell signaling, activating the numerous transcriptional responses correlated with plant defense. Therefore, the recent advances in omics approaches such as genomics, transcriptomics, proteomics, metabolomics, ionomics, miRNAomics, and phenomics have enabled the characterization of molecular regulators associated with toxic metal tolerance, which can be deployed for developing toxic metal tolerant plants. This review highlights various response strategies adopted by plants to tolerate toxic metals/metalloids toxicity, including physiological, biochemical, and molecular responses. A seven-(omics)-based design is summarized with scientific clues to reveal the stress-responsive genes, proteins, metabolites, miRNAs, trace elements, stress-inducible phenotypes, and metabolic pathways that could potentially help plants to cope up with metals/metalloids toxicity in the face of fluctuating environmental conditions. Finally, some bottlenecks and future directions have also been highlighted, which could enable sustainable agricultural production.
Collapse
Affiliation(s)
- Ali Raza
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
| | - Javaria Tabassum
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Hangzhou, China
| | - Zainab Zahid
- School of Civil and Environmental Engineering (SCEE), Institute of Environmental Sciences and Engineering (IESE), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Sidra Charagh
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Hangzhou, China
| | - Shanza Bashir
- School of Civil and Environmental Engineering (SCEE), Institute of Environmental Sciences and Engineering (IESE), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Rutwik Barmukh
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Rao Sohail Ahmad Khan
- Centre of Agricultural Biochemistry and Biotechnology (CABB), University of Agriculture, Faisalabad, Pakistan
| | - Fernando Barbosa
- Department of Clinical Analysis, Toxicology and Food Sciences, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Chong Zhang
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
| | - Hua Chen
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
| | - Weijian Zhuang
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
| | - Rajeev K. Varshney
- Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Center of Legume Crop Genetics and Systems Biology/College of Agriculture, Oil Crops Research Institute, Fujian Agriculture and Forestry University (FAFU), Fuzhou, China
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
- State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, WA, Australia
| |
Collapse
|
43
|
Langstroff A, Heuermann MC, Stahl A, Junker A. Opportunities and limits of controlled-environment plant phenotyping for climate response traits. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1-16. [PMID: 34302493 PMCID: PMC8741719 DOI: 10.1007/s00122-021-03892-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 06/17/2021] [Indexed: 05/19/2023]
Abstract
Rising temperatures and changing precipitation patterns will affect agricultural production substantially, exposing crops to extended and more intense periods of stress. Therefore, breeding of varieties adapted to the constantly changing conditions is pivotal to enable a quantitatively and qualitatively adequate crop production despite the negative effects of climate change. As it is not yet possible to select for adaptation to future climate scenarios in the field, simulations of future conditions in controlled-environment (CE) phenotyping facilities contribute to the understanding of the plant response to special stress conditions and help breeders to select ideal genotypes which cope with future conditions. CE phenotyping facilities enable the collection of traits that are not easy to measure under field conditions and the assessment of a plant's phenotype under repeatable, clearly defined environmental conditions using automated, non-invasive, high-throughput methods. However, extrapolation and translation of results obtained under controlled environments to field environments is ambiguous. This review outlines the opportunities and challenges of phenotyping approaches under controlled environments complementary to conventional field trials. It gives an overview on general principles and introduces existing phenotyping facilities that take up the challenge of obtaining reliable and robust phenotypic data on climate response traits to support breeding of climate-adapted crops.
Collapse
Affiliation(s)
- Anna Langstroff
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University Giessen, Heinrich Buff-Ring 26, 35392, Giessen, Germany
| | - Marc C Heuermann
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstr. 3, OT Gatersleben, 06466, Seeland, Germany
| | - Andreas Stahl
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University Giessen, Heinrich Buff-Ring 26, 35392, Giessen, Germany
- Institute for Resistance Research and Stress Tolerance, Federal Research Centre for Cultivated Plants, Julius Kühn-Institut (JKI), Erwin-Baur-Strasse 27, 06484, Quedlinburg, Germany
| | - Astrid Junker
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Corrensstr. 3, OT Gatersleben, 06466, Seeland, Germany.
| |
Collapse
|
44
|
Courtot M, Gupta D, Liyanage I, Xu F, Burdett T. BioSamples database: FAIRer samples metadata to accelerate research data management. Nucleic Acids Res 2021; 50:D1500-D1507. [PMID: 34747489 PMCID: PMC8728232 DOI: 10.1093/nar/gkab1046] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/13/2021] [Accepted: 10/14/2021] [Indexed: 12/04/2022] Open
Abstract
The BioSamples database at EMBL-EBI is the central institutional repository for sample metadata storage and connection to EMBL-EBI archives and other resources. The technical improvements to our infrastructure described in our last update have enabled us to scale and accommodate an increasing number of communities, resulting in a higher number of submissions and more heterogeneous data. The BioSamples database now has a valuable set of features and processes to improve data quality in BioSamples, and in particular enriching metadata content and following FAIR principles. In this manuscript, we describe how BioSamples in 2021 handles requirements from our community of users through exemplar use cases: increased findability of samples and improved data management practices support the goals of the ReSOLUTE project, how the plant community benefits from being able to link genotypic to phenotypic information, and we highlight how cumulatively those improvements contribute to more complex multi-omics data integration supporting COVID-19 research. Finally, we present underlying technical features used as pillars throughout those use cases and how they are reused for expanded engagement with communities such as FAIRplus and the Global Alliance for Genomics and Health. Availability: The BioSamples database is freely available at http://www.ebi.ac.uk/biosamples. Content is distributed under the EMBL-EBI Terms of Use available at https://www.ebi.ac.uk/about/terms-of-use. The BioSamples code is available at https://github.com/EBIBioSamples/biosamples-v4 and distributed under the Apache 2.0 license.
Collapse
Affiliation(s)
- Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Isuru Liyanage
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Fuqi Xu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
45
|
Grapevine and Wine Metabolomics-Based Guidelines for FAIR Data and Metadata Management. Metabolites 2021; 11:metabo11110757. [PMID: 34822415 PMCID: PMC8618349 DOI: 10.3390/metabo11110757] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 10/29/2021] [Accepted: 10/30/2021] [Indexed: 01/12/2023] Open
Abstract
In the era of big and omics data, good organization, management, and description of experimental data are crucial for achieving high-quality datasets. This, in turn, is essential for the export of robust results, to publish reliable papers, make data more easily available, and unlock the huge potential of data reuse. Lately, more and more journals now require authors to share data and metadata according to the FAIR (Findable, Accessible, Interoperable, Reusable) principles. This work aims to provide a step-by-step guideline for the FAIR data and metadata management specific to grapevine and wine science. In detail, the guidelines include recommendations for the organization of data and metadata regarding (i) meaningful information on experimental design and phenotyping, (ii) sample collection, (iii) sample preparation, (iv) chemotype analysis, (v) data analysis (vi) metabolite annotation, and (vii) basic ontologies. We hope that these guidelines will be helpful for the grapevine and wine metabolomics community and that it will benefit from the true potential of data usage in creating new knowledge being revealed.
Collapse
|
46
|
Machwitz M, Pieruschka R, Berger K, Schlerf M, Aasen H, Fahrner S, Jiménez-Berni J, Baret F, Rascher U. Bridging the Gap Between Remote Sensing and Plant Phenotyping-Challenges and Opportunities for the Next Generation of Sustainable Agriculture. FRONTIERS IN PLANT SCIENCE 2021; 12:749374. [PMID: 34751225 PMCID: PMC8571019 DOI: 10.3389/fpls.2021.749374] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 09/27/2021] [Indexed: 05/27/2023]
Affiliation(s)
- Miriam Machwitz
- Department of Environmental Research and Innovation, Luxembourg Institute of Science and Technology, Belval, Luxembourg
| | - Roland Pieruschka
- Institute of Bio and Geosciences, Plant Sciences, Forschungszentrum Jülich, Helmholtz-Verband Deutscher Forschungszentren, Jülich, Germany
| | - Katja Berger
- Department of Geography, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Martin Schlerf
- Department of Environmental Research and Innovation, Luxembourg Institute of Science and Technology, Belval, Luxembourg
| | - Helge Aasen
- Department of Environmental Systems Science, Crop Science, Eidgenössische Technische Hochschule (ETH) Zurich, Zurich, Switzerland
| | - Sven Fahrner
- Institute of Bio and Geosciences, Plant Sciences, Forschungszentrum Jülich, Helmholtz-Verband Deutscher Forschungszentren, Jülich, Germany
| | - Jose Jiménez-Berni
- Instituto de Agricultura Sostenible, Consejo Superior de Investigaciones Científicas, Cordoba, Spain
| | | | - Uwe Rascher
- Forschungszentrum Jülich, Institute of Bio- and Geosciences Plant Sciences (IBG-2), Jülich, Germany
| |
Collapse
|
47
|
Danilevicz MF, Bayer PE, Nestor BJ, Bennamoun M, Edwards D. Resources for image-based high-throughput phenotyping in crops and data sharing challenges. PLANT PHYSIOLOGY 2021; 187:699-715. [PMID: 34608963 PMCID: PMC8561249 DOI: 10.1093/plphys/kiab301] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 05/26/2021] [Indexed: 05/06/2023]
Abstract
High-throughput phenotyping (HTP) platforms are capable of monitoring the phenotypic variation of plants through multiple types of sensors, such as red green and blue (RGB) cameras, hyperspectral sensors, and computed tomography, which can be associated with environmental and genotypic data. Because of the wide range of information provided, HTP datasets represent a valuable asset to characterize crop phenotypes. As HTP becomes widely employed with more tools and data being released, it is important that researchers are aware of these resources and how they can be applied to accelerate crop improvement. Researchers may exploit these datasets either for phenotype comparison or employ them as a benchmark to assess tool performance and to support the development of tools that are better at generalizing between different crops and environments. In this review, we describe the use of image-based HTP for yield prediction, root phenotyping, development of climate-resilient crops, detecting pathogen and pest infestation, and quantitative trait measurement. We emphasize the need for researchers to share phenotypic data, and offer a comprehensive list of available datasets to assist crop breeders and tool developers to leverage these resources in order to accelerate crop breeding.
Collapse
Affiliation(s)
- Monica F. Danilevicz
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, Western Australia 6009, Australia
| | - Philipp E. Bayer
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, Western Australia 6009, Australia
| | - Benjamin J. Nestor
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, Western Australia 6009, Australia
| | - Mohammed Bennamoun
- Department of Computer Science and Software Engineering, University of Western Australia, Perth, Western Australia 6009, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, Western Australia 6009, Australia
- Author for communication:
| |
Collapse
|
48
|
Leipzig J, Nüst D, Hoyt CT, Ram K, Greenberg J. The role of metadata in reproducible computational research. PATTERNS (NEW YORK, N.Y.) 2021; 2:100322. [PMID: 34553169 PMCID: PMC8441584 DOI: 10.1016/j.patter.2021.100322] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Reproducible computational research (RCR) is the keystone of the scientific method for in silico analyses, packaging the transformation of raw data to published results. In addition to its role in research integrity, improving the reproducibility of scientific studies can accelerate evaluation and reuse. This potential and wide support for the FAIR principles have motivated interest in metadata standards supporting reproducibility. Metadata provide context and provenance to raw data and methods and are essential to both discovery and validation. Despite this shared connection with scientific data, few studies have explicitly described how metadata enable reproducible computational research. This review employs a functional content analysis to identify metadata standards that support reproducibility across an analytic stack consisting of input data, tools, notebooks, pipelines, and publications. Our review provides background context, explores gaps, and discovers component trends of embeddedness and methodology weight from which we derive recommendations for future work.
Collapse
Affiliation(s)
- Jeremy Leipzig
- Metadata Research Center, College of Computing and Informatics, Drexel University, Philadelphia, PA, USA
| | - Daniel Nüst
- Institute for Geoinformatics, University of Münster, Münster, Germany
| | | | - Karthik Ram
- Berkeley Institute for Data Science, University of California, Berkeley, Berkeley, CA, USA
| | - Jane Greenberg
- Metadata Research Center, College of Computing and Informatics, Drexel University, Philadelphia, PA, USA
| |
Collapse
|
49
|
Jha SG, Borowsky AT, Cole BJ, Fahlgren N, Farmer A, Huang SSC, Karia P, Libault M, Provart NJ, Rice SL, Saura-Sanchez M, Agarwal P, Ahkami AH, Anderton CR, Briggs SP, Brophy JAN, Denolf P, Di Costanzo LF, Exposito-Alonso M, Giacomello S, Gomez-Cano F, Kaufmann K, Ko DK, Kumar S, Malkovskiy AV, Nakayama N, Obata T, Otegui MS, Palfalvi G, Quezada-Rodríguez EH, Singh R, Uhrig RG, Waese J, Van Wijk K, Wright RC, Ehrhardt DW, Birnbaum KD, Rhee SY. Vision, challenges and opportunities for a Plant Cell Atlas. eLife 2021; 10:e66877. [PMID: 34491200 PMCID: PMC8423441 DOI: 10.7554/elife.66877] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 08/26/2021] [Indexed: 02/06/2023] Open
Abstract
With growing populations and pressing environmental problems, future economies will be increasingly plant-based. Now is the time to reimagine plant science as a critical component of fundamental science, agriculture, environmental stewardship, energy, technology and healthcare. This effort requires a conceptual and technological framework to identify and map all cell types, and to comprehensively annotate the localization and organization of molecules at cellular and tissue levels. This framework, called the Plant Cell Atlas (PCA), will be critical for understanding and engineering plant development, physiology and environmental responses. A workshop was convened to discuss the purpose and utility of such an initiative, resulting in a roadmap that acknowledges the current knowledge gaps and technical challenges, and underscores how the PCA initiative can help to overcome them.
Collapse
Affiliation(s)
- Suryatapa Ghosh Jha
- Department of Plant Biology, Carnegie Institution for ScienceStanfordUnited States
| | - Alexander T Borowsky
- Department of Botany and Plant Sciences, University of California, RiversideRiversideUnited States
| | - Benjamin J Cole
- Joint Genome Institute, Lawrence Berkeley National LaboratoryWalnut CreekUnited States
| | - Noah Fahlgren
- Donald Danforth Plant Science CenterSt. LouisUnited States
| | - Andrew Farmer
- National Center for Genome ResourcesSanta FeUnited States
| | | | - Purva Karia
- Department of Plant Biology, Carnegie Institution for ScienceStanfordUnited States
- Department of Cell and Systems Biology, University of TorontoTorontoCanada
| | - Marc Libault
- Department of Agronomy and Horticulture, University of Nebraska-LincolnLincolnUnited States
| | - Nicholas J Provart
- Department of Cell and Systems Biology and the Centre for the Analysis of Genome Evolution and Function, University of TorontoTorontoCanada
| | - Selena L Rice
- Department of Plant Biology, Carnegie Institution for ScienceStanfordUnited States
| | - Maite Saura-Sanchez
- Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Investigaciones Fisiológicas y Ecológicas Vinculadas a la Agricultura, Facultad de Agronomía, Universidad de Buenos AiresBuenos AiresArgentina
| | - Pinky Agarwal
- National Institute of Plant Genome ResearchNew DelhiIndia
| | - Amir H Ahkami
- Environmental Molecular Sciences Division, Pacific Northwest National LaboratoryRichlandUnited States
| | - Christopher R Anderton
- Environmental Molecular Sciences Division, Pacific Northwest National LaboratoryRichlandUnited States
| | - Steven P Briggs
- Department of Biological Sciences, University of California, San DiegoSan DiegoUnited States
| | | | | | - Luigi F Di Costanzo
- Department of Agricultural Sciences, University of Naples Federico IINapoliItaly
| | - Moises Exposito-Alonso
- Department of Plant Biology, Carnegie Institution for ScienceStanfordUnited States
- Department of Plant Biology, Carnegie Institution for ScienceTübingenGermany
| | | | - Fabio Gomez-Cano
- Department of Biochemistry and Molecular Biology, Michigan State UniversityEast LansingUnited States
| | - Kerstin Kaufmann
- Department for Plant Cell and Molecular Biology, Institute for Biology, Humboldt-Universitaet zu BerlinBerlinGermany
| | - Dae Kwan Ko
- Great Lakes Bioenergy Research Center, Michigan State UniversityEast LansingUnited States
| | - Sagar Kumar
- Department of Plant Breeding & Genetics, Mata Gujri College, Fatehgarh Sahib, Punjabi UniversityPatialaIndia
| | - Andrey V Malkovskiy
- Department of Plant Biology, Carnegie Institution for ScienceStanfordUnited States
| | - Naomi Nakayama
- Department of Bioengineering, Imperial College LondonLondonUnited Kingdom
| | - Toshihiro Obata
- Department of Biochemistry, University of Nebraska-LincolnMadisonUnited States
| | - Marisa S Otegui
- Department of Botany, University of Wisconsin-MadisonMadisonUnited States
| | - Gergo Palfalvi
- Division of Evolutionary Biology, National Institute for Basic BiologyOkazakiJapan
| | - Elsa H Quezada-Rodríguez
- Ciencias Agrogenómicas, Escuela Nacional de Estudios Superiores Unidad León, Universidad Nacional Autónoma de MéxicoLeónMexico
| | - Rajveer Singh
- School of Agricultural Biotechnology, Punjab Agricultural UniversityLudhianaIndia
| | - R Glen Uhrig
- Department of Science, University of AlbertaEdmontonCanada
| | - Jamie Waese
- Department of Cell and Systems Biology/Centre for the Analysis of Genome Evolution and Function, University of TorontoTorontoCanada
| | - Klaas Van Wijk
- School of Integrated Plant Science, Plant Biology Section, Cornell UniversityIthacaUnited States
| | - R Clay Wright
- Department of Biological Systems Engineering, Virginia TechBlacksburgUnited States
| | - David W Ehrhardt
- Department of Plant Biology, Carnegie Institution for ScienceStanfordUnited States
| | - Kenneth D Birnbaum
- Center for Genomics and Systems Biology, New York UniversityNew YorkUnited States
| | - Seung Y Rhee
- Department of Plant Biology, Carnegie Institution for ScienceStanfordUnited States
| |
Collapse
|
50
|
Mayer G, Müller W, Schork K, Uszkoreit J, Weidemann A, Wittig U, Rey M, Quast C, Felden J, Glöckner FO, Lange M, Arend D, Beier S, Junker A, Scholz U, Schüler D, Kestler HA, Wibberg D, Pühler A, Twardziok S, Eils J, Eils R, Hoffmann S, Eisenacher M, Turewicz M. Implementing FAIR data management within the German Network for Bioinformatics Infrastructure (de.NBI) exemplified by selected use cases. Brief Bioinform 2021; 22:bbab010. [PMID: 33589928 PMCID: PMC8425304 DOI: 10.1093/bib/bbab010] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 12/21/2020] [Accepted: 01/06/2021] [Indexed: 12/21/2022] Open
Abstract
This article describes some use case studies and self-assessments of FAIR status of de.NBI services to illustrate the challenges and requirements for the definition of the needs of adhering to the FAIR (findable, accessible, interoperable and reusable) data principles in a large distributed bioinformatics infrastructure. We address the challenge of heterogeneity of wet lab technologies, data, metadata, software, computational workflows and the levels of implementation and monitoring of FAIR principles within the different bioinformatics sub-disciplines joint in de.NBI. On the one hand, this broad service landscape and the excellent network of experts are a strong basis for the development of useful research data management plans. On the other hand, the large number of tools and techniques maintained by distributed teams renders FAIR compliance challenging.
Collapse
Affiliation(s)
- Gerhard Mayer
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
- Ulm University, Institute of Medical Systems Biology, Ulm, Germany
| | - Wolfgang Müller
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Scientific Databases and Visualization Group, Heidelberg, Germany
| | - Karin Schork
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
| | - Julian Uszkoreit
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
| | - Andreas Weidemann
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Scientific Databases and Visualization Group, Heidelberg, Germany
| | - Ulrike Wittig
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Scientific Databases and Visualization Group, Heidelberg, Germany
| | - Maja Rey
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Scientific Databases and Visualization Group, Heidelberg, Germany
| | | | - Janine Felden
- Jacobs University Bremen gGmbH, Bremen, Germany
- University of Bremen, MARUM - Center for Marine Environmental Sciences, Bremen, Germany
| | - Frank Oliver Glöckner
- Jacobs University Bremen gGmbH, Bremen, Germany
- University of Bremen, MARUM - Center for Marine Environmental Sciences, Bremen, Germany
- Alfred Wegener Institute - Helmholtz Center for Polar- and Marine Research, Bremerhaven, Germany
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Sebastian Beier
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Astrid Junker
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Danuta Schüler
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Hans A Kestler
- Ulm University, Institute of Medical Systems Biology, Ulm, Germany
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Jena
| | - Daniel Wibberg
- Bielefeld University, Center for Biotechnology (CeBiTec), Bielefeld, Germany
| | - Alfred Pühler
- Bielefeld University, Center for Biotechnology (CeBiTec), Bielefeld, Germany
| | - Sven Twardziok
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Center for Digital Health, Berlin, Germany
| | - Jürgen Eils
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Center for Digital Health, Berlin, Germany
| | - Roland Eils
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Center for Digital Health, Berlin, Germany
- Heidelberg University Hospital and BioQuant, Health Data Science Unit, Heidelberg, Germany
| | - Steve Hoffmann
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Jena
| | - Martin Eisenacher
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
| | - Michael Turewicz
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (ProDi), Medical Proteome Analysis, Bochum, Germany
| |
Collapse
|