1
|
Morales N, Anche MT, Kaczmar NS, Lepak N, Ni P, Romay MC, Santantonio N, Buckler ES, Gore MA, Mueller LA, Robbins KR. Spatio-temporal modeling of high-throughput multispectral aerial images improves agronomic trait genomic prediction in hybrid maize. Genetics 2024; 227:iyae037. [PMID: 38469622 PMCID: PMC11075545 DOI: 10.1093/genetics/iyae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 12/02/2023] [Accepted: 02/18/2024] [Indexed: 03/13/2024] Open
Abstract
Design randomizations and spatial corrections have increased understanding of genotypic, spatial, and residual effects in field experiments, but precisely measuring spatial heterogeneity in the field remains a challenge. To this end, our study evaluated approaches to improve spatial modeling using high-throughput phenotypes (HTP) via unoccupied aerial vehicle (UAV) imagery. The normalized difference vegetation index was measured by a multispectral MicaSense camera and processed using ImageBreed. Contrasting to baseline agronomic trait spatial correction and a baseline multitrait model, a two-stage approach was proposed. Using longitudinal normalized difference vegetation index data, plot level permanent environment effects estimated spatial patterns in the field throughout the growing season. Normalized difference vegetation index permanent environment were separated from additive genetic effects using 2D spline, separable autoregressive models, or random regression models. The Permanent environment were leveraged within agronomic trait genomic best linear unbiased prediction either modeling an empirical covariance for random effects, or by modeling fixed effects as an average of permanent environment across time or split among three growth phases. Modeling approaches were tested using simulation data and Genomes-to-Fields hybrid maize (Zea mays L.) field experiments in 2015, 2017, 2019, and 2020 for grain yield, grain moisture, and ear height. The two-stage approach improved heritability, model fit, and genotypic effect estimation compared to baseline models. Electrical conductance and elevation from a 2019 soil survey significantly improved model fit, while 2D spline permanent environment were most strongly correlated with the soil parameters. Simulation of field effects demonstrated improved specificity for random regression models. In summary, the use of longitudinal normalized difference vegetation index measurements increased experimental accuracy and understanding of field spatio-temporal heterogeneity.
Collapse
Affiliation(s)
- Nicolas Morales
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Mahlet T Anche
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Nicholas S Kaczmar
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Nicholas Lepak
- United States Department of Agriculture-Agricultural Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
| | - Pengzun Ni
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
- College of Bioscience and Biotechnology, Shenyang Agricultural University, Shenhe District, Shenyang, Liaoning Province, PR China
| | - Maria Cinta Romay
- Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853, USA
| | - Nicholas Santantonio
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA
| | - Edward S Buckler
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
- United States Department of Agriculture-Agricultural Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
- Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853, USA
| | - Michael A Gore
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Lukas A Mueller
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
- Boyce Thompson Institute, Ithaca, NY 14853, USA
| | - Kelly R Robbins
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
2
|
Morales N, Ogbonna AC, Ellerbrock BJ, Bauchet GJ, Tantikanjana T, Tecle IY, Powell AF, Lyon D, Menda N, Simoes CC, Saha S, Hosmani P, Flores M, Panitz N, Preble RS, Agbona A, Rabbi I, Kulakow P, Peteti P, Kawuki R, Esuma W, Kanaabi M, Chelangat DM, Uba E, Olojede A, Onyeka J, Shah T, Karanja M, Egesi C, Tufan H, Paterne A, Asfaw A, Jannink JL, Wolfe M, Birkett CL, Waring DJ, Hershberger JM, Gore MA, Robbins KR, Rife T, Courtney C, Poland J, Arnaud E, Laporte MA, Kulembeka H, Salum K, Mrema E, Brown A, Bayo S, Uwimana B, Akech V, Yencho C, de Boeck B, Campos H, Swennen R, Edwards JD, Mueller LA. Breedbase: a digital ecosystem for modern plant breeding. G3 GENES|GENOMES|GENETICS 2022; 12:6564228. [PMID: 35385099 PMCID: PMC9258556 DOI: 10.1093/g3journal/jkac078] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 02/14/2022] [Indexed: 01/17/2023]
Abstract
Modern breeding methods integrate next-generation sequencing and phenomics to identify plants with the best characteristics and greatest genetic merit for use as parents in subsequent breeding cycles to ultimately create improved cultivars able to sustain high adoption rates by farmers. This data-driven approach hinges on strong foundations in data management, quality control, and analytics. Of crucial importance is a central database able to (1) track breeding materials, (2) store experimental evaluations, (3) record phenotypic measurements using consistent ontologies, (4) store genotypic information, and (5) implement algorithms for analysis, prediction, and selection decisions. Because of the complexity of the breeding process, breeding databases also tend to be complex, difficult, and expensive to implement and maintain. Here, we present a breeding database system, Breedbase (https://breedbase.org/, last accessed 4/18/2022). Originally initiated as Cassavabase (https://cassavabase.org/, last accessed 4/18/2022) with the NextGen Cassava project (https://www.nextgencassava.org/, last accessed 4/18/2022), and later developed into a crop-agnostic system, it is presently used by dozens of different crops and projects. The system is web based and is available as open source software. It is available on GitHub (https://github.com/solgenomics/, last accessed 4/18/2022) and packaged in a Docker image for deployment (https://hub.docker.com/u/breedbase, last accessed 4/18/2022). The Breedbase system enables breeding programs to better manage and leverage their data for decision making within a fully integrated digital ecosystem.
Collapse
Affiliation(s)
- Nicolas Morales
- Boyce Thompson Institute , Ithaca, NY 14853, USA
- Cornell University , Ithaca, NY 14853, USA
| | - Alex C Ogbonna
- Boyce Thompson Institute , Ithaca, NY 14853, USA
- Cornell University , Ithaca, NY 14853, USA
| | | | | | | | | | | | - David Lyon
- Boyce Thompson Institute , Ithaca, NY 14853, USA
| | - Naama Menda
- Boyce Thompson Institute , Ithaca, NY 14853, USA
| | | | - Surya Saha
- Boyce Thompson Institute , Ithaca, NY 14853, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | - Ezenwanyi Uba
- National Root Crops Research Institute (NRCRI) , 463109 Umudike, Nigeria
| | - Adeyemi Olojede
- National Root Crops Research Institute (NRCRI) , 463109 Umudike, Nigeria
| | - Joseph Onyeka
- National Root Crops Research Institute (NRCRI) , 463109 Umudike, Nigeria
| | | | | | - Chiedozie Egesi
- Boyce Thompson Institute , Ithaca, NY 14853, USA
- IITA Ibadan , 200001 Ibadan, Nigeria
- National Root Crops Research Institute (NRCRI) , 463109 Umudike, Nigeria
| | - Hale Tufan
- Cornell University , Ithaca, NY 14853, USA
| | | | | | - Jean-Luc Jannink
- Cornell University , Ithaca, NY 14853, USA
- USDA-ARS , Ithaca, NY 14853, USA
| | | | - Clay L Birkett
- Cornell University , Ithaca, NY 14853, USA
- USDA-ARS , Ithaca, NY 14853, USA
| | - David J Waring
- Cornell University , Ithaca, NY 14853, USA
- USDA-ARS , Ithaca, NY 14853, USA
| | | | | | | | - Trevor Rife
- Kansas State University , Manhattan, KS 66506, USA
| | | | - Jesse Poland
- Kansas State University , Manhattan, KS 66506, USA
| | | | | | | | | | | | | | | | | | | | - Craig Yencho
- North Carolina State University (NCSU) , Raleigh, NC 27695, USA
| | | | | | | | | | | |
Collapse
|
3
|
Sempéré G, Larmande P, Rouard M. Managing High-Density Genotyping Data with Gigwa. Methods Mol Biol 2022; 2443:415-427. [PMID: 35037218 DOI: 10.1007/978-1-0716-2067-0_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Next generation sequencing technologies enabled high-density genotyping for large numbers of samples. Nowadays SNP calling pipelines produce up to millions of such markers, but which need to be filtered in various ways according to the type of analyses. One of the main challenges still lies in the management of an increasing volume of genotyping files that are difficult to handle for many applications. Here, we provide a practical guide for efficiently managing large genomic variation data using Gigwa, a user-friendly, scalable and versatile application that may be deployed either remotely on web servers or on a local machine.
Collapse
Affiliation(s)
- Guilhem Sempéré
- CIRAD, UMR INTERTRYP, Montpellier, France
- INTERTRYP, Univ Montpellier, CIRAD, IRD, Montpellier, France
- French Institute of Bioinformatics (IFB)-South Green Bioinformatics Platform, Bioversity, CIRAD, INRAE, IRD, Montpellier, France
| | - Pierre Larmande
- French Institute of Bioinformatics (IFB)-South Green Bioinformatics Platform, Bioversity, CIRAD, INRAE, IRD, Montpellier, France.
- DIADE, Univ Montpellier, IRD, Montpellier, France.
| | - Mathieu Rouard
- French Institute of Bioinformatics (IFB)-South Green Bioinformatics Platform, Bioversity, CIRAD, INRAE, IRD, Montpellier, France
- Bioversity International, Parc Scientifique Agropolis II, Montpellier, France
| |
Collapse
|
4
|
Volk GM, Byrne PF, Coyne CJ, Flint-Garcia S, Reeves PA, Richards C. Integrating Genomic and Phenomic Approaches to Support Plant Genetic Resources Conservation and Use. PLANTS (BASEL, SWITZERLAND) 2021; 10:2260. [PMID: 34834625 PMCID: PMC8619436 DOI: 10.3390/plants10112260] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 10/20/2021] [Accepted: 10/20/2021] [Indexed: 05/17/2023]
Abstract
Plant genebanks provide genetic resources for breeding and research programs worldwide. These programs benefit from having access to high-quality, standardized phenotypic and genotypic data. Technological advances have made it possible to collect phenomic and genomic data for genebank collections, which, with the appropriate analytical tools, can directly inform breeding programs. We discuss the importance of considering genebank accession homogeneity and heterogeneity in data collection and documentation. Citing specific examples, we describe how well-documented genomic and phenomic data have met or could meet the needs of plant genetic resource managers and users. We explore future opportunities that may emerge from improved documentation and data integration among plant genetic resource information systems.
Collapse
Affiliation(s)
- Gayle M. Volk
- United States Department of Agriculture, Agricultural Research Service, National Laboratory for Genetic Resources Preservation, Fort Collins, CO 80521, USA; (P.A.R.); (C.R.)
| | - Patrick F. Byrne
- Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO 80523, USA;
| | - Clarice J. Coyne
- United States Department of Agriculture, Agricultural Research Service, Western Regional Plant Introduction Station, Pullman, WA 99164, USA;
| | - Sherry Flint-Garcia
- Plant Genetics Research Unit, United States Department of Agriculture, Agricultural Research Service, Columbia, MO 65211, USA;
| | - Patrick A. Reeves
- United States Department of Agriculture, Agricultural Research Service, National Laboratory for Genetic Resources Preservation, Fort Collins, CO 80521, USA; (P.A.R.); (C.R.)
| | - Chris Richards
- United States Department of Agriculture, Agricultural Research Service, National Laboratory for Genetic Resources Preservation, Fort Collins, CO 80521, USA; (P.A.R.); (C.R.)
| |
Collapse
|
5
|
Sanderson LA, Caron CT, Tan RL, Bett KE. A PostgreSQL Tripal solution for large-scale genotypic and phenotypic data. Database (Oxford) 2021; 2021:baab051. [PMID: 34389844 PMCID: PMC8363843 DOI: 10.1093/database/baab051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 05/11/2021] [Accepted: 08/03/2021] [Indexed: 11/15/2022]
Abstract
Researchers are seeking cost-effective solutions for management and analysis of large-scale genotypic and phenotypic data. Open-source software is uniquely positioned to fill this need through user-focused, crowd-sourced development. Tripal, an open-source toolkit for developing biological data web portals, uses the GMOD Chado database schema to achieve flexible, ontology-driven storage in PostgreSQL. Tripal also aids research-focused web portals in providing data according to findable, accessible, interoperable, reusable (FAIR) principles. We describe here a fully relational PostgreSQL solution to handle large-scale genotypic and phenotypic data that is implemented as a collection of freely available, open-source modules. These Tripal extension modules provide a holistic approach for importing, storage, display and analysis within a relational database schema. Furthermore, they embody the Tripal approach to FAIR data by providing multiple search tools and ensuring metadata is fully described and interoperable. Our solution focuses on data integrity, as well as optimizing performance to provide a fully functional system that is currently being used in the production of Tripal portals for crop species. We fully describe the implementation of our solution and discuss why a PostgreSQL-powered web portal provides an efficient environment for researcher-driven genotypic and phenotypic data analysis.
Collapse
Affiliation(s)
- Lacey-Anne Sanderson
- Department of Plant Sciences, University of Saskatchewan, 51 Campus Drive, Saskatoon SK S7N 5A8, Canada
| | - Carolyn T Caron
- Department of Plant Sciences, University of Saskatchewan, 51 Campus Drive, Saskatoon SK S7N 5A8, Canada
| | - Reynold L Tan
- Department of Plant Sciences, University of Saskatchewan, 51 Campus Drive, Saskatoon SK S7N 5A8, Canada
| | - Kirstin E Bett
- Department of Plant Sciences, University of Saskatchewan, 51 Campus Drive, Saskatoon SK S7N 5A8, Canada
| |
Collapse
|