1
|
Bewick P, Forstner P, Zhang B, Collakova E. Identification of novel candidate genes for regulating oil composition in soybean seeds under environmental stresses. FRONTIERS IN PLANT SCIENCE 2025; 16:1572319. [PMID: 40313727 PMCID: PMC12044429 DOI: 10.3389/fpls.2025.1572319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2025] [Accepted: 03/26/2025] [Indexed: 05/03/2025]
Abstract
Introduction A key objective of soybean breeding programs is to enhance nutritional quality for human and animal consumption, with improved fatty acid (FA) composition for health benefits, and expand soybean use for industrial applications. Methods We conducted a metabolite genome-wide association study (mGWAS) to identify genomic regions associated with changes in FA composition and FA ratios in soybean seeds influenced by environmental factors. This mGWAS utilized 218 soybean plant introductions (PIs) grown in two field locations in Virginia over two years. Results The mGWAS revealed that 20 SNPs were significantly associated with 21 FA ratios, while additional suggestive SNPs were found for 36 FA ratios, highlighting potential quantitative trait loci linked to FA composition. Discussion Many of these SNPs are located near or within the genes related to phytohormone-mediated biotic and abiotic stress responses, suggesting the involvement of environmental factors in modulating FA composition in soybean seeds. Our findings provide novel insights into the genetic and environmental factors influencing FA composition in oilseeds. This research also lays the foundation for developing stable markers to develop soybean cultivars with tailored FA profiles for different practical applications under variable growth conditions.
Collapse
Affiliation(s)
- Patrick Bewick
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
- Translational Plant Science Center, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Peter Forstner
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
- Translational Plant Science Center, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Bo Zhang
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
- Translational Plant Science Center, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Eva Collakova
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
- Translational Plant Science Center, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
- Fralin Life Science Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| |
Collapse
|
2
|
Choręziak A, Rosiejka D, Michałowska J, Bogdański P. Nutritional Quality, Safety and Environmental Benefits of Alternative Protein Sources-An Overview. Nutrients 2025; 17:1148. [PMID: 40218906 PMCID: PMC11990347 DOI: 10.3390/nu17071148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2025] [Revised: 03/19/2025] [Accepted: 03/24/2025] [Indexed: 04/14/2025] Open
Abstract
Protein is a fundamental macronutrient in the human diet. It supplies our organisms with essential amino acids, which are needed for the growth and maintenance of cells and tissues. Conventional protein sources, despite their complete amino acid profiles and excellent digestibility, have a proven negative impact on the environment. Furthermore, their production poses many ethical challenges. This review aims to present nutritional, more ethical, and environmentally friendly alternatives that could serve as potential protein sources for the population. The available literature on alternative protein sources has been analyzed. Based on the research conducted, various products have been identified and described, including plant-based protein sources such as soybeans, peas, faba beans, lupins, and hemp seeds; aquatic sources such as algae, microalgae, and water lentils; as well as insect-based and microbial protein sources, and cell-cultured meat. Despite numerous advantages, such as a lower environmental impact, higher ethical standards of production, and beneficial nutritional profiles, alternative protein sources are not without limitations. These include lower bioavailability of certain amino acids, the presence of antinutritional compounds, technological challenges, and issues related to consumer acceptance. Nevertheless, with proper dietary composition, optimization of production processes, and further technological advancements, presented alternatives can constitute valuable and sustainable protein sources for the growing global population.
Collapse
Affiliation(s)
| | | | - Joanna Michałowska
- Department of Obesity and Metabolic Disorders Treatment and Clinical Dietetics, Poznań University of Medical Sciences, 60-355 Poznań, Poland
| | | |
Collapse
|
3
|
Van K, Lee S, Mian MAR, McHale LK. Network analysis combined with genome-wide association study helps identification of genes related to amino acid contents in soybean. BMC Genomics 2025; 26:21. [PMID: 39780068 PMCID: PMC11715193 DOI: 10.1186/s12864-024-11163-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Accepted: 12/17/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND Additional to total protein content, the amino acid (AA) profile is important to the nutritional value of soybean seed. The AA profile in soybean seed is a complex quantitative trait controlled by multiple interconnected genes and pathways controlling the accumulation of each AA. With a total of 621 soybean germplasm, we used three genome-wide association study (GWAS)-based approaches to investigate the genomic regions controlling the AA content and profile in soybean. Among those approaches, the GWAS network analysis we implemented takes advantage of the relationships between specific AAs to identify the genetic control of AA profile. RESULTS For Approach I, GWAS were performed for the content of 24 single AAs under all environments combined. Significant SNPs grouping into 16 linkage disequilibrium (LD) blocks from 18 traits were identified. For Approach II, the individual AAs were grouped by five families according to their metabolic pathways and were examined based on the sum, ratios, and interactions of AAs within the same biochemical family. Significant SNPs grouping into 35 LD blocks were identified, with SNPs associated with traits from the same biochemical family often positioned on the same LD blocks. Approach III, a correlation-based network analysis, was performed to assess the empirical relationships among AAs. Two groups were described by the network topology, Group 1: Ala, Gly, Lys, available Lys (Alys), and Thr and Group 2: Ile and Tyr. Significant SNPs associated with a ratio of connected AAs or a ratio of a single AA to its fully or partially connected metabolic groups were identified within 9 LD blocks for Group 1 and 2 LD blocks for Group 2. Among 40 identified QTL for AA or AA-derived traits, three genomic regions were novel in terms of seed composition traits (oil, protein, and AA content). An additional 24 regions had previously not been specifically associated with the AA content. CONCLUSIONS Our results confirmed loci identified from previous studies but also suggested that network approaches for studying AA contents in soybean seed are valuable. Three genomic regions (Chr 5: 41,754,397-41,893,109 bp, Chr 9: 1,537,829-1,806,586 bp, and Chr 20: 31,554,795-33,678,257 bp) were significantly identified by all three approaches. Yet, the majority of associations between a genomic region and an AA trait were approach- and/or environment-specific. Using a combination of approaches provides insights into the genetic control and pleiotropy among AA contents, which can be applied to mechanistic understanding of variation in AA content as well as tailored nutrition in cultivars developed from soybean breeding programs.
Collapse
Affiliation(s)
- Kyujung Van
- Department of Horticulture and Crop Science, The Ohio State University, Columbus, OH, 43210, USA
| | - Sungwoo Lee
- Department of Crop Science, Chungnam National University, Daejeon, 34134, South Korea
| | - M A Rouf Mian
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC, 27695, USA
- Soybean & Nitrogen Fixation Unit, USDA-ARS, Raleigh, NC, 27607, USA
| | - Leah K McHale
- Department of Horticulture and Crop Science, The Ohio State University, Columbus, OH, 43210, USA.
- Center for Soybean Research and Center of Applied Plant Sciences, The Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
4
|
Zhang Y, Yang X, Bhat JA, Zhang Y, Bu M, Zhao B, Yang S. Identification of superior haplotypes and candidate gene for seed size-related traits in soybean ( Glycine max L.). MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2025; 45:3. [PMID: 39717350 PMCID: PMC11663835 DOI: 10.1007/s11032-024-01525-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 12/06/2024] [Indexed: 12/25/2024]
Abstract
Seed size is an economically important trait that directly determines the seed yield in soybean. In the current investigation, we used an integrated strategy of linkage mapping, association mapping, haplotype analysis and candidate gene analysis to determine the genetic makeup of four seed size-related traits viz., 100-seed weight (HSW), seed area (SA), seed length (SL), and seed width (SW) in soybean. Linkage mapping identified a total of 23 quantitative trait loci (QTL) associated with four seed size-related traits in the F2 population; among them, 17 were detected as novel QTLs, whereas the remaining six viz., qHSW3-1, qHSW4-1, qHSW18-1, qHSW19-1, qSL4-1 and qSW6-1 have been previously identified. Six out of 23 QTLs were major possessing phenotypic variation explained (PVE) ≥ 10%. Besides, the four QTL Clusters/QTL Hotspots harboring multiple QTLs for different seed size-related traits were identified on Chr.04, Chr.16, Chr.19 and Chr.20. Genome-wide association study (GWAS) identified a total of 62 SNPs significantly associated with the four seed size-related traits. Interestingly, the QTL viz., qHSW18-1 was identified by both linkage mapping and GWAS, and was regarded as the most stable loci regulating HSW in soybean. In-silico, sequencing and qRT-PCR analysis identified the Glyma.18G242400 as the most potential candidate gene underlying the qHSW18-1 for regulating HSW. Moreover, three haplotype blocks viz., Hap2, Hap6A and Hap6B were identified for the SW trait, and one haplotype was identified within the Glyma.18G242400 for the HSW. These four haplotypes harbor three to seven haplotype alleles across the association mapping panel of 350 soybean accessions, regulating the seed size from lowest to highest through intermediate phenotypes. Hence, the outcome of the current investigation can be utilized as a potential genetic and genomic resource for breeding the improved seed size in soybean. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-024-01525-1.
Collapse
Affiliation(s)
- Ye Zhang
- Key Laboratory of Soybean Molecular Design Breeding, National Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102 China
- College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Xinjing Yang
- Key Laboratory of Soybean Molecular Design Breeding, National Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102 China
- College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Javaid Akhter Bhat
- Zhejiang Lab, Research Institute of Intelligent Computing, Hangzhou, 310012 China
| | - Yaohua Zhang
- Key Laboratory of Soybean Molecular Design Breeding, National Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102 China
| | - Moran Bu
- Key Laboratory of Soybean Molecular Design Breeding, National Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102 China
- College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Beifang Zhao
- Key Laboratory of Soybean Molecular Design Breeding, National Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102 China
| | - Suxin Yang
- Key Laboratory of Soybean Molecular Design Breeding, National Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, 130102 China
- College of Advanced Agricultural Sciences, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
5
|
Li X, Liu K, Rideout S, Rosso L, Zhang B, Welbaum GE. Seed physiological traits and environmental factors influence seedling establishment of vegetable soybean ( Glycine max L.). FRONTIERS IN PLANT SCIENCE 2024; 15:1344895. [PMID: 38887465 PMCID: PMC11180749 DOI: 10.3389/fpls.2024.1344895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 05/20/2024] [Indexed: 06/20/2024]
Abstract
Edamame (Glycine max (L.) Merr.), a specialty soybean prized for its nutritional value and taste, has witnessed a surge in demand within the U.S. However, subpar seedling stands have hindered its production potential, necessitating increased inputs for farmers. This study aims to uncover potential physiological factors contributing to low seedling emergence in edamame. We conducted comprehensive assessments on thirteen prominent edamame genotypes alongside two food-grade and two grain-type soybean genotypes, focusing on germination and emergence speed in both laboratory and field settings. Additionally, we employed single electrical conductivity tests and identified and quantified seed leachate components to distinguish among soybean types. Furthermore, using a LabField™ simulation table, we examined seed emergence across a wide soil temperature range (5°C to 45°C) for edamame and other soybean types. All seeds were produced under the same environmental conditions, harvested in Fall 2020, and stored under uniform conditions to minimize quality variations. Our findings revealed minimal divergence in emergence percentages among the seventeen genotypes, with over 95% germination and emergence in laboratory conditions and over 70% emergence in the field. Nonetheless, edamame genotypes typically exhibited slower germination speeds and higher leachate exudates containing higher soluble sugars and amino acids. Seed size did not significantly impact total emergence but was negatively correlated with germination and emergence speed, although this effect could be mitigated under complex field conditions. Furthermore, this study proposed differences that distinguish edamame from other soybean types regarding ideal and base temperatures, as well as thermal time. The finds offer valuable insights into edamame establishment, potentially paving the way for supporting local edamame production in the U.S.
Collapse
Affiliation(s)
- Xiaoying Li
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
- Department of Horticultural Sciences, Tropical Research and Education Center, University of Florida, Homestead, FL, United States
| | - Kathryn Liu
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Steven Rideout
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Luciana Rosso
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Bo Zhang
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Gregory E. Welbaum
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| |
Collapse
|
6
|
He L, Sui Y, Che Y, Liu L, Liu S, Wang X, Cao G. New Insights into the Genetic Basis of Lysine Accumulation in Rice Revealed by Multi-Model GWAS. Int J Mol Sci 2024; 25:4667. [PMID: 38731885 PMCID: PMC11083390 DOI: 10.3390/ijms25094667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 04/21/2024] [Accepted: 04/22/2024] [Indexed: 05/13/2024] Open
Abstract
Lysine is an essential amino acid that cannot be synthesized in humans. Rice is a global staple food for humans but has a rather low lysine content. Identification of the quantitative trait nucleotides (QTNs) and genes underlying lysine content is crucial to increase lysine accumulation. In this study, five grain and three leaf lysine content datasets and 4,630,367 single nucleotide polymorphisms (SNPs) of 387 rice accessions were used to perform a genome-wide association study (GWAS) by ten statistical models. A total of 248 and 71 common QTNs associated with grain/leaf lysine content were identified. The accuracy of genomic selection/prediction RR-BLUP models was up to 0.85, and the significant correlation between the number of favorable alleles per accession and lysine content was up to 0.71, which validated the reliability and additive effects of these QTNs. Several key genes were uncovered for fine-tuning lysine accumulation. Additionally, 20 and 30 QTN-by-environment interactions (QEIs) were detected in grains/leaves. The QEI-sf0111954416 candidate gene LOC_Os01g21380 putatively accounted for gene-by-environment interaction was identified in grains. These findings suggested the application of multi-model GWAS facilitates a better understanding of lysine accumulation in rice. The identified QTNs and genes hold the potential for lysine-rich rice with a normal phenotype.
Collapse
Affiliation(s)
- Liqiang He
- School of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China
| | - Yao Sui
- School of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China
| | - Yanru Che
- School of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China
| | - Lihua Liu
- School of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China
| | - Shuo Liu
- School of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China
| | - Xiaobing Wang
- Institute of Tropical Crop Genetic Resources, Chinese Academy of Tropical Agricultural Sciences, Danzhou 571737, China
| | - Guangping Cao
- Hainan Key Laboratory of Crop Genetics and Breeding, Institute of Food Crops, Hainan Academy of Agricultural Sciences, Haikou 571100, China
| |
Collapse
|
7
|
Wang H, Bai Y, Biligetu B. Effects of SNP marker density and training population size on prediction accuracy in alfalfa (Medicago sativa L.) genomic selection. THE PLANT GENOME 2024; 17:e20431. [PMID: 38263612 DOI: 10.1002/tpg2.20431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 11/29/2023] [Accepted: 01/04/2024] [Indexed: 01/25/2024]
Abstract
Effects of individual single-nucleotide polymorphism (SNP) markers and the size of "training" and "test" populations affect prediction accuracy in genomic selection (GS). This study evaluated 11 subsets of 4932 SNPs using six genetic additive methods to understand marker density in GS prediction in alfalfa (Medicago sativa L.). In the GS methods, the effect of "training" to "test" population size was also evaluated. Fourteen alfalfa populations sampled from long-term grazing sites were genotyped using genotyping by sequencing for the identification of SNPs. These populations were also phenotyped for six agromorphological and three nutritive traits from 2018 to 2020. The accuracy of GS prediction improved across six GS methods when the ratio of "training" to "test" population size increased. However, the prediction accuracy of the six GS methods reduced to a range of -0.27 to 0.11 when random, uninformative SNPs were used. In this study, five Bayesian methods and ridge-regression best linear unbiased prediction (rrBLUP) method had similar GS accuracies for "training" sets, but rrBLUP tended to outperform Bayesian methods in independent "test" sets when SNP subsets with high mean-squared-estimated-marker effect were used. These findings can enhance the application of GS in alfalfa genetic improvement.
Collapse
Affiliation(s)
- Hu Wang
- Department of Plant Sciences, College of Agriculture and Bioresources, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Yuguang Bai
- Department of Plant Sciences, College of Agriculture and Bioresources, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Bill Biligetu
- Department of Plant Sciences, College of Agriculture and Bioresources, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| |
Collapse
|
8
|
Singer WM, Lee YC, Shea Z, Vieira CC, Lee D, Li X, Cunicelli M, Kadam SS, Khan MAW, Shannon G, Mian MAR, Nguyen HT, Zhang B. Soybean genetics, genomics, and breeding for improving nutritional value and reducing antinutritional traits in food and feed. THE PLANT GENOME 2023; 16:e20415. [PMID: 38084377 DOI: 10.1002/tpg2.20415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/25/2023] [Accepted: 10/27/2023] [Indexed: 12/22/2023]
Abstract
Soybean [Glycine max (L.) Merr.] is a globally important crop due to its valuable seed composition, versatile feed, food, and industrial end-uses, and consistent genetic gain. Successful genetic gain in soybean has led to widespread adaptation and increased value for producers, processors, and consumers. Specific focus on the nutritional quality of soybean seed composition for food and feed has further elucidated genetic knowledge and bolstered breeding progress. Seed components are historical and current targets for soybean breeders seeking to improve nutritional quality of soybean. This article reviews genetic and genomic foundations for improvement of nutritionally important traits, such as protein and amino acids, oil and fatty acids, carbohydrates, and specific food-grade considerations; discusses the application of advanced breeding technology such as CRISPR/Cas9 in creating seed composition variations; and provides future directions and breeding recommendations regarding soybean seed composition traits.
Collapse
Affiliation(s)
- William M Singer
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| | - Yi-Chen Lee
- Department of Agriculture, Fort Hays State University, Hays, Kansas, USA
| | - Zachary Shea
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| | - Caio Canella Vieira
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, Arkansas, USA
| | - Dongho Lee
- Fisher Delta Research, Extension, and Education Center, University of Missouri, Portageville, Missouri, USA
| | - Xiaoying Li
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| | - Mia Cunicelli
- Soybean and Nitrogen Fixation Research Unit, USDA-ARS, Raleigh, North Carolina, USA
| | - Shaila S Kadam
- Division of Plant Science and Technology, University of Missouri, Columbia, Missouri, USA
| | | | - Grover Shannon
- Fisher Delta Research, Extension, and Education Center, University of Missouri, Portageville, Missouri, USA
| | - M A Rouf Mian
- Soybean and Nitrogen Fixation Research Unit, USDA-ARS, Raleigh, North Carolina, USA
| | - Henry T Nguyen
- Division of Plant Science and Technology, University of Missouri, Columbia, Missouri, USA
| | - Bo Zhang
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| |
Collapse
|
9
|
Wu C, Zhang Y, Ying Z, Li L, Wang J, Yu H, Zhang M, Feng X, Wei X, Xu X. A transformer-based genomic prediction method fused with knowledge-guided module. Brief Bioinform 2023; 25:bbad438. [PMID: 38058185 PMCID: PMC10701102 DOI: 10.1093/bib/bbad438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/15/2023] [Accepted: 11/03/2023] [Indexed: 12/08/2023] Open
Abstract
Genomic prediction (GP) uses single nucleotide polymorphisms (SNPs) to establish associations between markers and phenotypes. Selection of early individuals by genomic estimated breeding value shortens the generation interval and speeds up the breeding process. Recently, methods based on deep learning (DL) have gained great attention in the field of GP. In this study, we explore the application of Transformer-based structures to GP and develop a novel deep-learning model named GPformer. GPformer obtains a global view by gleaning beneficial information from all relevant SNPs regardless of the physical distance between SNPs. Comprehensive experimental results on five different crop datasets show that GPformer outperforms ridge regression-based linear unbiased prediction (RR-BLUP), support vector regression (SVR), light gradient boosting machine (LightGBM) and deep neural network genomic prediction (DNNGP) in terms of mean absolute error, Pearson's correlation coefficient and the proposed metric consistent index. Furthermore, we introduce a knowledge-guided module (KGM) to extract genome-wide association studies-based information, which is fused into GPformer as prior knowledge. KGM is very flexible and can be plugged into any DL network. Ablation studies of KGM on three datasets illustrate the efficiency of KGM adequately. Moreover, GPformer is robust and stable to hyperparameters and can generalize to each phenotype of every dataset, which is suitable for practical application scenarios.
Collapse
Affiliation(s)
- Cuiling Wu
- Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China
| | - Yiyi Zhang
- Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China
| | - Zhiwen Ying
- Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China
| | - Ling Li
- Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China
| | - Jun Wang
- Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China
| | - Hui Yu
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130012, China
| | - Mengchen Zhang
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Hangzhou 310006, China
| | - Xianzhong Feng
- Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130012, China
| | - Xinghua Wei
- Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Hangzhou 310006, China
| | - Xiaogang Xu
- School of Computer and Information Engineering, Zhejiang Gongshang University, Hangzhou 310018, China
| |
Collapse
|
10
|
Zhang Y, Zhang M, Ye J, Xu Q, Feng Y, Xu S, Hu D, Wei X, Hu P, Yang Y. Integrating genome-wide association study into genomic selection for the prediction of agronomic traits in rice ( Oryza sativa L.). MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2023; 43:81. [PMID: 37965378 PMCID: PMC10641074 DOI: 10.1007/s11032-023-01423-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/09/2023] [Indexed: 11/16/2023]
Abstract
Accurately identifying varieties with targeted agronomic traits was thought to contribute to genetic selection and accelerate rice breeding progress. Genomic selection (GS) is a promising technique that uses markers covering the whole genome to predict the genomic-estimated breeding values (GEBV), with the ability to select before phenotypes are measured. To choose the appropriate GS models for breeding work, we analyzed the predictability of nine agronomic traits measured from a population of 459 diverse rice varieties. By the comparison of eight representative GS models, we found that the prediction accuracies ranged from 0.407 to 0.896, with reproducing kernel Hilbert space (RKHS) having the highest predictive ability in most traits. Further results demonstrated the predictivity of GS is altered by several factors. Moreover, we assessed the method of integrating genome-wide association study (GWAS) into various GS models. The predictabilities of GS combined peak-associated markers generated from six different GWAS models were significantly different; a recommendation of Mixed Linear Model (MLM)-RKHS was given for the GWAS-GS-integrated prediction. Finally, based on the above result, we experimented with applying the P-values obtained from optimal GWAS models into ridge regression best linear unbiased prediction (rrBLUP), which benefited the low predictive traits in rice. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-023-01423-y.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Mengchen Zhang
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024 China
| | - Junhua Ye
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Qun Xu
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Yue Feng
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024 China
| | - Siliang Xu
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Dongxiu Hu
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Xinghua Wei
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024 China
| | - Peisong Hu
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
| | - Yaolong Yang
- Zhejiang Lab, Hangzhou, 311121 China
- CNRRI-Zhejiang Lab Computational Breeding Joint Laboratory, China National Rice Research Institute, Hangzhou, China
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya, 572024 China
| |
Collapse
|
11
|
Canella Vieira C, Zhou J, Jarquin D, Zhou J, Diers B, Riechers DE, Nguyen HT, Shannon G. Genetic architecture of soybean tolerance to off-target dicamba. FRONTIERS IN PLANT SCIENCE 2023; 14:1230068. [PMID: 37877091 PMCID: PMC10590897 DOI: 10.3389/fpls.2023.1230068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 09/27/2023] [Indexed: 10/26/2023]
Abstract
The adoption of dicamba-tolerant (DT) soybean in the United States resulted in extensive off-target dicamba damage to non-DT vegetation across soybean-producing states. Although soybeans are highly sensitive to dicamba, the intensity of observed symptoms and yield losses are affected by the genetic background of genotypes. Thus, the objective of this study was to detect novel marker-trait associations and expand on previously identified genomic regions related to soybean response to off-target dicamba. A total of 551 non-DT advanced breeding lines derived from 232 unique bi-parental populations were phenotyped for off-target dicamba across nine environments for three years. Breeding lines were genotyped using the Illumina Infinium BARCSoySNP6K BeadChip. Filtered SNPs were included as predictors in Random Forest (RF) and Support Vector Machine (SVM) models in a forward stepwise selection loop to identify the combination of SNPs yielding the highest classification accuracy. Both RF and SVM models yielded high classification accuracies (0.76 and 0.79, respectively) with minor extreme misclassifications (observed tolerant predicted as susceptible, and vice-versa). Eight genomic regions associated with off-target dicamba tolerance were identified on chromosomes 6 [Linkage Group (LG) C2], 8 (LG A2), 9 (LG K), 10 (LG O), and 19 (LG L). Although the genetic architecture of tolerance is complex, high classification accuracies were obtained when including the major effect SNP identified on chromosome 6 as the sole predictor. In addition, candidate genes with annotated functions associated with phases II (conjugation of hydroxylated herbicides to endogenous sugar molecules) and III (transportation of herbicide conjugates into the vacuole) of herbicide detoxification in plants were co-localized with significant markers within each genomic region. Genomic prediction models, as reported in this study, can greatly facilitate the identification of genotypes with superior tolerance to off-target dicamba.
Collapse
Affiliation(s)
- Caio Canella Vieira
- Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Jing Zhou
- Biological Systems Engineering, University of Wisconsin-Madison, Madison, WI, United States
| | - Diego Jarquin
- Agronomy Department, University of Florida, Gainesville, FL, United States
| | - Jianfeng Zhou
- Division of Plant Science and Technology, University of Missouri, Columbia, MO, United States
| | - Brian Diers
- Department of Crop Sciences, University of Illinois, Urbana, IL, United States
| | - Dean E. Riechers
- Department of Crop Sciences, University of Illinois, Urbana, IL, United States
| | - Henry T. Nguyen
- Division of Plant Science and Technology, University of Missouri, Columbia, MO, United States
| | - Grover Shannon
- Division of Plant Science and Technology, University of Missouri, Columbia, MO, United States
| |
Collapse
|
12
|
Shea Z, Singer WM, Rosso L, Song Q, Zhang B. Determining Genetic Markers and Seed Compositions Related to High Test Weight in Glycine max. PLANTS (BASEL, SWITZERLAND) 2023; 12:2997. [PMID: 37631207 PMCID: PMC10457734 DOI: 10.3390/plants12162997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 08/08/2023] [Accepted: 08/14/2023] [Indexed: 08/27/2023]
Abstract
Test weight, one of the primary indicators of soybean seed quality, is measured as the amount of soybean seeds in kilograms that can fit into one hectoliter. The price that growers receive for their soybean is dependent on test weight. Over the past 50 years, growers have observed a decreasing trend in test weight. Therefore, it is imperative to understand better the relationship between soybean test weight and other traits to enable breeders to select parental lines with high test weights in breeding programs to ensure the grower's profitability. The objectives of the study were to identify genetic markers associated with high test weight in soybean and to determine the correlation between high test weight and five important seed composition traits (protein, oil, sucrose, raffinose, and stachyose content). Maturity group IV and V germplasms from the USDA soybean germplasm collection were grown in Blacksburg and Warsaw in Virginia from 2019 to 2021 and were measured for all of the above traits. Results show that test weight values ranged from 62-77 kg/hL over the three years. Multiple single-nucleotide polymorphisms (SNPs) significantly associated with high test weight were found on chromosome (Chr.) 15 along with a couple on chromosome 14, and 11 candidate genes were found near these SNPs. Test weight was found to be significantly negatively correlated with oil content, inconsistently correlated with protein content in all environments, and negatively correlated but not significantly with all three sugars except for raffinose in Blacksburg 2019. We concluded that the genes that underlie test weight might be on chromosome 15, and the validated associated SNPs might be used to assist breeding selection of test weight. Breeders should pay special attention to test weight while selecting for high oil content in soybean due to their negative correlation.
Collapse
Affiliation(s)
- Zachary Shea
- School of Plant & Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA; (Z.S.); (W.M.S.); (L.R.)
| | - William M. Singer
- School of Plant & Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA; (Z.S.); (W.M.S.); (L.R.)
| | - Luciana Rosso
- School of Plant & Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA; (Z.S.); (W.M.S.); (L.R.)
| | - Qijian Song
- USDA-ARS, Beltsville Agricultural Research Center, Beltsville, MD 20705, USA;
| | - Bo Zhang
- School of Plant & Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA; (Z.S.); (W.M.S.); (L.R.)
| |
Collapse
|
13
|
Wang Z, Yu D, Morota G, Dhakal K, Singer W, Lord N, Huang H, Chen P, Mozzoni L, Li S, Zhang B. Genome-wide association analysis of sucrose and alanine contents in edamame beans. FRONTIERS IN PLANT SCIENCE 2023; 13:1086007. [PMID: 36816489 PMCID: PMC9935843 DOI: 10.3389/fpls.2022.1086007] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 12/29/2022] [Indexed: 06/18/2023]
Abstract
The sucrose and Alanine (Ala) content in edamame beans significantly impacts the sweetness flavor of edamame-derived products as an important attribute to consumers' acceptance. Unlike grain-type soybeans, edamame beans are harvested as fresh beans at the R6 to R7 growth stages when beans are filled 80-90% of the pod capacity. The genetic basis of sucrose and Ala contents in fresh edamame beans may differ from those in dry seeds. To date, there is no report on the genetic basis of sucrose and Ala contents in the edamame beans. In this study, a genome-wide association study was conducted to identify single nucleotide polymorphisms (SNPs) related to sucrose and Ala levels in edamame beans using an association mapping panel of 189 edamame accessions genotyped with a SoySNP50K BeadChip. A total of 43 and 25 SNPs was associated with sucrose content and Ala content in the edamame beans, respectively. Four genes (Glyma.10g270800, Glyma.08g137500, Glyma.10g268500, and Glyma.18g193600) with known effects on the process of sucrose biosynthesis and 37 novel sucrose-related genes were characterized. Three genes (Gm17g070500, Glyma.14g201100 and Glyma.18g269600) with likely relevant effects in regulating Ala content and 22 novel Ala-related genes were identified. In addition, by summarizing the phenotypic data of edamame beans from three locations in two years, three PI accessions (PI 532469, PI 243551, and PI 407748) were selected as the high sucrose and high Ala parental lines for the perspective breeding of sweet edamame varieties. Thus, the beneficial alleles, candidate genes, and selected PI accessions identified in this study will be fundamental to develop edamame varieties with improved consumers' acceptance, and eventually promote edamame production as a specialty crop in the United States.
Collapse
Affiliation(s)
- Zhibo Wang
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Dajun Yu
- Department of Food Science and Technology, Virginia Tech, Blacksburg, VA, United States
| | - Gota Morota
- School of Animal Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Kshitiz Dhakal
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - William Singer
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Nilanka Lord
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Haibo Huang
- Department of Food Science and Technology, Virginia Tech, Blacksburg, VA, United States
| | - Pengyin Chen
- Fisher Delta Research Center, University of Missouri, Portageville, MO, United States
| | - Leandro Mozzoni
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States
| | - Song Li
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| | - Bo Zhang
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA, United States
| |
Collapse
|