1
|
Robertson E, Grinton B, Oliver K, Fearnley L, Hildebrand M, Sadleir L, Scheffer I, Berkovic S, Bennett M, Bahlo M. Identifying individuals with rare disease variants by inferring shared ancestral haplotypes from SNP array data. NAR Genom Bioinform 2025; 7:lqaf033. [PMID: 40191585 PMCID: PMC11970371 DOI: 10.1093/nargab/lqaf033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Revised: 02/23/2025] [Accepted: 03/15/2025] [Indexed: 04/09/2025] Open
Abstract
We describe FoundHaplo, an identity-by-descent algorithm that can be used to screen untyped disease-causing variants using single nucleotide polymorphism (SNP) array data. FoundHaplo leverages knowledge of shared disease haplotypes for inherited variants to identify those who share the disease haplotype and are, therefore, likely to carry the rare [minor allele frequency (MAF) ≤ 0.01%] variant. We performed a simulation study to evaluate the performance of FoundHaplo across 33 disease-harbouring loci. FoundHaplo was used to infer the presence of two rare (MAF ≤ 0.01%) pathogenic variants, SCN1B c.363C>G (p.Cys121Trp) and WWOX c.49G>A (p.E17K), which can cause mild dominant and severe recessive epilepsy, respectively, in the Epi25 cohort and the UK Biobank. FoundHaplo demonstrated substantially better sensitivity at inferring the presence of these rare variants than existing genome-wide imputation. FoundHaplo is a valuable screening tool for searching disease-causing variants with known founder effects using only SNP genotyping data. It is also applicable to nonhuman applications and nondisease-causing traits, including rare-variant drivers of quantitative traits. The FoundHaplo algorithm is available at https://github.com/bahlolab/FoundHaplo (DOI:10.5281/zenodo.8058286).
Collapse
Affiliation(s)
- Erandee Robertson
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, Victoria 3052, Australia
| | - Bronwyn E Grinton
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, Victoria 3052, Australia
- Epilepsy Research Centre,Department of Medicine, Austin Health, University of Melbourne, Heidelberg, Victoria 3084, Australia
| | - Karen L Oliver
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, Victoria 3052, Australia
- Epilepsy Research Centre,Department of Medicine, Austin Health, University of Melbourne, Heidelberg, Victoria 3084, Australia
| | - Liam G Fearnley
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, Victoria 3052, Australia
| | - Michael S Hildebrand
- Epilepsy Research Centre,Department of Medicine, Austin Health, University of Melbourne, Heidelberg, Victoria 3084, Australia
- Murdoch Children’s Research Institute, Royal Children’s Hospital, Parkville, Victoria 3052, Australia
| | - Lynette G Sadleir
- Department of Paediatrics and Child Health, University of Otago, Wellington South 6242, New Zealand
| | - Ingrid E Scheffer
- Epilepsy Research Centre,Department of Medicine, Austin Health, University of Melbourne, Heidelberg, Victoria 3084, Australia
- Murdoch Children’s Research Institute, Royal Children’s Hospital, Parkville, Victoria 3052, Australia
- Department of Paediatrics, The University of Melbourne, Royal Children’s Hospital, Parkville, Victoria 3052, Australia
- Florey Institute of Neuroscience and Mental Health, Heidelberg, Victoria 3084, Australia
| | - Samuel F Berkovic
- Epilepsy Research Centre,Department of Medicine, Austin Health, University of Melbourne, Heidelberg, Victoria 3084, Australia
| | - Mark F Bennett
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, Victoria 3052, Australia
- Epilepsy Research Centre,Department of Medicine, Austin Health, University of Melbourne, Heidelberg, Victoria 3084, Australia
| | - Melanie Bahlo
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, Victoria 3052, Australia
| |
Collapse
|
2
|
Fraslin C, Robledo D, Kause A, Houston RD. Potential of low-density genotype imputation for cost-efficient genomic selection for resistance to Flavobacterium columnare in rainbow trout (Oncorhynchus mykiss). Genet Sel Evol 2023; 55:59. [PMID: 37580697 PMCID: PMC10424455 DOI: 10.1186/s12711-023-00832-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 07/26/2023] [Indexed: 08/16/2023] Open
Abstract
BACKGROUND Flavobacterium columnare is the pathogen agent of columnaris disease, a major emerging disease that affects rainbow trout aquaculture. Selective breeding using genomic selection has potential to achieve cumulative improvement of the host resistance. However, genomic selection is expensive partly because of the cost of genotyping large numbers of animals using high-density single nucleotide polymorphism (SNP) arrays. The objective of this study was to assess the efficiency of genomic selection for resistance to F. columnare using in silico low-density (LD) panels combined with imputation. After a natural outbreak of columnaris disease, 2874 challenged fish and 469 fish from the parental generation (n = 81 parents) were genotyped with 27,907 SNPs. The efficiency of genomic prediction using LD panels was assessed for 10 panels of different densities, which were created in silico using two sampling methods, random and equally spaced. All LD panels were also imputed to the full 28K HD panel using the parental generation as the reference population, and genomic predictions were re-evaluated. The potential of prioritizing SNPs that are associated with resistance to F. columnare was also tested for the six lower-density panels. RESULTS The accuracies of both imputation and genomic predictions were similar with random and equally-spaced sampling of SNPs. Using LD panels of at least 3000 SNPs or lower-density panels (as low as 300 SNPs) combined with imputation resulted in accuracies that were comparable to those of the 28K HD panel and were 11% higher than the pedigree-based predictions. CONCLUSIONS Compared to using the commercial HD panel, LD panels combined with imputation may provide a more affordable approach to genomic prediction of breeding values, which supports a more widespread adoption of genomic selection in aquaculture breeding programmes.
Collapse
Affiliation(s)
- Clémence Fraslin
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK.
| | - Diego Robledo
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Antti Kause
- Natural Resources Institute Finland (Luke), Myllytie 1, 31600, Jokioinen, Finland
| | - Ross D Houston
- Benchmark Genetics, Edinburgh Technopole, 1 Pioneer Building, Penicuik, EH26 0GB, UK
| |
Collapse
|
3
|
Sinha D, Maurya AK, Abdi G, Majeed M, Agarwal R, Mukherjee R, Ganguly S, Aziz R, Bhatia M, Majgaonkar A, Seal S, Das M, Banerjee S, Chowdhury S, Adeyemi SB, Chen JT. Integrated Genomic Selection for Accelerating Breeding Programs of Climate-Smart Cereals. Genes (Basel) 2023; 14:1484. [PMID: 37510388 PMCID: PMC10380062 DOI: 10.3390/genes14071484] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 07/14/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023] Open
Abstract
Rapidly rising population and climate changes are two critical issues that require immediate action to achieve sustainable development goals. The rising population is posing increased demand for food, thereby pushing for an acceleration in agricultural production. Furthermore, increased anthropogenic activities have resulted in environmental pollution such as water pollution and soil degradation as well as alterations in the composition and concentration of environmental gases. These changes are affecting not only biodiversity loss but also affecting the physio-biochemical processes of crop plants, resulting in a stress-induced decline in crop yield. To overcome such problems and ensure the supply of food material, consistent efforts are being made to develop strategies and techniques to increase crop yield and to enhance tolerance toward climate-induced stress. Plant breeding evolved after domestication and initially remained dependent on phenotype-based selection for crop improvement. But it has grown through cytological and biochemical methods, and the newer contemporary methods are based on DNA-marker-based strategies that help in the selection of agronomically useful traits. These are now supported by high-end molecular biology tools like PCR, high-throughput genotyping and phenotyping, data from crop morpho-physiology, statistical tools, bioinformatics, and machine learning. After establishing its worth in animal breeding, genomic selection (GS), an improved variant of marker-assisted selection (MAS), has made its way into crop-breeding programs as a powerful selection tool. To develop novel breeding programs as well as innovative marker-based models for genetic evaluation, GS makes use of molecular genetic markers. GS can amend complex traits like yield as well as shorten the breeding period, making it advantageous over pedigree breeding and marker-assisted selection (MAS). It reduces the time and resources that are required for plant breeding while allowing for an increased genetic gain of complex attributes. It has been taken to new heights by integrating innovative and advanced technologies such as speed breeding, machine learning, and environmental/weather data to further harness the GS potential, an approach known as integrated genomic selection (IGS). This review highlights the IGS strategies, procedures, integrated approaches, and associated emerging issues, with a special emphasis on cereal crops. In this domain, efforts have been taken to highlight the potential of this cutting-edge innovation to develop climate-smart crops that can endure abiotic stresses with the motive of keeping production and quality at par with the global food demand.
Collapse
Affiliation(s)
- Dwaipayan Sinha
- Department of Botany, Government General Degree College, Mohanpur 721436, India
| | - Arun Kumar Maurya
- Department of Botany, Multanimal Modi College, Modinagar, Ghaziabad 201204, India
| | - Gholamreza Abdi
- Department of Biotechnology, Persian Gulf Research Institute, Persian Gulf University, Bushehr 75169, Iran
| | - Muhammad Majeed
- Department of Botany, University of Gujrat, Punjab 50700, Pakistan
| | - Rachna Agarwal
- Applied Genomics Section, Bhabha Atomic Research Centre, Mumbai 400085, India
| | - Rashmi Mukherjee
- Research Center for Natural and Applied Sciences, Department of Botany (UG & PG), Raja Narendralal Khan Women's College, Gope Palace, Midnapur 721102, India
| | - Sharmistha Ganguly
- Department of Dravyaguna, Institute of Post Graduate Ayurvedic Education and Research, Kolkata 700009, India
| | - Robina Aziz
- Department of Botany, Government, College Women University, Sialkot 51310, Pakistan
| | - Manika Bhatia
- TERI School of Advanced Studies, New Delhi 110070, India
| | - Aqsa Majgaonkar
- Department of Botany, St. Xavier's College (Autonomous), Mumbai 400001, India
| | - Sanchita Seal
- Department of Botany, Polba Mahavidyalaya, Polba 712148, India
| | - Moumita Das
- V. Sivaram Research Foundation, Bangalore 560040, India
| | - Swastika Banerjee
- Department of Botany, Kairali College of +3 Science, Champua, Keonjhar 758041, India
| | - Shahana Chowdhury
- Department of Biotechnology, Faculty of Engineering Sciences, German University Bangladesh, TNT Road, Telipara, Chandona Chowrasta, Gazipur 1702, Bangladesh
| | - Sherif Babatunde Adeyemi
- Ethnobotany/Phytomedicine Laboratory, Department of Plant Biology, Faculty of Life Sciences, University of Ilorin, Ilorin P.M.B 1515, Nigeria
| | - Jen-Tsung Chen
- Department of Life Sciences, National University of Kaohsiung, Kaohsiung 811, Taiwan
| |
Collapse
|
4
|
Kriaridou C, Tsairidou S, Fraslin C, Gorjanc G, Looseley ME, Johnston IA, Houston RD, Robledo D. Evaluation of low-density SNP panels and imputation for cost-effective genomic selection in four aquaculture species. Front Genet 2023; 14:1194266. [PMID: 37252666 PMCID: PMC10213886 DOI: 10.3389/fgene.2023.1194266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 04/26/2023] [Indexed: 05/31/2023] Open
Abstract
Genomic selection can accelerate genetic progress in aquaculture breeding programmes, particularly for traits measured on siblings of selection candidates. However, it is not widely implemented in most aquaculture species, and remains expensive due to high genotyping costs. Genotype imputation is a promising strategy that can reduce genotyping costs and facilitate the broader uptake of genomic selection in aquaculture breeding programmes. Genotype imputation can predict ungenotyped SNPs in populations genotyped at a low-density (LD), using a reference population genotyped at a high-density (HD). In this study, we used datasets of four aquaculture species (Atlantic salmon, turbot, common carp and Pacific oyster), phenotyped for different traits, to investigate the efficacy of genotype imputation for cost-effective genomic selection. The four datasets had been genotyped at HD, and eight LD panels (300-6,000 SNPs) were generated in silico. SNPs were selected to be: i) evenly distributed according to physical position ii) selected to minimise the linkage disequilibrium between adjacent SNPs or iii) randomly selected. Imputation was performed with three different software packages (AlphaImpute2, FImpute v.3 and findhap v.4). The results revealed that FImpute v.3 was faster and achieved higher imputation accuracies. Imputation accuracy increased with increasing panel density for both SNP selection methods, reaching correlations greater than 0.95 in the three fish species and 0.80 in Pacific oyster. In terms of genomic prediction accuracy, the LD and the imputed panels performed similarly, reaching values very close to the HD panels, except in the pacific oyster dataset, where the LD panel performed better than the imputed panel. In the fish species, when LD panels were used for genomic prediction without imputation, selection of markers based on either physical or genetic distance (instead of randomly) resulted in a high prediction accuracy, whereas imputation achieved near maximal prediction accuracy independently of the LD panel, showing higher reliability. Our results suggests that, in fish species, well-selected LD panels may achieve near maximal genomic selection prediction accuracy, and that the addition of imputation will result in maximal accuracy independently of the LD panel. These strategies represent effective and affordable methods to incorporate genomic selection into most aquaculture settings.
Collapse
Affiliation(s)
- Christina Kriaridou
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| | - Smaragda Tsairidou
- Global Academy of Agriculture and Food Systems, University of Edinburgh, Edinburgh, United Kingdom
| | - Clémence Fraslin
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| | | | | | - Ross D. Houston
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
- Benchmark Genetics, Penicuik, United Kingdom
| | - Diego Robledo
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
5
|
Reyes VP, Kitony JK, Nishiuchi S, Makihara D, Doi K. Utilization of Genotyping-by-Sequencing (GBS) for Rice Pre-Breeding and Improvement: A Review. Life (Basel) 2022; 12:1752. [PMID: 36362909 PMCID: PMC9694628 DOI: 10.3390/life12111752] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 09/29/2023] Open
Abstract
Molecular markers play a crucial role in the improvement of rice. To benefit from these markers, genotyping is carried out to identify the differences at a specific position in the genome of individuals. The advances in sequencing technologies have led to the development of different genotyping techniques such as genotyping-by-sequencing. Unlike PCR-fragment-based genotyping, genotyping-by-sequencing has enabled the parallel sequencing and genotyping of hundreds of samples in a single run, making it more cost-effective. Currently, GBS is being used in several pre-breeding programs of rice to identify beneficial genes and QTL from different rice genetic resources. In this review, we present the current advances in the utilization of genotyping-by-sequencing for the development of rice pre-breeding materials and the improvement of existing rice cultivars. The challenges and perspectives of using this approach are also highlighted.
Collapse
Affiliation(s)
- Vincent Pamugas Reyes
- Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya 464-8601, Japan
| | | | - Shunsaku Nishiuchi
- Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya 464-8601, Japan
| | - Daigo Makihara
- International Center for Research and Education in Agriculture, Nagoya University, Nagoya 464-8601, Japan
| | - Kazuyuki Doi
- Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya 464-8601, Japan
| |
Collapse
|
6
|
Bernard M, Dehaullon A, Gao G, Paul K, Lagarde H, Charles M, Prchal M, Danon J, Jaffrelo L, Poncet C, Patrice P, Haffray P, Quillet E, Dupont-Nivet M, Palti Y, Lallias D, Phocas F. Development of a High-Density 665 K SNP Array for Rainbow Trout Genome-Wide Genotyping. Front Genet 2022; 13:941340. [PMID: 35923696 PMCID: PMC9340366 DOI: 10.3389/fgene.2022.941340] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 06/24/2022] [Indexed: 12/02/2022] Open
Abstract
Single nucleotide polymorphism (SNP) arrays, also named « SNP chips », enable very large numbers of individuals to be genotyped at a targeted set of thousands of genome-wide identified markers. We used preexisting variant datasets from USDA, a French commercial line and 30X-coverage whole genome sequencing of INRAE isogenic lines to develop an Affymetrix 665 K SNP array (HD chip) for rainbow trout. In total, we identified 32,372,492 SNPs that were polymorphic in the USDA or INRAE databases. A subset of identified SNPs were selected for inclusion on the chip, prioritizing SNPs whose flanking sequence uniquely aligned to the Swanson reference genome, with homogenous repartition over the genome and the highest Minimum Allele Frequency in both USDA and French databases. Of the 664,531 SNPs which passed the Affymetrix quality filters and were manufactured on the HD chip, 65.3% and 60.9% passed filtering metrics and were polymorphic in two other distinct French commercial populations in which, respectively, 288 and 175 sampled fish were genotyped. Only 576,118 SNPs mapped uniquely on both Swanson and Arlee reference genomes, and 12,071 SNPs did not map at all on the Arlee reference genome. Among those 576,118 SNPs, 38,948 SNPs were kept from the commercially available medium-density 57 K SNP chip. We demonstrate the utility of the HD chip by describing the high rates of linkage disequilibrium at 2–10 kb in the rainbow trout genome in comparison to the linkage disequilibrium observed at 50–100 kb which are usual distances between markers of the medium-density chip.
Collapse
Affiliation(s)
- Maria Bernard
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
- INRAE, SIGENAE, Jouy-en-Josas, France
| | - Audrey Dehaullon
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | - Guangtu Gao
- USDA, REE, ARS, NEA, NCCCWA, Kearneysville, WV, United States
| | - Katy Paul
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | - Henri Lagarde
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | - Mathieu Charles
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
- INRAE, SIGENAE, Jouy-en-Josas, France
| | - Martin Prchal
- South Bohemian Research Center of Aquaculture and Biodiversity of Hydrocenoses, Faculty of Fisheries and Protection of Waters, University of South Bohemia, Vodňany, Czechia
| | - Jeanne Danon
- INRAE-UCA, Plateforme Gentyane, UMR GDEC, Clermont-Ferrand, France
| | - Lydia Jaffrelo
- INRAE-UCA, Plateforme Gentyane, UMR GDEC, Clermont-Ferrand, France
| | - Charles Poncet
- INRAE-UCA, Plateforme Gentyane, UMR GDEC, Clermont-Ferrand, France
| | | | | | - Edwige Quillet
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | | | - Yniv Palti
- USDA, REE, ARS, NEA, NCCCWA, Kearneysville, WV, United States
| | - Delphine Lallias
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
| | - Florence Phocas
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, France
- *Correspondence: Florence Phocas,
| |
Collapse
|