1
|
Grunin M, Triffon D, Beykin G, Rahmani E, Schweiger R, Tiosano L, Khateb S, Hagbi-Levi S, Rinsky B, Munitz R, Winkler TW, Heid IM, Halperin E, Carmi S, Chowers I. Genome wide association study and genomic risk prediction of age related macular degeneration in Israel. Sci Rep 2024; 14:13034. [PMID: 38844476 PMCID: PMC11156861 DOI: 10.1038/s41598-024-63065-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 05/24/2024] [Indexed: 06/09/2024] Open
Abstract
The risk of developing age-related macular degeneration (AMD) is influenced by genetic background. In 2016, the International AMD Genomics Consortium (IAMDGC) identified 52 risk variants in 34 loci, and a polygenic risk score (PRS) from these variants was associated with AMD. The Israeli population has a unique genetic composition: Ashkenazi Jewish (AJ), Jewish non-Ashkenazi, and Arab sub-populations. We aimed to perform a genome-wide association study (GWAS) for AMD in Israel, and to evaluate PRSs for AMD. Our discovery set recruited 403 AMD patients and 256 controls at Hadassah Medical Center. We genotyped individuals via custom exome chip. We imputed non-typed variants using cosmopolitan and AJ reference panels. We recruited additional 155 cases and 69 controls for validation. To evaluate predictive power of PRSs for AMD, we used IAMDGC summary-statistics excluding our study and developed PRSs via clumping/thresholding or LDpred2. In our discovery set, 31/34 loci reported by IAMDGC were AMD-associated (P < 0.05). Of those, all effects were directionally consistent with IAMDGC and 11 loci had a P-value under Bonferroni-corrected threshold (0.05/34 = 0.0015). At a 5 × 10-5 threshold, we discovered four suggestive associations in FAM189A1, IGDCC4, C7orf50, and CNTNAP4. Only the FAM189A1 variant was AMD-associated in the replication cohort after Bonferroni-correction. A prediction model including LDpred2-based PRS + covariates had an AUC of 0.82 (95% CI 0.79-0.85) and performed better than covariates-only model (P = 5.1 × 10-9). Therefore, previously reported AMD-associated loci were nominally associated with AMD in Israel. A PRS developed based on a large international study is predictive in Israeli populations.
Collapse
Affiliation(s)
- Michelle Grunin
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, POB 12271, 9112102, Jerusalem, Israel
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, POB 12000, 91120, Jerusalem, Israel
| | - Daria Triffon
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, POB 12271, 9112102, Jerusalem, Israel
| | - Gala Beykin
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, POB 12000, 91120, Jerusalem, Israel
| | - Elior Rahmani
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Regev Schweiger
- Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv, Israel
- Department of Genetics, University of Cambridge, CB21TN, Cambridge, UK
| | - Liran Tiosano
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, POB 12000, 91120, Jerusalem, Israel
| | - Samer Khateb
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, POB 12000, 91120, Jerusalem, Israel
| | - Shira Hagbi-Levi
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, POB 12000, 91120, Jerusalem, Israel
| | - Batya Rinsky
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, POB 12000, 91120, Jerusalem, Israel
| | - Refael Munitz
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, POB 12000, 91120, Jerusalem, Israel
| | - Thomas W Winkler
- Department of Genetic Epidemiology, University of Regensburg, Regensburg, Germany
| | - Iris M Heid
- Department of Genetic Epidemiology, University of Regensburg, Regensburg, Germany
| | - Eran Halperin
- Molecular Microbiology and Biotechnology, Tel Aviv University, Tel Aviv, Israel
- Department of Anesthesiology, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, POB 12271, 9112102, Jerusalem, Israel.
| | - Itay Chowers
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, POB 12000, 91120, Jerusalem, Israel.
| |
Collapse
|
2
|
Cahoon JL, Rui X, Tang E, Simons C, Langie J, Chen M, Lo YC, Chiang CWK. Imputation accuracy across global human populations. Am J Hum Genet 2024; 111:979-989. [PMID: 38604166 PMCID: PMC11080279 DOI: 10.1016/j.ajhg.2024.03.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 03/14/2024] [Accepted: 03/15/2024] [Indexed: 04/13/2024] Open
Abstract
Genotype imputation is now fundamental for genome-wide association studies but lacks fairness due to the underrepresentation of references from non-European ancestries. The state-of-the-art imputation reference panel released by the Trans-Omics for Precision Medicine (TOPMed) initiative improved the imputation of admixed African-ancestry and Hispanic/Latino samples, but imputation for populations primarily residing outside of North America may still fall short in performance due to persisting underrepresentation. To illustrate this point, we imputed the genotypes of over 43,000 individuals across 123 populations around the world and identified numerous populations where imputation accuracy paled in comparison to that of European-ancestry populations. For instance, the mean imputation r-squared (Rsq) for variants with minor allele frequencies between 1% and 5% in Saudi Arabians (n = 1,061), Vietnamese (n = 1,264), Thai (n = 2,435), and Papua New Guineans (n = 776) were 0.79, 0.78, 0.76, and 0.62, respectively, compared to 0.90-0.93 for comparable European populations matched in sample size and SNP array content. Outside of Africa and Latin America, Rsq appeared to decrease as genetic distances to European-ancestry reference increased, as predicted. Using sequencing data as ground truth, we also showed that Rsq may over-estimate imputation accuracy for non-European populations more than European populations, suggesting further disparity in accuracy between populations. Using 1,496 sequenced individuals from Taiwan Biobank as a second reference panel to TOPMed, we also assessed a strategy to improve imputation for non-European populations with meta-imputation, but this design did not improve accuracy across frequency spectra. Taken together, our analyses suggest that we must ultimately strive to increase diversity and size to promote equity within genetics research.
Collapse
Affiliation(s)
- Jordan L Cahoon
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA; Department of Computer Science, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA
| | - Xinyue Rui
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA
| | - Echo Tang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA
| | - Christopher Simons
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA
| | - Jalen Langie
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA
| | - Minhui Chen
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA
| | - Ying-Chu Lo
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA
| | - Charleston W K Chiang
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA; Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA.
| |
Collapse
|
3
|
Levi H, Elkon R, Shamir R. The predictive capacity of polygenic risk scores for disease risk is only moderately influenced by imputation panels tailored to the target population. Bioinformatics 2024; 40:btae036. [PMID: 38265251 PMCID: PMC10868313 DOI: 10.1093/bioinformatics/btae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 12/20/2023] [Accepted: 01/20/2024] [Indexed: 01/25/2024] Open
Abstract
MOTIVATION Polygenic risk scores (PRSs) predict individuals' genetic risk of developing complex diseases. They summarize the effect of many variants discovered in genome-wide association studies (GWASs). However, to date, large GWASs exist primarily for the European population and the quality of PRS prediction declines when applied to other ethnicities. Genetic profiling of individuals in the discovery set (on which the GWAS was performed) and target set (on which the PRS is applied) is typically done by SNP arrays that genotype a fraction of common SNPs. Therefore, a key step in GWAS analysis and PRS calculation is imputing untyped SNPs using a panel of fully sequenced individuals. The imputation results depend on the ethnic composition of the imputation panel. Imputing genotypes with a panel of individuals of the same ethnicity as the genotyped individuals typically improves imputation accuracy. However, there has been no systematic investigation into the influence of the ethnic composition of imputation panels on the accuracy of PRS predictions when applied to ethnic groups that differ from the population used in the GWAS. RESULTS We estimated the effect of imputation of the target set on prediction accuracy of PRS when the discovery and the target sets come from different ethnic groups. We analyzed binary phenotypes on ethnically distinct sets from the UK Biobank and other resources. We generated ethnically homogenous panels, imputed the target sets, and generated PRSs. Then, we assessed the prediction accuracy obtained from each imputation panel. Our analysis indicates that using an imputation panel matched to the ethnicity of the target population yields only a marginal improvement and only under specific conditions. AVAILABILITY AND IMPLEMENTATION The source code used for executing the analyses is this paper is available at https://github.com/Shamir-Lab/PRS-imputation-panels.
Collapse
Affiliation(s)
- Hagai Levi
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ran Elkon
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ron Shamir
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
4
|
Grunin M, Triffon D, Beykin G, Rahmani E, Schweiger R, Tiosano L, Khateb S, Hagbi-Levi S, Rinsky B, Munitz R, Winkler TW, Heid IM, Halperin E, Carmi S, Chowers I. Genome-wide association study and genomic risk prediction of age-related macular degeneration in Israel. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.09.06.23295126. [PMID: 37732190 PMCID: PMC10508791 DOI: 10.1101/2023.09.06.23295126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
Purpose The risk of developing age-related macular degeneration(AMD) is influenced by genetic background. In 2016, International AMD Genomics Consortium(IAMDGC) identified 52 risk variants in 34 loci, and a polygenic risk score(PRS) based on these variants was associated with AMD. The Israeli population has a unique genetic composition: Ashkenazi Jewish(AJ), Jewish non-Ashkenazi, and Arab sub-populations. We aimed to perform a genome-wide association study(GWAS) for AMD in Israel, and to evaluate PRSs for AMD. Methods For our discovery set, we recruited 403 AMD patients and 256 controls at Hadassah Medical Center. We genotyped all individuals via custom exome chip. We imputed non-typed variants using cosmopolitan and AJ reference panels. We recruited additional 155 cases and 69 controls for validation. To evaluate predictive power of PRSs for AMD, we used IAMDGC summary statistics excluding our study and developed PRSs via either clumping/thresholding or LDpred2. Results In our discovery set, 31/34 loci previously reported by the IAMDGC were AMD associated with P<0.05. Of those, all effects were directionally consistent with the IAMDGC and 11 loci had a p-value under Bonferroni-corrected threshold(0.05/34=0.0015). At a threshold of 5x10 -5 , we discovered four suggestive associations in FAM189A1 , IGDCC4 , C7orf50 , and CNTNAP4 . However, only the FAM189A1 variant was AMD associated in the replication cohort after Bonferroni-correction. A prediction model including LDpred2-based PRS and other covariates had an AUC of 0.82(95%CI:0.79-0.85) and performed better than a covariates-only model(P=5.1x10 -9 ). Conclusions Previously reported AMD-associated loci were nominally associated with AMD in Israel. A PRS developed based on a large international study is predictive in Israeli populations.
Collapse
|
5
|
Genome-wide data from medieval German Jews show that the Ashkenazi founder event pre-dated the 14 th century. Cell 2022; 185:4703-4716.e16. [PMID: 36455558 PMCID: PMC9793425 DOI: 10.1016/j.cell.2022.11.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 08/26/2022] [Accepted: 11/01/2022] [Indexed: 12/05/2022]
Abstract
We report genome-wide data from 33 Ashkenazi Jews (AJ), dated to the 14th century, obtained following a salvage excavation at the medieval Jewish cemetery of Erfurt, Germany. The Erfurt individuals are genetically similar to modern AJ, but they show more variability in Eastern European-related ancestry than modern AJ. A third of the Erfurt individuals carried a mitochondrial lineage common in modern AJ and eight carried pathogenic variants known to affect AJ today. These observations, together with high levels of runs of homozygosity, suggest that the Erfurt community had already experienced the major reduction in size that affected modern AJ. The Erfurt bottleneck was more severe, implying substructure in medieval AJ. Overall, our results suggest that the AJ founder event and the acquisition of the main sources of ancestry pre-dated the 14th century and highlight late medieval genetic heterogeneity no longer present in modern AJ.
Collapse
|
6
|
Song X, Ru M, Steinsnyder Z, Tkachuk K, Kopp RP, Sullivan J, Gümüş ZH, Offit K, Joseph V, Klein RJ. SNPs at SMG7 Associated with Time from Biochemical Recurrence to Prostate Cancer Death. Cancer Epidemiol Biomarkers Prev 2022; 31:1466-1472. [PMID: 35511739 PMCID: PMC9250608 DOI: 10.1158/1055-9965.epi-22-0053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 03/25/2022] [Accepted: 05/02/2022] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND A previous genome-wide association study identified several loci with genetic variants associated with prostate cancer survival time in two cohorts from Sweden. Whether these variants have an effect in other populations or if their effect is homogenous across the course of disease is unknown. METHODS These variants were genotyped in a cohort of 1,298 patients. Samples were linked with age, PSA level, Gleason score, cancer stage at surgery, and times from surgery to biochemical recurrence to death from prostate cancer. SNPs rs2702185 and rs73055188 were tested for association with prostate cancer-specific survival time using a multivariate Cox proportional hazard model. SNP rs2702185 was further tested for association with time to biochemical recurrence and time from biochemical recurrence to death with a multi-state model. RESULTS SNP rs2702185 at SMG7 was associated with prostate cancer-specific survival time, specifically the time from biochemical recurrence to prostate cancer death (HR, 2.5; 95% confidence interval, 1.4-4.5; P = 0.0014). Nine variants were in linkage disequilibrium (LD) with rs2702185; one, rs10737246, was found to be most likely to be functional based on LD patterns and overlap with open chromatin. Patterns of open chromatin and correlation with gene expression suggest that this SNP may affect expression of SMG7 in T cells. CONCLUSIONS The SNP rs2702185 at the SMG7 locus is associated with time from biochemical recurrence to prostate cancer death, and its LD partner rs10737246 is predicted to be functional. IMPACT These results suggest that future association studies of prostate cancer survival should consider various intervals over the course of disease.
Collapse
Affiliation(s)
- Xiaoyu Song
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, 10029 USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029 USA
| | - Meng Ru
- Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, 10029 USA
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029 USA
| | - Zoe Steinsnyder
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065 USA
| | - Kaitlyn Tkachuk
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065 USA
| | - Ryan P. Kopp
- Department of Urology, Oregon Health and Science University, Portland, OR, 97239 USA
| | - John Sullivan
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065 USA
| | - Zeynep H. Gümüş
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029 USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029 USA
| | - Kenneth Offit
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065 USA
- Department of Medicine, Weill Cornell Medical College, New York, NY 10065, USA
| | - Vijai Joseph
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065 USA
- Department of Medicine, Weill Cornell Medical College, New York, NY 10065, USA
| | - Robert J. Klein
- Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, 10029 USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029 USA
| |
Collapse
|
7
|
Lencz T, Yu J, Khan RR, Flaherty E, Carmi S, Lam M, Ben-Avraham D, Barzilai N, Bressman S, Darvasi A, Cho JH, Clark LN, Gümüş ZH, Vijai J, Klein RJ, Lipkin S, Offit K, Ostrer H, Ozelius LJ, Peter I, Malhotra AK, Maniatis T, Atzmon G, Pe'er I. Novel ultra-rare exonic variants identified in a founder population implicate cadherins in schizophrenia. Neuron 2021; 109:1465-1478.e4. [PMID: 33756103 DOI: 10.1016/j.neuron.2021.03.004] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 12/16/2020] [Accepted: 03/01/2021] [Indexed: 12/12/2022]
Abstract
The identification of rare variants associated with schizophrenia has proven challenging due to genetic heterogeneity, which is reduced in founder populations. In samples from the Ashkenazi Jewish population, we report that schizophrenia cases had a greater frequency of novel missense or loss of function (MisLoF) ultra-rare variants (URVs) compared to controls, and the MisLoF URV burden was inversely correlated with polygenic risk scores in cases. Characterizing 141 "case-only" genes (MisLoF URVs in ≥3 cases with none in controls), the cadherin gene set was associated with schizophrenia. We report a recurrent case mutation in PCDHA3 that results in the formation of cytoplasmic aggregates and failure to engage in homophilic interactions on the plasma membrane in cultured cells. Modeling purifying selection, we demonstrate that deleterious URVs are greatly overrepresented in the Ashkenazi population, yielding enhanced power for association studies. Identification of the cadherin/protocadherin family as risk genes helps specify the synaptic abnormalities central to schizophrenia.
Collapse
Affiliation(s)
- Todd Lencz
- Departments of Psychiatry and Molecular Medicine, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY 11550, USA; Department of Psychiatry, Division of Research, The Zucker Hillside Hospital Division of Northwell Health, Glen Oaks, NY 11004, USA; Institute for Behavioral Science, The Feinstein Institutes for Medical Research, Manhasset, NY 11030, USA.
| | - Jin Yu
- Department of Psychiatry, Division of Research, The Zucker Hillside Hospital Division of Northwell Health, Glen Oaks, NY 11004, USA; Institute for Behavioral Science, The Feinstein Institutes for Medical Research, Manhasset, NY 11030, USA
| | - Raiyan Rashid Khan
- Department of Computer Science, Columbia University, New York, NY 10027, USA
| | - Erin Flaherty
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, NY 10027, USA
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, Faculty of Medicine, Hebrew University of Jerusalem, Ein Kerem, Jerusalem 9112102, Israel
| | - Max Lam
- Department of Psychiatry, Division of Research, The Zucker Hillside Hospital Division of Northwell Health, Glen Oaks, NY 11004, USA; Institute for Behavioral Science, The Feinstein Institutes for Medical Research, Manhasset, NY 11030, USA
| | - Danny Ben-Avraham
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Nir Barzilai
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Susan Bressman
- Department of Neurology, Beth Israel Medical Center, New York, NY 10003, USA
| | - Ariel Darvasi
- Department of Genetics, The Institute of Life Sciences, The Hebrew University of Jerusalem, Givat Ram, Jerusalem 91904, Israel
| | - Judy H Cho
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Lorraine N Clark
- Department of Pathology and Cell Biology, Columbia University Medical Center, New York, NY 10032, USA; Taub Institute for Research of Alzheimer's Disease and the Aging Brain, Columbia University Medical Center, New York, NY 10032, USA
| | - Zeynep H Gümüş
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Joseph Vijai
- Clinical Genetics Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Robert J Klein
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Taub Institute for Research of Alzheimer's Disease and the Aging Brain, Columbia University Medical Center, New York, NY 10032, USA
| | - Steven Lipkin
- Departments of Medicine, Genetic Medicine and Surgery, Weill Cornell Medical College, New York, NY 10065, USA
| | - Kenneth Offit
- Clinical Genetics Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Harry Ostrer
- Departments of Pathology and Pediatrics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Laurie J Ozelius
- Department of Neurology, Massachusetts General Hospital, Charlestown, MA 02129, USA
| | - Inga Peter
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Anil K Malhotra
- Departments of Psychiatry and Molecular Medicine, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY 11550, USA; Department of Psychiatry, Division of Research, The Zucker Hillside Hospital Division of Northwell Health, Glen Oaks, NY 11004, USA; Institute for Behavioral Science, The Feinstein Institutes for Medical Research, Manhasset, NY 11030, USA
| | - Tom Maniatis
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Mortimer B. Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, NY 10027, USA; New York Genome Center, New York, NY 10013, USA
| | - Gil Atzmon
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Human Biology, Haifa University, Haifa, Israel
| | - Itsik Pe'er
- Department of Computer Science, Columbia University, New York, NY 10027, USA; Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
8
|
Jeon Y, Jeon S, Blazyte A, Kim YJ, Lee JJ, Bhak Y, Cho YS, Park Y, Noh EK, Manica A, Edwards JS, Bolser D, Kim S, Lee Y, Yoon C, Lee S, Kim BC, Park NH, Bhak J. Welfare Genome Project: A Participatory Korean Personal Genome Project With Free Health Check-Up and Genetic Report Followed by Counseling. Front Genet 2021; 12:633731. [PMID: 33633791 PMCID: PMC7900555 DOI: 10.3389/fgene.2021.633731] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 01/20/2021] [Indexed: 12/27/2022] Open
Abstract
The Welfare Genome Project (WGP) provided 1,000 healthy Korean volunteers with detailed genetic and health reports to test the social perception of integrating personal genetic and healthcare data at a large-scale. WGP was launched in 2016 in the Ulsan Metropolitan City as the first large-scale genome project with public participation in Korea. The project produced a set of genetic materials, genotype information, clinical data, and lifestyle survey answers from participants aged 20–96. As compensation, the participants received a free general health check-up on 110 clinical traits, accompanied by a genetic report of their genotypes followed by genetic counseling. In a follow-up survey, 91.0% of the participants indicated that their genetic reports motivated them to improve their health. Overall, WGP expanded not only the general awareness of genomics, DNA sequencing technologies, bioinformatics, and bioethics regulations among all the parties involved, but also the general public’s understanding of how genome projects can indirectly benefit their health and lifestyle management. WGP established a data construction framework for not only scientific research but also the welfare of participants. In the future, the WGP framework can help lay the groundwork for a new personalized healthcare system that is seamlessly integrated with existing public medical infrastructure.
Collapse
Affiliation(s)
- Yeonsu Jeon
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Department of Biomedical Engineering, College of Information-Bio Convergence Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Sungwon Jeon
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Department of Biomedical Engineering, College of Information-Bio Convergence Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Asta Blazyte
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Department of Biomedical Engineering, College of Information-Bio Convergence Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | | | - Jasmin Junseo Lee
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Human Biology Program, Faculty of Arts and Sciences, University of Toronto, Toronto, ON, Canada
| | - Youngjune Bhak
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Department of Biomedical Engineering, College of Information-Bio Convergence Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | | | - Yeshin Park
- Clinomics Inc., Ulsan, South Korea.,Department of Medical Sciences, Graduate School of Ajou University School, Suwon, South Korea
| | - Eui-Kyu Noh
- Department of Hematology and Oncology, Ulsan University Hospital, University of Ulsan College of Medicine, Ulsan, South Korea
| | - Andrea Manica
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - Jeremy S Edwards
- Department of Chemistry and Chemical Biology, University of New Mexico Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM, United States
| | - Dan Bolser
- Geromics Ltd., Cambridge, United Kingdom
| | - Sukyeon Kim
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Yuji Lee
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Changhan Yoon
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Department of Biomedical Engineering, College of Information-Bio Convergence Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Semin Lee
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Department of Biomedical Engineering, College of Information-Bio Convergence Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | | | - Neung Hwa Park
- Department of Internal Medicine, Ulsan University Hospital, University of Ulsan College of Medicine, Ulsan, South Korea
| | - Jong Bhak
- Korean Genomics Center (KOGIC), Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Department of Biomedical Engineering, College of Information-Bio Convergence Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.,Clinomics Inc., Ulsan, South Korea.,Personal Genomics Institute (PGI), Genome Research Foundation (GRF), Osong, South Korea
| |
Collapse
|
9
|
Jain A, Bhoyar RC, Pandhare K, Mishra A, Sharma D, Imran M, Senthivel V, Divakar MK, Rophina M, Jolly B, Batra A, Sharma S, Siwach S, Jadhao AG, Palande N, Jha GN, Ashrafi N, Mishra PK, A. K. V, Jain S, Dash D, Kumar NS, Vanlallawma A, Sarma R, Chhakchhuak L, Kalyanaraman S, Mahadevan R, Kandasamy S, B. M. P, Rajagopal RE, J. ER, P. ND, Bajaj A, Gupta V, Mathew S, Goswami S, Mangla M, Prakash S, Joshi K, S. S, Gajjar D, Soraisham R, Yadav R, Devi YS, Gupta A, Mukerji M, Ramalingam S, B. K. B, Scaria V, Sivasubbu S. IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes. Nucleic Acids Res 2021; 49:D1225-D1232. [PMID: 33095885 PMCID: PMC7778947 DOI: 10.1093/nar/gkaa923] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 10/01/2020] [Accepted: 10/22/2020] [Indexed: 12/15/2022] Open
Abstract
With the advent of next-generation sequencing, large-scale initiatives for mining whole genomes and exomes have been employed to better understand global or population-level genetic architecture. India encompasses more than 17% of the world population with extensive genetic diversity, but is under-represented in the global sequencing datasets. This gave us the impetus to perform and analyze the whole genome sequencing of 1029 healthy Indian individuals under the pilot phase of the 'IndiGen' program. We generated a compendium of 55,898,122 single allelic genetic variants from geographically distinct Indian genomes and calculated the allele frequency, allele count, allele number, along with the number of heterozygous or homozygous individuals. In the present study, these variants were systematically annotated using publicly available population databases and can be accessed through a browsable online database named as 'IndiGenomes' http://clingen.igib.res.in/indigen/. The IndiGenomes database will help clinicians and researchers in exploring the genetic component underlying medical conditions. Till date, this is the most comprehensive genetic variant resource for the Indian population and is made freely available for academic utility. The resource has also been accessed extensively by the worldwide community since it's launch.
Collapse
Affiliation(s)
- Abhinav Jain
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Rahul C Bhoyar
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Kavita Pandhare
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Anushree Mishra
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Disha Sharma
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Mohamed Imran
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Vigneshwar Senthivel
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Mohit Kumar Divakar
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Mercy Rophina
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Bani Jolly
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Arushi Batra
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Sumit Sharma
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Sanjay Siwach
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Arun G Jadhao
- Department of Zoology, RTM Nagpur University, Nagpur, Maharashtra 440033, India
| | - Nikhil V Palande
- Department of Zoology, Shri Mathuradas Mohota College of Science, Nagpur, Maharashtra 440009, India
| | - Ganga Nath Jha
- Department of Anthropology, Vinoba Bhave University, Hazaribag, Jharkhand 825301, India
| | - Nishat Ashrafi
- Department of Anthropology, Vinoba Bhave University, Hazaribag, Jharkhand 825301, India
| | - Prashant Kumar Mishra
- Department of Biotechnology, Vinoba Bhave University, Hazaribag, Jharkhand 825301, India
| | - Vidhya A. K.
- Department of Biochemistry, Dr. Kongu Science and Art College, Erode, Tamil Nadu 638107, India
| | - Suman Jain
- Thalassemia and Sickle cell Society, Hyderabad, Telangana 500052, India
| | - Debasis Dash
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | | | - Andrew Vanlallawma
- Department of Biotechnology, Mizoram University, Aizawl, Mizoram 796004, India
| | - Ranjan Jyoti Sarma
- Department of Biotechnology, Mizoram University, Aizawl, Mizoram 796004, India
| | | | | | - Radha Mahadevan
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | - Sunitha Kandasamy
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | - Pabitha B. M.
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | | | - Ezhil Ramya J.
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | - Nirmala Devi P.
- TVMC, Tirunelveli Medical College, Tirunelveli, Tamil Nadu 627011, India
| | - Anjali Bajaj
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Vishu Gupta
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Samatha Mathew
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Sangam Goswami
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Mohit Mangla
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Savinitha Prakash
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Kandarp Joshi
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
| | - Sreedevi S.
- Department of Microbiology, St.Pious X Degree & PG College for Women, Hyderabad, Telangana 500076, India
| | - Devarshi Gajjar
- Department of Microbiology, The Maharaja Sayajirao University of Baroda, Vadodara, Gujarat 390002, India
| | - Ronibala Soraisham
- Department of Dermatology, Venereology and Leprology, Regional Institute of Medical Sciences, Imphal, Manipur 795004, India
| | - Rohit Yadav
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Yumnam Silla Devi
- CSIR- North East Institute of Science and Technology, Jorhat, Assam 785006, India
| | - Aayush Gupta
- Department of Dermatology, Dr. D.Y. Patil Medical College, Pune, Maharashtra 411018, India
| | - Mitali Mukerji
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Sivaprakash Ramalingam
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Binukumar B. K.
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Vinod Scaria
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| | - Sridhar Sivasubbu
- CSIR-Institute of Genomics and Integrative Biology, New Delhi 110025, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India
| |
Collapse
|
10
|
Gutman D, Lidzbarsky G, Milman S, Gao T, Sin-Chan P, Gonzaga‐Jauregui C, Deelen J, Shuldiner AR, Barzilai N, Atzmon G. Similar burden of pathogenic coding variants in exceptionally long-lived individuals and individuals without exceptional longevity. Aging Cell 2020; 19:e13216. [PMID: 32860726 PMCID: PMC7576295 DOI: 10.1111/acel.13216] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 06/22/2020] [Accepted: 07/12/2020] [Indexed: 12/13/2022] Open
Abstract
Centenarians (exceptionally long‐lived individuals—ELLI) are a unique segment of the population, exhibiting long human lifespan and healthspan, despite generally practicing similar lifestyle habits as their peers. We tested disease‐associated mutation burden in ELLI genomes by determining the burden of pathogenic variants reported in the ClinVar and HGMD databases using data from whole exome sequencing (WES) conducted in a cohort of ELLI, their offspring, and control individuals without antecedents of familial longevity (n = 1879), all descendent from the founder population of Ashkenazi Jews. The burden of pathogenic variants did not differ between the three groups. Additional analyses of variants subtypes and variant effect predictor (VEP) biotype frequencies did not reveal a decrease of pathogenic or loss‐of‐function (LoF) variants in ELLI and offspring compared to the control group. Case–control pathogenic variants enrichment analyses conducted in ELLI and controls also did not identify significant differences in any of the variants between the groups and polygenic risk scores failed to provide a predictive model. Interestingly, cancer and Alzheimer's disease‐associated variants were significantly depleted in ELLI compared to controls, suggesting slower accumulation of mutation. That said, polygenic risk score analysis failed to find any predictive variants among the functional variants tested. The high similarity in the burden of pathogenic variation between ELLI and individuals without familial longevity supports the notion that extension of lifespan and healthspan in ELLI is not a consequence of pathogenic variant depletion but rather a result of other genomic, epigenomic, or potentially nongenomic properties.
Collapse
Affiliation(s)
- Danielle Gutman
- Faculty of Natural Sciences University of Haifa Haifa Israel
| | | | - Sofiya Milman
- Department of Medicine Albert Einstein College of Medicine Bronx New York USA
| | - Tina Gao
- Department of Medicine Albert Einstein College of Medicine Bronx New York USA
| | | | | | - Joris Deelen
- Max Planck Institute for Biology of Ageing Cologne Germany
- Molecular Epidemiology Department of Biochemical Data Sciences Leiden University Medical Center Leiden The Netherlands
| | | | - Nir Barzilai
- Department of Medicine Albert Einstein College of Medicine Bronx New York USA
- Genetic, Institute for Aging Research and the Diabetes Research Center Albert Einstein College of Medicine Bronx New York USA
| | - Gil Atzmon
- Faculty of Natural Sciences University of Haifa Haifa Israel
- Department of Medicine Albert Einstein College of Medicine Bronx New York USA
- Genetic, Institute for Aging Research and the Diabetes Research Center Albert Einstein College of Medicine Bronx New York USA
| | | |
Collapse
|
11
|
Quick C, Anugu P, Musani S, Weiss ST, Burchard EG, White MJ, Keys KL, Cucca F, Sidore C, Boehnke M, Fuchsberger C. Sequencing and imputation in GWAS: Cost-effective strategies to increase power and genomic coverage across diverse populations. Genet Epidemiol 2020; 44:537-549. [PMID: 32519380 PMCID: PMC7449570 DOI: 10.1002/gepi.22326] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 04/02/2020] [Accepted: 05/22/2020] [Indexed: 01/03/2023]
Abstract
A key aim for current genome-wide association studies (GWAS) is to interrogate the full spectrum of genetic variation underlying human traits, including rare variants, across populations. Deep whole-genome sequencing is the gold standard to fully capture genetic variation, but remains prohibitively expensive for large sample sizes. Array genotyping interrogates a sparser set of variants, which can be used as a scaffold for genotype imputation to capture a wider set of variants. However, imputation quality depends crucially on reference panel size and genetic distance from the target population. Here, we consider sequencing a subset of GWAS participants and imputing the rest using a reference panel that includes both sequenced GWAS participants and an external reference panel. We investigate how imputation quality and GWAS power are affected by the number of participants sequenced for admixed populations (African and Latino Americans) and European population isolates (Sardinians and Finns), and identify powerful, cost-effective GWAS designs given current sequencing and array costs. For populations that are well-represented in existing reference panels, we find that array genotyping alone is cost-effective and well-powered to detect common- and rare-variant associations. For poorly represented populations, sequencing a subset of participants is often most cost-effective, and can substantially increase imputation quality and GWAS power.
Collapse
Affiliation(s)
- Corbin Quick
- Department of Biostatistics and Center for Statistical GeneticsUniversity of Michigan School of Public HealthAnn ArborMichigan
| | - Pramod Anugu
- University of Mississippi Medical CenterJacksonMississippi
| | - Solomon Musani
- University of Mississippi Medical CenterJacksonMississippi
| | - Scott T. Weiss
- Harvard Medical SchoolBostonMassachusetts
- Channing Department of Network MedicineBrigham and Women's HospitalBostonCalifornia
- Partners HealthCare Personalized MedicineBostonMassachusetts
| | - Esteban G. Burchard
- Department of MedicineUniversity of California San FranciscoSan FranciscoCalifornia
- Department of Bioengineering and Therapeutic SciencesUniversity of California San FranciscoSan FranciscoCalifornia
| | - Marquitta J. White
- Department of MedicineUniversity of California San FranciscoSan FranciscoCalifornia
| | - Kevin L. Keys
- Department of MedicineUniversity of California San FranciscoSan FranciscoCalifornia
| | - Francesco Cucca
- Istituto di Ricerca Genetica e Biomedica (IRGB), CNRMonserratoItaly
- Dipartimento di Scienze BiomedicheUniversità di SassariSassariItaly
| | - Carlo Sidore
- Istituto di Ricerca Genetica e Biomedica (IRGB), CNRMonserratoItaly
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical GeneticsUniversity of Michigan School of Public HealthAnn ArborMichigan
| | - Christian Fuchsberger
- Department of Biostatistics and Center for Statistical GeneticsUniversity of Michigan School of Public HealthAnn ArborMichigan
- Department of Genetics and Pharmacology, Institute of Genetic EpidemiologyMedical University of InnsbruckInnsbruckAustria
- Institute for Biomedicine, Eurac ResearchAffiliated Institute of the University of LübeckBolzanoItaly
| |
Collapse
|
12
|
Ros-Freixedes R, Whalen A, Gorjanc G, Mileham AJ, Hickey JM. Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling. Genet Sel Evol 2020; 52:18. [PMID: 32248818 PMCID: PMC7132986 DOI: 10.1186/s12711-020-00537-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 03/27/2020] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND For assembling large whole-genome sequence datasets for routine use in research and breeding, the sequencing strategy should be adapted to the methods that will be used later for variant discovery and imputation. In this study, we used simulation to explore the impact that the sequencing strategy and level of sequencing investment have on the overall accuracy of imputation using hybrid peeling, a pedigree-based imputation method that is well suited for large livestock populations. METHODS We simulated marker array and whole-genome sequence data for 15 populations with simulated or real pedigrees that had different structures. In these populations, we evaluated the effect on imputation accuracy of seven methods for selecting which individuals to sequence, the generation of the pedigree to which the sequenced individuals belonged, the use of variable or uniform coverage, and the trade-off between the number of sequenced individuals and their sequencing coverage. For each population, we considered four levels of investment in sequencing that were proportional to the size of the population. RESULTS Imputation accuracy depended greatly on pedigree depth. The distribution of the sequenced individuals across the generations of the pedigree underlay the performance of the different methods used to select individuals to sequence and it was critical for achieving high imputation accuracy in both early and late generations. Imputation accuracy was highest with a uniform coverage across the sequenced individuals of 2× rather than variable coverage. An investment equivalent to the cost of sequencing 2% of the population at 2× provided high imputation accuracy. The gain in imputation accuracy from additional investment decreased with larger populations and higher levels of investment. However, to achieve the same imputation accuracy, a proportionally greater investment must be used in the smaller populations compared to the larger ones. CONCLUSIONS Suitable sequencing strategies for subsequent imputation with hybrid peeling involve sequencing ~2% of the population at a uniform coverage 2×, distributed preferably across all generations of the pedigree, except for the few earliest generations that lack genotyped ancestors. Such sequencing strategies are beneficial for generating whole-genome sequence data in populations with deep pedigrees of closely related individuals.
Collapse
Affiliation(s)
- Roger Ros-Freixedes
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
- Departament de Ciència Animal, Universitat de Lleida-Agrotecnio Center, Lleida, Spain
| | - Andrew Whalen
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
| | | | - John M. Hickey
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
| |
Collapse
|
13
|
Ros-Freixedes R, Whalen A, Chen CY, Gorjanc G, Herring WO, Mileham AJ, Hickey JM. Accuracy of whole-genome sequence imputation using hybrid peeling in large pedigreed livestock populations. Genet Sel Evol 2020; 52:17. [PMID: 32248811 PMCID: PMC7132992 DOI: 10.1186/s12711-020-00536-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 03/27/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND The coupling of appropriate sequencing strategies and imputation methods is critical for assembling large whole-genome sequence datasets from livestock populations for research and breeding. In this paper, we describe and validate the coupling of a sequencing strategy with the imputation method hybrid peeling in real animal breeding settings. METHODS We used data from four pig populations of different size (18,349 to 107,815 individuals) that were widely genotyped at densities between 15,000 and 75,000 markers genome-wide. Around 2% of the individuals in each population were sequenced (most of them at 1× or 2× and 37-92 individuals per population, totalling 284, at 15-30×). We imputed whole-genome sequence data with hybrid peeling. We evaluated the imputation accuracy by removing the sequence data of the 284 individuals with high coverage, using a leave-one-out design. We simulated data that mimicked the sequencing strategy used in the real populations to quantify the factors that affected the individual-wise and variant-wise imputation accuracies using regression trees. RESULTS Imputation accuracy was high for the majority of individuals in all four populations (median individual-wise dosage correlation: 0.97). Imputation accuracy was lower for individuals in the earliest generations of each population than for the rest, due to the lack of marker array data for themselves and their ancestors. The main factors that determined the individual-wise imputation accuracy were the genotyping status, the availability of marker array data for immediate ancestors, and the degree of connectedness to the rest of the population, but sequencing coverage of the relatives had no effect. The main factors that determined variant-wise imputation accuracy were the minor allele frequency and the number of individuals with sequencing coverage at each variant site. Results were validated with the empirical observations. CONCLUSIONS We demonstrate that the coupling of an appropriate sequencing strategy and hybrid peeling is a powerful strategy for generating whole-genome sequence data with high accuracy in large pedigreed populations where only a small fraction of individuals (2%) had been sequenced, mostly at low coverage. This is a critical step for the successful implementation of whole-genome sequence data for genomic prediction and fine-mapping of causal variants.
Collapse
Affiliation(s)
- Roger Ros-Freixedes
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
- Departament de Ciència Animal, Universitat de Lleida-Agrotecnio Center, Lleida, Spain
| | - Andrew Whalen
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
| | - Ching-Yi Chen
- The Pig Improvement Company, Genus plc, 100 Bluegrass Commons Blvd Ste 2200, Hendersonville, TN 37075 USA
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
| | - William O. Herring
- The Pig Improvement Company, Genus plc, 100 Bluegrass Commons Blvd Ste 2200, Hendersonville, TN 37075 USA
| | | | - John M. Hickey
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK
| |
Collapse
|
14
|
The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature 2019; 576:106-111. [PMID: 31802016 PMCID: PMC7054211 DOI: 10.1038/s41586-019-1793-z] [Citation(s) in RCA: 264] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 10/11/2019] [Indexed: 12/30/2022]
Abstract
The underrepresentation of non-Europeans in human genetic studies so far has limited the diversity of individuals in genomic datasets and led to reduced medical relevance for a large proportion of the world’s population. Population-specific reference genome datasets as well as genome-wide association studies in diverse populations are needed to address this issue. Here we describe the pilot phase of the GenomeAsia 100K Project. This includes a whole-genome sequencing reference dataset from 1,739 individuals of 219 population groups and 64 countries across Asia. We catalogue genetic variation, population structure, disease associations and founder effects. We also explore the use of this dataset in imputation, to facilitate genetic studies in populations across Asia and worldwide. Using whole-genome sequencing data from 1,739 individuals, the GenomeAsia 100K Project catalogues genetic variation, population structure and disease associations to facilitate genetic studies in Asian populations and increase representation in genetics studies worldwide.
Collapse
|
15
|
Karavani E, Zuk O, Zeevi D, Barzilai N, Stefanis NC, Hatzimanolis A, Smyrnis N, Avramopoulos D, Kruglyak L, Atzmon G, Lam M, Lencz T, Carmi S. Screening Human Embryos for Polygenic Traits Has Limited Utility. Cell 2019; 179:1424-1435.e8. [PMID: 31761530 PMCID: PMC6957074 DOI: 10.1016/j.cell.2019.10.033] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 09/11/2019] [Accepted: 10/25/2019] [Indexed: 12/19/2022]
Abstract
The increasing proportion of variance in human complex traits explained by polygenic scores, along with progress in preimplantation genetic diagnosis, suggests the possibility of screening embryos for traits such as height or cognitive ability. However, the expected outcomes of embryo screening are unclear, which undermines discussion of associated ethical concerns. Here, we use theory, simulations, and real data to evaluate the potential gain of embryo screening, defined as the difference in trait value between the top-scoring embryo and the average embryo. The gain increases very slowly with the number of embryos but more rapidly with the variance explained by the score. Given current technology, the average gain due to screening would be ≈2.5 cm for height and ≈2.5 IQ points for cognitive ability. These mean values are accompanied by wide prediction intervals, and indeed, in large nuclear families, the majority of children top-scoring for height are not the tallest.
Collapse
Affiliation(s)
- Ehud Karavani
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Or Zuk
- Department of Statistics, The Hebrew University of Jerusalem, Jerusalem 9190501, Israel
| | - Danny Zeevi
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Nir Barzilai
- Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Genetics, Institute for Aging Research, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Nikos C Stefanis
- Department of Psychiatry, National and Kapodistrian University of Athens Medical School, Eginition Hospital, 115 28 Athens, Greece; University Mental Health Research Institute, 115 27 Athens, Greece; Neurobiology Research Institute, Theodor-Theohari Cozzika Foundation, 115 21 Athens, Greece
| | - Alex Hatzimanolis
- Department of Psychiatry, National and Kapodistrian University of Athens Medical School, Eginition Hospital, 115 28 Athens, Greece; Neurobiology Research Institute, Theodor-Theohari Cozzika Foundation, 115 21 Athens, Greece
| | - Nikolaos Smyrnis
- Department of Psychiatry, National and Kapodistrian University of Athens Medical School, Eginition Hospital, 115 28 Athens, Greece; University Mental Health Research Institute, 115 27 Athens, Greece
| | - Dimitrios Avramopoulos
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Leonid Kruglyak
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA; Howard Hughes Medical Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Gil Atzmon
- Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Genetics, Institute for Aging Research, Albert Einstein College of Medicine, Bronx, NY 10461, USA; Department of Biology, Faculty of Natural Sciences, University of Haifa, Haifa 3498838, Israel
| | - Max Lam
- Division of Psychiatry Research, Zucker Hillside Hospital, Glen Oaks, NY 11004, USA; Institute of Behavioral Science, Feinstein Institutes of Medical Research, Manhasset, NY 11030, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Todd Lencz
- Division of Psychiatry Research, Zucker Hillside Hospital, Glen Oaks, NY 11004, USA; Institute of Behavioral Science, Feinstein Institutes of Medical Research, Manhasset, NY 11030, USA; Department of Psychiatry, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY 11549, USA.
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel.
| |
Collapse
|
16
|
Preconception carrier screening yield: effect of variants of unknown significance in partners of carriers with clinically significant variants. Genet Med 2019; 22:646-653. [DOI: 10.1038/s41436-019-0676-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 09/27/2019] [Indexed: 02/04/2023] Open
|
17
|
Maroilley T, Tarailo-Graovac M. Uncovering Missing Heritability in Rare Diseases. Genes (Basel) 2019; 10:E275. [PMID: 30987386 PMCID: PMC6523881 DOI: 10.3390/genes10040275] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 03/29/2019] [Accepted: 04/01/2019] [Indexed: 12/14/2022] Open
Abstract
The problem of 'missing heritability' affects both common and rare diseases hindering: discovery, diagnosis, and patient care. The 'missing heritability' concept has been mainly associated with common and complex diseases where promising modern technological advances, like genome-wide association studies (GWAS), were unable to uncover the complete genetic mechanism of the disease/trait. Although rare diseases (RDs) have low prevalence individually, collectively they are common. Furthermore, multi-level genetic and phenotypic complexity when combined with the individual rarity of these conditions poses an important challenge in the quest to identify causative genetic changes in RD patients. In recent years, high throughput sequencing has accelerated discovery and diagnosis in RDs. However, despite the several-fold increase (from ~10% using traditional to ~40% using genome-wide genetic testing) in finding genetic causes of these diseases in RD patients, as is the case in common diseases-the majority of RDs are also facing the 'missing heritability' problem. This review outlines the key role of high throughput sequencing in uncovering genetics behind RDs, with a particular focus on genome sequencing. We review current advances and challenges of sequencing technologies, bioinformatics approaches, and resources.
Collapse
Affiliation(s)
- Tatiana Maroilley
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada.
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.
| | - Maja Tarailo-Graovac
- Departments of Biochemistry, Molecular Biology and Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada.
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.
| |
Collapse
|
18
|
Zeevi DA, Zahdeh F, Kling Y, Carmi S, Altarescu G. Off the street phasing (OTSP): no hassle haplotype phasing for molecular PGD applications. J Assist Reprod Genet 2019; 36:727-739. [PMID: 30617673 PMCID: PMC6504987 DOI: 10.1007/s10815-018-1392-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 12/18/2018] [Indexed: 11/28/2022] Open
Abstract
PURPOSE Pre-implantation genetic diagnosis (PGD) for molecular disorders requires the construction of parental haplotypes. Classically, haplotype resolution ("phasing") is obtained by genotyping multiple polymorphic markers in both parents and at least one additional relative. However, this process is time-consuming, and immediate family members are not always available. The recent availability of massive genomic data for many populations promises to eliminate the needs for developing family-specific assays and for recruiting additional family members. In this study, we aimed to validate population-assisted haplotype phasing for PGD. METHODS Targeted sequencing of CFTR gene variants and ~ 1700 flanking polymorphic SNPs (± 2 Mb) was performed on 54 individuals from 12 PGD families of (a) Full Ashkenazi (FA; n = 16), (b) mixed Ashkenazi (MA; n = 23 individuals with at least one Ashkenazi and one non-Ashkenazi grandparents), or (c) non-Ashkenazi (NA; n = 15) descent. Heterozygous genotype calls in each individual were phased using various whole genome reference panels and appropriate computational models. All computationally derived haplotype predictions were benchmarked against trio-based phasing. RESULTS Using the Ashkenazi reference panel, phasing of FA was highly accurate (99.4% ± 0.2% accuracy); phasing of MA was less accurate (95.4% ± 4.5% accuracy); and phasing of NA was predictably low (83.4% ± 6.6% accuracy). Strikingly, for founder mutation carriers, our haplotyping approach facilitated near perfect phasing accuracy (99.9% ± 0.1% and 98.2% ± 2.8% accuracy for W1282X and delF508 carriers, respectively). CONCLUSIONS Our results demonstrate the feasibility of replacing classical haplotype phasing with population-based phasing with uncompromised accuracy.
Collapse
Affiliation(s)
- David A Zeevi
- Medical Genetics Institute, Shaare Zedek Medical Center (SZMC), Bayit Str. 12, P.O.Box 3235, 91031, Jerusalem, Israel.
| | - Fouad Zahdeh
- Medical Genetics Institute, Shaare Zedek Medical Center (SZMC), Bayit Str. 12, P.O.Box 3235, 91031, Jerusalem, Israel
| | - Yehuda Kling
- Medical Genetics Institute, Shaare Zedek Medical Center (SZMC), Bayit Str. 12, P.O.Box 3235, 91031, Jerusalem, Israel
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Gheona Altarescu
- Medical Genetics Institute, Shaare Zedek Medical Center (SZMC), Bayit Str. 12, P.O.Box 3235, 91031, Jerusalem, Israel
| |
Collapse
|
19
|
Mohammed Ismail W, Pagel KA, Pejaver V, Zhang SV, Casasa S, Mort M, Cooper DN, Hahn MW, Radivojac P. The sequencing and interpretation of the genome obtained from a Serbian individual. PLoS One 2018; 13:e0208901. [PMID: 30566479 PMCID: PMC6300249 DOI: 10.1371/journal.pone.0208901] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 11/26/2018] [Indexed: 02/07/2023] Open
Abstract
Recent genetic studies and whole-genome sequencing projects have greatly improved our understanding of human variation and clinically actionable genetic information. Smaller ethnic populations, however, remain underrepresented in both individual and large-scale sequencing efforts and hence present an opportunity to discover new variants of biomedical and demographic significance. This report describes the sequencing and analysis of a genome obtained from an individual of Serbian origin, introducing tens of thousands of previously unknown variants to the currently available pool. Ancestry analysis places this individual in close proximity to Central and Eastern European populations; i.e., closest to Croatian, Bulgarian and Hungarian individuals and, in terms of other Europeans, furthest from Ashkenazi Jewish, Spanish, Sicilian and Baltic individuals. Our analysis confirmed gene flow between Neanderthal and ancestral pan-European populations, with similar contributions to the Serbian genome as those observed in other European groups. Finally, to assess the burden of potentially disease-causing/clinically relevant variation in the sequenced genome, we utilized manually curated genotype-phenotype association databases and variant-effect predictors. We identified several variants that have previously been associated with severe early-onset disease that is not evident in the proband, as well as putatively impactful variants that could yet prove to be clinically relevant to the proband over the next decades. The presence of numerous private and low-frequency variants, along with the observed and predicted disease-causing mutations in this genome, exemplify some of the global challenges of genome interpretation, especially in the context of under-studied ethnic groups.
Collapse
Affiliation(s)
- Wazim Mohammed Ismail
- Department of Computer Science, Indiana University, Bloomington, Indiana, United States of America
| | - Kymberleigh A. Pagel
- Department of Computer Science, Indiana University, Bloomington, Indiana, United States of America
| | - Vikas Pejaver
- Department of Computer Science, Indiana University, Bloomington, Indiana, United States of America
| | - Simo V. Zhang
- Department of Computer Science, Indiana University, Bloomington, Indiana, United States of America
| | - Sofia Casasa
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - David N. Cooper
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - Matthew W. Hahn
- Department of Computer Science, Indiana University, Bloomington, Indiana, United States of America
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
| | - Predrag Radivojac
- College of Computer and Information Science, Northeastern University, Boston, Massachusetts, United States of America
| |
Collapse
|