1
|
Ormond C, Ryan NM, Cap M, Byerley W, Corvin A, Heron EA. BICEP: Bayesian inference for rare genomic variant causality evaluation in pedigrees. Brief Bioinform 2024; 26:bbae624. [PMID: 39656772 PMCID: PMC11645550 DOI: 10.1093/bib/bbae624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 10/04/2024] [Accepted: 11/27/2024] [Indexed: 12/17/2024] Open
Abstract
Next-generation sequencing is widely applied to the investigation of pedigree data for gene discovery. However, identifying plausible disease-causing variants within a robust statistical framework is challenging. Here, we introduce BICEP: a Bayesian inference tool for rare variant causality evaluation in pedigree-based cohorts. BICEP calculates the posterior odds that a genomic variant is causal for a phenotype based on the variant cosegregation as well as a priori evidence such as deleteriousness and functional consequence. BICEP can correctly identify causal variants for phenotypes with both Mendelian and complex genetic architectures, outperforming existing methodologies. Additionally, BICEP can correctly down-weight common variants that are unlikely to be involved in phenotypic liability in the context of a pedigree, even if they have reasonable cosegregation patterns. The output metrics from BICEP allow for the quantitative comparison of variant causality within and across pedigrees, which is not possible with existing approaches.
Collapse
Affiliation(s)
- Cathal Ormond
- Neuropsychiatric Genetics Research Group, Department of Psychiatry, Trinity Centre for Health Sciences, Trinity College Dublin, St James’s Hospital, Dublin 8, Ireland
| | - Niamh M Ryan
- Neuropsychiatric Genetics Research Group, Department of Psychiatry, Trinity Centre for Health Sciences, Trinity College Dublin, St James’s Hospital, Dublin 8, Ireland
| | - Mathieu Cap
- Neuropsychiatric Genetics Research Group, Department of Psychiatry, Trinity Centre for Health Sciences, Trinity College Dublin, St James’s Hospital, Dublin 8, Ireland
| | - William Byerley
- Department of Psychiatry and Behavioral Sciences, University of California, 1550 Fourth Street, San Francisco, CA 94158, United States
| | - Aiden Corvin
- Neuropsychiatric Genetics Research Group, Department of Psychiatry, Trinity Centre for Health Sciences, Trinity College Dublin, St James’s Hospital, Dublin 8, Ireland
| | - Elizabeth A Heron
- Neuropsychiatric Genetics Research Group, Department of Psychiatry, Trinity Centre for Health Sciences, Trinity College Dublin, St James’s Hospital, Dublin 8, Ireland
| |
Collapse
|
2
|
Ochs-Balcom HM, Preus L, Du Z, Elston RC, Teerlink CC, Jia G, Guo X, Cai Q, Long J, Ping J, Li B, Stram DO, Shu XO, Sanderson M, Gao G, Ahearn T, Lunetta KL, Zirpoli G, Troester MA, Ruiz-Narváez EA, Haddad SA, Figueroa J, John EM, Bernstein L, Hu JJ, Ziegler RG, Nyante S, Bandera EV, Ingles SA, Mancuso N, Press MF, Deming SL, Rodriguez-Gil JL, Yao S, Ogundiran TO, Ojengbede O, Bolla MK, Dennis J, Dunning AM, Easton DF, Michailidou K, Pharoah PDP, Sandler DP, Taylor JA, Wang Q, O’Brien KM, Weinberg CR, Kitahara CM, Blot W, Nathanson KL, Hennis A, Nemesure B, Ambs S, Sucheston-Campbell LE, Bensen JT, Chanock SJ, Olshan AF, Ambrosone CB, Olopade OI, the Ghana Breast Health Study Team, Conti DV, Palmer J, García-Closas M, Huo D, Zheng W, Haiman C. Novel breast cancer susceptibility loci under linkage peaks identified in African ancestry consortia. Hum Mol Genet 2024; 33:687-697. [PMID: 38263910 PMCID: PMC11000665 DOI: 10.1093/hmg/ddae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 01/25/2024] Open
Abstract
BACKGROUND Expansion of genome-wide association studies across population groups is needed to improve our understanding of shared and unique genetic contributions to breast cancer. We performed association and replication studies guided by a priori linkage findings from African ancestry (AA) relative pairs. METHODS We performed fixed-effect inverse-variance weighted meta-analysis under three significant AA breast cancer linkage peaks (3q26-27, 12q22-23, and 16q21-22) in 9241 AA cases and 10 193 AA controls. We examined associations with overall breast cancer as well as estrogen receptor (ER)-positive and negative subtypes (193,132 SNPs). We replicated associations in the African-ancestry Breast Cancer Genetic Consortium (AABCG). RESULTS In AA women, we identified two associations on chr12q for overall breast cancer (rs1420647, OR = 1.15, p = 2.50×10-6; rs12322371, OR = 1.14, p = 3.15×10-6), and one for ER-negative breast cancer (rs77006600, OR = 1.67, p = 3.51×10-6). On chr3, we identified two associations with ER-negative disease (rs184090918, OR = 3.70, p = 1.23×10-5; rs76959804, OR = 3.57, p = 1.77×10-5) and on chr16q we identified an association with ER-negative disease (rs34147411, OR = 1.62, p = 8.82×10-6). In the replication study, the chr3 associations were significant and effect sizes were larger (rs184090918, OR: 6.66, 95% CI: 1.43, 31.01; rs76959804, OR: 5.24, 95% CI: 1.70, 16.16). CONCLUSION The two chr3 SNPs are upstream to open chromatin ENSR00000710716, a regulatory feature that is actively regulated in mammary tissues, providing evidence that variants in this chr3 region may have a regulatory role in our target organ. Our study provides support for breast cancer variant discovery using prioritization based on linkage evidence.
Collapse
Affiliation(s)
- Heather M Ochs-Balcom
- Department of Epidemiology and Environmental Health, School of Public Health and Health Professions, University at Buffalo, 270 Farber Hall, Buffalo, NY 14214, United States
| | - Leah Preus
- Department of Epidemiology and Environmental Health, School of Public Health and Health Professions, University at Buffalo, 270 Farber Hall, Buffalo, NY 14214, United States
| | - Zhaohui Du
- Department of Preventive Population and Public Health Sciences, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
- Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave, N. Seattle, WA 98109, United States
| | - Robert C Elston
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH 44106, United States
| | - Craig C Teerlink
- Department of Internal Medicine, University of Utah School of Medicine, 30 North Mario Capecchi Dr, 3rd Floor North, Salt Lake City, UT 84112, United States
| | - Guochong Jia
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
| | - Qiuyin Cai
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
| | - Jie Ping
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
| | - Bingshan Li
- Department of Molecular Physiology and Biophysics, Vanderbilt University, 707 Light Hall 2215 Garland Avenue, Nashville, TN 37232, United States
| | - Daniel O Stram
- Department of Preventive Population and Public Health Sciences, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
| | - Maureen Sanderson
- Department of Family and Community Medicine, Meharry Medical College, 1005 Dr. DB Todd Jr, Blvd. Nashville, TN 37208, United States
| | - Guimin Gao
- Department of Public Health Sciences, University of Chicago, 5841 S. Maryland Ave., Chicago, IL 60637, United States
| | - Thomas Ahearn
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, 9609 Medical Center Drive, Bethesda, MD 20892, United States
| | - Kathryn L Lunetta
- Department of Biostatistics, Boston University, 715 Albany St, Boston, MA 02118, United States
| | - Gary Zirpoli
- Slone Epidemiology Center, Boston University, L-7, 72 East Concord Street, Boston, MA 02118, United States
| | - Melissa A Troester
- Department of Epidemiology, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 135 Dauer Drive, CB 7435, Chapel Hill, NC 27599, United States
| | - Edward A Ruiz-Narváez
- Department of Nutritional Sciences, University of Michigan School of Public Health, 1860 SPH I, 1415 Washington Heights, Ann Arbor, MI 48109, United States
| | - Stephen A Haddad
- Slone Epidemiology Center, Boston University, L-7, 72 East Concord Street, Boston, MA 02118, United States
| | - Jonine Figueroa
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, 9609 Medical Center Drive, Bethesda, MD 20892, United States
- Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh Medical School, 9 Little France Road, Edinburgh, EH16 4UX, United Kingdom
- Cancer Research UK Edinburgh Centre, Crewe Rd S, Edinburgh, EH4 2XR, United Kingdom
| | - Esther M John
- Department of Epidemiology & Population Health, Stanford University School of Medicine, 3145 Porter Dr, Suite E223, MC 5393, Palo Alto, CA 94304, United States
- Department of Medicine (Oncology), Stanford University School of Medicine, 291 Campus Drive Li Ka Shing Building, Stanford, CA 94305, United States
| | - Leslie Bernstein
- Division of Biomarkers of Early Detection and Prevention Department of Population Sciences, Beckman Research Institute of the City of Hope, City of Hope Comprehensive Cancer Center, 1500 East Duarte Road, Duarte, CA 91010, United States
| | - Jennifer J Hu
- Sylvester Comprehensive Cancer Center and Department of Public Health Sciences, University of Miami Miller School of Medicine, 1120 NW 14th St, CRB 1511, Miami, FL 33136, United States
| | - Regina G Ziegler
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, 9609 Medical Center Drive, Bethesda, MD 20892, United States
| | - Sarah Nyante
- Department of Radiology, School of Medicine, University of North Carolina at Chapel Hill, 130 Mason Farm Rd., Chapel Hill, NC 27599, United States
| | - Elisa V Bandera
- Cancer Epidemiology and Health Outcomes, Rutgers Cancer Institute of New Jersey, 120 Albany Street, Tower 2, 8th Floor, New Brunswick, NJ 08903, United States
| | - Sue A Ingles
- Department of Preventive Population and Public Health Sciences, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
| | - Nicholas Mancuso
- Department of Preventive Population and Public Health Sciences, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
| | - Michael F Press
- Department of Pathology, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, 1441 Eastlake Ave., Los Angeles, CA 90033, United States
| | - Sandra L Deming
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
| | - Jorge L Rodriguez-Gil
- Genomics, Development and Disease Section, Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, 31 Center Dr, Bethesda, MD 20894, United States
- Medical Scientist Training Program, School of Medicine and Public Health, University of Wisconsin-Madison, 750 Highland Ave., Madison, WI 53705, United States
| | - Song Yao
- Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Elm and Carlton Streets, Buffalo, NY 14263, United States
| | - Temidayo O Ogundiran
- Department of Surgery, College of Medicine, University of Ibadan, Queen Elizabeth II Road, Ibadan, 200285, Nigeria
| | - Oladosu Ojengbede
- Center for Population and Reproductive Health, College of Medicine, University of Ibadan, UCH, Queen Elizabeth II Road, Ibadan, 200285, Nigeria
| | - Manjeet K Bolla
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, 2 Worts Causeway, Cambridge, CB1 8RN, United Kingdom
| | - Joe Dennis
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, 2 Worts Causeway, Cambridge, CB1 8RN, United Kingdom
| | - Alison M Dunning
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge, CB1 8RN, United Kingdom
| | - Douglas F Easton
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge, CB1 8RN, United Kingdom
| | - Kyriaki Michailidou
- Biostatistics Unit, The Cyprus Institute of Neurology & Genetics, Iroon Avenue 6, 2371 Ayius Dometios, Nicosia, Cyprus
| | - Paul D P Pharoah
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Strangeways Research Laboratory, Worts Causeway, Cambridge, CB1 8RN, United Kingdom
| | - Dale P Sandler
- Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, PO Box 12233, Research Triangle Park, NC 27709, United States
| | - Jack A Taylor
- Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, PO Box 12233, Research Triangle Park, NC 27709, United States
| | - Qin Wang
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Research Laboratory, 2 Worts Causeway, Cambridge, CB1 8RN, United Kingdom
| | - Katie M O’Brien
- Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, PO Box 12233, Research Triangle Park, NC 27709, United States
| | - Clarice R Weinberg
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, PO Box 12233, Research Triangle Park, NC 27709, United States
| | - Cari M Kitahara
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Drive, Bethesda, MD 20892, United States
| | - William Blot
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
- International Epidemiology Institute, 1455 Research Boulevard, Rockville, MD 20850, United States
| | - Katherine L Nathanson
- Department of Medicine, Abramson Cancer Center, The Perelman School of Medicine at the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19140, United States
| | - Anselm Hennis
- Chronic Disease Research Centre and Faculty of Medical Sciences, University of the West Indies, Jemmotts Lane, Avalon, Bridgetown, Barbados
| | - Barbara Nemesure
- Department of Family, Population and Preventive Medicine, Stony Brook University, 100 Nicolls Road, Stony Brook, NY 11794, United States
| | - Stefan Ambs
- Laboratory of Human Carcinogenesis, National Cancer Institute, 37 Convent Drive, Bethesda, MD 20892, United States
| | - Lara E Sucheston-Campbell
- College of Pharmacy, The Ohio State University, 217 Lloyd M. Parks Hall, 500 West 12th Ave., Columbus, OH 43210, United States
- College of Veterinary Medicine, The Ohio State University, 1900 Coffey Road, Columbus, OH 43210, United States
| | - Jeannette T Bensen
- Department of Epidemiology, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 135 Dauer Drive, CB 7435, Chapel Hill, NC 27599, United States
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, 9609 Medical Center Drive, Bethesda, MD 20892, United States
| | - Andrew F Olshan
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, 170 Rosenau Hall, CB #7400, 135 Dauer Drive, Chapel Hill, NC 27599, United States
| | - Christine B Ambrosone
- Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Elm and Carlton Streets, Buffalo, NY 14263, United States
| | - Olufunmilayo I Olopade
- Center for Clinical Cancer Genetics and Global Health, Department of Medicine, University of Chicago, 5841 S Maryland Avenue, Chicago, IL 60637, United States
| | | | - David V Conti
- Department of Preventive Population and Public Health Sciences, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
| | - Julie Palmer
- Slone Epidemiology Center, Boston University, L-7, 72 East Concord Street, Boston, MA 02118, United States
| | - Montserrat García-Closas
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, 9609 Medical Center Drive, Bethesda, MD 20892, United States
| | - Dezheng Huo
- Department of Public Health Sciences, University of Chicago, 5841 S. Maryland Ave., Chicago, IL 60637, United States
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Avenue, Nashville, TN 37203, United States
| | - Christopher Haiman
- Department of Preventive Population and Public Health Sciences, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
| |
Collapse
|
3
|
Kundu P, Chatterjee N. Logistic regression analysis of two-phase studies using generalized method of moments. Biometrics 2023; 79:241-252. [PMID: 34677824 DOI: 10.1111/biom.13584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Revised: 08/10/2021] [Accepted: 09/23/2021] [Indexed: 11/29/2022]
Abstract
Two-phase designs can reduce the cost of epidemiological studies by limiting the ascertainment of expensive covariates or/and exposures to an efficiently selected subset (phase-II) of a larger (phase-I) study. Efficient analysis of the resulting data set combining disparate information from phase-I and phase-II, however, can be complex. Most of the existing methods, including semiparametric maximum-likelihood estimator, require the information in phase-I to be summarized into a fixed number of strata. In this paper, we describe a novel method for the analysis of two-phase studies where information from phase-I is summarized by parameters associated with a reduced logistic regression model of the disease outcome on available covariates. We then setup estimating equations for parameters associated with the desired extended logistic regression model, based on information on the reduced model parameters from phase-I and complete data available at phase-II after accounting for nonrandom sampling design. We use generalized method of moments to solve overly identified estimating equations and develop the resulting asymptotic theory for the proposed estimator. Simulation studies show that the use of reduced parametric models, as opposed to summarizing data into strata, can lead to more efficient utilization of phase-I data. An application of the proposed method is illustrated using the data from the U.S. National Wilms Tumor Study.
Collapse
Affiliation(s)
- Prosenjit Kundu
- Department of Biostatistics, Bloomberg School of Public Health, The Johns Hopkins University, Baltimore, Maryland, USA
| | - Nilanjan Chatterjee
- Department of Biostatistics, Bloomberg School of Public Health; Department of Oncology, School of Medicine, The Johns Hopkins University, Baltimore, Maryland, USA
| |
Collapse
|
4
|
Wang YC, Wu Y, Choi J, Allington G, Zhao S, Khanfar M, Yang K, Fu PY, Wrubel M, Yu X, Mekbib KY, Ocken J, Smith H, Shohfi J, Kahle KT, Lu Q, Jin SC. Computational Genomics in the Era of Precision Medicine: Applications to Variant Analysis and Gene Therapy. J Pers Med 2022; 12:175. [PMID: 35207663 PMCID: PMC8878256 DOI: 10.3390/jpm12020175] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/18/2022] [Accepted: 01/24/2022] [Indexed: 02/04/2023] Open
Abstract
Rapid methodological advances in statistical and computational genomics have enabled researchers to better identify and interpret both rare and common variants responsible for complex human diseases. As we continue to see an expansion of these advances in the field, it is now imperative for researchers to understand the resources and methodologies available for various data types and study designs. In this review, we provide an overview of recent methods for identifying rare and common variants and understanding their roles in disease etiology. Additionally, we discuss the strategy, challenge, and promise of gene therapy. As computational and statistical approaches continue to improve, we will have an opportunity to translate human genetic findings into personalized health care.
Collapse
Affiliation(s)
- Yung-Chun Wang
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Yuchang Wu
- Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, USA;
| | - Julie Choi
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Garrett Allington
- Department of Pathology, Yale School of Medicine, New Haven, CT 06510, USA;
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA 02114, USA; (H.S.); (K.T.K.)
| | - Shujuan Zhao
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Mariam Khanfar
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Kuangying Yang
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Po-Ying Fu
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Max Wrubel
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
| | - Xiaobing Yu
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
- Department of Computer Science & Engineering, Washington University, St. Louis, MO 63130, USA
| | - Kedous Y. Mekbib
- Department of Neurosurgery, Yale University School of Medicine, New Haven, CT 06510, USA; (K.Y.M.); (J.O.); (J.S.)
| | - Jack Ocken
- Department of Neurosurgery, Yale University School of Medicine, New Haven, CT 06510, USA; (K.Y.M.); (J.O.); (J.S.)
| | - Hannah Smith
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA 02114, USA; (H.S.); (K.T.K.)
- Department of Neurosurgery, Yale University School of Medicine, New Haven, CT 06510, USA; (K.Y.M.); (J.O.); (J.S.)
| | - John Shohfi
- Department of Neurosurgery, Yale University School of Medicine, New Haven, CT 06510, USA; (K.Y.M.); (J.O.); (J.S.)
| | - Kristopher T. Kahle
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA 02114, USA; (H.S.); (K.T.K.)
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115, USA
- Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Qiongshi Lu
- Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, USA;
| | - Sheng Chih Jin
- Department of Genetics, School of Medicine, Washington University, St. Louis, MO 63110, USA; (Y.-C.W.); (J.C.); (S.Z.); (M.K.); (K.Y.); (P.-Y.F.); (M.W.); (X.Y.)
- Department of Pediatrics, School of Medicine, Washington University, St. Louis, MO 63110, USA
| |
Collapse
|
5
|
Lakhal-Chaieb L, Simard J, Bull S. Sequence kernel association test for survival outcomes in the presence of a non-susceptible fraction. Biostatistics 2021; 21:518-530. [PMID: 30590388 DOI: 10.1093/biostatistics/kxy075] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Revised: 10/23/2018] [Accepted: 10/25/2018] [Indexed: 11/13/2022] Open
Abstract
In this work, we propose a single nucleotide polymorphism set association test for survival phenotypes in the presence of a non-susceptible fraction. We consider a mixture model with a logistic regression for the susceptibility indicator and a proportional hazards regression to model survival in the susceptible group. We propose a joint test to assess the significance of the genetic variant in both logistic and survival regressions simultaneously. We adopt the spirit of SKAT and conduct a variance-component test treating the genetic effects of multiple variants as random. We derive score-type test statistics, and we investigate several approaches to compute their $p$-values. The finite-sample properties of the proposed tests are assessed and compared to existing approaches by simulations and their use is illustrated through an application to ovarian cancer data from the Consortium of Investigators of Modifiers of BRCA1 and BRCA2.
Collapse
Affiliation(s)
- Lajmi Lakhal-Chaieb
- Département de mathématiques et de statistique, Université Laval, 1045 de la médecine, Québec G1V 0A6, Canada
| | - Jacques Simard
- Département de médecine moléculaire, Chaire de recherche du Canada en encogénétique, Université Laval, Québec G1V 0A6, Canada
| | - Shelley Bull
- Dalla Lana School of Public Health, University of Toronto, 6th floor, Health Sciences Building, 155 College Street, Toronto, Ontario M5T3M7 Canada.,The Lunenberg-Tanenbaum Research Institute, Sinai Health System, 60 Murray Street, Toronto, Ontario M5T 3L9 Canada
| |
Collapse
|
6
|
Alfares A, Alsubaie L, Aloraini T, Alaskar A, Althagafi A, Alahmad A, Rashid M, Alswaid A, Alothaim A, Eyaid W, Ababneh F, Albalwi M, Alotaibi R, Almutairi M, Altharawi N, Alsamer A, Abdelhakim M, Kafkas S, Mineta K, Cheung N, Abdallah AM, Büchmann-Møller S, Fukasawa Y, Zhao X, Rajan I, Hoehndorf R, Al Mutairi F, Gojobori T, Alfadhel M. What is the right sequencing approach? Solo VS extended family analysis in consanguineous populations. BMC Med Genomics 2020; 13:103. [PMID: 32680510 PMCID: PMC7368798 DOI: 10.1186/s12920-020-00743-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 06/19/2020] [Indexed: 02/04/2023] Open
Abstract
Background Testing strategies is crucial for genetics clinics and testing laboratories. In this study, we tried to compare the hit rate between solo and trio and trio plus testing and between trio and sibship testing. Finally, we studied the impact of extended family analysis, mainly in complex and unsolved cases. Methods Three cohorts were used for this analysis: one cohort to assess the hit rate between solo, trio and trio plus testing, another cohort to examine the impact of the testing strategy of sibship genome vs trio-based analysis, and a third cohort to test the impact of an extended family analysis of up to eight family members to lower the number of candidate variants. Results The hit rates in solo, trio and trio plus testing were 39, 40, and 41%, respectively. The total number of candidate variants in the sibship testing strategy was 117 variants compared to 59 variants in the trio-based analysis. We noticed that the average number of coding candidate variants in trio-based analysis was 1192 variants and 26,454 noncoding variants, and this number was lowered by 50–75% after adding additional family members, with up to two coding and 66 noncoding homozygous variants only, in families with eight family members. Conclusion There was no difference in the hit rate between solo and extended family members. Trio-based analysis was a better approach than sibship testing, even in a consanguineous population. Finally, each additional family member helped to narrow down the number of variants by 50–75%. Our findings could help clinicians, researchers and testing laboratories select the most cost-effective and appropriate sequencing approach for their patients. Furthermore, using extended family analysis is a very useful tool for complex cases with novel genes.
Collapse
Affiliation(s)
- Ahmed Alfares
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia. .,Department of Pediatrics, College of Medicine, Qassim University, Qassim, Saudi Arabia. .,Qassim University, Department of Pediatrics, Almulyda, Saudi Arabia.
| | - Lamia Alsubaie
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Abdullah International Medical Research Center, Riyadh, Saudi Arabia
| | - Taghrid Aloraini
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Aljoharah Alaskar
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Azza Althagafi
- Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Ahmed Alahmad
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Mamoon Rashid
- King Abdullah International Medical Research Center, Riyadh, Saudi Arabia
| | - Abdulrahman Alswaid
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Ali Alothaim
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Wafaa Eyaid
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Faroug Ababneh
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Abdullah International Medical Research Center, Riyadh, Saudi Arabia
| | - Mohammed Albalwi
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Raniah Alotaibi
- King Abdullah International Medical Research Center, Riyadh, Saudi Arabia.,Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Mashael Almutairi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Nouf Altharawi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Alhanouf Alsamer
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Marwa Abdelhakim
- Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Senay Kafkas
- Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Katsuhiko Mineta
- Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Nicole Cheung
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Abdallah M Abdallah
- Department of Basic Medical Sciences, College of Medicine, QU Health, Qatar University, Doha, Qatar
| | - Stine Büchmann-Møller
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Yoshinori Fukasawa
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Xiang Zhao
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Issaac Rajan
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Robert Hoehndorf
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Fuad Al Mutairi
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Takashi Gojobori
- Biological and Environmental Science and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Majid Alfadhel
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Abdullah International Medical Research Center, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| |
Collapse
|
7
|
Jones RM, Melton PE, Pinese M, Rea AJ, Ingley E, Ballinger ML, Wood DJ, Thomas DM, Moses EK. Identification of novel sarcoma risk genes using a two-stage genome wide DNA sequencing strategy in cancer cluster families and population case and control cohorts. BMC MEDICAL GENETICS 2019; 20:69. [PMID: 31053105 PMCID: PMC6499942 DOI: 10.1186/s12881-019-0808-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 04/16/2019] [Indexed: 12/26/2022]
Abstract
Background Although familial clustering of cancers is relatively common, only a small proportion of familial cancer risk can be explained by known cancer predisposition genes. Methods In this study we employed a two-stage approach to identify candidate sarcoma risk genes. First, we conducted whole exome sequencing in three multigenerational cancer families ascertained through a sarcoma proband (n = 19) in order to prioritize candidate genes for validation in an independent case-control cohort of sarcoma patients using family-based association and segregation analysis. The second stage employed a burden analysis of rare variants within prioritized candidate genes identified from stage one in 560 sarcoma cases and 1144 healthy ageing controls, for which whole genome sequence was available. Results Variants from eight genes were identified in stage one. Following gene-based burden testing and after correction for multiple testing, two of these genes, ABCB5 and C16orf96, were determined to show statistically significant association with cancer. The ABCB5 gene was found to have a higher burden of putative regulatory variants (OR = 4.9, p-value = 0.007, q-value = 0.04) based on allele counts in sarcoma cases compared to controls. C16orf96, was found to have a significantly lower burden (OR = 0.58, p-value = 0.0004, q-value = 0.003) of regulatory variants in controls compared to sarcoma cases. Conclusions Based on these genetic association data we propose that ABCB5 and C16orf96 are novel candidate risk genes for sarcoma. Although neither of these two genes have been previously associated with sarcoma, ABCB5 has been shown to share clinical drug resistance associations with melanoma and leukaemia and C16orf96 shares regulatory elements with genes that are involved with TNF-alpha mediated apoptosis in a p53/TP53-dependent manner. Future genetic studies in other family and population cohorts will be required for further validation of these novel findings. Electronic supplementary material The online version of this article (10.1186/s12881-019-0808-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rachel M Jones
- The Curtin UWA Centre for Genetic Origins of Health and Disease, Faculty of Health Sciences, Curtin University and Faculty of Health and Medical Sciences, M409 The University of Western Australia, 35 Stirling Hwy, Crawley, 6009, Western Australia.,Medical School, Faculty of Health and Medical Sciences, University of Western Australia, Crawley, Australia
| | - Phillip E Melton
- The Curtin UWA Centre for Genetic Origins of Health and Disease, Faculty of Health Sciences, Curtin University and Faculty of Health and Medical Sciences, M409 The University of Western Australia, 35 Stirling Hwy, Crawley, 6009, Western Australia.,School of Pharmacy and Biomedical Sciences, Faculty of Health Sciences, Curtin University, Bentley, Western Australia
| | - Mark Pinese
- Cancer Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Alexander J Rea
- The Curtin UWA Centre for Genetic Origins of Health and Disease, Faculty of Health Sciences, Curtin University and Faculty of Health and Medical Sciences, M409 The University of Western Australia, 35 Stirling Hwy, Crawley, 6009, Western Australia
| | - Evan Ingley
- School of Veterinary and Life Sciences, Murdoch University, Murdoch, Australia.,Harry Perkins Institute of Medical Research, Murdoch, Western Australia.,The Centre for Medical Research, The University of Western Australia, Crawley, Australia
| | - Mandy L Ballinger
- Cancer Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | | | - David J Wood
- Medical School, Faculty of Health and Medical Sciences, University of Western Australia, Crawley, Australia
| | - David M Thomas
- Cancer Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Eric K Moses
- The Curtin UWA Centre for Genetic Origins of Health and Disease, Faculty of Health Sciences, Curtin University and Faculty of Health and Medical Sciences, M409 The University of Western Australia, 35 Stirling Hwy, Crawley, 6009, Western Australia. .,School of Pharmacy and Biomedical Sciences, Faculty of Health Sciences, Curtin University, Bentley, Western Australia. .,School of Biomedical Sciences, Faculty of Health and Medical Sciences, The University of Western Australia, Crawley, Australia.
| |
Collapse
|
8
|
Multivariate association test for rare variants controlling for cryptic and family relatedness. CAN J STAT 2019. [DOI: 10.1002/cjs.11475] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
9
|
Wang X, Zhang Z, Morris N, Cai T, Lee S, Wang C, Yu TW, Walsh CA, Lin X. Rare variant association test in family-based sequencing studies. Brief Bioinform 2018; 18:954-961. [PMID: 27677958 DOI: 10.1093/bib/bbw083] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Indexed: 12/20/2022] Open
Abstract
The objective of this article is to introduce valid and robust methods for the analysis of rare variants for family-based exome chips, whole-exome sequencing or whole-genome sequencing data. Family-based designs provide unique opportunities to detect genetic variants that complement studies of unrelated individuals. Currently, limited methods and software tools have been developed to assist family-based association studies with rare variants, especially for analyzing binary traits. In this article, we address this gap by extending existing burden and kernel-based gene set association tests for population data to related samples, with a particular emphasis on binary phenotypes. The proposed approach blends the strengths of kernel machine methods and generalized estimating equations. Importantly, the efficient generalized kernel score test can be applied as a mega-analysis framework to combine studies with different designs. We illustrate the application of the proposed method using data from an exome sequencing study of autism. Methods discussed in this article are implemented in an R package 'gskat', which is available on CRAN and GitHub.
Collapse
|
10
|
Espin-Garcia O, Craiu RV, Bull SB. Two-phase designs for joint quantitative-trait-dependent and genotype-dependent sampling in post-GWAS regional sequencing. Genet Epidemiol 2017; 42:104-116. [PMID: 29239496 PMCID: PMC5814750 DOI: 10.1002/gepi.22099] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Revised: 10/23/2017] [Accepted: 10/23/2017] [Indexed: 11/09/2022]
Abstract
We evaluate two‐phase designs to follow‐up findings from genome‐wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation‐maximization‐based inference under a semiparametric maximum likelihood formulation tailored for post‐GWAS inference. A GWAS‐SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT‐SNP‐dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme‐QT strata yields significant power improvements compared to marginal QT‐ or SNP‐based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure.
Collapse
Affiliation(s)
- Osvaldo Espin-Garcia
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.,Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | - Radu V Craiu
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
| | - Shelley B Bull
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.,Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| |
Collapse
|
11
|
Bull SB, Andrulis IL, Paterson AD. Statistical challenges in high-dimensional molecular and genetic epidemiology. CAN J STAT 2017. [DOI: 10.1002/cjs.11342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Shelley B. Bull
- Lunenfeld-Tanenbaum Research Institute; Sinai Health System; Toronto Ontario, Canada M5T 3L9
- Dalla Lana School of Public Health; University of Toronto; Toronto, Ontario Canada M5T 3M7
| | - Irene L. Andrulis
- Lunenfeld-Tanenbaum Research Institute; Sinai Health System; Toronto Ontario, Canada M5T 3L9
- Department of Molecular Genetics; University of Toronto; Toronto, Ontario Canada M5S 1A8
| | - Andrew D. Paterson
- Dalla Lana School of Public Health; University of Toronto; Toronto, Ontario Canada M5T 3M7
- Genetics and Genome Biology Program; The Hospital for Sick Children; Toronto, Ontario Canada M5G 0A4
| |
Collapse
|
12
|
Salomon MP, Li WLS, Edlund CK, Morrison J, Fortini BK, Win AK, Conti DV, Thomas DC, Duggan D, Buchanan DD, Jenkins MA, Hopper JL, Gallinger S, Le Marchand L, Newcomb PA, Casey G, Marjoram P. GWASeq: targeted re-sequencing follow up to GWAS. BMC Genomics 2016; 17:176. [PMID: 26940994 PMCID: PMC4776370 DOI: 10.1186/s12864-016-2459-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Accepted: 02/09/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND For the last decade the conceptual framework of the Genome-Wide Association Study (GWAS) has dominated the investigation of human disease and other complex traits. While GWAS have been successful in identifying a large number of variants associated with various phenotypes, the overall amount of heritability explained by these variants remains small. This raises the question of how best to follow up on a GWAS, localize causal variants accounting for GWAS hits, and as a consequence explain more of the so-called "missing" heritability. Advances in high throughput sequencing technologies now allow for the efficient and cost-effective collection of vast amounts of fine-scale genomic data to complement GWAS. RESULTS We investigate these issues using a colon cancer dataset. After QC, our data consisted of 1993 cases, 899 controls. Using marginal tests of associations, we identify 10 variants distributed among six targeted regions that are significantly associated with colorectal cancer, with eight of the variants being novel to this study. Additionally, we perform so-called 'SNP-set' tests of association and identify two sets of variants that implicate both common and rare variants in the etiology of colorectal cancer. CONCLUSIONS Here we present a large-scale targeted re-sequencing resource focusing on genomic regions implicated in colorectal cancer susceptibility previously identified in several GWAS, which aims to 1) provide fine-scale targeted sequencing data for fine-mapping and 2) provide data resources to address methodological questions regarding the design of sequencing-based follow-up studies to GWAS. Additionally, we show that this strategy successfully identifies novel variants associated with colorectal cancer susceptibility and can implicate both common and rare variants.
Collapse
Affiliation(s)
- Matthew P Salomon
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA. .,Department of Molecular Oncology, John Wayne Cancer Institute at Providence Saint John's Health Center, Santa Monica, CA, USA.
| | - Wai Lok Sibon Li
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
| | - Christopher K Edlund
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
| | - John Morrison
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
| | - Barbara K Fortini
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
| | - Aung Ko Win
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Melbourne, VIC, Australia.
| | - David V Conti
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
| | - Duncan C Thomas
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
| | - David Duggan
- Translational Genomics Research Institute, Phoenix, AZ, USA.
| | - Daniel D Buchanan
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Melbourne, VIC, Australia. .,Oncogenomics Group, Genetic Epidemiology Laboratory, Department of Pathology, The University of Melbourne, Parkville, Melbourne, VIC, Australia.
| | - Mark A Jenkins
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Melbourne, VIC, Australia.
| | - John L Hopper
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Melbourne, VIC, Australia.
| | - Steven Gallinger
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.
| | | | - Polly A Newcomb
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.
| | - Graham Casey
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
| | - Paul Marjoram
- Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
13
|
Abstract
For the first time in the history of human genetics research, it is now both technically feasible and economically affordable to screen individual genomes for novel disease-causing mutations at base-pair resolution using "next-generation sequencing" (NGS). One popular aim in many of today's NGS studies is genome resequencing (in part or whole) to identify DNA variants potentially accounting for the "missing heritability" problem observed in many genetically complex traits. Thus far, only relatively few projects have applied these powerful new technologies to search for novel Alzheimer's disease (AD) related sequence variants. In this review, I summarize the findings from the first NGS-based resequencing studies in AD and discuss their potential implications and limitations. Notable recent discoveries using NGS include the identification of rare susceptibility modifying alleles in APP, TREM2, and PLD3. Several other large-scale NGS projects are currently underway so that additional discoveries can be expected over the coming years.
Collapse
|
14
|
Ding J, Li H. Comparison of robust tests for genetic association analysis incorporating uncertain genotype. COMMUN STAT-SIMUL C 2015. [DOI: 10.1080/03610918.2015.1091077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
15
|
Leclerc M, Simard J, Lakhal-Chaieb L. SNP Set Association Testing for Survival Outcomes in the Presence of Intrafamilial Correlation. Genet Epidemiol 2015; 39:406-14. [PMID: 26282997 DOI: 10.1002/gepi.21914] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Revised: 06/04/2015] [Accepted: 06/17/2015] [Indexed: 11/06/2022]
Abstract
In this work, we propose a single nucleotide polymorphism (SNP) set association test for censored phenotypes in the presence of a family-based design. The proposed test is valid for both common and rare variants. A proportional hazards Cox model is specified for the marginal distribution of the trait and the familial dependence is modeled via a Gaussian copula. Censored values are treated as partially missing data and a multiple imputation procedure is proposed in order to compute the test statistics. The P-value is then deduced analytically. The finite-sample empirical properties of the proposed method are evaluated and compared to existing competitors by simulations and its use is illustrated using a breast cancer data set from the Consortium of Investigators of Modifiers of BRCA1 and BRCA2.
Collapse
Affiliation(s)
- Martin Leclerc
- Département de mathématiques et de statistique, Université Laval, Québec, Canada
| | | | - Jacques Simard
- Department of Molecular Medicine, Canada Research Chair in Oncogenetics, Laval University & Genomics Centre, CHU de Québec Research Centre, Québec, Canada
| | - Lajmi Lakhal-Chaieb
- Département de mathématiques et de statistique, Université Laval, Québec, Canada
| |
Collapse
|
16
|
Wang X, Biernacka JM. Assessing the effects of multiple markers in genetic association studies. Front Genet 2015; 6:66. [PMID: 25759719 PMCID: PMC4338793 DOI: 10.3389/fgene.2015.00066] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 02/09/2015] [Indexed: 11/13/2022] Open
Affiliation(s)
- Xuefeng Wang
- Department of Preventive Medicine, Stony Brook University Stony Brook, NY, USA
| | | |
Collapse
|
17
|
Chen Z, Craiu RV, Bull SB. A note on the efficiencies of sampling strategies in two-stage Bayesian regional fine mapping of a quantitative trait. Genet Epidemiol 2014; 38:599-609. [PMID: 25132153 DOI: 10.1002/gepi.21845] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2013] [Revised: 06/12/2014] [Accepted: 06/16/2014] [Indexed: 11/09/2022]
Abstract
In focused studies designed to follow up associations detected in a genome-wide association study (GWAS), investigators can proceed to fine-map a genomic region by targeted sequencing or dense genotyping of all variants in the region, aiming to identify a functional sequence variant. For the analysis of a quantitative trait, we consider a Bayesian approach to fine-mapping study design that incorporates stratification according to a promising GWAS tag SNP in the same region. Improved cost-efficiency can be achieved when the fine-mapping phase incorporates a two-stage design, with identification of a smaller set of more promising variants in a subsample taken in stage 1, followed by their evaluation in an independent stage 2 subsample. To avoid the potential negative impact of genetic model misspecification on inference we incorporate genetic model selection based on posterior probabilities for each competing model. Our simulation study shows that, compared to simple random sampling that ignores genetic information from GWAS, tag-SNP-based stratified sample allocation methods reduce the number of variants continuing to stage 2 and are more likely to promote the functional sequence variant into confirmation studies.
Collapse
Affiliation(s)
- Zhijian Chen
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada
| | | | | |
Collapse
|
18
|
A statistical framework to guide sequencing choices in pedigrees. Am J Hum Genet 2014; 94:257-67. [PMID: 24507777 DOI: 10.1016/j.ajhg.2014.01.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Accepted: 01/13/2014] [Indexed: 11/23/2022] Open
Abstract
The use of large pedigrees is an effective design for identifying rare functional variants affecting heritable traits. Cost-effective studies using sequence data can be achieved via pedigree-based genotype imputation in which some subjects are sequenced and missing genotypes are inferred on the remaining subjects. Because of high cost, it is important to carefully prioritize subjects for sequencing. Here, we introduce a statistical framework that enables systematic comparison among subject-selection choices for sequencing. We introduce a metric "local coverage," which allows the use of inferred inheritance vectors to measure genotype-imputation ability specifically in a region of interest, such as one with prior evidence of linkage. In the absence of linkage information, we can instead use a "genome-wide coverage" metric computed with the pedigree structure. These metrics enable the development of a method that identifies efficient selection choices for sequencing. As implemented in GIGI-Pick, this method also flexibly allows initial manual selection of subjects and optimizes selections within the constraint that only some subjects might be available for sequencing. In the present study, we used simulations to compare GIGI-Pick with PRIMUS, ExomePicks, and common ad hoc methods of selecting subjects. In genotype imputation of both common and rare alleles, GIGI-Pick substantially outperformed all other methods considered and had the added advantage of incorporating prior linkage information. We also used a real pedigree to demonstrate the utility of our approach in identifying causal mutations. Our work enables prioritization of subjects for sequencing to facilitate dissection of the genetic basis of heritable traits.
Collapse
|