1
|
Yoshida S, Yamaguchi Y, Maruo K, Gosho M. Permutation-based global rank test with adaptive weights for multiple primary endpoints. Stat Methods Med Res 2025:9622802251334886. [PMID: 40368380 DOI: 10.1177/09622802251334886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2025]
Abstract
Multiple efficacy endpoints are investigated in clinical trials, and selecting the appropriate primary endpoints is key to the study's success. The global test is an analysis approach that can handle multiple endpoints without multiplicity adjustment. This test, which aggregates the statistics from multiple primary endpoints into a single statistic using weights for the statistical comparison, has been gaining increasing attention. A key consideration in the global test is determination of the weights. In this study, we propose a novel global rank test in which the weights for each endpoint are estimated based on the current study data to maximize the test statistic, and the permutation test is applied to control the type I error rate. Simulation studies conducted to compare the proposed test with other global tests show that the proposed test can control the type I error rate at the nominal level, regardless of the number of primary endpoints and correlations between endpoints. Additionally, the proposed test offers higher statistical powers when the efficacy is considerably different between endpoints or when endpoints are moderately correlated, such as when the correlation coefficient is greater than or equal to 0.5.
Collapse
Affiliation(s)
- Satoshi Yoshida
- Data Science, Astellas Pharma Inc., Tokyo, Japan
- Graduate School of Comprehensive Human Sciences, University of Tsukuba, Ibaraki, Japan
| | - Yusuke Yamaguchi
- Data Science, Astellas Pharma Global Development Inc., Northbrook, IL, USA
| | - Kazushi Maruo
- Department of Biostatistics, Institute of Medicine, University of Tsukuba, Ibaraki, Japan
| | - Masahiko Gosho
- Department of Biostatistics, Institute of Medicine, University of Tsukuba, Ibaraki, Japan
| |
Collapse
|
2
|
Geng LN, Bonilla H, Hedlin H, Jacobson KB, Tian L, Jagannathan P, Yang PC, Subramanian AK, Liang JW, Shen S, Deng Y, Shaw BJ, Botzheim B, Desai M, Pathak D, Jazayeri Y, Thai D, O’Donnell A, Mohaptra S, Leang Z, Reynolds GZM, Brooks EF, Bhatt AS, Shafer RW, Miglis MG, Quach T, Tiwari A, Banerjee A, Lopez RN, De Jesus M, Charnas LR, Utz PJ, Singh U. Nirmatrelvir-Ritonavir and Symptoms in Adults With Postacute Sequelae of SARS-CoV-2 Infection: The STOP-PASC Randomized Clinical Trial. JAMA Intern Med 2024; 184:1024-1034. [PMID: 38848477 PMCID: PMC11161857 DOI: 10.1001/jamainternmed.2024.2007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 03/30/2024] [Indexed: 06/09/2024]
Abstract
Importance There is an urgent need to identify treatments for postacute sequelae of SARS-CoV-2 infection (PASC). Objective To assess the efficacy of a 15-day course of nirmatrelvir-ritonavir in reducing the severity of select PASC symptoms. Design, Setting, and Participants This was a 15-week blinded, placebo-controlled, randomized clinical trial conducted from November 2022 to September 2023 at Stanford University (California). The participants were adults with moderate to severe PASC symptoms of 3 months or longer duration. Interventions Participants were randomized 2:1 to treatment with oral nirmatrelvir-ritonavir (NMV/r, 300 mg and 100 mg) or with placebo-ritonavir (PBO/r) twice daily for 15 days. Main Outcomes and Measures Primary outcome was a pooled severity of 6 PASC symptoms (fatigue, brain fog, shortness of breath, body aches, gastrointestinal symptoms, and cardiovascular symptoms) based on a Likert scale score at 10 weeks. Secondary outcomes included symptom severity at different time points, symptom burden and relief, patient global measures, Patient-Reported Outcomes Measurement Information System (PROMIS) measures, orthostatic vital signs, and sit-to-stand test change from baseline. Results Of the 155 participants (median [IQR] age, 43 [34-54] years; 92 [59%] females), 102 were randomized to the NMV/r group and 53 to the PBO/r group. Nearly all participants (n = 153) had received the primary series for COVID-19 vaccination. Mean (SD) time between index SARS-CoV-2 infection and randomization was 17.5 (9.1) months. There was no statistically significant difference in the model-derived severity outcome pooled across the 6 core symptoms at 10 weeks between the NMV/r and PBO/r groups. No statistically significant between-group differences were found at 10 weeks in the Patient Global Impression of Severity or Patient Global Impression of Change scores, summative symptom scores, and change from baseline to 10 weeks in PROMIS fatigue, dyspnea, cognitive function, and physical function measures. Adverse event rates were similar in NMV/r and PBO/r groups and mostly of low grade. Conclusions and Relevance The results of this randomized clinical trial showed that a 15-day course of NMV/r in a population of patients with PASC was generally safe but did not demonstrate a significant benefit for improving select PASC symptoms in a mostly vaccinated cohort with protracted symptom duration. Further studies are needed to determine the role of antivirals in the treatment of PASC. Trial Registration ClinicalTrials.gov Identifier: NCT05576662.
Collapse
Affiliation(s)
- Linda N. Geng
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Hector Bonilla
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Haley Hedlin
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Karen B. Jacobson
- Department of Medicine, Stanford University School of Medicine, Stanford, California
- Kaiser Permanente Northern California Division of Research, Oakland
| | - Lu Tian
- Department of Biomedical Data Science, Stanford School of Medicine, Stanford, California
| | - Prasanna Jagannathan
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Phillip C. Yang
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Aruna K. Subramanian
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Jane W. Liang
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Sa Shen
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Yaowei Deng
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Blake J. Shaw
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Bren Botzheim
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Manisha Desai
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Divya Pathak
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Yasmin Jazayeri
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Daniel Thai
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Andrew O’Donnell
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Sukanya Mohaptra
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Zenita Leang
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | | | - Erin F. Brooks
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Ami S. Bhatt
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Robert W. Shafer
- Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Mitchell G. Miglis
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California
| | - Tom Quach
- Stanford University, Stanford, California
| | | | - Anindita Banerjee
- Pfizer Research and Development, Pfizer Inc, Cambridge, Massachusetts
| | - Rene N. Lopez
- Clinical Research Collaborations COE, Worldwide Medical and Safety, Pfizer Inc, Groton, Connecticut
| | - Magdia De Jesus
- Strategic Planning, Worldwide Medical and Safety, Pfizer Inc, New York, New York
| | - Lawrence R. Charnas
- Clinical Research Collaborations COE, Worldwide Medical and Safety, Pfizer Inc, Groton, Connecticut
| | - Paul J. Utz
- Department of Medicine, Stanford University School of Medicine, Stanford, California
- Institute for Immunity, Transplantation and Infection, Stanford University, Stanford, California
| | - Upinder Singh
- Department of Medicine, Stanford University School of Medicine, Stanford, California
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, California
| |
Collapse
|
3
|
Deng Q, Song C, Lin S. An adaptive and robust method for multi-trait analysis of genome-wide association studies using summary statistics. Eur J Hum Genet 2024; 32:681-690. [PMID: 37237036 PMCID: PMC11153499 DOI: 10.1038/s41431-023-01389-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 05/01/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with human traits or diseases in the past decade. Nevertheless, much of the heritability of many traits is still unaccounted for. Commonly used single-trait analysis methods are conservative, while multi-trait methods improve statistical power by integrating association evidence across multiple traits. In contrast to individual-level data, GWAS summary statistics are usually publicly available, and thus methods using only summary statistics have greater usage. Although many methods have been developed for joint analysis of multiple traits using summary statistics, there are many issues, including inconsistent performance, computational inefficiency, and numerical problems when considering lots of traits. To address these challenges, we propose a multi-trait adaptive Fisher method for summary statistics (MTAFS), a computationally efficient method with robust power performance. We applied MTAFS to two sets of brain imaging derived phenotypes (IDPs) from the UK Biobank, including a set of 58 Volumetric IDPs and a set of 212 Area IDPs. Through annotation analysis, the underlying genes of the SNPs identified by MTAFS were found to exhibit higher expression and are significantly enriched in brain-related tissues. Together with results from a simulation study, MTAFS shows its advantage over existing multi-trait methods, with robust performance across a range of underlying settings. It controls type 1 error well and can efficiently handle a large number of traits.
Collapse
Affiliation(s)
- Qiaolan Deng
- Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH, USA
- Department of Statistics, College of Arts and Sciences, The Ohio State University, Columbus, OH, USA
| | - Chi Song
- Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH, USA
| | - Shili Lin
- Department of Statistics, College of Arts and Sciences, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
4
|
Xie H, Cao X, Zhang S, Sha Q. Joint analysis of multiple phenotypes for extremely unbalanced case-control association studies using multi-layer network. Bioinformatics 2023; 39:btad707. [PMID: 37991852 PMCID: PMC10697735 DOI: 10.1093/bioinformatics/btad707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 09/29/2023] [Accepted: 11/21/2023] [Indexed: 11/24/2023] Open
Abstract
MOTIVATION Genome-wide association studies is an essential tool for analyzing associations between phenotypes and single nucleotide polymorphisms (SNPs). Most of binary phenotypes in large biobanks are extremely unbalanced, which leads to inflated type I error rates for many widely used association tests for joint analysis of multiple phenotypes. In this article, we first propose a novel method to construct a Multi-Layer Network (MLN) using individuals with at least one case status among all phenotypes. Then, we introduce a computationally efficient community detection method to group phenotypes into disjoint clusters based on the MLN. Finally, we propose a novel approach, MLN with Omnibus (MLN-O), to jointly analyse the association between phenotypes and a SNP. MLN-O uses the score test to test the association of each merged phenotype in a cluster and a SNP, then uses the Omnibus test to obtain an overall test statistic to test the association between all phenotypes and a SNP. RESULTS We conduct extensive simulation studies to reveal that the proposed approach can control type I error rates and is more powerful than some existing methods. Meanwhile, we apply the proposed method to a real data set in the UK Biobank. Using phenotypes in Chapter XIII (Diseases of the musculoskeletal system and connective tissue) in the UK Biobank, we find that MLN-O identifies more significant SNPs than other methods we compare with. AVAILABILITY AND IMPLEMENTATION https://github.com/Hongjing-Xie/Multi-Layer-Network-with-Omnibus-MLN-O.
Collapse
Affiliation(s)
- Hongjing Xie
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI 49931, United States
| | - Xuewei Cao
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI 49931, United States
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI 49931, United States
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI 49931, United States
| |
Collapse
|
5
|
Kulminski AM, Feng F, Loiko E, Nazarian A, Loika Y, Culminskaya I. Prevailing Antagonistic Risks in Pleiotropic Associations with Alzheimer's Disease and Diabetes. J Alzheimers Dis 2023; 94:1121-1132. [PMID: 37355909 PMCID: PMC10666173 DOI: 10.3233/jad-230397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2023]
Abstract
BACKGROUND The lack of efficient preventive interventions against Alzheimer's disease (AD) calls for identifying efficient modifiable risk factors for AD. As diabetes shares many pathological processes with AD, including accumulation of amyloid plaques and neurofibrillary tangles, insulin resistance, and impaired glucose metabolism, diabetes is thought to be a potentially modifiable risk factor for AD. Mounting evidence suggests that links between AD and diabetes may be more complex than previously believed. OBJECTIVE To examine the pleiotropic architecture of AD and diabetes mellitus (DM). METHODS Univariate and pleiotropic analyses were performed following the discovery-replication strategy using individual-level data from 10 large-scale studies. RESULTS We report a potentially novel pleiotropic NOTCH2 gene, with a minor allele of rs5025718 associated with increased risks of both AD and DM. We confirm previously identified antagonistic associations of the same variants with the risks of AD and DM in the HLA and APOE gene clusters. We show multiple antagonistic associations of the same variants with AD and DM in the HLA cluster, which were not explained by the lead SNP in this cluster. Although the ɛ2 and ɛ4 alleles played a major role in the antagonistic associations with AD and DM in the APOE cluster, we identified non-overlapping SNPs in this cluster, which were adversely and beneficially associated with AD and DM independently of the ɛ2 and ɛ4 alleles. CONCLUSION This study emphasizes differences and similarities in the heterogeneous genetic architectures of AD and DM, which may differentiate the pathogenic mechanisms of these diseases.
Collapse
Affiliation(s)
- Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27705, USA
| | - Fan Feng
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27705, USA
| | - Elena Loiko
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27705, USA
| | - Alireza Nazarian
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27705, USA
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27705, USA
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27705, USA
| |
Collapse
|
6
|
Kulminski AM, Loiko E, Loika Y, Culminskaya I. Pleiotropic predisposition to Alzheimer's disease and educational attainment: insights from the summary statistics analysis. GeroScience 2022; 44:265-280. [PMID: 34743297 PMCID: PMC8572080 DOI: 10.1007/s11357-021-00484-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 10/28/2021] [Indexed: 12/25/2022] Open
Abstract
Epidemiological studies report beneficial associations of higher educational attainment (EDU) with Alzheimer's disease (AD). Prior genome-wide association studies (GWAS) also reported variants associated with AD and EDU separately. The analysis of pleiotropic associations with these phenotypes may shed light on EDU-related protection against AD. We performed pleiotropic meta-analyses using Fisher's method and omnibus test applied to summary statistics for single nucleotide polymorphisms (SNPs) associated with AD and EDU in large-scale univariate GWAS at suggestive-effect (5 × 10-8 < p < 0.1) and genome-wide (p ≤ 5 × 10-8) significance levels. We report 53 SNPs that attained p ≤ 5 × 10-8 at least in one of the pleiotropic meta-analyses and were reported in the univariate GWAS at 5 × 10-8 < p < 0.1. Of them, there were 46 pleiotropic SNPs according to Fisher's method. Additionally, Fisher's method identified 25 of 206 SNPs with pleiotropic effects, which attained p ≤ 5 × 10-8 in the univariate GWAS. We showed that a large fraction of the pleiotropic associations was affected by a counterintuitive phenomenon of antagonistic genetic heterogeneity, which explains the increase, rather than decrease, of the significance of the pleiotropic associations in the omnibus test. Functional enrichment analysis showed that apart from cancers, gene set harboring the non-pleiotropic SNPs was characterized by late-onset AD and neurodevelopmental disorders. The pleiotropic gene set was characterized by a broad spectrum of progressive neurological and neuromuscular diseases and immune-mediated conditions, including progressive motor neuropathy, multiple sclerosis, Parkinson's disease, and severe AD. Our results suggest that disentangling genes harboring variants with and without pleiotropic associations with AD and EDU is promising for dissecting heterogeneity in biological mechanisms of AD.
Collapse
Affiliation(s)
- Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27708-0408, USA.
| | - Elena Loiko
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27708-0408, USA
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27708-0408, USA
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, 27708-0408, USA
| |
Collapse
|
7
|
Kulminski AM, Loika Y, Nazarian A, Culminskaya I. Quantitative and Qualitative Role of Antagonistic Heterogeneity in Genetics of Blood Lipids. J Gerontol A Biol Sci Med Sci 2020; 75:1811-1819. [PMID: 31566214 DOI: 10.1093/gerona/glz225] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Indexed: 12/18/2022] Open
Abstract
Prevailing strategies in genome-wide association studies (GWAS) mostly rely on principles of medical genetics emphasizing one gene, one function, one phenotype concept. Here, we performed GWAS of blood lipids leveraging a new systemic concept emphasizing complexity of genetic predisposition to such phenotypes. We focused on total cholesterol, low- and high-density lipoprotein cholesterols, and triglycerides available for 29,902 individuals of European ancestry from seven independent studies, men and women combined. To implement the new concept, we leveraged the inherent heterogeneity in genetic predisposition to such complex phenotypes and emphasized a new counter intuitive phenomenon of antagonistic genetic heterogeneity, which is characterized by misalignment of the directions of genetic effects and the phenotype correlation. This analysis identified 37 loci associated with blood lipids but only one locus, FBXO33, was not reported in previous top GWAS. We, however, found strong effect of antagonistic heterogeneity that leaded to profound (quantitative and qualitative) changes in the associations with blood lipids in most, 25 of 37 or 68%, loci. These changes suggested new roles for some genes, which functions were considered as well established such as GCKR, SIK3 (APOA1 locus), LIPC, LIPG, among the others. The antagonistic heterogeneity highlighted a new class of genetic associations emphasizing beneficial and adverse trade-offs in predisposition to lipids. Our results argue that rigorous analyses dissecting heterogeneity in genetic predisposition to complex traits such as lipids beyond those implemented in current GWAS are required to facilitate translation of genetic discoveries into health care.
Collapse
Affiliation(s)
- Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, North Carolina
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, North Carolina
| | - Alireza Nazarian
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, North Carolina
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, North Carolina
| |
Collapse
|
8
|
Bu D, Yang Q, Meng Z, Zhang S, Li Q. Truncated tests for combining evidence of summary statistics. Genet Epidemiol 2020; 44:687-701. [DOI: 10.1002/gepi.22330] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 04/24/2020] [Accepted: 06/01/2020] [Indexed: 12/15/2022]
Affiliation(s)
- Deliang Bu
- School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China
- Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of Sciences Beijing China
| | - Qinglong Yang
- School of Statistics and Mathematics Zhongnan University of Economics and Law Wuhan China
| | - Zhen Meng
- LSC, NCMIS, Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing China
| | - Sanguo Zhang
- School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China
- Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of Sciences Beijing China
| | - Qizhai Li
- School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China
- LSC, NCMIS, Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing China
| |
Collapse
|
9
|
Zhao Y, Yu Q, Lake SL. A flexible multi-domain test with adaptive weights and its application to clinical trials. Pharm Stat 2019; 19:315-325. [PMID: 31886602 DOI: 10.1002/pst.1993] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 09/30/2019] [Accepted: 11/21/2019] [Indexed: 12/20/2022]
Abstract
The design of a clinical trial is often complicated by the multi-systemic nature of the disease; a single endpoint often cannot capture the spectrum of potential therapeutic benefits. Multi-domain outcomes which take into account patient heterogeneity of disease presentation through measurements of multiple symptom/functional domains are an attractive alternative to a single endpoint. A multi-domain test with adaptive weights is proposed to synthesize the evidence of treatment efficacy over numerous disease domains. The test is a weighted sum of domain-specific test statistics with weights selected adaptively via a data-driven algorithm. The null distribution of the test statistic is constructed empirically through resampling and does not require estimation of the covariance structure of domain-specific test statistics. Simulations show that the proposed test controls the type I error rate, and has increased power over other methods such as the O'Brien and Wei-Lachin tests in scenarios reflective of clinical trial settings. Data from a clinical trial in a rare lysosomal storage disorder were used to illustrate the properties of the proposed test. As a strategy of combining marginal test statistics, the proposed test is flexible and readily applicable to a variety of clinical trial scenarios.
Collapse
Affiliation(s)
- Yang Zhao
- Gilead Sciences, Foster City, California
| | - Qifeng Yu
- Sanofi R&D, Framingham, Massachusetts
| | | |
Collapse
|
10
|
Effect of non-normality and low count variants on cross-phenotype association tests in GWAS. Eur J Hum Genet 2019; 28:300-312. [PMID: 31582815 DOI: 10.1038/s41431-019-0514-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 09/01/2019] [Accepted: 09/05/2019] [Indexed: 01/21/2023] Open
Abstract
Many complex human diseases, such as type 2 diabetes, are characterized by multiple underlying traits/phenotypes that have substantially shared genetic architecture. Multivariate analysis of correlated traits has the potential to increase the power of detecting underlying common genetic loci. Several cross-phenotype association methods have been proposed-some require individual-level data on traits and genotypes, while the others require only summary-level data. In this article, we explore whether non-normality of multivariate trait distribution affects the inference from some of the existing multi-trait methods and how that effect is dependent on the allele count of the genetic variant being tested. We find that most of these tests are susceptible to biases that lead to spurious association signals. Even after controlling for confounders that may contribute to non-normality and then applying inverse normal transformation on the residuals of each trait, these tests may have inflated type I errors for variants with low minor allele counts (MACs). A likelihood ratio test of association based on the ordinal regression of individual-level genotype conditional on the traits seems to be the least biased and can maintain type I error when the MAC is reasonably large (e.g., MAC > 30). Application of these methods to publicly available summary statistics of eight amino acid traits on European samples seem to exhibit systematic inflation (especially for variants with low MAC), which is consistent with our findings from simulation experiments.
Collapse
|
11
|
van Rheenen W, Peyrot WJ, Schork AJ, Lee SH, Wray NR. Genetic correlations of polygenic disease traits: from theory to practice. Nat Rev Genet 2019; 20:567-581. [PMID: 31171865 DOI: 10.1038/s41576-019-0137-z] [Citation(s) in RCA: 210] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The genetic correlation describes the genetic relationship between two traits and can contribute to a better understanding of the shared biological pathways and/or the causality relationships between them. The rarity of large family cohorts with recorded instances of two traits, particularly disease traits, has made it difficult to estimate genetic correlations using traditional epidemiological approaches. However, advances in genomic methodologies, such as genome-wide association studies, and widespread sharing of data now allow genetic correlations to be estimated for virtually any trait pair. Here, we review the definition, estimation, interpretation and uses of genetic correlations, with a focus on applications to human disease.
Collapse
Affiliation(s)
- Wouter van Rheenen
- Department of Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, Netherlands.
| | - Wouter J Peyrot
- Department of Psychiatry, Amsterdam UMC, VU University Medical Center, Amsterdam, Netherlands
| | - Andrew J Schork
- Institute for Biological Psychiatry, Mental Health Services Snct. Hans, Roskilde, Denmark
| | - S Hong Lee
- Australian Centre for Precision Health, University of South Australia Cancer Research Institute, University of South Australia, Adelaide, South Australia, Australia
| | - Naomi R Wray
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia.
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia.
| |
Collapse
|
12
|
Kulminski AM, Loika Y, Huang J, Arbeev KG, Bagley O, Ukraintseva S, Yashin AI, Culminskaya I. Pleiotropic Meta-Analysis of Age-Related Phenotypes Addressing Evolutionary Uncertainty in Their Molecular Mechanisms. Front Genet 2019; 10:433. [PMID: 31134135 PMCID: PMC6524409 DOI: 10.3389/fgene.2019.00433] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 04/24/2019] [Indexed: 12/21/2022] Open
Abstract
Age-related phenotypes are characterized by genetic heterogeneity attributed to an uncertain role of evolution in establishing their molecular mechanisms. Here, we performed univariate and pleiotropic meta-analyses of 24 age-related phenotypes dealing with such evolutionary uncertainty and leveraging longitudinal information. Our analysis identified 237 novel single nucleotide polymorphisms (SNPs) in 199 loci with phenotype-specific (61 SNPs) and pleiotropic (176 SNPs) associations and replicated associations for 160 SNPs in 68 loci in a modest sample of 26,371 individuals from five longitudinal studies. Most pleiotropic associations (65.3%, 115 of 176 SNPs) were impacted by heterogeneity, with the natural-selection—free genetic heterogeneity as its inevitable component. This pleiotropic heterogeneity was dominated (93%, 107 of 115 SNPs) by antagonistic genetic heterogeneity, a phenomenon that is characterized by antagonistic directions of genetic effects for directly correlated phenotypes. Genetic association studies of age-related phenotypes addressing the evolutionary uncertainty in establishing their molecular mechanisms have power to substantially improve the efficiency of the analyses. A dominant form of heterogeneous pleiotropy, antagonistic genetic heterogeneity, provides unprecedented insight into the genetic origin of age-related phenotypes and side effects in medical care that is counter-intuitive in medical genetics but naturally expected when molecular mechanisms of age-related phenotypes are not due to direct evolutionary selection.
Collapse
Affiliation(s)
- Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Jian Huang
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Konstantin G Arbeev
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Olivia Bagley
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Svetlana Ukraintseva
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Anatoliy I Yashin
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| |
Collapse
|
13
|
Dimou NL, Pantavou KG, Braliou GG, Bagos PG. Multivariate Methods for Meta-Analysis of Genetic Association Studies. Methods Mol Biol 2019; 1793:157-182. [PMID: 29876897 DOI: 10.1007/978-1-4939-7868-7_11] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.
Collapse
Affiliation(s)
- Niki L Dimou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece.,Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
| | - Katerina G Pantavou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
| | - Georgia G Braliou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece.
| |
Collapse
|
14
|
Guo X, Zhu J, Fan Q, He M, Wang X, Zhang H. A univariate perspective of multivariate genome-wide association analysis. Genet Epidemiol 2018; 42:470-479. [PMID: 29781551 DOI: 10.1002/gepi.22128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 03/26/2018] [Accepted: 03/30/2018] [Indexed: 01/11/2023]
Abstract
Multiple correlated phenotypes are frequently collected in genome-wide association studies (GWASs), and a systematic, simultaneous analysis of multiple phenotypes can integrate the signals from single phenotypes, therefore increasing the power of detecting genetic signals. However, fundamental questions remain open, including the conditions and reasons under which the multivariate analysis is beneficial, how a highly significant signal arises in the multivariate analysis. To understand these issues, we propose to decompose the multivariate model into a series of simple univariate models. This transformation offers a clearer quantitative analysis of the circumstances under which a multivariate approach can be beneficial for the bivariate phenotypes case. A real data analysis is employed to illustrate how to interpret how the signals arising from multivariate GWASs.
Collapse
Affiliation(s)
- Xiaobo Guo
- Department of Statistical Science, School of Mathematics, Sun Yat-Sen University, Guangzhou, China.,Southern China Center for Statistical Science, Sun Yat-Sen University, Guangzhou, China.,Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, University of Melbourne, Melbourne, Victoria, Australia
| | - Junxian Zhu
- Department of Statistical Science, School of Mathematics, Sun Yat-Sen University, Guangzhou, China.,Southern China Center for Statistical Science, Sun Yat-Sen University, Guangzhou, China
| | - Qiao Fan
- DUKE-National University of Singapore Graduate Medical School, Singapore, Singapore
| | - Mingguang He
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, University of Melbourne, Melbourne, Victoria, Australia.,State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou, China
| | - Xueqin Wang
- Department of Statistical Science, School of Mathematics, Sun Yat-Sen University, Guangzhou, China.,Southern China Center for Statistical Science, Sun Yat-Sen University, Guangzhou, China.,Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Heping Zhang
- Department of Statistical Science, School of Mathematics, Sun Yat-Sen University, Guangzhou, China.,Southern China Center for Statistical Science, Sun Yat-Sen University, Guangzhou, China.,Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, United States of America
| |
Collapse
|
15
|
Schiffmann R, Bichet DG, Jovanovic A, Hughes DA, Giugliani R, Feldt-Rasmussen U, Shankar SP, Barisoni L, Colvin RB, Jennette JC, Holdbrook F, Mulberg A, Castelli JP, Skuban N, Barth JA, Nicholls K. Migalastat improves diarrhea in patients with Fabry disease: clinical-biomarker correlations from the phase 3 FACETS trial. Orphanet J Rare Dis 2018; 13:68. [PMID: 29703262 PMCID: PMC5923014 DOI: 10.1186/s13023-018-0813-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 04/18/2018] [Indexed: 12/23/2022] Open
Abstract
Background Fabry disease is frequently characterized by gastrointestinal symptoms, including diarrhea. Migalastat is an orally-administered small molecule approved to treat the symptoms of Fabry disease in patients with amenable mutations. Methods We evaluated minimal clinically important differences (MCID) in diarrhea based on the corresponding domain of the patient-reported Gastrointestinal Symptom Rating Scale (GSRS) in patients with Fabry disease and amenable mutations (N = 50) treated with migalastat 150 mg every other day or placebo during the phase 3 FACETS trial (NCT00925301). Results After 6 months, significantly more patients receiving migalastat versus placebo experienced improvement in diarrhea based on a MCID of 0.33 (43% vs 11%; p = .02), including the subset with baseline diarrhea (71% vs 20%; p = .02). A decline in kidney peritubular capillary globotriaosylceramide inclusions correlated with diarrhea improvement; patients with a reduction > 0.1 were 5.6 times more likely to have an improvement in diarrhea than those without (p = .031). Conclusions Migalastat was associated with a clinically meaningful improvement in diarrhea in patients with Fabry disease and amenable mutations. Reductions in kidney globotriaosylceramide may be a useful surrogate endpoint to predict clinical benefit with migalastat in patients with Fabry disease. Trial registration NCT00925301; June 19, 2009.
Collapse
Affiliation(s)
- Raphael Schiffmann
- Baylor Scott & White Research Institute, Dallas, TX, USA. .,Institute of Metabolic Disease, 3812 Elm Street, Dallas, TX, 75226, USA.
| | - Daniel G Bichet
- Hôpital du Sacré-Coeur, University of Montreal, Montreal, Quebec, Canada
| | - Ana Jovanovic
- Salford Royal Foundation Trust, Manchester, Greater Manchester, UK
| | | | | | | | - Suma P Shankar
- Emory University School of Medicine, Atlanta, GA, USA.,Present Address: UC Davis MIND Institute, Sacramento, CA, USA
| | - Laura Barisoni
- Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Robert B Colvin
- Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - J Charles Jennette
- School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | | | | | - Nina Skuban
- Amicus Therapeutics, Inc., Cranbury, NJ, USA
| | - Jay A Barth
- Amicus Therapeutics, Inc., Cranbury, NJ, USA
| | - Kathleen Nicholls
- Department of Nephrology, Royal Melbourne Hospital, Parkville, VIC, Australia
| |
Collapse
|
16
|
Kulminski AM, Huang J, Loika Y, Arbeev KG, Bagley O, Yashkin A, Duan M, Culminskaya I. Strong impact of natural-selection-free heterogeneity in genetics of age-related phenotypes. Aging (Albany NY) 2018; 10:492-514. [PMID: 29615537 PMCID: PMC5892700 DOI: 10.18632/aging.101407] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 03/24/2018] [Indexed: 11/25/2022]
Abstract
A conceptual difficulty in genetics of age-related phenotypes that make individuals vulnerable to disease in post-reproductive life is genetic heterogeneity attributed to an undefined role of evolution in establishing their molecular mechanisms. Here, we performed univariate and pleiotropic genome-wide meta-analyses of 20 age-related phenotypes leveraging longitudinal information in a sample of 33,431 individuals and dealing with the natural-selection-free genetic heterogeneity. We identified 142 non-proxy single nucleotide polymorphisms (SNPs) with phenotype-specific (18 SNPs) and pleiotropic (124 SNPs) associations at genome-wide level. Univariate meta-analysis identified two novel (11.1%) and replicated 16 SNPs whereas pleiotropic meta-analysis identified 115 novel (92.7%) and nine replicated SNPs. Pleiotropic associations for most novel (93.9%) and all replicated SNPs were strongly impacted by the natural-selection-free genetic heterogeneity in its unconventional form of antagonistic heterogeneity, implying antagonistic directions of genetic effects for directly correlated phenotypes. Our results show that the common genome-wide approach is well adapted to handle homogeneous univariate associations within Mendelian framework whereas most associations with age-related phenotypes are more complex and well beyond that framework. Dissecting the natural-selection-free genetic heterogeneity is critical for gaining insights into genetics of age-related phenotypes and has substantial and unexplored yet potential for improving efficiency of genome-wide analysis.
Collapse
Affiliation(s)
- Alexander M. Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Jian Huang
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Konstantin G. Arbeev
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Olivia Bagley
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Arseniy Yashkin
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Matt Duan
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708, USA
| |
Collapse
|
17
|
He L, Zhbannikov I, Arbeev KG, Yashin AI, Kulminski AM. A genetic stochastic process model for genome-wide joint analysis of biomarker dynamics and disease susceptibility with longitudinal data. Genet Epidemiol 2017; 41:620-635. [PMID: 28636232 PMCID: PMC5643257 DOI: 10.1002/gepi.22058] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Revised: 05/06/2017] [Accepted: 05/17/2017] [Indexed: 12/31/2022]
Abstract
Unraveling the underlying biological mechanisms or pathways behind the effects of genetic variations on complex diseases remains one of the major challenges in the post-GWAS (where GWAS is genome-wide association study) era. To further explore the relationship between genetic variations, biomarkers, and diseases for elucidating underlying pathological mechanism, a huge effort has been placed on examining pleiotropic and gene-environmental interaction effects. We propose a novel genetic stochastic process model (GSPM) that can be applied to GWAS and jointly investigate the genetic effects on longitudinally measured biomarkers and risks of diseases. This model is characterized by more profound biological interpretation and takes into account the dynamics of biomarkers during follow-up when investigating the hazards of a disease. We illustrate the rationale and evaluate the performance of the proposed model through two GWAS. One is to detect single nucleotide polymorphisms (SNPs) having interaction effects on type 2 diabetes (T2D) with body mass index (BMI) and the other is to detect SNPs affecting the optimal BMI level for protecting from T2D. We identified multiple SNPs that showed interaction effects with BMI on T2D, including a novel SNP rs11757677 in the CDKAL1 gene (P = 5.77 × 10-7 ). We also found a SNP rs1551133 located on 2q14.2 that reversed the effect of BMI on T2D (P = 6.70 × 10-7 ). In conclusion, the proposed GSPM provides a promising and useful tool in GWAS of longitudinal data for interrogating pleiotropic and interaction effects to gain more insights into the relationship between genes, quantitative biomarkers, and risks of complex diseases.
Collapse
Affiliation(s)
- Liang He
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708
| | - Ilya Zhbannikov
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708
| | - Konstantin G. Arbeev
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708
| | - Anatoliy I. Yashin
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708
| | - Alexander M. Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC 27708
| |
Collapse
|
18
|
Abstract
For over a decade, genome-wide association studies (GWAS) have been a major tool for detecting genetic variants underlying complex traits. Recent studies have demonstrated that the same variant or gene can be associated with multiple traits, and such associations are termed cross-phenotype (CP) associations. CP association analysis can improve statistical power by searching for variants that contribute to multiple traits, which is often relevant to pleiotropy. In this chapter, we discuss existing statistical methods for analyzing association between a single marker and multivariate phenotypes, we introduce a general approach, CPASSOC, to detect the CP associations, and explain how to conduct the analysis in practice.
Collapse
Affiliation(s)
- Xiaoyin Li
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, 44106, USA.
| | - Xiaofeng Zhu
- Department of Population and Quantitative Health Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, 44106, USA
| |
Collapse
|
19
|
Lee S, Won S, Kim YJ, Kim Y, Kim BJ, Park T. Rare variant association test with multiple phenotypes. Genet Epidemiol 2016; 41:198-209. [PMID: 28039885 DOI: 10.1002/gepi.22021] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Revised: 08/27/2016] [Accepted: 09/21/2016] [Indexed: 12/17/2022]
Abstract
Although genome-wide association studies (GWAS) have now discovered thousands of genetic variants associated with common traits, such variants cannot explain the large degree of "missing heritability," likely due to rare variants. The advent of next generation sequencing technology has allowed rare variant detection and association with common traits, often by investigating specific genomic regions for rare variant effects on a trait. Although multiple correlated phenotypes are often concurrently observed in GWAS, most studies analyze only single phenotypes, which may lessen statistical power. To increase power, multivariate analyses, which consider correlations between multiple phenotypes, can be used. However, few existing multivariant analyses can identify rare variants for assessing multiple phenotypes. Here, we propose Multivariate Association Analysis using Score Statistics (MAAUSS), to identify rare variants associated with multiple phenotypes, based on the widely used sequence kernel association test (SKAT) for a single phenotype. We applied MAAUSS to whole exome sequencing (WES) data from a Korean population of 1,058 subjects to discover genes associated with multiple traits of liver function. We then assessed validation of those genes by a replication study, using an independent dataset of 3,445 individuals. Notably, we detected the gene ZNF620 among five significant genes. We then performed a simulation study to compare MAAUSS's performance with existing methods. Overall, MAAUSS successfully conserved type 1 error rates and in many cases had a higher power than the existing methods. This study illustrates a feasible and straightforward approach for identifying rare variants correlated with multiple phenotypes, with likely relevance to missing heritability.
Collapse
Affiliation(s)
- Selyeong Lee
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Sungho Won
- Graduate School of Public Health, Seoul National University, Seoul, Korea
| | - Young Jin Kim
- Division of Structural and Functional Genomics, Korean National Institute of Health, Osong, Chungchungbuk-do, Korea
| | - Yongkang Kim
- Department of Statistics, Seoul National University, Seoul, Korea
| | | | - Bong-Jo Kim
- Division of Structural and Functional Genomics, Korean National Institute of Health, Osong, Chungchungbuk-do, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, Korea.,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
| |
Collapse
|
20
|
Brunel H, Massanet R, Martinez-Perez A, Ziyatdinov A, Martin-Fernandez L, Souto JC, Perera A, Soria JM. The Central Role of KNG1 Gene as a Genetic Determinant of Coagulation Pathway-Related Traits: Exploring Metaphenotypes. PLoS One 2016; 11:e0167187. [PMID: 28005926 PMCID: PMC5178993 DOI: 10.1371/journal.pone.0167187] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 11/09/2016] [Indexed: 01/12/2023] Open
Abstract
Traditional genetic studies of single traits may be unable to detect the pleiotropic effects involved in complex diseases. To detect the correlation that exists between several phenotypes involved in the same biological process, we introduce an original methodology to analyze sets of correlated phenotypes involved in the coagulation cascade in genome-wide association studies. The methodology consists of a two-stage process. First, we define new phenotypic meta-variables (linear combinations of the original phenotypes), named metaphenotypes, by applying Independent Component Analysis for the multivariate analysis of correlated phenotypes (i.e. the levels of coagulation pathway–related proteins). The resulting metaphenotypes integrate the information regarding the underlying biological process (i.e. thrombus/clot formation). Secondly, we take advantage of a family based Genome Wide Association Study to identify genetic elements influencing these metaphenotypes and consequently thrombosis risk. Our study utilized data from the GAIT Project (Genetic Analysis of Idiopathic Thrombophilia). We obtained 15 metaphenotypes, which showed significant heritabilities, ranging from 0.2 to 0.7. These results indicate the importance of genetic factors in the variability of these traits. We found 4 metaphenotypes that showed significant associations with SNPs. The most relevant were those mapped in a region near the HRG, FETUB and KNG1 genes. Our results are provocative since they show that the KNG1 locus plays a central role as a genetic determinant of the entire coagulation pathway and thrombus/clot formation. Integrating data from multiple correlated measurements through metaphenotypes is a promising approach to elucidate the hidden genetic mechanisms underlying complex diseases.
Collapse
Affiliation(s)
- Helena Brunel
- Unit of Genomics of Complex Diseases, Sant Pau Institute of Biomedical Research (IIB-Sant Pau), Barcelona, Spain
| | - Raimon Massanet
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
| | - Angel Martinez-Perez
- Unit of Genomics of Complex Diseases, Sant Pau Institute of Biomedical Research (IIB-Sant Pau), Barcelona, Spain
| | - Andrey Ziyatdinov
- Unit of Genomics of Complex Diseases, Sant Pau Institute of Biomedical Research (IIB-Sant Pau), Barcelona, Spain
| | - Laura Martin-Fernandez
- Unit of Genomics of Complex Diseases, Sant Pau Institute of Biomedical Research (IIB-Sant Pau), Barcelona, Spain
| | - Juan Carlos Souto
- Thrombosis and Haemostasis Unit, Sant Pau Institute of Biomedical Research (IIB-Sant Pau), Barcelona, Spain
| | - Alexandre Perera
- Thrombosis and Haemostasis Unit, Sant Pau Institute of Biomedical Research (IIB-Sant Pau), Barcelona, Spain
| | - José Manuel Soria
- Unit of Genomics of Complex Diseases, Sant Pau Institute of Biomedical Research (IIB-Sant Pau), Barcelona, Spain
- * E-mail:
| |
Collapse
|
21
|
Bei Y, Hong P. Robust differential expression analysis by learning discriminant boundary in multi-dimensional space of statistical attributes. BMC Bioinformatics 2016; 17:541. [PMID: 27993137 PMCID: PMC5168810 DOI: 10.1186/s12859-016-1386-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Accepted: 11/26/2016] [Indexed: 11/10/2022] Open
Abstract
Background Performing statistical tests is an important step in analyzing genome-wide datasets for detecting genomic features differentially expressed between conditions. Each type of statistical test has its own advantages in characterizing certain aspects of differences between population means and often assumes a relatively simple data distribution (e.g., Gaussian, Poisson, negative binomial, etc.), which may not be well met by the datasets of interest. Making insufficient distributional assumptions can lead to inferior results when dealing with complex differential expression patterns. Results We propose to capture differential expression information more comprehensively by integrating multiple test statistics, each of which has relatively limited capacity to summarize the observed differential expression information. This work addresses a general application scenario, in which users want to detect as many as DEFs while requiring the false discovery rate (FDR) to be lower than a cut-off. We treat each test statistic as a basic attribute, and model the detection of differentially expressed genomic features as learning a discriminant boundary in a multi-dimensional space of basic attributes. We mathematically formulated our goal as a constrained optimization problem aiming to maximize discoveries satisfying a user-defined FDR. An effective algorithm, Discriminant-Cut, has been developed to solve an instantiation of this problem. Extensive comparisons of Discriminant-Cut with 13 existing methods were carried out to demonstrate its robustness and effectiveness. Conclusions We have developed a novel machine learning methodology for robust differential expression analysis, which can be a new avenue to significantly advance research on large-scale differential expression analysis. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1386-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yuanzhe Bei
- Computer Science Department, Brandeis University, Waltham, MA, 02453, USA
| | - Pengyu Hong
- Computer Science Department, Brandeis University, Waltham, MA, 02453, USA.
| |
Collapse
|
22
|
He L, Kernogitski Y, Kulminskaya I, Loika Y, Arbeev KG, Loiko E, Bagley O, Duan M, Yashkin A, Ukraintseva SV, Kovtun M, Yashin AI, Kulminski AM. Pleiotropic Meta-Analyses of Longitudinal Studies Discover Novel Genetic Variants Associated with Age-Related Diseases. Front Genet 2016; 7:179. [PMID: 27790247 PMCID: PMC5061751 DOI: 10.3389/fgene.2016.00179] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 09/21/2016] [Indexed: 01/31/2023] Open
Abstract
Age-related diseases may result from shared biological mechanisms in intrinsic processes of aging. Genetic effects on age-related diseases are often modulated by environmental factors due to their little contribution to fitness or are mediated through certain endophenotypes. Identification of genetic variants with pleiotropic effects on both common complex diseases and endophenotypes may reveal potential conflicting evolutionary pressures and deliver new insights into shared genetic contribution to healthspan and lifespan. Here, we performed pleiotropic meta-analyses of genetic variants using five NIH-funded datasets by integrating univariate summary statistics for age-related diseases and endophenotypes. We investigated three groups of traits: (1) endophenotypes such as blood glucose, blood pressure, lipids, hematocrit, and body mass index, (2) time-to-event outcomes such as the age-at-onset of diabetes mellitus (DM), cancer, cardiovascular diseases (CVDs) and neurodegenerative diseases (NDs), and (3) both combined. In addition to replicating previous findings, we identify seven novel genome-wide significant loci (< 5e-08), out of which five are low-frequency variants. Specifically, from Group 2, we find rs7632505 on 3q21.1 in SEMA5B, rs460976 on 21q22.3 (1 kb from TMPRSS2) and rs12420422 on 11q24.1 predominantly associated with a variety of CVDs, rs4905014 in ITPK1 associated with stroke and heart failure, rs7081476 on 10p12.1 in ANKRD26 associated with multiple diseases including DM, CVDs, and NDs. From Group 3, we find rs8082812 on 18p11.22 and rs1869717 on 4q31.3 associated with both endophenotypes and CVDs. Our follow-up analyses show that rs7632505, rs4905014, and rs8082812 have age-dependent effects on coronary heart disease or stroke. Functional annotation suggests that most of these SNPs are within regulatory regions or DNase clusters and in linkage disequilibrium with expression quantitative trait loci, implying their potential regulatory influence on the expression of nearby genes. Our mediation analyses suggest that the effects of some SNPs are mediated by specific endophenotypes. In conclusion, these findings indicate that loci with pleiotropic effects on age-related disorders tend to be enriched in genes involved in underlying mechanisms potentially related to nervous, cardiovascular and immune system functions, stress resistance, inflammation, ion channels and hematopoiesis, supporting the hypothesis of shared pathological role of infection, and inflammation in chronic age-related diseases.
Collapse
Affiliation(s)
- Liang He
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke UniversityDurham, NC, USA
| | | | | | | | | | | | | | | | | | | | | | | | - Alexander M. Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke UniversityDurham, NC, USA
| |
Collapse
|
23
|
Detecting Heterogeneity in Population Structure Across the Genome in Admixed Populations. Genetics 2016; 204:43-56. [PMID: 27440868 DOI: 10.1534/genetics.115.184184] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Accepted: 06/11/2016] [Indexed: 11/18/2022] Open
Abstract
The genetic structure of human populations is often characterized by aggregating measures of ancestry across the autosomal chromosomes. While it may be reasonable to assume that population structure patterns are similar genome-wide in relatively homogeneous populations, this assumption may not be appropriate for admixed populations, such as Hispanics and African-Americans, with recent ancestry from two or more continents. Recent studies have suggested that systematic ancestry differences can arise at genomic locations in admixed populations as a result of selection and nonrandom mating. Here, we propose a method, which we refer to as the chromosomal ancestry differences (CAnD) test, for detecting heterogeneity in population structure across the genome. CAnD can incorporate either local or chromosome-wide ancestry inferred from SNP genotype data to identify chromosomes harboring genomic regions with ancestry contributions that are significantly different than expected. In simulation studies with real genotype data from phase III of the HapMap Project, we demonstrate the validity and power of CAnD. We apply CAnD to the HapMap Mexican-American (MXL) and African-American (ASW) population samples; in this analysis the software RFMix is used to infer local ancestry at genomic regions, assuming admixing from Europeans, West Africans, and Native Americans. The CAnD test provides strong evidence of heterogeneity in population structure across the genome in the MXL sample ([Formula: see text]), which is largely driven by elevated Native American ancestry and deficit of European ancestry on the X chromosomes. Among the ASW, all chromosomes are largely African derived and no heterogeneity in population structure is detected in this sample.
Collapse
|
24
|
Li J, Wei Z, Hakonarson H. Application of computational methods in genetic study of inflammatory bowel disease. World J Gastroenterol 2016; 22:949-960. [PMID: 26811639 PMCID: PMC4716047 DOI: 10.3748/wjg.v22.i3.949] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Revised: 11/04/2015] [Accepted: 11/24/2015] [Indexed: 02/06/2023] Open
Abstract
Genetic factors play an important role in the etiology of inflammatory bowel disease (IBD). The launch of genome-wide association study (GWAS) represents a landmark in the genetic study of human complex disease. Concurrently, computational methods have undergone rapid development during the past a few years, which led to the identification of numerous disease susceptibility loci. IBD is one of the successful examples of GWAS and related analyses. A total of 163 genetic loci and multiple signaling pathways have been identified to be associated with IBD. Pleiotropic effects were found for many of these loci; and risk prediction models were built based on a broad spectrum of genetic variants. Important gene-gene, gene-environment interactions and key contributions of gut microbiome are being discovered. Here we will review the different types of analyses that have been applied to IBD genetic study, discuss the computational methods for each type of analysis, and summarize the discoveries made in IBD research with the application of these methods.
Collapse
|
25
|
Uno H, Tian L, Claggett B, Wei LJ. A versatile test for equality of two survival functions based on weighted differences of Kaplan-Meier curves. Stat Med 2015. [PMID: 26194988 DOI: 10.1002/sim.6591] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
With censored event time observations, the logrank test is the most popular tool for testing the equality of two underlying survival distributions. Although this test is asymptotically distribution free, it may not be powerful when the proportional hazards assumption is violated. Various other novel testing procedures have been proposed, which generally are derived by assuming a class of specific alternative hypotheses with respect to the hazard functions. The test considered by Pepe and Fleming (1989) is based on a linear combination of weighted differences of the two Kaplan-Meier curves over time and is a natural tool to assess the difference of two survival functions directly. In this article, we take a similar approach but choose weights that are proportional to the observed standardized difference of the estimated survival curves at each time point. The new proposal automatically makes weighting adjustments empirically. The new test statistic is aimed at a one-sided general alternative hypothesis and is distributed with a short right tail under the null hypothesis but with a heavy tail under the alternative. The results from extensive numerical studies demonstrate that the new procedure performs well under various general alternatives with a caution of a minor inflation of the type I error rate when the sample size is small or the number of observed events is small. The survival data from a recent cancer comparative study are utilized for illustrating the implementation of the process.
Collapse
Affiliation(s)
- Hajime Uno
- Department of Biostatistics and Computational Biology and Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, U.S.A
| | - Lu Tian
- Department of Health Research and Policy, Stanford University School of Medicine, Stanford, 94305, CA, U.S.A
| | - Brian Claggett
- Brigham and Women's Hospital, Division of Cardiovascular Medicine, Harvard Medical School, Boston, 02115, MA, U.S.A
| | - L J Wei
- Department of Biostatistics, Harvard University, Boston, 02115, MA, U.S.A
| |
Collapse
|
26
|
Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. Am J Hum Genet 2015; 96:21-36. [PMID: 25500260 DOI: 10.1016/j.ajhg.2014.11.011] [Citation(s) in RCA: 284] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2014] [Accepted: 11/17/2014] [Indexed: 12/14/2022] Open
Abstract
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes.
Collapse
|
27
|
Xu L, Craiu RV, Derkach A, Paterson AD, Sun L. Using a Bayesian latent variable approach to detect pleiotropy in the Genetic Analysis Workshop 18 data. BMC Proc 2014; 8:S77. [PMID: 25519405 PMCID: PMC4143687 DOI: 10.1186/1753-6561-8-s1-s77] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Pleiotropy, which occurs when a single genetic factor influences multiple phenotypes, is present in many genetic studies of complex human traits. Longitudinal family data, such as the Genetic Analysis Workshop 18 data, combine the features of longitudinal studies in individuals and cross-sectional studies in families, thus providing richer information about the genetic and environmental factors associated with the trait of interest. We recently proposed a Bayesian latent variable methodology for the study of pleiotropy, in the presence of longitudinal and family correlation. The purpose of this work is to evaluate the Bayesian latent variable method in a real data setting using the Genetic Analysis Workshop 18 blood pressure phenotypes and sequenced genotype data. To detect single-nucleotide polymorphisms with pleiotropic effect on both diastolic and systolic blood pressure, we focused on a set of 6 single-nucleotide polymorphisms from chromosome 3 that was reported in the literature to be significantly associated with either diastolic blood pressure or the binary hypertension trait. Our analysis suggests that both diastolic blood pressure and systolic blood pressure are associated with the latent hypertension severity variable, but the analysis did not find any of the 6 single-nucleotide polymorphisms to have statistically significant pleiotropic effect on both diastolic blood pressure and systolic blood pressure.
Collapse
Affiliation(s)
- Lizhen Xu
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario M5S 3G3, Canada
| | - Radu V Craiu
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario M5S 3G3, Canada
| | - Andriy Derkach
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario M5S 3G3, Canada
| | - Andrew D Paterson
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto M5G 1X8, Canada ; Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Ontario M5S 3G3, Canada
| | - Lei Sun
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Ontario M5S 3G3, Canada ; Department of Statistical Sciences, University of Toronto, Toronto, Ontario M5S 3G3, Canada
| |
Collapse
|
28
|
Wang K. Testing genetic association by regressing genotype over multiple phenotypes. PLoS One 2014; 9:e106918. [PMID: 25221983 PMCID: PMC4164437 DOI: 10.1371/journal.pone.0106918] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 07/26/2014] [Indexed: 02/03/2023] Open
Abstract
Complex disorders are typically characterized by multiple phenotypes. Analyzing these phenotypes jointly is expected to be more powerful than dealing with one of them at a time. A recent approach (O'Reilly et al. 2012) is to regress the genotype at a SNP marker on multiple phenotypes and apply the proportional odds model. In the current research, we introduce an explicit expression for the score test statistic and its non-centrality parameter that determines its power. Same simulation studies as those reported in Galesloot et al. (2014) were conducted to assess its performance. We demonstrate by theoretical arguments and simulation studies that, despite its potential usefulness for multiple phenotypes, the proportional odds model method can be less powerful than regular methods for univariate traits. We also introduce an implementation of the proposed score statistic in an R package named iGasso.
Collapse
Affiliation(s)
- Kai Wang
- Department of Biostatistics, University of Iowa, Iowa City, Iowa, United States of America
| |
Collapse
|
29
|
Sinnott JA, Cai T. Omnibus risk assessment via accelerated failure time kernel machine modeling. Biometrics 2013; 69:861-73. [PMID: 24328713 PMCID: PMC3869038 DOI: 10.1111/biom.12098] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2012] [Revised: 07/01/2013] [Accepted: 07/01/2013] [Indexed: 01/05/2023]
Abstract
Integrating genomic information with traditional clinical risk factors to improve the prediction of disease outcomes could profoundly change the practice of medicine. However, the large number of potential markers and possible complexity of the relationship between markers and disease make it difficult to construct accurate risk prediction models. Standard approaches for identifying important markers often rely on marginal associations or linearity assumptions and may not capture non-linear or interactive effects. In recent years, much work has been done to group genes into pathways and networks. Integrating such biological knowledge into statistical learning could potentially improve model interpretability and reliability. One effective approach is to employ a kernel machine (KM) framework, which can capture nonlinear effects if nonlinear kernels are used (Scholkopf and Smola, 2002; Liu et al., 2007, 2008). For survival outcomes, KM regression modeling and testing procedures have been derived under a proportional hazards (PH) assumption (Li and Luan, 2003; Cai, Tonini, and Lin, 2011). In this article, we derive testing and prediction methods for KM regression under the accelerated failure time (AFT) model, a useful alternative to the PH model. We approximate the null distribution of our test statistic using resampling procedures. When multiple kernels are of potential interest, it may be unclear in advance which kernel to use for testing and estimation. We propose a robust Omnibus Test that combines information across kernels, and an approach for selecting the best kernel for estimation. The methods are illustrated with an application in breast cancer.
Collapse
Affiliation(s)
- Jennifer A. Sinnott
- Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA
| | - Tianxi Cai
- Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA
| |
Collapse
|
30
|
Solovieff N, Cotsapas C, Lee PH, Purcell SM, Smoller JW. Pleiotropy in complex traits: challenges and strategies. Nat Rev Genet 2013; 14:483-95. [PMID: 23752797 DOI: 10.1038/nrg3461] [Citation(s) in RCA: 751] [Impact Index Per Article: 62.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Genome-wide association studies have identified many variants that each affects multiple traits, particularly across autoimmune diseases, cancers and neuropsychiatric disorders, suggesting that pleiotropic effects on human complex traits may be widespread. However, systematic detection of such effects is challenging and requires new methodologies and frameworks for interpreting cross-phenotype results. In this Review, we discuss the evidence for pleiotropy in contemporary genetic mapping studies, new and established analytical approaches to identifying pleiotropic effects, sources of spurious cross-phenotype effects and study design considerations. We also outline the molecular and clinical implications of such findings and discuss future directions of research.
Collapse
Affiliation(s)
- Nadia Solovieff
- Center for Human Genetics Research, Massachusetts General Hospital, 185 Cambridge Street, Boston, Massachusetts 02114, USA
| | | | | | | | | |
Collapse
|
31
|
ZHU W, ZHANG H. A nonparametric regression method for multiple longitudinal phenotypes using multivariate adaptive splines. FRONTIERS OF MATHEMATICS IN CHINA : SELECTED PAPERS FROM CHINESE UNIVERSITIES 2013; 8:731-743. [PMID: 25309585 PMCID: PMC4193387 DOI: 10.1007/s11464-012-0256-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In genetic studies of complex diseases, particularly mental illnesses, and behavior disorders, two distinct characteristics have emerged in some data sets. First, genetic data sets are collected with a large number of phenotypes that are potentially related to the complex disease under study. Second, each phenotype is collected from the same subject repeatedly over time. In this study, we present a nonparametric regression approach to study multivariate and time-repeated phenotypes together by using the technique of the multivariate adaptive regression splines for analysis of longitudinal data (MASAL), which makes it possible to identify genes, gene-gene and gene-environment, including time, interactions associated with the phenotypes of interest. Furthermore, we propose a permutation test to assess the associations between the phenotypes and selected markers. Through simulation, we demonstrate that our proposed approach has advantages over the existing methods that examine each longitudinal phenotype separately or analyze the summarized values of phenotypes by compressing them into one-time-point phenotypes. Application of the proposed method to the Framingham Heart Study illustrates that the use of multivariate longitudinal phenotypes enhanced the significance of the association test.
Collapse
Affiliation(s)
- Wensheng ZHU
- Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China
- Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT 06520-8034, USA
| | - Heping ZHANG
- Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT 06520-8034, USA
| |
Collapse
|
32
|
De G, Yip WK, Ionita-Laza I, Laird N. Rare variant analysis for family-based design. PLoS One 2013; 8:e48495. [PMID: 23341868 PMCID: PMC3546113 DOI: 10.1371/journal.pone.0048495] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 10/01/2012] [Indexed: 12/21/2022] Open
Abstract
Genome-wide association studies have been able to identify disease associations with many common variants; however most of the estimated genetic contribution explained by these variants appears to be very modest. Rare variants are thought to have larger effect sizes compared to common SNPs but effects of rare variants cannot be tested in the GWAS setting. Here we propose a novel method to test for association of rare variants obtained by sequencing in family-based samples by collapsing the standard family-based association test (FBAT) statistic over a region of interest. We also propose a suitable weighting scheme so that low frequency SNPs that may be enriched in functional variants can be upweighted compared to common variants. Using simulations we show that the family-based methods perform at par with the population-based methods under no population stratification. By construction, family-based tests are completely robust to population stratification; we show that our proposed methods remain valid even when population stratification is present.
Collapse
Affiliation(s)
- Gourab De
- Department of Biostatistics, Harvard University, Boston, MA, USA.
| | | | | | | |
Collapse
|
33
|
Ferguson J, Wheeler W, Fu Y, Prokunina-Olsson L, Zhao H, Sampson J. Statistical tests for detecting associations with groups of genetic variants: generalization, evaluation, and implementation. Eur J Hum Genet 2012; 21:680-6. [PMID: 23092956 DOI: 10.1038/ejhg.2012.220] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
With recent advances in sequencing, genotyping arrays, and imputation, GWAS now aim to identify associations with rare and uncommon genetic variants. Here, we describe and evaluate a class of statistics, generalized score statistics (GSS), that can test for an association between a group of genetic variants and a phenotype. GSS are a simple weighted sum of single-variant statistics and their cross-products. We show that the majority of statistics currently used to detect associations with rare variants are equivalent to choosing a specific set of weights within this framework. We then evaluate the power of various weighting schemes as a function of variant characteristics, such as MAF, the proportion associated with the phenotype, and the direction of effect. Ultimately, we find that two classical tests are robust and powerful, but details are provided as to when other GSS may perform favorably. The software package CRaVe is available at our website (http://dceg.cancer.gov/bb/tools/crave).
Collapse
Affiliation(s)
- John Ferguson
- Division of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | | | | | | | | | | |
Collapse
|
34
|
Campbell TB, Smeaton LM, Kumarasamy N, Flanigan T, Klingman KL, Firnhaber C, Grinsztejn B, Hosseinipour MC, Kumwenda J, Lalloo U, Riviere C, Sanchez J, Melo M, Supparatpinyo K, Tripathy S, Martinez AI, Nair A, Walawander A, Moran L, Chen Y, Snowden W, Rooney JF, Uy J, Schooley RT, De Gruttola V, Hakim JG. Efficacy and safety of three antiretroviral regimens for initial treatment of HIV-1: a randomized clinical trial in diverse multinational settings. PLoS Med 2012; 9:e1001290. [PMID: 22936892 PMCID: PMC3419182 DOI: 10.1371/journal.pmed.1001290] [Citation(s) in RCA: 99] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/02/2011] [Accepted: 07/05/2012] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Antiretroviral regimens with simplified dosing and better safety are needed to maximize the efficiency of antiretroviral delivery in resource-limited settings. We investigated the efficacy and safety of antiretroviral regimens with once-daily compared to twice-daily dosing in diverse areas of the world. METHODS AND FINDINGS 1,571 HIV-1-infected persons (47% women) from nine countries in four continents were assigned with equal probability to open-label antiretroviral therapy with efavirenz plus lamivudine-zidovudine (EFV+3TC-ZDV), atazanavir plus didanosine-EC plus emtricitabine (ATV+DDI+FTC), or efavirenz plus emtricitabine-tenofovir-disoproxil fumarate (DF) (EFV+FTC-TDF). ATV+DDI+FTC and EFV+FTC-TDF were hypothesized to be non-inferior to EFV+3TC-ZDV if the upper one-sided 95% confidence bound for the hazard ratio (HR) was ≤1.35 when 30% of participants had treatment failure. An independent monitoring board recommended stopping study follow-up prior to accumulation of 472 treatment failures. Comparing EFV+FTC-TDF to EFV+3TC-ZDV, during a median 184 wk of follow-up there were 95 treatment failures (18%) among 526 participants versus 98 failures among 519 participants (19%; HR 0.95, 95% CI 0.72-1.27; p = 0.74). Safety endpoints occurred in 243 (46%) participants assigned to EFV+FTC-TDF versus 313 (60%) assigned to EFV+3TC-ZDV (HR 0.64, CI 0.54-0.76; p<0.001) and there was a significant interaction between sex and regimen safety (HR 0.50, CI 0.39-0.64 for women; HR 0.79, CI 0.62-1.00 for men; p = 0.01). Comparing ATV+DDI+FTC to EFV+3TC-ZDV, during a median follow-up of 81 wk there were 108 failures (21%) among 526 participants assigned to ATV+DDI+FTC and 76 (15%) among 519 participants assigned to EFV+3TC-ZDV (HR 1.51, CI 1.12-2.04; p = 0.007). CONCLUSION EFV+FTC-TDF had similar high efficacy compared to EFV+3TC-ZDV in this trial population, recruited in diverse multinational settings. Superior safety, especially in HIV-1-infected women, and once-daily dosing of EFV+FTC-TDF are advantageous for use of this regimen for initial treatment of HIV-1 infection in resource-limited countries. ATV+DDI+FTC had inferior efficacy and is not recommended as an initial antiretroviral regimen. TRIAL REGISTRATION www.ClinicalTrials.gov NCT00084136. Please see later in the article for the Editors' Summary.
Collapse
Affiliation(s)
- Thomas B Campbell
- Division of Infectious Diseases, Department of Medicine, University of Colorado School of Medicine, Aurora, United States of America.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Yang Q, Wang Y. Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies. JOURNAL OF PROBABILITY AND STATISTICS 2012; 2012:652569. [PMID: 24748889 PMCID: PMC3989935 DOI: 10.1155/2012/652569] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Multivariate phenotypes are frequently encountered in genetic association studies. The purpose of analyzing multivariate phenotypes usually includes discovery of novel genetic variants of pleiotropy effects, that is, affecting multiple phenotypes, and the ultimate goal of uncovering the underlying genetic mechanism. In recent years, there have been new method development and application of existing statistical methods to such phenotypes. In this paper, we provide a review of the available methods for analyzing association between a single marker and a multivariate phenotype consisting of the same type of components (e.g., all continuous or all categorical) or different types of components (e.g., some are continuous and others are categorical). We also reviewed causal inference methods designed to test whether the detected association with the multivariate phenotype is truly pleiotropy or the genetic marker exerts its effects on some phenotypes through affecting the others.
Collapse
Affiliation(s)
- Qiong Yang
- Department of Biostatistics, Boston University School of Public Health, 810 Mass Avenue, Boston, MA 02118, USA
| | - Yuanjia Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10027, USA
| |
Collapse
|
36
|
Shriner D. Moving toward System Genetics through Multiple Trait Analysis in Genome-Wide Association Studies. Front Genet 2012; 3:1. [PMID: 22303408 PMCID: PMC3266611 DOI: 10.3389/fgene.2012.00001] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Accepted: 01/01/2012] [Indexed: 02/05/2023] Open
Abstract
Association studies are a staple of genotype–phenotype mapping studies, whether they are based on single markers, haplotypes, candidate genes, genome-wide genotypes, or whole genome sequences. Although genetic epidemiological studies typically contain data collected on multiple traits which themselves are often correlated, most analyses have been performed on single traits. Here, I review several methods that have been developed to perform multiple trait analysis. These methods range from traditional multivariate models for systems of equations to recently developed graphical approaches based on network theory. The application of network theory to genetics is termed systems genetics and has the potential to address long-standing questions in genetics about complex processes such as coordinate regulation, homeostasis, and pleiotropy.
Collapse
Affiliation(s)
- Daniel Shriner
- Center for Research on Genomics and Global Health, National Human Genome Research Institute Bethesda, MD, USA
| |
Collapse
|
37
|
A methodology for multivariate phenotype-based genome-wide association studies to mine pleiotropic genes. BMC SYSTEMS BIOLOGY 2011; 5 Suppl 2:S13. [PMID: 22784570 PMCID: PMC3287479 DOI: 10.1186/1752-0509-5-s2-s13] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Background Current Genome-Wide Association Studies (GWAS) are performed in a single trait framework without considering genetic correlations between important disease traits. Hence, the GWAS have limitations in discovering genetic risk factors affecting pleiotropic effects. Results This work reports a novel data mining approach to discover patterns of multiple phenotypic associations over 52 anthropometric and biochemical traits in KARE and a new analytical scheme for GWAS of multivariate phenotypes defined by the discovered patterns. This methodology applied to the GWAS for multivariate phenotype highLDLhighTG derived from the predicted patterns of the phenotypic associations. The patterns of the phenotypic associations were informative to draw relations between plasma lipid levels with bone mineral density and a cluster of common traits (Obesity, hypertension, insulin resistance) related to Metabolic Syndrome (MS). A total of 15 SNPs in six genes (PAK7, C20orf103, NRIP1, BCL2, TRPM3, and NAV1) were identified for significant associations with highLDLhighTG. Noteworthy findings were that the significant associations included a mis-sense mutation (PAK7:R335P), a frame shift mutation (C20orf103) and SNPs in splicing sites (TRPM3). Conclusions The six genes corresponded to rat and mouse quantitative trait loci (QTLs) that had shown associations with the common traits such as the well characterized MS and even tumor susceptibility. Our findings suggest that the six genes may play important roles in the pleiotropic effects on lipid metabolism and the MS, which increase the risk of Type 2 Diabetes and cardiovascular disease. The use of the multivariate phenotypes can be advantageous in identifying genetic risk factors, accounting for the pleiotropic effects when the multivariate phenotypes have a common etiological pathway.
Collapse
|
38
|
Yang Q, Wu H, Guo CY, Fox CS. Analyze multivariate phenotypes in genetic association studies by combining univariate association tests. Genet Epidemiol 2010; 34:444-54. [PMID: 20583287 PMCID: PMC3090041 DOI: 10.1002/gepi.20497] [Citation(s) in RCA: 111] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Multivariate phenotypes are frequently encountered in genome-wide association studies (GWAS). Such phenotypes contain more information than univariate phenotypes, but how to best exploit the information to increase the chance of detecting genetic variant of pleiotropic effect is not always clear. Moreover, when multivariate phenotypes contain a mixture of quantitative and qualitative measures, limited methods are applicable. In this paper, we first evaluated the approach originally proposed by O'Brien and by Wei and Johnson that combines the univariate test statistics and then we proposed two extensions to that approach. The original and proposed approaches are applicable to a multivariate phenotype containing any type of components including continuous, categorical and survival phenotypes, and applicable to samples consisting of families or unrelated samples. Simulation results suggested that all methods had valid type I error rates. Our extensions had a better power than O'Brien's method with heterogeneous means among univariate test statistics, but were less powerful than O'Brien's with homogeneous means among individual test statistics. All approaches have shown considerable increase in power compared to testing each component of a multivariate phenotype individually in some cases. We apply all the methods to GWAS of serum uric acid levels and gout with 550,000 single nucleotide polymorphisms in the Framingham Heart Study.
Collapse
Affiliation(s)
- Qiong Yang
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts 02118, USA.
| | | | | | | |
Collapse
|
39
|
Crawford KW, Spritzler J, Kalayjian RC, Parsons T, Landay A, Pollard R, Stocker V, Lederman MM, Flexner C. Age-related changes in plasma concentrations of the HIV protease inhibitor lopinavir. AIDS Res Hum Retroviruses 2010; 26:635-43. [PMID: 20560793 DOI: 10.1089/aid.2009.0154] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
The advent of highly active antiretroviral therapy in the treatment of HIV disease has substantially extended the lifespan of individuals infected with HIV resulting in a growing population of older HIV-infected individuals. The efficacy and safety of antiretroviral agents in the population are important concerns. There have been relatively few studies assessing antiretroviral pharmacokinetics in older patients. Thirty-seven subjects aged 18-30 years and 40 subjects aged 45-79 years, naive to antiretroviral therapy, received lopinavir/ritonavir (400/100) bid, emtricitibine 200 mg qd, and stavudine 40 mg bid. Trough lopinavir concentrations were available for 44 subjects, collected at 24, 36, and 96 weeks. At week 24, older age was associated with higher lopinavir trough concentrations, and a trend was observed toward older age being associated with higher lopinavir trough concentrations when all time points were evaluated. In the young cohort, among subjects with two or more measurements, there was a trend toward increasing intrasubject trough lopinavir concentrations over time. Using a nonlinear, mixed-effects population pharmacokinetic model, age was negatively associated with lopinavir clearance after adjusting for adherence. Adherence was assessed by patient self-reports; older patients missed fewer doses than younger patients (p = 0.02). No difference in grade 3-4 toxicities was observed between the two age group. Older patients have higher trough lopinavir concentrations and likely decreased lopinavir clearance. Age-related changes in the pharmacokinetics of antiretroviral drugs may be of increasing importance as the HIV-infected population ages and as older individuals comprise an increasing proportion of new diagnoses.
Collapse
Affiliation(s)
- Keith W. Crawford
- Johns Hopkins University School of Medicine, Baltimore, Maryland
- Howard University College of Medicine, Washington D.C
| | - John Spritzler
- Harvard University School of Public Health, Boston, Massachusetts
| | - Robert C. Kalayjian
- MetroHealth Medical Center and Case Western Reserve University, Cleveland, Ohio
| | - Teresa Parsons
- Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Alan Landay
- Rush University Medical College, Chicago, Illinois
| | | | - Vicki Stocker
- Social and Scientific Systems, Inc., Silver Spring, Maryland
| | - Michael M. Lederman
- University Hospitals/Case Medical Center, Case Western Reserve University, Cleveland, Ohio
| | - Charles Flexner
- Johns Hopkins University School of Medicine, Baltimore, Maryland
| | | |
Collapse
|
40
|
Yu K, Wheeler W, Li Q, Bergen AW, Caporaso N, Chatterjee N, Chen J. A partially linear tree-based regression model for multivariate outcomes. Biometrics 2010; 66:89-96. [PMID: 19432770 PMCID: PMC2875329 DOI: 10.1111/j.1541-0420.2009.01235.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In the genetic study of complex traits, especially behavior related ones, such as smoking and alcoholism, usually several phenotypic measurements are obtained for the description of the complex trait, but no single measurement can quantify fully the complicated characteristics of the symptom because of our lack of understanding of the underlying etiology. If those phenotypes share a common genetic mechanism, rather than studying each individual phenotype separately, it is more advantageous to analyze them jointly as a multivariate trait to enhance the power to identify associated genes. We propose a multilocus association test for the study of multivariate traits. The test is derived from a partially linear tree-based regression model for multiple outcomes. This novel tree-based model provides a formal statistical testing framework for the evaluation of the association between a multivariate outcome and a set of candidate predictors, such as markers within a gene or pathway, while accommodating adjustment for other covariates. Through simulation studies we show that the proposed method has an acceptable type I error rate and improved power over the univariate outcome analysis, which studies each component of the complex trait separately with multiple-comparison adjustment. A candidate gene association study of multiple smoking-related phenotypes is used to demonstrate the application and advantages of this new method. The proposed method is general enough to be used for the assessment of the joint effect of a set of multiple risk factors on a multivariate outcome in other biomedical research settings.
Collapse
Affiliation(s)
- Kai Yu
- Division of Cancer Epidemiology and Genetics, NCI, Rockville, Maryland 20892, USA.
| | | | | | | | | | | | | |
Collapse
|
41
|
Shen YF, Zhu J. Power analysis of principal components regression in genetic association studies. J Zhejiang Univ Sci B 2010; 10:721-30. [PMID: 19816996 DOI: 10.1631/jzus.b0830866] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Association analysis provides an opportunity to find genetic variants underlying complex traits. A principal components regression (PCR)-based approach was shown to outperform some competing approaches. However, a limitation of this method is that the principal components (PCs) selected from single nucleotide polymorphisms (SNPs) may be unrelated to the phenotype. In this article, we investigate the theoretical properties of such a method in more detail. We first derive the exact power function of the test based on PCR, and hence clarify the relationship between the test power and the degrees of freedom (DF). Next, we extend the PCR test to a general weighted PCs test, which provides a unified framework for understanding the properties of some related statistics. We then compare the performance of these tests. We also introduce several data-driven adaptive alternatives to overcome difficulties in the PCR approach. Finally, we illustrate our results using simulations based on real genotype data. Simulation study shows the risk of using the unsupervised rule to determine the number of PCs, and demonstrates that there is no single uniformly powerful method for detecting genetic variants.
Collapse
Affiliation(s)
- Yan-feng Shen
- Department of Mathematics, Zhejiang University, Hangzhou 310027, China
| | | |
Collapse
|
42
|
Wang JY, Tai JJ. Robust Quantitative Trait Association Tests in the Parent-Offspring Triad Design: Conditional Likelihood-Based Approaches. Ann Hum Genet 2009; 73:231-44. [DOI: 10.1111/j.1469-1809.2008.00502.x] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
43
|
Abstract
In studies of complex disorders such as nicotine dependence, it is common that researchers assess multiple variables related to a disorder as well as other disorders that are potentially correlated with the primary disorder of interest. In this work, we refer to those variables and disorders broadly as multiple traits. The multiple traits may or may not have a common causal genetic variant. Intuitively, it may be more powerful to accommodate multiple traits in genetic traits, but the analysis of multiple traits is generally more complicated than the analysis of a single trait. Furthermore, it is not well documented as to how much power we may potentially gain by considering multiple traits. Our aim is to enhance our understanding on this important and practical issue. We considered a variety of correlation structures between traits and the disease locus. To focus on the effect of accommodating multiple traits, we examined genetic models that are relatively simple so that we can pinpoint the factors affecting the power. We conducted simulation studies to explore the performance of testing multiple traits simultaneously and the performance of testing a single trait at a time in family-based association studies. Our simulation results demonstrated that the performance of testing multiple traits simultaneously is better than that of testing each trait individually for almost models considered. We also found that the power of association tests varies among the underlying models. The advantage of conducting a multiple traits test is minimized when some traits are influenced by the gene only through other traits; and it is maximized when there are causal relations between the traits and the gene, and among the traits themselves or when there are extraneous traits.
Collapse
Affiliation(s)
- Wensheng Zhu
- Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT 06520-8034
| | | |
Collapse
|
44
|
Tian L, Cai T, Pfeffer MA, Piankov N, Cremieux PY, Wei LJ. Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 x 2 tables with all available data but without artificial continuity correction. Biostatistics 2008; 10:275-81. [PMID: 18922759 DOI: 10.1093/biostatistics/kxn034] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Recently, meta-analysis has been widely utilized to combine information across comparative clinical studies for evaluating drug efficacy or safety profile. When dealing with rather rare events, a substantial proportion of studies may not have any events of interest. Conventional methods either exclude such studies or add an arbitrary positive value to each cell of the corresponding 2 x 2 tables in the analysis. In this article, we present a simple, effective procedure to make valid inferences about the parameter of interest with all available data without artificial continuity corrections. We then use the procedure to analyze the data from 48 comparative trials involving rosiglitazone with respect to its possible cardiovascular toxicity.
Collapse
Affiliation(s)
- Lu Tian
- Department of Preventive Medicine, Northwestern University, Chicago, IL 60611, USA.
| | | | | | | | | | | |
Collapse
|
45
|
Abstract
The commonly used two-sample tests of equal area-under-the-curve (AUC), where AUC is based on the linear trapezoidal rule, may have poor properties when observations are missing, even if they are missing completely at random (MCAR). We propose two tests: one that has good properties when data are MCAR and another that has good properties when the data are missing at random (MAR), provided that the pattern of missingness is monotonic. In addition, we discuss other non-parametric tests of hypotheses that are similar, but not identical, to the hypothesis of equal AUCs, but that often have better statistical properties than do AUC tests and may be more scientifically appropriate for many settings.
Collapse
|
46
|
Zhou H, Wei LJ, Xu X, Xu X. Combining association tests across multiple genetic markers in case-control studies. Hum Hered 2007; 65:166-74. [PMID: 17940337 DOI: 10.1159/000109733] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2007] [Accepted: 07/11/2007] [Indexed: 11/19/2022] Open
Abstract
In the search to detect genetic associations between complex traits and DNA variants, a practice is to select a subset of Single Nucleotide Polymorphisms (tag SNPs) in a gene or chromosomal region of interest. This allows study of untyped polymorphisms in this region through the phenomenon of linkage disequilibrium (LD). However, it is crucial in the analysis to utilize such multiple SNP markers efficiently. In this study, we present a robust testing approach (T(C)) that combines single marker association test statistics or p values. This combination is based on the summation of single test statistics or p values, giving greater weight to those with lower p values. We compared the powers of T(C) in identifying common trait loci, using tag SNPs within the same haplotype block that the trait loci reside, with competing published tests, in case-control settings. These competing tests included the Bonferroni procedure (T(B)), the simple permutation procedure (T(P)), the permutation procedure proposed by Hoh et al. (T(P-H)) and its revised version using 'deflated' statistics (T(P-H_def)), the traditional chi(2) procedure (T(CHI)), the regression procedure (Hotelling T(2) test) (T(R)) and the haplotype-based test (T(H)). Results of these comparisons show that our proposed combining procedure (T(C)) is preferred in all scenarios examined. We also apply this new test to a data set from a previously reported association study on airway responsiveness to methacholine.
Collapse
Affiliation(s)
- Huanyu Zhou
- Program for Population Genetics, Harvard School of Public Health, Boston, MA 02115, USA
| | | | | | | |
Collapse
|
47
|
Xu X, Rakovski C, Xu X, Laird N. An efficient family-based association test using multiple markers. Genet Epidemiol 2006; 30:620-6. [PMID: 16868964 DOI: 10.1002/gepi.20174] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In genetic association studies, multiple markers are usually employed to cover a genomic region of interest for localizing a trait locus. In this report, we propose a novel multi-marker family-based association test (T(LC)) that linearly combines the single-marker test statistics using data-driven weights. We examine the type-I error rate in a numerical study and compare its power to identify a common trait locus using tag single nucleotide polymorphisms (SNPs) within the same haplotype block that the trait locus resides with three competing tests including a global haplotype test (T(H)), a multi-marker test similar to the Hotelling-T(2) test for the population-based data (T(MM)), and a single-marker test with Bonferroni's correction for multiple testing (T(B)). The type-I error rate of T(LC) is well maintained in our numeric study. In all the scenarios we examined, T(LC) is the most powerful, followed by T(B). T(MM) and T(H) are the poorest. T(H) and T(MM) have essentially the same power when parents are available. However, when both parents are missing, T(MM) is substantially more powerful than T(H). We also apply this new test on a data set from a previous association study on nicotine dependence.
Collapse
Affiliation(s)
- Xin Xu
- Program for Population Genetics, Harvard School of Public Health, 665 Huntington Avenue, Boston, MA 02115, USA.
| | | | | | | |
Collapse
|
48
|
Delongchamp R, Lee T, Velasco C. A method for computing the overall statistical significance of a treatment effect among a group of genes. BMC Bioinformatics 2006; 7 Suppl 2:S11. [PMID: 17118132 PMCID: PMC1683577 DOI: 10.1186/1471-2105-7-s2-s11] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In studies that use DNA arrays to assess changes in gene expression, our goal is to evaluate the statistical significance of treatments on sets of genes. Genes can be grouped by a molecular function, a biological process, or a cellular component, e.g., gene ontology (GO) terms. The meaning of an affected GO group is often clearer than interpretations arising from a list of the statistically significant genes. RESULTS Computer simulations demonstrated that correlations among genes invalidate many statistical methods that are commonly used to assign significance to GO terms. Ignoring these correlations overstates the statistical significance. Meta-analysis methods for combining p-values were modified to adjust for correlation. One of these methods is elaborated in the context of a comparison between two treatments. The form of the correlation adjustment depends upon the alternative hypothesis. CONCLUSION Reliable corrections for the effect of correlations among genes on the significance level of a GO term can be constructed for an alternative hypothesis where all transcripts in the GO term increase (decrease) in response to treatment. For general alternatives, which allow some transcripts to increase and others to decrease, the bias of naïve significance calculations can be greatly decreased although not eliminated.
Collapse
Affiliation(s)
- Robert Delongchamp
- Division of Biometry and Risk Assessment, National Center for Toxicological Research, Jefferson, Arkansas 72079 USA
| | - Taewon Lee
- Division of Biometry and Risk Assessment, National Center for Toxicological Research, Jefferson, Arkansas 72079 USA
| | - Cruz Velasco
- School of Public Health, LSU Health Science Center, New Orleans, LA 70112 USA
| |
Collapse
|