1
|
Hans N, Klein N, Faschingbauer F, Schneider M, Mayr A. Boosting distributional copula regression. Biometrics 2023; 79:2298-2310. [PMID: 36165288 DOI: 10.1111/biom.13765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 09/15/2022] [Indexed: 11/28/2022]
Abstract
Capturing complex dependence structures between outcome variables (e.g., study endpoints) is of high relevance in contemporary biomedical data problems and medical research. Distributional copula regression provides a flexible tool to model the joint distribution of multiple outcome variables by disentangling the marginal response distributions and their dependence structure. In a regression setup, each parameter of the copula model, that is, the marginal distribution parameters and the copula dependence parameters, can be related to covariates via structured additive predictors. We propose a framework to fit distributional copula regression via model-based boosting, which is a modern estimation technique that incorporates useful features like an intrinsic variable selection mechanism, parameter shrinkage and the capability to fit regression models in high-dimensional data setting, that is, situations with more covariates than observations. Thus, model-based boosting does not only complement existing Bayesian and maximum-likelihood based estimation frameworks for this model class but rather enables unique intrinsic mechanisms that can be helpful in many applied problems. The performance of our boosting algorithm for copula regression models with continuous margins is evaluated in simulation studies that cover low- and high-dimensional data settings and situations with and without dependence between the responses. Moreover, distributional copula boosting is used to jointly analyze and predict the length and the weight of newborns conditional on sonographic measurements of the fetus before delivery together with other clinical variables.
Collapse
Affiliation(s)
- Nicolai Hans
- Chair of Statistics and Data Science, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Nadja Klein
- Chair of Statistics and Data Science, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Florian Faschingbauer
- Department of Obstetrics and Gynecology, University Hospital of Erlangen, Erlangen, Germany
| | - Michael Schneider
- Department of Obstetrics and Gynecology, University Hospital of Erlangen, Erlangen, Germany
| | - Andreas Mayr
- Department of Medical Biometrics, Informatics and Epidemiology, Faculty of Medicine, University of Bonn, Bonn, Germany
| |
Collapse
|
2
|
Deng X, Wang B, Fisher V, Peloso G, Cupples A, Liu CT. Genome-wide association study for multiple phenotype analysis. BMC Proc 2018; 12:55. [PMID: 30263053 PMCID: PMC6156845 DOI: 10.1186/s12919-018-0135-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Genome-wide association studies often collect multiple phenotypes for complex diseases. Multivariate joint analyses have higher power to detect genetic variants compared with the marginal analysis of each phenotype and are also able to identify loci with pleiotropic effects. We extend the unified score-based association test to incorporate family structure, apply different approaches to analyze multiple traits in GAW20 real samples, and compare the results. Through simulation studies, we confirm that the Type I error rate of the pedigree-based unified score association test is appropriately controlled. In marginalanalysis of triglyceride levels, we found 1 subgenome-wide significant variant on chromosome 6. Joint analyses identified several suggestive genome-wide significant signals, with the pedigree-based unified score association test yielding the greatest number of significant results.
Collapse
Affiliation(s)
- Xuan Deng
- Department of Biostatistics, School of Public Health, Boston University, 801 Massachusetts Avenue 3rd Floor, Boston, MA 02118 USA
| | - Biqi Wang
- Department of Biostatistics, School of Public Health, Boston University, 801 Massachusetts Avenue 3rd Floor, Boston, MA 02118 USA
| | - Virginia Fisher
- Department of Biostatistics, School of Public Health, Boston University, 801 Massachusetts Avenue 3rd Floor, Boston, MA 02118 USA
| | - Gina Peloso
- Department of Biostatistics, School of Public Health, Boston University, 801 Massachusetts Avenue 3rd Floor, Boston, MA 02118 USA
| | - Adrienne Cupples
- Department of Biostatistics, School of Public Health, Boston University, 801 Massachusetts Avenue 3rd Floor, Boston, MA 02118 USA
| | - Ching-Ti Liu
- Department of Biostatistics, School of Public Health, Boston University, 801 Massachusetts Avenue 3rd Floor, Boston, MA 02118 USA
| |
Collapse
|
3
|
Wang X, Boekstegers F, Brinster R. Methods and results from the genome-wide association group at GAW20. BMC Genet 2018; 19:79. [PMID: 30255814 PMCID: PMC6157187 DOI: 10.1186/s12863-018-0649-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND This paper summarizes the contributions from the Genome-wide Association Study group (GWAS group) of the GAW20. The GWAS group contributions focused on topics such as association tests, phenotype imputation, and application of empirical kinships. The goals of the GWAS group contributions were varied. A real or a simulated data set based on the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study was employed by different methods. Different outcomes and covariates were considered, and quality control procedures varied throughout the contributions. RESULTS The consideration of heritability and family structure played a major role in some contributions. The inclusion of family information and adaptive weights based on data were found to improve power in genome-wide association studies. It was proven that gene-level approaches are more powerful than single-marker analysis. Other contributions focused on the comparison between pedigree-based kinship and empirical kinship matrices, and investigated similar results in heritability estimation, association mapping, and genomic prediction. A new approach for linkage mapping of triglyceride levels was able to identify a novel linkage signal. CONCLUSIONS This summary paper reports on promising statistical approaches and findings of the members of the GWAS group applied on real and simulated data which encompass the current topics of epigenetic and pharmacogenomics.
Collapse
Affiliation(s)
- Xuexia Wang
- University of North Texas, GAB 459, 1155 Union Circle #311430, Denton, TX 76203 USA
| | - Felix Boekstegers
- Institute of Medical Biometry and Informatics, University of Heidelberg, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany
| | - Regina Brinster
- Institute of Medical Biometry and Informatics, University of Heidelberg, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany
| |
Collapse
|
4
|
Abstract
Preterm birth is the single leading cause of mortality for neonates and children less than 5 years of age. Compared to other childhood diseases, such as infections, less progress in prevention of prematurity has been made. In large part, the continued high burden of prematurity results from the limited understanding of the mechanisms controlling normal birth timing in humans, and how individual genetic variation and environmental exposures disrupt these mechanisms to cause preterm birth. In this review, we summarize the outcomes and limitations from studies in model organisms for birth timing in humans, the evidence that genetic factors contribute to birth timing and risk for preterm birth, and recent genetic and genomic studies in women and infants that implicate specific genes and pathways. We conclude with discussing areas of potential high impact in understanding human parturition and preterm birth in the future.
Collapse
Affiliation(s)
- Nagendra K Monangi
- Division of Neonatology, Perinatal Institute, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, MLC 7009, Cincinnati, OH 45229; Center for Prevention of Preterm Birth, Cincinnati Children's Hospital Medical Center, Cincinnati, OH
| | - Heather M Brockway
- Center for Prevention of Preterm Birth, Cincinnati Children's Hospital Medical Center, Cincinnati, OH
| | - Melissa House
- Division of Neonatology, Perinatal Institute, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, MLC 7009, Cincinnati, OH 45229
| | - Ge Zhang
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH
| | - Louis J Muglia
- Division of Neonatology, Perinatal Institute, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, MLC 7009, Cincinnati, OH 45229; Center for Prevention of Preterm Birth, Cincinnati Children's Hospital Medical Center, Cincinnati, OH.
| |
Collapse
|
5
|
Abstract
For many years, linkage analysis was the primary tool used for the genetic mapping of Mendelian and complex traits with familial aggregation. Linkage analysis was largely supplanted by the wide adoption of genome-wide association studies (GWASs). However, with the recent increased use of whole-genome sequencing (WGS), linkage analysis is again emerging as an important and powerful analysis method for the identification of genes involved in disease aetiology, often in conjunction with WGS filtering approaches. Here, we review the principles of linkage analysis and provide practical guidelines for carrying out linkage studies using WGS data.
Collapse
Affiliation(s)
- Jurg Ott
- 1] Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 100101, China. [2] Laboratory of Statistical Genetics, Rockefeller University, 1230 York Avenue, New York, New York 10065, USA
| | - Jing Wang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 100101, China
| | - Suzanne M Leal
- Center for Statistical Genetics, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
| |
Collapse
|
6
|
No association between CTNNBL1 and episodic memory performance. Transl Psychiatry 2014; 4:e454. [PMID: 25268258 PMCID: PMC4203019 DOI: 10.1038/tp.2014.93] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Revised: 05/01/2014] [Accepted: 05/21/2014] [Indexed: 11/09/2022] Open
Abstract
Polymorphisms in the gene encoding catenin-β-like 1 (CTNNBL1) were recently reported to be associated with verbal episodic memory performance--in particular, delayed verbal free recall assessed between 5 and 30 min after encoding--in a genome-wide association study on healthy young adults. To further examine the genetic effects of CTNNBL1, we tested for association between 455 single-nucleotide polymorphisms (SNPs) in or near CTNNBL1 and 14 measures of episodic memory performance from three different tasks in 1743 individuals. Probands were part of a population-based study of mentally healthy adult men and women, who were between 20 and 70 years old and were recruited as participants for the Berlin Aging Study II. Associations were assessed using linear regression analysis. Despite having sufficient power to detect the previously reported effect sizes, we found no evidence for statistically significant associations between the tested CTNNBL1 SNPs and any of the 14 measures of episodic memory. The previously reported effects of genetic polymorphisms in CTNNBL1 on episodic memory performance do not generalize to the broad range of tasks assessed in our cohort. If not altogether spurious, the effects may be limited to a very narrow phenotypic domain (that is, verbal delayed free recall between 5 and 30 min). More studies are needed to further clarify the role of CTNNBL1 in human memory.
Collapse
|
7
|
Suo C, Toulopoulou T, Bramon E, Walshe M, Picchioni M, Murray R, Ott J. Analysis of multiple phenotypes in genome-wide genetic mapping studies. BMC Bioinformatics 2013; 14:151. [PMID: 23639181 PMCID: PMC3655878 DOI: 10.1186/1471-2105-14-151] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2012] [Accepted: 04/27/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Complex traits may be defined by a range of different criteria. It would result in a loss of information to perform analyses simply on the basis of a final clinical dichotomized affected / unaffected variable. RESULTS We assess the performance of four alternative approaches for the analysis of multiple phenotypes in genetic association studies. We describe the four methods in detail and discuss their relative theoretical merits and disadvantages. Using simulation we demonstrate that PCA provides the greatest power when applied to both correlated phenotypes and with large numbers of phenotypes. The multivariate approach had low type I error only with independent phenotypes or small numbers of phenotypes. In this study, our application of the four methods to schizophrenia data provides converging evidence of the relative performance of the methods. CONCLUSIONS Via power analysis of simulated data and testing of experimental data, we conclude that PCA, creating one variable based on a linear combination of all the traits, performs optimally. We propose that our comparison will provide insight into the properties of the methods and help researchers to choose appropriate strategy in future experimental studies.
Collapse
Affiliation(s)
- Chen Suo
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China.
| | | | | | | | | | | | | |
Collapse
|