1
|
Li M, Hu X, Li Y, Chen G, Ding CG, Tian X, Tian P, Xiang H, Pan X, Ding X, Xue W, Zheng J. Development and validation of a novel nomogram model for predicting delayed graft function in deceased donor kidney transplantation based on pre-transplant biopsies. BMC Nephrol 2024; 25:138. [PMID: 38641807 PMCID: PMC11031976 DOI: 10.1186/s12882-024-03557-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 03/21/2024] [Indexed: 04/21/2024] Open
Abstract
BACKGROUND Delayed graft function (DGF) is an important complication after kidney transplantation surgery. The present study aimed to develop and validate a nomogram for preoperative prediction of DGF on the basis of clinical and histological risk factors. METHODS The prediction model was constructed in a development cohort comprising 492 kidney transplant recipients from May 2018 to December 2019. Data regarding donor and recipient characteristics, pre-transplantation biopsy results, and machine perfusion parameters were collected, and univariate analysis was performed. The least absolute shrinkage and selection operator regression model was used for variable selection. The prediction model was developed by multivariate logistic regression analysis and presented as a nomogram. An external validation cohort comprising 105 transplantation cases from January 2020 to April 2020 was included in the analysis. RESULTS 266 donors were included in the development cohort, 458 kidneys (93.1%) were preserved by hypothermic machine perfusion (HMP), 96 (19.51%) of 492 recipients developed DGF. Twenty-eight variables measured before transplantation surgery were included in the LASSO regression model. The nomogram consisted of 12 variables from donor characteristics, pre-transplantation biopsy results and machine perfusion parameters. Internal and external validation showed good discrimination and calibration of the nomogram, with Area Under Curve (AUC) 0.83 (95%CI, 0.78-0.88) and 0.87 (95%CI, 0.80-0.94). Decision curve analysis demonstrated that the nomogram was clinically useful. CONCLUSION A DGF predicting nomogram was developed that incorporated donor characteristics, pre-transplantation biopsy results, and machine perfusion parameters. This nomogram can be conveniently used for preoperative individualized prediction of DGF in kidney transplant recipients.
Collapse
Affiliation(s)
- Meihe Li
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Xiaojun Hu
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Yang Li
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Guozhen Chen
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Chen-Guang Ding
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Xiaohui Tian
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Puxun Tian
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Heli Xiang
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Xiaoming Pan
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Xiaoming Ding
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China
| | - Wujun Xue
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China.
| | - Jin Zheng
- Department of Renal Transplantation, Nephropathy Hospital, The First Affiliated Hospital of Xi'an Jiaotong University, 710061, Xi'an, Shaanxi, China.
| |
Collapse
|
2
|
Sun NA, Wang YU, Chu J, Han Q, Shen Y. Bayesian Approaches in Exploring Gene-environment and Gene-gene Interactions: A Comprehensive Review. Cancer Genomics Proteomics 2023; 20:669-678. [PMID: 38035701 PMCID: PMC10687732 DOI: 10.21873/cgp.20414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/30/2023] [Accepted: 11/01/2023] [Indexed: 12/02/2023] Open
Abstract
Rapid advancements in high-throughput biological techniques have facilitated the generation of high-dimensional omics datasets, which have provided a solid foundation for precision medicine and prognosis prediction. Nonetheless, the problem of missing heritability persists. To solve this problem, it is essential to explain the genetic structure of disease incidence risk and prognosis by incorporating interactions. The development of the Bayesian theory has provided new approaches for developing models for interaction identification and estimation. Several Bayesian models have been developed to improve the accuracy of model and identify the main effect, gene-environment (G×E) and gene-gene (G×G) interactions. Studies based on single-nucleotide polymorphisms (SNPs) are significant for the exploration of rare and common variants. Models based on the effect heredity principle and group-based models are relatively flexible and do not require strict constraints when dealing with the hierarchical structure between the main effect and interactions (M-I). These models have a good interpretability of biological mechanisms. Machine learning-based Bayesian approaches are highly competitive in improving prediction accuracy. These models provide insights into the mechanisms underlying the occurrence and progression of complex diseases, identify more reliable biomarkers, and develop higher predictive accuracy. In this paper, we provide a comprehensive review of these Bayesian approaches.
Collapse
Affiliation(s)
- N A Sun
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, P.R. China
| | - Y U Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, P.R. China
| | - Jiadong Chu
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, P.R. China
| | - Qiang Han
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, P.R. China
| | - Yueping Shen
- Department of Epidemiology and Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, P.R. China
| |
Collapse
|
3
|
Sajal IH, Biswas S. Bivariate quantitative Bayesian LASSO for detecting association of rare haplotypes with two correlated continuous phenotypes. Front Genet 2023; 14:1104727. [PMID: 36968609 PMCID: PMC10033866 DOI: 10.3389/fgene.2023.1104727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 02/21/2023] [Indexed: 03/12/2023] Open
Abstract
In genetic association studies, the multivariate analysis of correlated phenotypes offers statistical and biological advantages compared to analyzing one phenotype at a time. The joint analysis utilizes additional information contained in the correlation and avoids multiple testing. It also provides an opportunity to investigate and understand shared genetic mechanisms of multiple phenotypes. Bivariate logistic Bayesian LASSO (LBL) was proposed earlier to detect rare haplotypes associated with two binary phenotypes or one binary and one continuous phenotype jointly. There is currently no haplotype association test available that can handle multiple continuous phenotypes. In this study, by employing the framework of bivariate LBL, we propose bivariate quantitative Bayesian LASSO (QBL) to detect rare haplotypes associated with two continuous phenotypes. Bivariate QBL removes unassociated haplotypes by regularizing the regression coefficients and utilizing a latent variable to model correlation between two phenotypes. We carry out extensive simulations to investigate the performance of bivariate QBL and compare it with that of a standard (univariate) haplotype association test, Haplo.score (applied twice to two phenotypes individually). Bivariate QBL performs better than Haplo.score in all simulations with varying degrees of power gain. We analyze Genetic Analysis Workshop 19 exome sequencing data on systolic and diastolic blood pressures and detect several rare haplotypes associated with the two phenotypes.
Collapse
Affiliation(s)
| | - Swati Biswas
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX, United States
| |
Collapse
|
4
|
Abstract
The stage of cancer is a discrete ordinal response that indicates the aggressiveness of disease and is often used by physicians to determine the type and intensity of treatment to be administered. For example, the FIGO stage in cervical cancer is based on the size and depth of the tumor as well as the level of spread. It may be of clinical relevance to identify molecular features from high-throughput genomic assays that are associated with the stage of cervical cancer to elucidate pathways related to tumor aggressiveness, identify improved molecular features that may be useful for staging, and identify therapeutic targets. High-throughput RNA-Seq data and corresponding clinical data (including stage) for cervical cancer patients have been made available through The Cancer Genome Atlas Project (TCGA). We recently described penalized Bayesian ordinal response models that can be used for variable selection for over-parameterized datasets, such as the TCGA-CESC dataset. Herein, we describe our ordinalbayes R package, available from the Comprehensive R Archive Network (CRAN), which enhances the runjags R package by enabling users to easily fit cumulative logit models when the outcome is ordinal and the number of predictors exceeds the sample size, P > N, such as for TCGA and other high-throughput genomic data. We demonstrate the use of this package by applying it to the TCGA cervical cancer dataset. Our ordinalbayes package can be used to fit models to high-dimensional datasets, and it effectively performs variable selection.
Collapse
|
5
|
Zhang Y, Archer KJ. Bayesian variable selection for high-dimensional data with an ordinal response: identifying genes associated with prognostic risk group in acute myeloid leukemia. BMC Bioinformatics 2021; 22:539. [PMID: 34727888 PMCID: PMC8565083 DOI: 10.1186/s12859-021-04432-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 10/04/2021] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Acute myeloid leukemia (AML) is a heterogeneous cancer of the blood, though specific recurring cytogenetic abnormalities in AML are strongly associated with attaining complete response after induction chemotherapy, remission duration, and survival. Therefore recurring cytogenetic abnormalities have been used to segregate patients into favorable, intermediate, and adverse prognostic risk groups. However, it is unclear how expression of genes is associated with these prognostic risk groups. We postulate that expression of genes monotonically associated with these prognostic risk groups may yield important insights into leukemogenesis. Therefore, in this paper we propose penalized Bayesian ordinal response models to predict prognostic risk group using gene expression data. We consider a double exponential prior, a spike-and-slab normal prior, a spike-and-slab double exponential prior, and a regression-based approach with variable inclusion indicators for modeling our high-dimensional ordinal response, prognostic risk group, and identify genes through hypothesis tests using Bayes factor. RESULTS Gene expression was ascertained using Affymetrix HG-U133Plus2.0 GeneChips for 97 favorable, 259 intermediate, and 97 adverse risk AML patients. When applying our penalized Bayesian ordinal response models, genes identified for model inclusion were consistent among the four different models. Additionally, the genes included in the models were biologically plausible, as most have been previously associated with either AML or other types of cancer. CONCLUSION These findings demonstrate that our proposed penalized Bayesian ordinal response models are useful for performing variable selection for high-dimensional genomic data and have the potential to identify genes relevantly associated with an ordinal phenotype.
Collapse
Affiliation(s)
| | - Kellie J Archer
- Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
6
|
Yuan X, Biswas S. Detecting rare haplotype association with two correlated phenotypes of binary and continuous types. Stat Med 2021; 40:1877-1900. [PMID: 33438281 DOI: 10.1002/sim.8877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 11/18/2020] [Accepted: 12/25/2020] [Indexed: 11/10/2022]
Abstract
Multiple correlated traits/phenotypes are often collected in genetic association studies and they may share a common genetic mechanism. Joint analysis of correlated phenotypes has well-known advantages over one-at-a-time analysis including gain in power and better understanding of genetic etiology. However, when the phenotypes are of discordant types such as binary and continuous, the joint modeling is more challenging. Another research area of current interest is discovery of rare genetic variants. Currently there is no method available for detecting association of rare (or common) haplotypes with multiple discordant phenotypes jointly. Our goal is to fill this gap specifically for two discordant phenotypes. We consider a rare haplotype association method for a binary phenotype, logistic Bayesian LASSO (univariate LBL) and its extension for two correlated binary phenotypes (bivariate LBL-2B). Under this framework, we propose a haplotype association test with binary and continuous phenotypes jointly (bivariate LBL-BC). Specifically, we use a latent variable to induce correlation between the two phenotypes. We carry out extensive simulations to investigate bivariate LBL-BC and compare it with univariate LBL and bivariate LBL-2B. In most settings, bivariate LBL-BC performs the best. In only two situations, bivariate LBL-BC has similar performance-when the two phenotypes are (1) weakly or not correlated and the target haplotype affects the binary phenotype only and (2) strongly positively correlated and the target haplotype affects both phenotypes in positive direction. Finally, we apply the method to a data set on lung cancer and nicotine dependence and detect several haplotypes including a rare one.
Collapse
Affiliation(s)
- Xiaochen Yuan
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas, USA
| | - Swati Biswas
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas, USA
| |
Collapse
|
7
|
Zhang Y, Archer KJ. Bayesian penalized cumulative logit model for high-dimensional data with an ordinal response. Stat Med 2020; 40:1453-1481. [PMID: 33336826 DOI: 10.1002/sim.8851] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 11/23/2020] [Accepted: 11/23/2020] [Indexed: 01/15/2023]
Abstract
Many previous studies have identified associations between gene expression, measured using high-throughput genomic platforms, and quantitative or dichotomous traits. However, we note that health outcome and disease status measurements frequently appear on an ordinal scale, that is, the outcome is categorical but has inherent ordering. Identification of important genes may be useful for developing novel diagnostic and prognostic tools to predict or classify stage of disease. Gene expression data are usually high-dimensional, meaning that the number of genes is much larger than the sample size or number of patients. Herein we describe some existing frequentist methods for modeling an ordinal response in a high-dimensional predictor space. Following Tibshirani (1996), who described the LASSO estimate as the Bayesian posterior mode when the regression coefficients have independent Laplace priors, we propose a new approach for high-dimensional data with an ordinal response that is rooted in the Bayesian paradigm. We show that our proposed Bayesian approach outperforms existing frequentist methods through simulation studies. We then compare the performance of frequentist and Bayesian approaches using a study evaluating progression to hepatocellular carcinoma in hepatitis C infected patients.
Collapse
Affiliation(s)
- Yiran Zhang
- College of Public Health, The Ohio State University, Columbus, Ohio, USA
| | - Kellie J Archer
- College of Public Health, The Ohio State University, Columbus, Ohio, USA
| |
Collapse
|
8
|
Zhang L, Papachristou C, Choudhary PK, Biswas S. A Bayesian Hierarchical Framework for Pathway Analysis in Genome-Wide Association Studies. Hum Hered 2020; 84:240-255. [PMID: 32966977 DOI: 10.1159/000508664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 05/14/2020] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Pathway analysis allows joint consideration of multiple SNPs belonging to multiple genes, which in turn belong to a biologically defined pathway. This type of analysis is usually more powerful than single-SNP analyses for detecting joint effects of variants in a pathway. METHODS We develop a Bayesian hierarchical model by fully modeling the 3-level hierarchy, namely, SNP-gene-pathway that is naturally inherent in the structure of the pathways, unlike the currently used ad hoc ways of combining such information. We model the effects at each level conditional on the effects of the levels preceding them within the generalized linear model framework. To deal with the high dimensionality, we regularize the regression coefficients through an appropriate choice of priors. The model is fit using a combination of iteratively weighted least squares and expectation-maximization algorithms to estimate the posterior modes and their standard errors. A normal approximation is used for inference. RESULTS We conduct simulations to study the proposed method and find that our method has higher power than some standard approaches in several settings for identifying pathways with multiple modest-sized variants. We illustrate the method by analyzing data from two genome-wide association studies on breast and renal cancers. CONCLUSION Our method can be helpful in detecting pathway association.
Collapse
Affiliation(s)
- Lei Zhang
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas, USA
| | | | - Pankaj K Choudhary
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas, USA
| | - Swati Biswas
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas, USA,
| |
Collapse
|
9
|
Lemoine É, Dallaire F, Yadav R, Agarwal R, Kadoury S, Trudel D, Guiot MC, Petrecca K, Leblond F. Feature engineering applied to intraoperative in vivo Raman spectroscopy sheds light on molecular processes in brain cancer: a retrospective study of 65 patients. Analyst 2020; 144:6517-6532. [PMID: 31647061 DOI: 10.1039/c9an01144g] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Raman spectroscopy is a promising tool for neurosurgical guidance and cancer research. Quantitative analysis of the Raman signal from living tissues is, however, limited. Their molecular composition is convoluted and influenced by clinical factors, and access to data is limited. To ensure acceptance of this technology by clinicians and cancer scientists, we need to adapt the analytical methods to more closely model the Raman-generating process. Our objective is to use feature engineering to develop a new representation for spectral data specifically tailored for brain diagnosis that improves interpretability of the Raman signal while retaining enough information to accurately predict tissue content. The method consists of band fitting of Raman bands which consistently appear in the brain Raman literature, and the generation of new features representing the pairwise interaction between bands and the interaction between bands and patient age. Our technique was applied to a dataset of 547 in situ Raman spectra from 65 patients undergoing glioma resection. It showed superior predictive capacities to a principal component analysis dimensionality reduction. After analysis through a Bayesian framework, we were able to identify the oncogenic processes that characterize glioma: increased nucleic acid content, overexpression of type IV collagen and shift in the primary metabolic engine. Our results demonstrate how this mathematical transformation of the Raman signal allows the first biological, statistically robust analysis of in vivo Raman spectra from brain tissue.
Collapse
Affiliation(s)
- Émile Lemoine
- Department of Engineering Physics, Polytechnique Montreal, Montreal, Quebec, Canada.
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Wang L, Lin D, Li Y. Exploiting gene-environment independence in haplotype-based inferences for population-based case-control studies with complex sampling. Stat Med 2020; 39:57-69. [PMID: 31746016 DOI: 10.1002/sim.8395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 09/05/2019] [Accepted: 09/20/2019] [Indexed: 11/07/2022]
Abstract
The use of complex sampling in population-based case-control studies is becoming more common. Although most single nucleotide polymorphism-based association studies with complex sampling account for the design complications, many of haplotype-based genetic association studies with complex sampling tend to ignore them when estimating haplotype frequencies, regression coefficients, or both. In this article, we develop innovative one-step and two-step statistical methods that account for the design complications in haplotype-based association studies when cases and/or controls are sampled with complex sampling. Attracted by the efficiency advantage of the retrospective method, we explore the assumptions of Hardy-Weinberg equilibrium and gene-environment independence in the underlying population. Results of our simulation studies demonstrate superior performance of the proposed methods over selected existing methods under various complex sampling designs. An application of the proposed methods is illustrated using a population-based case-control study of kidney cancer.
Collapse
Affiliation(s)
- Lingxiao Wang
- The Joint Program in Survey Methodology, University of Maryland, College Park, Maryland
| | - Daoying Lin
- Department of Mathematics, The University of Texas at Arlington, Arlington, Texas
| | - Yan Li
- The Joint Program in Survey Methodology, University of Maryland, College Park, Maryland
| |
Collapse
|
11
|
Yuan X, Biswas S. Bivariate logistic Bayesian LASSO for detecting rare haplotype association with two correlated phenotypes. Genet Epidemiol 2019; 43:996-1017. [PMID: 31544985 DOI: 10.1002/gepi.22258] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 07/31/2019] [Accepted: 08/09/2019] [Indexed: 11/08/2022]
Abstract
In genetic association studies, joint modeling of related traits/phenotypes can utilize the correlation between them and thereby provide more power and uncover additional information about genetic etiology. Moreover, detecting rare genetic variants are of current scientific interest as a key to missing heritability. Logistic Bayesian LASSO (LBL) has been proposed recently to detect rare haplotype variants using case-control data, that is, a single binary phenotype. As there is currently no haplotype association method that can handle multiple binary phenotypes, we extend LBL to fill this gap. We develop a bivariate model by using a latent variable to induce correlation between the two outcomes. We carry out extensive simulations to investigate the bivariate LBL and compare with the univariate LBL. The bivariate LBL performs better or similar to the univariate LBL in most settings. It has the highest gain in power when a haplotype is associated with both traits and it affects at least one trait in a direction opposite to the direction of the correlation between the traits. We analyze two data sets-Genetic Analysis Workshop 19 sequence data on systolic and diastolic blood pressures and a genome-wide association data set on lung cancer and smoking and detect several associated rare haplotypes.
Collapse
Affiliation(s)
- Xiaochen Yuan
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas
| | - Swati Biswas
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas
| |
Collapse
|