Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bonato V, Baladandayuthapani V, Broom BM, Sulman EP, Aldape KD, Do KA. Bayesian ensemble methods for survival prediction in gene expression data. Bioinformatics 2011;27:359-67. [PMID: 21148161 PMCID: PMC3031034 DOI: 10.1093/bioinformatics/btq660] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

For:	Bonato V, Baladandayuthapani V, Broom BM, Sulman EP, Aldape KD, Do KA. Bayesian ensemble methods for survival prediction in gene expression data. Bioinformatics 2011;27:359-67. [PMID: 21148161 PMCID: PMC3031034 DOI: 10.1093/bioinformatics/btq660] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

Number

Cited by Other Article(s)

Baizer L, Bures R, Nadkarni G, Reyes-Guzman C, Ladwa S, Cade B, Westover MB, Durmer J, de Zambotti M, Desai M, Parekh A, Si B, Fernandez-Mendoza J, Minor K, Mazzotti DR, Lee S, Katabi D, Kiss O, Spira AP, Morris J, Seixas A, Kioumourtzoglou MA, Bridges JFP, Brown M, Hale L, Purcell S. Big data approaches for novel mechanistic insights on sleep and circadian rhythms: a workshop summary. Sleep 2025;48:zsaf035. [PMID: 39945146 PMCID: PMC12163129 DOI: 10.1093/sleep/zsaf035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2024] [Revised: 02/01/2025] [Indexed: 02/19/2025] Open

Affiliation(s)

Lawrence Baizer National Center on Sleep Disorders Research, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
Regina Bures National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
Girish Nadkarni The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Carolyn Reyes-Guzman National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
Sweta Ladwa National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
Brian Cade Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Michael Brandon Westover Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
Jeffrey Durmer Sleep & Circadian Science, Absolute Rest, Denver, CO, USA
Massimiliano de Zambotti Science, Ouraring Inc. San Francisco, CA, USA
Manisha Desai Quantitative Sciences Unit, Stanford University Medical School, Stanford, CA, USA
Ankit Parekh Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai New York, NY, USA
Bing Si School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
Julio Fernandez-Mendoza Penn State College of Medicine Sleep Research and Treatment Center, Pennsylvania State University College of Medicine, Hershey, PA, USA
Kelton Minor Department of Environmental Health Sciences, Columbia University Mailman School of Public Health, New York, NY, USA
Diego R Mazzotti Division of Medical Informatics, University of Kansas Medical Center, Kansas City, KS, USA
Soomi Lee Department of Human Development and Family Studies, Center for Healthy Aging, Pennsylvania State University, University Park, PA, USA
Dina Katabi MIT Center for Wireless Networks and Mobile Computing, Massachusetts Institute of Technology, Cambridge, MA, USA
Orsolya Kiss Center for Health Sciences, SRI International, Menlo Park, CA, USA
Adam P Spira Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
Jonna Morris School of Nursing, University of Pittsburgh, Pittsburgh, PA, USA
Azizi Seixas Department of Psychiatry and Behavioral Sciences, University of Miami, Miami, FL, USA
Marianthi-Anna Kioumourtzoglou Department of Environmental Health Sciences, Columbia University Mailman School of Public Health, New York, NY, USA
John F P Bridges College of Medicine, Ohio State University, Columbus, OH, USA
Marishka Brown National Center on Sleep Disorders Research, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
Lauren Hale Renaissance School of Medicine, Stony Brook University, Stony Brook, NY, USA
Shaun Purcell Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA

Collapse

Tran D, Nguyen H, Pham VD, Nguyen P, Nguyen Luu H, Minh Phan L, Blair DeStefano C, Jim Yeung SC, Nguyen T. A comprehensive review of cancer survival prediction using multi-omics integration and clinical variables. Brief Bioinform 2025;26:bbaf150. [PMID: 40221959 PMCID: PMC11994034 DOI: 10.1093/bib/bbaf150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 01/29/2025] [Accepted: 03/19/2025] [Indexed: 04/15/2025] Open

Abstract

Cancer is an umbrella term that includes a wide spectrum of disease severity, from those that are malignant, metastatic, and aggressive to benign lesions with very low potential for progression or death. The ability to prognosticate patient outcomes would facilitate management of various malignancies: patients whose cancer is likely to advance quickly would receive necessary treatment that is commensurate with the predicted biology of the disease. Former prognostic models based on clinical variables (age, gender, cancer stage, tumor grade, etc.), though helpful, cannot account for genetic differences, molecular etiology, tumor heterogeneity, and important host biological mechanisms. Therefore, recent prognostic models have shifted toward the integration of complementary information available in both molecular data and clinical variables to better predict patient outcomes: vital status (overall survival), metastasis (metastasis-free survival), and recurrence (progression-free survival). In this article, we review 20 survival prediction approaches that integrate multi-omics and clinical data to predict patient outcomes. We discuss their strategies for modeling survival time (continuous and discrete), the incorporation of molecular measurements and clinical variables into risk models (clinical and multi-omics data), how to cope with censored patient records, the effectiveness of data integration techniques, prediction methodologies, model validation, and assessment metrics. The goal is to inform life scientists of available resources, and to provide a complete review of important building blocks in survival prediction. At the same time, we thoroughly describe the pros and cons of each methodology, and discuss in depth the outstanding challenges that need to be addressed in future method development.

Collapse

Goedhart JM, Klausch T, Janssen J, van de Wiel MA. Adaptive Use of Co-Data Through Empirical Bayes for Bayesian Additive Regression Trees. Stat Med 2025;44:e70004. [PMID: 39964672 PMCID: PMC11834989 DOI: 10.1002/sim.70004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 11/26/2024] [Accepted: 01/07/2025] [Indexed: 02/20/2025]

Sparapani RA, Maiers M, Spellman SR, Shaw BE, Laud PW, Devine SM, Logan BR. Optimal Donor Selection Across Multiple Outcomes For Hematopoietic Stem Cell Transplantation By Bayesian Nonparametric Machine Learning. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.09.24307134. [PMID: 38766030 PMCID: PMC11100939 DOI: 10.1101/2024.05.09.24307134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]

Wang S, Puggioni G, Wu J, Meador KJ, Caffrey A, Wyss R, Slaughter JL, Suzuki E, Ward KE, Lewkowitz AK, Wen X. Prenatal Exposure to Opioids and Neurodevelopmental Disorders in Children: A Bayesian Mediation Analysis. Am J Epidemiol 2024;193:308-322. [PMID: 37671942 PMCID: PMC11484615 DOI: 10.1093/aje/kwad183] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 06/08/2023] [Accepted: 09/02/2023] [Indexed: 09/07/2023] Open

Payne RD, Guha N, Mallick BK. A Bayesian survival treed hazards model using latent Gaussian processes. Biometrics 2024;80:ujad009. [PMID: 38364805 DOI: 10.1093/biomtc/ujad009] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 06/27/2023] [Accepted: 11/12/2023] [Indexed: 02/18/2024]

Li X, Logan BR, Hossain SMF, Moodie EEM. Dynamic Treatment Regimes Using Bayesian Additive Regression Trees for Censored Outcomes. LIFETIME DATA ANALYSIS 2024;30:181-212. [PMID: 37659991 PMCID: PMC10764602 DOI: 10.1007/s10985-023-09605-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Accepted: 07/16/2023] [Indexed: 09/04/2023]

Sparapani R, Logan B, Maiers M, Laud P, McCulloch R. Nonparametric failure time: Time-to-event machine learning with heteroskedastic Bayesian additive regression trees and low information omnibus Dirichlet process mixtures. Biometrics 2023;79:3023-3037. [PMID: 36932826 PMCID: PMC10505620 DOI: 10.1111/biom.13857] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 02/22/2023] [Indexed: 03/19/2023]

Zhang L, Arabameri A, Santosh M, Pal SC. Land subsidence susceptibility mapping: comparative assessment of the efficacy of the five models. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023:10.1007/s11356-023-27799-0. [PMID: 37266775 DOI: 10.1007/s11356-023-27799-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 05/17/2023] [Indexed: 06/03/2023]

Salerno S, Li Y. High-Dimensional Survival Analysis: Methods and Applications. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION 2023;10:25-49. [PMID: 36968638 PMCID: PMC10038209 DOI: 10.1146/annurev-statistics-032921-022127] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]

Dorie V, Perrett G, Hill JL, Goodrich B. Stan and BART for Causal Inference: Estimating Heterogeneous Treatment Effects Using the Power of Stan and the Flexibility of Machine Learning. ENTROPY (BASEL, SWITZERLAND) 2022;24:1782. [PMID: 36554187 PMCID: PMC9778579 DOI: 10.3390/e24121782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/22/2022] [Accepted: 11/06/2022] [Indexed: 06/17/2023]

Semiparametric Survival Analysis of 30-Day Hospital Readmissions with Bayesian Additive Regression Kernel Model. STATS 2022. [DOI: 10.3390/stats5030038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Chu J, Sun N, Hu W, Chen X, Yi N, Shen Y. Bayesian hierarchical lasso Cox model: A 9-gene prognostic signature for overall survival in gastric cancer in an Asian population. PLoS One 2022;17:e0266805. [PMID: 35421138 PMCID: PMC9009599 DOI: 10.1371/journal.pone.0266805] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 03/29/2022] [Indexed: 12/24/2022] Open

Abstract Objective Gastric cancer (GC) is one of the most common tumour diseases worldwide and has poor survival, especially in the Asian population. Exploration based on biomarkers would be efficient for better diagnosis, prediction, and targeted therapy. Methods Expression profiles were downloaded from the Gene Expression Omnibus (GEO) database. Survival-related genes were identified by gene set enrichment analysis (GSEA) and univariate Cox. Then, we applied a Bayesian hierarchical lasso Cox model for prognostic signature screening. Protein-protein interaction and Spearman analysis were performed. Kaplan–Meier and receiver operating characteristic (ROC) curve analysis were applied to evaluate the prediction performance. Multivariate Cox regression was used to identify prognostic factors, and a prognostic nomogram was constructed for clinical application. Results With the Bayesian lasso Cox model, a 9-gene signature included TNFRSF11A, NMNAT1, EIF5A, NOTCH3, TOR2A, E2F8, PSMA5, TPMT, and KIF11 was established to predict overall survival in GC. Protein-protein interaction analysis indicated that E2F8 was likely related to KIF11. Kaplan-Meier analysis showed a significant difference between the high-risk and low-risk groups (P<0.001). Multivariate analysis demonstrated that the 9-gene signature was an independent predictor (HR = 2.609, 95% CI 2.017–3.370), and the C-index of the integrative model reached 0.75. Function enrichment analysis for different risk groups revealed the most significant enrichment pathway/term, including pyrimidine metabolism and respiratory electron transport chain. Conclusion Our findings suggested that a novel prognostic model based on a 9-gene signature was developed to predict GC patients in high-risk and improve prediction performance. We hope our model could provide a reference for risk classification and clinical decision-making. Collapse

Yan D, Cai S, Bai L, Du Z, Li H, Sun P, Cao J, Yi N, Liu SB, Tang Z. Integration of immune and hypoxia gene signatures improves the prediction of radiosensitivity in breast cancer. Am J Cancer Res 2022;12:1222-1240. [PMID: 35411250 PMCID: PMC8984882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 02/22/2022] [Indexed: 06/14/2023] Open

Abstract

Immunity and hypoxia are two important factors that affect the response of cancer patients to radiotherapy. At the same time, considering the limited predictive value of a single predictive model and the uncertainty of grouping patients near the cutoff value, we developed and validated a combined model based on immune- and hypoxia-related gene expression profiles to predict the radiosensitivity of breast cancer patients. This study was based on breast cancer data from The Cancer Genome Atlas (TCGA). Spike-and-slab Lasso regression analysis was performed to select three immune-related genes and develop a radiosensitivity model. Lasso Cox regression modeling selected 11 hypoxia-related genes for development of radiosensitivity model. Three independent datasets (Molecular Taxonomy of Breast Cancer International Consortium [METABRIC], E-TABM-158, GSE103746) were used to validate the predictive value of radiosensitivity signatures. In the TCGA dataset, the 10-year survival probabilities of the immune radioresistant (IRR) and hypoxia radioresistant (HRR) groups were 0.189 (0.037, 0.973) and 0.477 (0.293, 0.776), respectively. The 10-year survival probabilities of the immune radiosensitive (IRS) and hypoxia radiosensitive (HRS) groups were 0.778 (0.676, 0.895) and 0.824 (0.723, 0.939), respectively. Based on these two gene signatures, we further constructed a combined model and divided all patients into three groups (IRS/HRS, mixed, IRR/HRR). We identified the IRS/HRS patients most likely to benefit from radiotherapy; the 10-year survival probability was 0.886 (0.806, 0.976). The 10-year survival probability of the IRR/HRR group was 0. In conclusion, a combined model integrating immune- and hypoxia-related gene signatures could effectively predict the radiosensitivity of breast cancer and more accurately identify radiosensitive and radioresistant patients than a single model.

Collapse

Affiliation(s)

Derui Yan Department of Biostatistics, School of Public Health, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China Suzhou Key Laboratory of Medical Biotechnology, Suzhou Vocational Health CollegeSuzhou 215009, Jiangsu, China Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China
Shang Cai Department of Radiotherapy & Oncology, The Second Affiliated Hospital of Soochow UniversitySuzhou 215004, Jiangsu, China
Lu Bai Department of Biostatistics, School of Public Health, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China Suzhou Key Laboratory of Medical Biotechnology, Suzhou Vocational Health CollegeSuzhou 215009, Jiangsu, China Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China
Zixuan Du Department of Biostatistics, School of Public Health, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China
Huijun Li Department of Biostatistics, School of Public Health, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China
Peng Sun Department of Otolaryngology, The First Affiliated Hospital of Soochow UniversitySuzhou 215006, Jiangsu, China
Jianping Cao School of Radiation Medicine and Protection and Collaborative Innovation Center of Radiation Medicine of Jiangsu Higher Education Institutions, Soochow UniversitySuzhou 215031, Jiangsu, China
Nengjun Yi Department of Biostatistics, University of Alabama at BirminghamBirmingham, AL 35294, USA
Song-Bai Liu Suzhou Key Laboratory of Medical Biotechnology, Suzhou Vocational Health CollegeSuzhou 215009, Jiangsu, China
Zaixiang Tang Department of Biostatistics, School of Public Health, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow UniversitySuzhou 215123, Jiangsu, China

Collapse

Alkindi KM, Mukherjee K, Pandey M, Arora A, Janizadeh S, Pham QB, Anh DT, Ahmadi K. Prediction of groundwater nitrate concentration in a semiarid region using hybrid Bayesian artificial intelligence approaches. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022;29:20421-20436. [PMID: 34735705 DOI: 10.1007/s11356-021-17224-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 10/21/2021] [Indexed: 06/13/2023]

Chu J, Sun NA, Hu W, Chen X, Yi N, Shen Y. The Application of Bayesian Methods in Cancer Prognosis and Prediction. Cancer Genomics Proteomics 2022;19:1-11. [PMID: 34949654 PMCID: PMC8717957 DOI: 10.21873/cgp.20298] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 11/24/2021] [Accepted: 11/30/2021] [Indexed: 11/10/2022] Open

Maity AK, Lee SC, Hu L, Bell-Pedersen D, Mallick BK, Sarkar TR. Circadian Gene Selection for Time-to-event Phenotype by Integrating CNV and RNAseq Data. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS : AN INTERNATIONAL JOURNAL SPONSORED BY THE CHEMOMETRICS SOCIETY 2021;212:104276. [PMID: 35068632 PMCID: PMC8775911 DOI: 10.1016/j.chemolab.2021.104276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Basak P, Linero A, Sinha D, Lipsitz S. Semiparametric analysis of clustered interval-censored survival data using soft Bayesian additive regression trees (SBART). Biometrics 2021;78:880-893. [PMID: 33864633 DOI: 10.1111/biom.13478] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 03/10/2021] [Accepted: 04/01/2021] [Indexed: 11/30/2022]

Spanbauer C, Sparapani R. Nonparametric machine learning for precision medicine with longitudinal clinical trials and Bayesian additive regression trees with mixed models. Stat Med 2021;40:2665-2691. [PMID: 33751659 DOI: 10.1002/sim.8924] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 12/14/2020] [Accepted: 02/07/2021] [Indexed: 11/11/2022]

Yu X, Yang Q, Wang D, Li Z, Chen N, Kong DX. Predicting lung adenocarcinoma disease progression using methylation-correlated blocks and ensemble machine learning classifiers. PeerJ 2021;9:e10884. [PMID: 33628643 PMCID: PMC7894106 DOI: 10.7717/peerj.10884] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 01/12/2021] [Indexed: 01/20/2023] Open

Maity AK, Carroll RJ, Mallick BK. Integration of Survival and Binary Data for Variable Selection and Prediction: A Bayesian Approach. J R Stat Soc Ser C Appl Stat 2020;68:1577-1595. [PMID: 33311813 DOI: 10.1111/rssc.12377] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Maity AK, Lee SC, Mallick BK, Sarkar TR. Bayesian structural equation modeling in multiple omics data with application to circadian genes. Bioinformatics 2020;36:3951-3958. [PMID: 32369552 DOI: 10.1093/bioinformatics/btaa286] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 03/30/2020] [Accepted: 04/27/2020] [Indexed: 11/13/2022] Open

Henderson NC, Louis TA, Rosner GL, Varadhan R. Individualized treatment effects with censored data via fully nonparametric Bayesian accelerated failure time models. Biostatistics 2020;21:50-68. [PMID: 30052809 PMCID: PMC8972560 DOI: 10.1093/biostatistics/kxy028] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Revised: 05/24/2018] [Accepted: 06/14/2018] [Indexed: 09/04/2023] Open

Tan YV, Roy J. Bayesian additive regression trees and the General BART model. Stat Med 2019;38:5048-5069. [PMID: 31460678 PMCID: PMC6800811 DOI: 10.1002/sim.8347] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Revised: 07/05/2019] [Accepted: 07/23/2019] [Indexed: 11/06/2022]

Maity AK, Bhattacharya A, Mallick BK, Baladandayuthapani V. Bayesian data integration and variable selection for pan-cancer survival prediction using protein expression data. Biometrics 2019;76:316-325. [PMID: 31393003 DOI: 10.1111/biom.13132] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 07/19/2019] [Indexed: 12/20/2022]

Nethery RC, Mealli F, Dominici F. ESTIMATING POPULATION AVERAGE CAUSAL EFFECTS IN THE PRESENCE OF NON-OVERLAP: THE EFFECT OF NATURAL GAS COMPRESSOR STATION EXPOSURE ON CANCER MORTALITY. Ann Appl Stat 2019;13:1242-1267. [PMID: 31346355 PMCID: PMC6658123 DOI: 10.1214/18-aoas1231] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Abstract

Most causal inference studies rely on the assumption of overlap to estimate population or sample average causal effects. When data suffer from non-overlap, estimation of these estimands requires reliance on model specifications, due to poor data support. All existing methods to address non-overlap, such as trimming or down-weighting data in regions of poor data support, change the estimand so that inference cannot be made on the sample or the underlying population. In environmental health research settings, where study results are often intended to influence policy, population-level inference may be critical, and changes in the estimand can diminish the impact of the study results, because estimates may not be representative of effects in the population of interest to policymakers. Researchers may be willing to make additional, minimal modeling assumptions in order to preserve the ability to estimate population average causal effects. We seek to make two contributions on this topic. First, we propose a flexible, data-driven definition of propensity score overlap and non-overlap regions. Second, we develop a novel Bayesian framework to estimate population average causal effects with minor model dependence and appropriately large uncertainties in the presence of non-overlap and causal effect heterogeneity. In this approach, the tasks of estimating causal effects in the overlap and non-overlap regions are delegated to two distinct models, suited to the degree of data support in each region. Tree ensembles are used to non-parametrically estimate individual causal effects in the overlap region, where the data can speak for themselves. In the non-overlap region, where insufficient data support means reliance on model specification is necessary, individual causal effects are estimated by extrapolating trends from the overlap region via a spline model. The promising performance of our method is demonstrated in simulations. Finally, we utilize our method to perform a novel investigation of the causal effect of natural gas compressor station exposure on cancer outcomes. Code and data to implement the method and reproduce all simulations and analyses is available on Github (https://github.com/rachelnethery/overlap).

Collapse

Hsu JBK, Chang TH, Lee GA, Lee TY, Chen CY. Identification of potential biomarkers related to glioma survival by gene expression profile analysis. BMC Med Genomics 2019;11:34. [PMID: 30894197 PMCID: PMC7402580 DOI: 10.1186/s12920-019-0479-6] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Accepted: 02/06/2019] [Indexed: 01/11/2023] Open

Abstract

BACKGROUND

Recent studies have proposed several gene signatures as biomarkers for different grades of gliomas from various perspectives. However, most of these genes can only be used appropriately for patients with specific grades of gliomas.

METHODS

In this study, we aimed to identify survival-relevant genes shared between glioblastoma multiforme (GBM) and lower-grade glioma (LGG), which could be used as potential biomarkers to classify patients into different risk groups. Cox proportional hazard regression model (Cox model) was used to extract relative genes, and effectiveness of genes was estimated against random forest regression. Finally, risk models were constructed with logistic regression.

RESULTS

We identified 104 key genes that were shared between GBM and LGG, which could be significantly correlated with patients' survival based on next-generation sequencing data obtained from The Cancer Genome Atlas for gene expression analysis. The effectiveness of these genes in the survival prediction of GBM and LGG was evaluated, and the average receiver operating characteristic curve (ROC) area under the curve values ranged from 0.7 to 0.8. Gene set enrichment analysis revealed that these genes were involved in eight significant pathways and 23 molecular functions. Moreover, the expressions of ten (CTSZ, EFEMP2, ITGA5, KDELR2, MDK, MICALL2, MAP 2 K3, PLAUR, SERPINE1, and SOCS3) of these genes were significantly higher in GBM than in LGG, and comparing their expression levels to those of the proposed control genes (TBP, IPO8, and SDHA) could have the potential capability to classify patients into high- and low- risk groups, which differ significantly in the overall survival. Signatures of candidate genes were validated, by multiple microarray datasets from Gene Expression Omnibus, to increase the robustness of using these potential prognostic factors. In both the GBM and LGG cohort study, most of the patients in the high-risk group had the IDH1 wild-type gene, and those in the low-risk group had IDH1 mutations. Moreover, most of the high-risk patients with LGG possessed a 1p/19q-noncodeletion.

CONCLUSION

In this study, we identified survival relevant genes which were shared between GBM and LGG, and those enabled to classify patients into high- and low-risk groups based on expression level analysis. Both the risk groups could be correlated with the well-known genetic variants, thus suggesting their potential prognostic value in clinical application.

Collapse

Risser MD, Calder CA, Berrocal VJ, Berrett C. NONSTATIONARY SPATIAL PREDICTION OF SOIL ORGANIC CARBON: IMPLICATIONS FOR STOCK ASSESSMENT DECISION MAKING. Ann Appl Stat 2019;13:165-188. [PMID: 39220174 PMCID: PMC11364347 DOI: 10.1214/18-aoas1204] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]

Bellot A, van der Schaar M. A Hierarchical Bayesian Model for Personalized Survival Predictions. IEEE J Biomed Health Inform 2019;23:72-80. [DOI: 10.1109/jbhi.2018.2832599] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Tang Z, Shen Y, Zhang X, Yi N. The spike-and-slab lasso Cox model for survival prediction and associated genes detection. Bioinformatics 2018;33:2799-2807. [PMID: 28472220 DOI: 10.1093/bioinformatics/btx300] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Accepted: 05/05/2017] [Indexed: 12/20/2022] Open

Cui Y, Li B, Li R. Decentralized Learning Framework of Meta-Survival Analysis for Developing Robust Prognostic Signatures. JCO Clin Cancer Inform 2017;1:1-13. [PMID: 30657395 PMCID: PMC6873986 DOI: 10.1200/cci.17.00077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract

PURPOSE

A significant hurdle in developing reliable gene expression-based prognostic models has been the limited sample size, which can cause overfitting and false discovery. Combining data from multiple studies can enhance statistical power and reduce spurious findings, but how to address the biologic heterogeneity across different datasets remains a major challenge. Better meta-survival analysis approaches are needed.

MATERIAL AND METHODS

We presented a decentralized learning framework for meta-survival analysis without the need for data aggregation. Our method consisted of a series of proposals that together alleviated the influence of data heterogeneity and improved the performance of survival prediction. First, we transformed the gene expression profile of every sample into normalized percentile ranks to obtain platform-agnostic features. Second, we used Stouffer's meta-z approach in combination with Harrell's concordance index to prioritize and select genes to be included in the model. Third, we used survival discordance as a scale-independent model loss function. Instead of generating a merged dataset and training the model therein, we avoided comparing patients across datasets and individually evaluated the loss function on each dataset. Finally, we optimized the model by minimizing the joint loss function.

RESULTS

Through comprehensive evaluation on 31 public microarray datasets containing 6,724 samples of several cancer types, we demonstrated that the proposed method has outperformed (1) single prognostic genes identified using conventional meta-analysis, (2) multigene signatures trained on single datasets, (3) multigene signatures trained on merged datasets as well as by other existing meta-analysis methods, and (4) clinically applicable, established multigene signatures.

CONCLUSION

The decentralized learning approach can be used to effectively perform meta-analysis of gene expression data and to develop robust multigene prognostic signatures.

Collapse

Morris JS, Baladandayuthapani V. Statistical Contributions to Bioinformatics: Design, Modeling, Structure Learning, and Integration. STAT MODEL 2017;17:245-289. [PMID: 29129969 PMCID: PMC5679480 DOI: 10.1177/1471082x17698255] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Kindo BP, Wang H, Peña EA. Multinomial probit Bayesian additive regression trees. Stat (Int Stat Inst) 2016;5:119-131. [PMID: 27330743 DOI: 10.1002/sta4.110] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Sparapani RA, Logan BR, McCulloch RE, Laud PW. Nonparametric survival analysis using Bayesian Additive Regression Trees (BART). Stat Med 2016;35:2741-53. [PMID: 26854022 DOI: 10.1002/sim.6893] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Revised: 01/11/2016] [Accepted: 01/12/2016] [Indexed: 11/06/2022]

Bayesian methods for proteomic biomarker development. EUPA OPEN PROTEOMICS 2015. [DOI: 10.1016/j.euprot.2015.08.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Zou M, Liu Z, Zhang XS, Wang Y. NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data. Bioinformatics 2015;31:3330-8. [PMID: 26092859 DOI: 10.1093/bioinformatics/btv374] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2015] [Accepted: 06/14/2015] [Indexed: 12/26/2022] Open

Abstract

MOTIVATION

In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need.

RESULTS

In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories.

CONCLUSION

In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis.

AVAILABILITY AND IMPLEMENTATION

NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC.

CONTACT

ywang@amss.ac.cn

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Zhou L, Xu Q, Wang H. Rotation survival forest for right censored data. PeerJ 2015;3:e1009. [PMID: 26082863 PMCID: PMC4465950 DOI: 10.7717/peerj.1009] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Accepted: 05/19/2015] [Indexed: 11/20/2022] Open

García V, Salvador Sánchez J. Mapping microarray gene expression data into dissimilarity spaces for tumor classification. Inf Sci (N Y) 2015. [DOI: 10.1016/j.ins.2014.09.064] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Tiong KL, Chang KC, Yeh KT, Liu TY, Wu JH, Hsieh PH, Lin SH, Lai WY, Hsu YC, Chen JY, Chang JG, Shieh GS. CSNK1E/CTNNB1 are synthetic lethal to TP53 in colorectal cancer and are markers for prognosis. Neoplasia 2014;16:441-50. [PMID: 24947187 PMCID: PMC4198690 DOI: 10.1016/j.neo.2014.04.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Revised: 04/25/2014] [Accepted: 04/29/2014] [Indexed: 02/03/2023] Open

Affiliation(s)

Khong-Loon Tiong Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan, R.O.C.; Institute of Biomedical Informatics, National Yang-Ming University, Taipei 112, Taiwan, R.O.C
Kuo-Ching Chang Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan, R.O.C
Kun-Tu Yeh Department of Pathology, Changhua Christian Hospital, Changhua 505, Taiwan, R.O.C.; Department of Pathology, School of Medicine, Chung Shan Medical University, Taichung 402, Taiwan, R.O.C
Ting-Yuan Liu Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 807, Taiwan, R.O.C
Jia-Hong Wu Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan, R.O.C
Ping-Heng Hsieh Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan, R.O.C
Shu-Hui Lin Department of Pathology, Changhua Christian Hospital, Changhua 505, Taiwan, R.O.C.; Jen-Teh Junior College of Medicine, Nursing Management, Miaoli 356, Taiwan, R.O.C
Wei-Yun Lai Molecular Medicine Program, Taiwan International Graduate Program, Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan, R.O.C.; Institute of Biochemistry and Molecular Biology, School of Life Sciences, National Yang-Ming University, Taipei 112, Taiwan, R.O.C
Yu-Chin Hsu Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan, R.O.C
Jeou-Yuan Chen Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan, R.O.C
Jan-Gowth Chang Department of Laboratory Medicine, and Center of RNA Biology and Clinical Application, China Medical University Hospital, China Medical University, Taichung 404, Taiwan, R.O.C..
Grace S Shieh Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan, R.O.C.; Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan, R.O.C..

Collapse

Zhang L, Baladandayuthapani V, Mallick BK, Manyam GC, Thompson PA, Bondy ML, Do KA. Bayesian hierarchical structured variable selection methods with application to MIP studies in breast cancer. J R Stat Soc Ser C Appl Stat 2014;63:595-620. [PMID: 25705056 DOI: 10.1111/rssc.12053] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Park C, Ahn J, Kim H, Park S. Integrative gene network construction to analyze cancer recurrence using semi-supervised learning. PLoS One 2014;9:e86309. [PMID: 24497942 PMCID: PMC3908883 DOI: 10.1371/journal.pone.0086309] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 12/09/2013] [Indexed: 12/17/2022] Open

Thamrin SA, McGree JM, Mengersen KL. Modelling survival data to account for model uncertainty: a single model or model averaging? SPRINGERPLUS 2013;2:665. [PMID: 24386617 PMCID: PMC3877415 DOI: 10.1186/2193-1801-2-665] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 11/18/2013] [Indexed: 11/10/2022]

Zhang L, Mallick BK. Inferring gene networks from discrete expression data. Biostatistics 2013;14:708-22. [PMID: 23873894 DOI: 10.1093/biostatistics/kxt021] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Lai Y, Hayashida M, Akutsu T. Survival analysis by penalized regression and matrix factorization. ScientificWorldJournal 2013;2013:632030. [PMID: 23737722 PMCID: PMC3655687 DOI: 10.1155/2013/632030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2013] [Accepted: 04/03/2013] [Indexed: 11/18/2022] Open

Wang W, Baladandayuthapani V, Morris JS, Broom BM, Manyam G, Do KA. iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data. ACTA ACUST UNITED AC 2012;29:149-59. [PMID: 23142963 PMCID: PMC3546799 DOI: 10.1093/bioinformatics/bts655] [Citation(s) in RCA: 102] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]