Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

44
(from Reference Citation Analysis)

Article PDFs (14)

Cited by > 0 (35)

Searched Name

Penalized regression

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Huang P, Cai M, Lu X, McKennan C, Wang J. Accurate estimation of rare cell type fractions from tissue omics data via hierarchical deconvolution. bioRxiv 2023:2023.03.15.532820. [PMID: 36993280 PMCID: PMC10055056 DOI: 10.1101/2023.03.15.532820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Calle ML, Pujolassos M, Susin A. coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies. BMC Bioinformatics 2023;24:82. [PMID: 36879227 PMCID: PMC9990256 DOI: 10.1186/s12859-023-05205-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 02/22/2023] [Indexed: 03/08/2023] Open

Abstract

BACKGROUND

One of the main challenges of microbiome analysis is its compositional nature that if ignored can lead to spurious results. Addressing the compositional structure of microbiome data is particularly critical in longitudinal studies where abundances measured at different times can correspond to different sub-compositions.

RESULTS

We developed coda4microbiome, a new R package for analyzing microbiome data within the Compositional Data Analysis (CoDA) framework in both, cross-sectional and longitudinal studies. The aim of coda4microbiome is prediction, more specifically, the method is designed to identify a model (microbial signature) containing the minimum number of features with the maximum predictive power. The algorithm relies on the analysis of log-ratios between pairs of components and variable selection is addressed through penalized regression on the "all-pairs log-ratio model", the model containing all possible pairwise log-ratios. For longitudinal data, the algorithm infers dynamic microbial signatures by performing penalized regression over the summary of the log-ratio trajectories (the area under these trajectories). In both, cross-sectional and longitudinal studies, the inferred microbial signature is expressed as the (weighted) balance between two groups of taxa, those that contribute positively to the microbial signature and those that contribute negatively. The package provides several graphical representations that facilitate the interpretation of the analysis and the identified microbial signatures. We illustrate the new method with data from a Crohn's disease study (cross-sectional data) and on the developing microbiome of infants (longitudinal data).

CONCLUSIONS

coda4microbiome is a new algorithm for identification of microbial signatures in both, cross-sectional and longitudinal studies. The algorithm is implemented as an R package that is available at CRAN ( https://cran.r-project.org/web/packages/coda4microbiome/ ) and is accompanied with a vignette with a detailed description of the functions. The website of the project contains several tutorials: https://malucalle.github.io/coda4microbiome/.

Collapse

Xia X, Zhang Y, Wei Y, Wang MH. Statistical Methods for Disease Risk Prediction with Genotype Data. Methods Mol Biol 2023;2629:331-347. [PMID: 36929084 DOI: 10.1007/978-1-0716-2986-4_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]

Jardillier R, Koca D, Chatelain F, Guyon L. Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening. BMC Cancer 2022;22:1045. [PMID: 36199072 PMCID: PMC9533541 DOI: 10.1186/s12885-022-10117-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 09/14/2022] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Prediction of patient survival from tumor molecular '-omics' data is a key step toward personalized medicine. Cox models performed on RNA profiling datasets are popular for clinical outcome predictions. But these models are applied in the context of "high dimension", as the number p of covariates (gene expressions) greatly exceeds the number n of patients and e of events. Thus, pre-screening together with penalization methods are widely used for dimensional reduction.

METHODS

In the present paper, (i) we benchmark the performance of the lasso penalization and three variants (i.e., ridge, elastic net, adaptive elastic net) on 16 cancers from TCGA after pre-screening, (ii) we propose a bi-dimensional pre-screening procedure based on both gene variability and p-values from single variable Cox models to predict survival, and (iii) we compare our results with iterative sure independence screening (ISIS).

RESULTS

First, we show that integration of mRNA-seq data with clinical data improves predictions over clinical data alone. Second, our bi-dimensional pre-screening procedure can only improve, in moderation, the C-index and/or the integrated Brier score, while excluding irrelevant genes for prediction. We demonstrate that the different penalization methods reached comparable prediction performances, with slight differences among datasets. Finally, we provide advice in the case of multi-omics data integration.

CONCLUSIONS

Tumor profiles convey more prognostic information than clinical variables such as stage for many cancer subtypes. Lasso and Ridge penalizations perform similarly than Elastic Net penalizations for Cox models in high-dimension. Pre-screening of the top 200 genes in term of single variable Cox model p-values is a practical way to reduce dimension, which may be particularly useful when integrating multi-omics.

Collapse

Chen Y, Jewell S, Witten D. More Powerful Selective Inference for the Graph Fused Lasso. J Comput Graph Stat 2022;32:577-587. [PMID: 38250478 PMCID: PMC10798806 DOI: 10.1080/10618600.2022.2097246] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 06/28/2022] [Indexed: 10/17/2022]

Kammer M, Dunkler D, Michiels S, Heinze G. Evaluating methods for Lasso selective inference in biomedical research: a comparative simulation study. BMC Med Res Methodol 2022;22:206. [PMID: 35883041 PMCID: PMC9316707 DOI: 10.1186/s12874-022-01681-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 07/11/2022] [Indexed: 12/03/2022] Open

Abstract

Background

Variable selection for regression models plays a key role in the analysis of biomedical data. However, inference after selection is not covered by classical statistical frequentist theory, which assumes a fixed set of covariates in the model. This leads to over-optimistic selection and replicability issues.

Methods

We compared proposals for selective inference targeting the submodel parameters of the Lasso and its extension, the adaptive Lasso: sample splitting, selective inference conditional on the Lasso selection (SI), and universally valid post-selection inference (PoSI). We studied the properties of the proposed selective confidence intervals available via R software packages using a neutral simulation study inspired by real data commonly seen in biomedical studies. Furthermore, we present an exemplary application of these methods to a publicly available dataset to discuss their practical usability.

Results

Frequentist properties of selective confidence intervals by the SI method were generally acceptable, but the claimed selective coverage levels were not attained in all scenarios, in particular with the adaptive Lasso. The actual coverage of the extremely conservative PoSI method exceeded the nominal levels, and this method also required the greatest computational effort. Sample splitting achieved acceptable actual selective coverage levels, but the method is inefficient and leads to less accurate point estimates.

The choice of inference method had a large impact on the resulting interval estimates, thereby necessitating that the user is acutely aware of the goal of inference in order to interpret and communicate the results.

Conclusions

Despite violating nominal coverage levels in some scenarios, selective inference conditional on the Lasso selection is our recommended approach for most cases. If simplicity is strongly favoured over efficiency, then sample splitting is an alternative. If only few predictors undergo variable selection (i.e. up to 5) or the avoidance of false positive claims of significance is a concern, then the conservative approach of PoSI may be useful. For the adaptive Lasso, SI should be avoided and only PoSI and sample splitting are recommended. In summary, we find selective inference useful to assess the uncertainties in the importance of individual selected predictors for future applications.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12874-022-01681-y.

Collapse

Yoo JE, Rho M. Large-Scale Survey Data Analysis with Penalized Regression: A Monte Carlo Simulation on Missing Categorical Predictors. Multivariate Behav Res 2022;57:642-657. [PMID: 33703972 DOI: 10.1080/00273171.2021.1891856] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Ni A, Song C. Variable Selection for Time-to-Event Data. Methods Mol Biol 2021;2194:61-76. [PMID: 32926362 DOI: 10.1007/978-1-0716-0849-4_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Nené NR, Barrett J, Jones A, Evans I, Reisel D, Timms JF, Paprotka T, Leimbach A, Franchi D, Colombo N, Bjørge L, Zikan M, Cibula D, Widschwendter M. DNA methylation signatures to predict the cervicovaginal microbiome status. Clin Epigenetics 2020;12:180. [PMID: 33228781 PMCID: PMC7686703 DOI: 10.1186/s13148-020-00966-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 11/03/2020] [Indexed: 01/21/2023] Open

Affiliation(s)

Nuno R Nené Department of Women's Cancer, EGA Institute for Women's Health, University College London, London, UK.,Department of Mathematics, University College London, London, UK
James Barrett Department of Women's Cancer, EGA Institute for Women's Health, University College London, London, UK.,European Translational Oncology Prevention and Screening (EUTOPS) Institute, 6060, Hall in Tirol, Austria.,Research Institute for Biomedical Aging Research, Universität Innsbruck, 6020, Innsbruck, Austria
Allison Jones Department of Women's Cancer, EGA Institute for Women's Health, University College London, London, UK
Iona Evans Department of Women's Cancer, EGA Institute for Women's Health, University College London, London, UK
Daniel Reisel Department of Women's Cancer, EGA Institute for Women's Health, University College London, London, UK
John F Timms Department of Women's Cancer, EGA Institute for Women's Health, University College London, London, UK
Tobias Paprotka Eurofins Genomics Europe Sequencing, Constance, Germany
Andreas Leimbach Eurofins Genomics Europe Sequencing, Constance, Germany
Dorella Franchi Europeo Di Oncologia, IRCCS, Milan, Italy
Nicoletta Colombo Europeo Di Oncologia, IRCCS, Milan, Italy.,University of Milano-Bicocca, Milan, Italy
Line Bjørge Department of Obstetrics and Gynecology, Haukeland University Hospital, Bergen, Norway.,Centre for Cancer Biomarkers, Department of Clinical Science, CCBIO, University of Bergen, Bergen, Norway
Michal Zikan Hospital Na Bulovce, Prague, Czech Republic.,Department of Obstetrics and Gynecology, General University Hospital in Prague, First Faculty of Medicine, Charles University, Prague, Czech Republic
David Cibula Department of Obstetrics and Gynecology, General University Hospital in Prague, First Faculty of Medicine, Charles University, Prague, Czech Republic
Martin Widschwendter Department of Women's Cancer, EGA Institute for Women's Health, University College London, London, UK. .,European Translational Oncology Prevention and Screening (EUTOPS) Institute, 6060, Hall in Tirol, Austria. .,Research Institute for Biomedical Aging Research, Universität Innsbruck, 6020, Innsbruck, Austria.

Collapse

Barbosa S, Khalfallah O, Forhan A, Galera C, Heude B, Glaichenhaus N, Davidovic L. Serum cytokines associated with behavior: A cross-sectional study in 5-year-old children. Brain Behav Immun 2020;87:377-387. [PMID: 31923553 DOI: 10.1016/j.bbi.2020.01.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 12/23/2019] [Accepted: 01/05/2020] [Indexed: 12/22/2022] Open

Dondelinger F, Mukherjee S. The joint lasso: high-dimensional regression for group structured data. Biostatistics 2020;21:219-235. [PMID: 30192903 PMCID: PMC7868060 DOI: 10.1093/biostatistics/kxy035] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Revised: 05/11/2018] [Accepted: 06/02/2018] [Indexed: 11/24/2022] Open

Sheng A, Ghosh SK. Effects of Proportional Hazard Assumption on Variable Selection Methods for Censored Data. Stat Biopharm Res 2020;12:199-209. [PMID: 34040695 DOI: 10.1080/19466315.2019.1694578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Wang F, Mukherjee S, Richardson S, Hill SM. High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking. Stat Comput 2019;30:697-719. [PMID: 32132772 PMCID: PMC7026376 DOI: 10.1007/s11222-019-09914-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 10/16/2019] [Indexed: 06/01/2023]

Velten B, Huber W. Adaptive penalization in high-dimensional regression and classification with external covariates using variational Bayes. Biostatistics 2019;22:348-364. [PMID: 31596468 PMCID: PMC8036004 DOI: 10.1093/biostatistics/kxz034] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 06/27/2019] [Accepted: 08/14/2019] [Indexed: 12/18/2022] Open

Breheny PJ. Marginal false discovery rates for penalized regression models. Biostatistics 2019;20:299-314. [PMID: 29420686 DOI: 10.1093/biostatistics/kxy004] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Accepted: 01/14/2018] [Indexed: 11/14/2022] Open

Garcia-Carretero R, Barquero-Perez O, Mora-Jimenez I, Soguero-Ruiz C, Goya-Esteban R, Ramos-Lopez J. Identification of clinically relevant features in hypertensive patients using penalized regression: a case study of cardiovascular events. Med Biol Eng Comput 2019;57:2011-2026. [PMID: 31346948 DOI: 10.1007/s11517-019-02007-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Accepted: 06/24/2019] [Indexed: 12/18/2022]

Abstract

Appropriate management of hypertensive patients relies on the accurate identification of clinically relevant features. However, traditional statistical methods may ignore important information in datasets or overlook possible interactions among features. Machine learning may improve the prediction accuracy and interpretability of regression models by identifying the most relevant features in hypertensive patients. We sought the most relevant features for prediction of cardiovascular (CV) events in a hypertensive population. We used the penalized regression models least absolute shrinkage and selection operator (LASSO) and elastic net (EN) to obtain the most parsimonious and accurate models. The clinical parameters and laboratory biomarkers were collected from the clinical records of 1,471 patients receiving care at Mostoles University Hospital. The outcome was the development of major adverse CV events. Cox proportional hazards regression was performed alone and with penalized regression analyses (LASSO and EN), producing three models. The modeling was performed using 10-fold cross-validation to fit the penalized models. The three predictive models were compared and statistically analyzed to assess their classification accuracy, sensitivity, specificity, discriminative power, and calibration accuracy. The standard Cox model identified five relevant features, while LASSO and EN identified only three (age, LDL cholesterol, and kidney function). The accuracies of the models (prediction vs. observation) were 0.767 (Cox model), 0.754 (LASSO), and 0.764 (EN), and the areas under the curve were 0.694, 0.670, and 0.673, respectively. However, pairwise comparison of performance yielded no statistically significant differences. All three calibration curves showed close agreement between the predicted and observed probabilities of the development of a CV event. Although the performance was similar for all three models, both penalized regression analyses produced models with good fit and fewer features than the Cox regression predictive model but with the same accuracy. This case study of predictive models using penalized regression analyses shows that penalized regularization techniques can provide predictive models for CV risk assessment that are parsimonious, highly interpretable, and generalizable and that have good fit. For clinicians, a parsimonious model can be useful where available data are limited, as such a model can offer a simple but efficient way to model the impact of the different features on the prediction of CV events. Management of these features may lower the risk for a CV event. Graphical Abstract In a clinical setting, with numerous biological and laboratory features and incomplete datasets, traditional statistical methods may ignore important information and overlook possible interactions among features. Our aim was to identify the most relevant features to predict cardiovascular events in a hypertensive population, using three different regression approaches for feature selection, to improve the prediction accuracy and interpretability of regression models by identifying the relevant features in these patients.

Collapse

Namkung J. Statistical Methods for Identifying Biomarkers from miRNA Profiles of Cancers. Methods Mol Biol 2019;1882:261-286. [PMID: 30378062 DOI: 10.1007/978-1-4939-8879-2_24] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Nguyen P, Braun R. Time-lagged Ordered Lasso for network inference. BMC Bioinformatics 2018;19:545. [PMID: 30594121 PMCID: PMC6311035 DOI: 10.1186/s12859-018-2558-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 12/04/2018] [Indexed: 12/22/2022] Open

Klau S, Jurinovic V, Hornung R, Herold T, Boulesteix AL. Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data. BMC Bioinformatics 2018;19:322. [PMID: 30208855 DOI: 10.1186/s12859-018-2344-6] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 08/29/2018] [Indexed: 12/18/2022] Open

Gaines BR, Kim J, Zhou H. Algorithms for Fitting the Constrained Lasso. J Comput Graph Stat 2018;27:861-871. [PMID: 30618485 PMCID: PMC6320228 DOI: 10.1080/10618600.2018.1473777] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Revised: 03/01/2018] [Indexed: 01/22/2023]

Martínez-Ávila JC, García Bartolomé A, García I, Dapía I, Tong HY, Díaz L, Guerra P, Frías J, Carcás Sansuan AJ, Borobia AM. Pharmacometabolomics applied to zonisamide pharmacokinetic parameter prediction. Metabolomics 2018;14:70. [PMID: 30830352 DOI: 10.1007/s11306-018-1365-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 04/25/2018] [Indexed: 10/16/2022]

Schwartz N, Sakhnini A, Bisharat N. Predictive modeling of inpatient mortality in departments of internal medicine. Intern Emerg Med 2018;13:205-211. [PMID: 29290047 DOI: 10.1007/s11739-017-1784-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Accepted: 12/25/2017] [Indexed: 11/25/2022]

Wu Y, Cook RJ. Variable selection and prediction in biased samples with censored outcomes. Lifetime Data Anal 2018;24:72-93. [PMID: 28215038 DOI: 10.1007/s10985-017-9392-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Accepted: 02/07/2017] [Indexed: 06/06/2023]

Lee JW, Punshon T, Moen EL, Karagas MR, Gui J. Penalized estimation of sparse concentration matrices based on prior knowledge with applications to placenta elemental data. Comput Biol Chem 2017;71:219-223. [PMID: 29153892 DOI: 10.1016/j.compbiolchem.2017.10.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Revised: 10/29/2017] [Accepted: 10/30/2017] [Indexed: 10/18/2022]

Shen R, Luo L, Jiang H. Identification of gene pairs through penalized regression subject to constraints. BMC Bioinformatics 2017;18:466. [PMID: 29100492 DOI: 10.1186/s12859-017-1872-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 10/17/2017] [Indexed: 02/07/2023] Open

Sedighi Maman Z, Alamdar Yazdi MA, Cavuoto LA, Megahed FM. A data-driven approach to modeling physical fatigue in the workplace using wearable sensors. Appl Ergon 2017;65:515-529. [PMID: 28259238 DOI: 10.1016/j.apergo.2017.02.001] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2016] [Revised: 01/28/2017] [Accepted: 02/01/2017] [Indexed: 05/14/2023]

Lee HS, Krischer JP. A new framework for prediction and variable selection for uncommon events in a large prospective cohort study. Model Assist Stat Appl 2017;12:227-237. [PMID: 29075164 PMCID: PMC5654558 DOI: 10.3233/mas-170397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Ternès N, Rotolo F, Michiels S. Robust estimation of the expected survival probabilities from high-dimensional Cox models with biomarker-by-treatment interactions in randomized clinical trials. BMC Med Res Methodol 2017;17:83. [PMID: 28532387 PMCID: PMC5441049 DOI: 10.1186/s12874-017-0354-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Accepted: 04/27/2017] [Indexed: 11/10/2022] Open

Abstract

Background

Thanks to the advances in genomics and targeted treatments, more and more prediction models based on biomarkers are being developed to predict potential benefit from treatments in a randomized clinical trial. Despite the methodological framework for the development and validation of prediction models in a high-dimensional setting is getting more and more established, no clear guidance exists yet on how to estimate expected survival probabilities in a penalized model with biomarker-by-treatment interactions.

Methods

Based on a parsimonious biomarker selection in a penalized high-dimensional Cox model (lasso or adaptive lasso), we propose a unified framework to: estimate internally the predictive accuracy metrics of the developed model (using double cross-validation); estimate the individual survival probabilities at a given timepoint; construct confidence intervals thereof (analytical or bootstrap); and visualize them graphically (pointwise or smoothed with spline). We compared these strategies through a simulation study covering scenarios with or without biomarker effects. We applied the strategies to a large randomized phase III clinical trial that evaluated the effect of adding trastuzumab to chemotherapy in 1574 early breast cancer patients, for which the expression of 462 genes was measured.

Results

In our simulations, penalized regression models using the adaptive lasso estimated the survival probability of new patients with low bias and standard error; bootstrapped confidence intervals had empirical coverage probability close to the nominal level across very different scenarios. The double cross-validation performed on the training data set closely mimicked the predictive accuracy of the selected models in external validation data. We also propose a useful visual representation of the expected survival probabilities using splines. In the breast cancer trial, the adaptive lasso penalty selected a prediction model with 4 clinical covariates, the main effects of 98 biomarkers and 24 biomarker-by-treatment interactions, but there was high variability of the expected survival probabilities, with very large confidence intervals.

Conclusion

Based on our simulations, we propose a unified framework for: developing a prediction model with biomarker-by-treatment interactions in a high-dimensional setting and validating it in absence of external data; accurately estimating the expected survival probability of future patients with associated confidence intervals; and graphically visualizing the developed prediction model. All the methods are implemented in the R package biospear, publicly available on the CRAN.

Electronic supplementary material

The online version of this article (doi:10.1186/s12874-017-0354-0) contains supplementary material, which is available to authorized users.

Collapse

Zhai J, Hsu CH, Daye ZJ. Ridle for sparse regression with mandatory covariates with application to the genetic assessment of histologic grades of breast cancer. BMC Med Res Methodol 2017;17:12. [PMID: 28122498 PMCID: PMC5267467 DOI: 10.1186/s12874-017-0291-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Accepted: 01/06/2017] [Indexed: 12/13/2022] Open

Abstract

Background

Many questions in statistical genomics can be formulated in terms of variable selection of candidate biological factors for modeling a trait or quantity of interest. Often, in these applications, additional covariates describing clinical, demographical or experimental effects must be included a priori as mandatory covariates while allowing the selection of a large number of candidate or optional variables. As genomic studies routinely require mandatory covariates, it is of interest to propose principled methods of variable selection that can incorporate mandatory covariates.

Methods

In this article, we propose the ridge-lasso hybrid estimator (ridle), a new penalized regression method that simultaneously estimates coefficients of mandatory covariates while allowing selection for others. The ridle provides a principled approach to mitigate effects of multicollinearity among the mandatory covariates and possible dependency between mandatory and optional variables. We provide detailed empirical and theoretical studies to evaluate our method. In addition, we develop an efficient algorithm for the ridle. Software, based on efficient Fortran code with R-language wrappers, is publicly and freely available at https://sites.google.com/site/zhongyindaye/software.

Results

The ridle is useful when mandatory predictors are known to be significant due to prior knowledge or must be kept for additional analysis. Both theoretical and comprehensive simulation studies have shown that the ridle to be advantageous when mandatory covariates are correlated with the irrelevant optional predictors or are highly correlated among themselves. A microarray gene expression analysis of the histologic grades of breast cancer has identified 24 genes, in which 2 genes are selected only by the ridle among current methods and found to be associated with tumor grade.

Conclusions

In this article, we proposed the ridle as a principled sparse regression method for the selection of optional variables while incorporating mandatory ones. Results suggest that the ridle is advantageous when mandatory covariates are correlated with the irrelevant optional predictors or are highly correlated among themselves.

Electronic supplementary material

The online version of this article (doi:10.1186/s12874-017-0291-y) contains supplementary material, which is available to authorized users.

Collapse

Moradi E, Hallikainen I, Hänninen T, Tohka J. Rey's Auditory Verbal Learning Test scores can be predicted from whole brain MRI in Alzheimer's disease. Neuroimage Clin 2016;13:415-427. [PMID: 28116234 PMCID: PMC5233798 DOI: 10.1016/j.nicl.2016.12.011] [Citation(s) in RCA: 88] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Revised: 11/25/2016] [Accepted: 12/11/2016] [Indexed: 12/18/2022]

Abstract

Rey's Auditory Verbal Learning Test (RAVLT) is a powerful neuropsychological tool for testing episodic memory, which is widely used for the cognitive assessment in dementia and pre-dementia conditions. Several studies have shown that an impairment in RAVLT scores reflect well the underlying pathology caused by Alzheimer's disease (AD), thus making RAVLT an effective early marker to detect AD in persons with memory complaints. We investigated the association between RAVLT scores (RAVLT Immediate and RAVLT Percent Forgetting) and the structural brain atrophy caused by AD. The aim was to comprehensively study to what extent the RAVLT scores are predictable based on structural magnetic resonance imaging (MRI) data using machine learning approaches as well as to find the most important brain regions for the estimation of RAVLT scores. For this, we built a predictive model to estimate RAVLT scores from gray matter density via elastic net penalized linear regression model. The proposed approach provided highly significant cross-validated correlation between the estimated and observed RAVLT Immediate (R = 0.50) and RAVLT Percent Forgetting (R = 0.43) in a dataset consisting of 806 AD, mild cognitive impairment (MCI) or healthy subjects. In addition, the selected machine learning method provided more accurate estimates of RAVLT scores than the relevance vector regression used earlier for the estimation of RAVLT based on MRI data. The top predictors were medial temporal lobe structures and amygdala for the estimation of RAVLT Immediate and angular gyrus, hippocampus and amygdala for the estimation of RAVLT Percent Forgetting. Further, the conversion of MCI subjects to AD in 3-years could be predicted based on either observed or estimated RAVLT scores with an accuracy comparable to MRI-based biomarkers.

Collapse

Doerken S, Mockenhaupt M, Naldi L, Schumacher M, Sekula P. The case-crossover design via penalized regression. BMC Med Res Methodol 2016;16:103. [PMID: 27549803 PMCID: PMC4994302 DOI: 10.1186/s12874-016-0197-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Accepted: 07/28/2016] [Indexed: 11/12/2022] Open

Ojeda FM, Müller C, Börnigen D, Trégouët DA, Schillert A, Heinig M, Zeller T, Schnabel RB. Comparison of Cox Model Methods in A Low-dimensional Setting with Few Events. Genomics Proteomics Bioinformatics 2016;14:235-43. [PMID: 27224515 PMCID: PMC4996851 DOI: 10.1016/j.gpb.2016.03.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 03/01/2016] [Accepted: 03/22/2016] [Indexed: 11/01/2022]

Zhao LP, Bolouri H. Object-oriented regression for building predictive models with high dimensional omics data from translational studies. J Biomed Inform 2016;60:431-45. [PMID: 26972839 PMCID: PMC5097461 DOI: 10.1016/j.jbi.2016.03.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Revised: 02/23/2016] [Accepted: 03/01/2016] [Indexed: 12/31/2022]

Bhattacharya A, Pati D, Pillai NS, Dunson DB. Dirichlet-Laplace priors for optimal shrinkage. J Am Stat Assoc 2015;110:1479-1490. [PMID: 27019543 PMCID: PMC4803119 DOI: 10.1080/01621459.2014.960967] [Citation(s) in RCA: 106] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Revised: 07/01/2014] [Indexed: 10/24/2022]

Ha MJ, Sun W, Xie J. PenPC: A two-step approach to estimate the skeletons of high-dimensional directed acyclic graphs. Biometrics 2015;72:146-55. [PMID: 26406114 DOI: 10.1111/biom.12415] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2014] [Revised: 05/01/2015] [Accepted: 07/01/2015] [Indexed: 11/29/2022]

Sabourin JA, Valdar W, Nobel AB. A permutation approach for selecting the penalty parameter in penalized model selection. Biometrics 2015;71:1185-94. [PMID: 26243050 DOI: 10.1111/biom.12359] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 04/01/2015] [Accepted: 05/01/2015] [Indexed: 11/27/2022]

Neely ML, Bondell HD, Tzeng JY. A penalized likelihood approach for investigating gene-drug interactions in pharmacogenetic studies. Biometrics 2015;71:529-37. [PMID: 25604216 DOI: 10.1111/biom.12259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2013] [Revised: 09/01/2014] [Accepted: 09/01/2014] [Indexed: 11/28/2022]

Chaturvedi N, de Menezes RX, Goeman JJ. Fused lasso algorithm for Cox' proportional hazards and binomial logit models with application to copy number profiles. Biom J 2014;56:477-92. [PMID: 24496763 DOI: 10.1002/bimj.201200241] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Revised: 10/15/2013] [Accepted: 10/19/2013] [Indexed: 11/09/2022]

Song R, Yi F, Zou H. On Varying-coefficient Independence Screening for High-dimensional Varying-coefficient Models. Stat Sin 2014;24:1735-1752. [PMID: 25484548 PMCID: PMC4251601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Pan W, Shen X, Liu B. Cluster Analysis: Unsupervised Learning via Supervised Learning with a Non-convex Penalty. J Mach Learn Res 2013;14:1865. [PMID: 24358018 PMCID: PMC3866036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Liu J, Wang K, Ma S, Huang J. Accounting for linkage disequilibrium in genome-wide association studies: A penalized regression method. Stat Interface 2013;6:99-115. [PMID: 25258655 PMCID: PMC4172344 DOI: 10.4310/sii.2013.v6.n1.a10] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Mallick H, Yi N. Bayesian Methods for High Dimensional Linear Models. ACTA ACUST UNITED AC 2013;1:005. [PMID: 24511433 DOI: 10.4172/2155-6180.s1-005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Huang J, Wei F, Ma S. Semiparametric Regression Pursuit. Stat Sin 2012;22:1403-1426. [PMID: 23559831 PMCID: PMC3613788 DOI: 10.5705/ss.2010.298] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Gunes F, Bondell HD. A Confidence Region Approach to Tuning for Variable Selection. J Comput Graph Stat 2012;21:295-314. [PMID: 23407768 DOI: 10.1080/10618600.2012.679890] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]