1
|
Djordjilović V, Ponzi E, Nøst TH, Thoresen M. penalizedclr: an R package for penalized conditional logistic regression for integration of multiple omics layers. BMC Bioinformatics 2024; 25:226. [PMID: 38937668 PMCID: PMC11212437 DOI: 10.1186/s12859-024-05850-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 06/20/2024] [Indexed: 06/29/2024] Open
Abstract
BACKGROUND The matched case-control design, up until recently mostly pertinent to epidemiological studies, is becoming customary in biomedical applications as well. For instance, in omics studies, it is quite common to compare cancer and healthy tissue from the same patient. Furthermore, researchers today routinely collect data from various and variable sources that they wish to relate to the case-control status. This highlights the need to develop and implement statistical methods that can take these tendencies into account. RESULTS We present an R package penalizedclr, that provides an implementation of the penalized conditional logistic regression model for analyzing matched case-control studies. It allows for different penalties for different blocks of covariates, and it is therefore particularly useful in the presence of multi-source omics data. Both L1 and L2 penalties are implemented. Additionally, the package implements stability selection for variable selection in the considered regression model. CONCLUSIONS The proposed method fills a gap in the available software for fitting high-dimensional conditional logistic regression models accounting for the matched design and block structure of predictors/features. The output consists of a set of selected variables that are significantly associated with case-control status. These variables can then be investigated in terms of functional interpretation or validation in further, more targeted studies.
Collapse
Affiliation(s)
- Vera Djordjilović
- Department of Economics, Ca' Foscari University of Venice, Venice, Italy.
- Department of Biostatistics, University of Oslo, Oslo, Norway.
| | - Erica Ponzi
- Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Therese Haugdahl Nøst
- Department of Public Health and Nursing, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Community Medicine, Faculty of Health Sciences, The Arctic University of Norway, Tromsø, Norway
| | - Magne Thoresen
- Department of Biostatistics, University of Oslo, Oslo, Norway
| |
Collapse
|
2
|
Zhao Z, Zobolas J, Zucknick M, Aittokallio T. Tutorial on survival modeling with applications to omics data. Bioinformatics 2024; 40:btae132. [PMID: 38445722 PMCID: PMC10973942 DOI: 10.1093/bioinformatics/btae132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 02/22/2024] [Accepted: 03/04/2024] [Indexed: 03/07/2024] Open
Abstract
MOTIVATION Identification of genomic, molecular and clinical markers prognostic of patient survival is important for developing personalized disease prevention, diagnostic and treatment approaches. Modern omics technologies have made it possible to investigate the prognostic impact of markers at multiple molecular levels, including genomics, epigenomics, transcriptomics, proteomics and metabolomics, and how these potential risk factors complement clinical characterization of patient outcomes for survival prognosis. However, the massive sizes of the omics datasets, along with their correlation structures, pose challenges for studying relationships between the molecular information and patients' survival outcomes. RESULTS We present a general workflow for survival analysis that is applicable to high-dimensional omics data as inputs when identifying survival-associated features and validating survival models. In particular, we focus on the commonly used Cox-type penalized regressions and hierarchical Bayesian models for feature selection in survival analysis, which are especially useful for high-dimensional data, but the framework is applicable more generally. AVAILABILITY AND IMPLEMENTATION A step-by-step R tutorial using The Cancer Genome Atlas survival and omics data for the execution and evaluation of survival models has been made available at https://ocbe-uio.github.io/survomics.
Collapse
Affiliation(s)
- Zhi Zhao
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Department of Biostatistics, Faculty of Medicine, University of Oslo, Oslo 0372, Norway
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo 0310, Norway
| | - John Zobolas
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Department of Biostatistics, Faculty of Medicine, University of Oslo, Oslo 0372, Norway
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo 0310, Norway
| | - Manuela Zucknick
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Department of Biostatistics, Faculty of Medicine, University of Oslo, Oslo 0372, Norway
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Research Support Services, Oslo University Hospital, Oslo 0372, Norway
| | - Tero Aittokallio
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Department of Biostatistics, Faculty of Medicine, University of Oslo, Oslo 0372, Norway
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo 0310, Norway
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki FI-00014, Finland
| |
Collapse
|
3
|
Giliberto M, Santana LM, Holien T, Misund K, Nakken S, Vodak D, Hovig E, Meza-Zepeda LA, Coward E, Waage A, Taskén K, Skånland SS. Mutational analysis and protein profiling predict drug sensitivity in multiple myeloma cell lines. Front Oncol 2022; 12:1040730. [PMID: 36523963 PMCID: PMC9745900 DOI: 10.3389/fonc.2022.1040730] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 10/31/2022] [Indexed: 12/03/2023] Open
Abstract
INTRODUCTION Multiple myeloma (MM) is a heterogeneous disease where cancer-driver mutations and aberrant signaling may lead to disease progression and drug resistance. Drug responses vary greatly, and there is an unmet need for biomarkers that can guide precision cancer medicine in this disease. METHODS To identify potential predictors of drug sensitivity, we applied integrated data from drug sensitivity screening, mutational analysis and functional signaling pathway profiling in 9 cell line models of MM. We studied the sensitivity to 33 targeted drugs and their association with the mutational status of cancer-driver genes and activity level of signaling proteins. RESULTS We found that sensitivity to mitogen-activated protein kinase kinase 1 (MEK1) and phosphatidylinositol-3 kinase (PI3K) inhibitors correlated with mutations in NRAS/KRAS, and PI3K family genes, respectively. Phosphorylation status of MEK1 and protein kinase B (AKT) correlated with sensitivity to MEK and PI3K inhibition, respectively. In addition, we found that enhanced phosphorylation of proteins, including Tank-binding kinase 1 (TBK1), as well as high expression of B cell lymphoma 2 (Bcl-2), correlated with low sensitivity to MEK inhibitors. DISCUSSION Taken together, this study shows that mutational status and signaling protein profiling might be used in further studies to predict drug sensitivities and identify resistance markers in MM.
Collapse
Affiliation(s)
- Mariaserena Giliberto
- Department of Cancer Immunology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- K.G. Jebsen Centre for B Cell Malignancies, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Leonardo Miranda Santana
- Department of Cancer Immunology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- K.G. Jebsen Centre for B Cell Malignancies, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Oslo Centre for Biostatistics and Epidemiology, University of Oslo, Oslo, Norway
| | - Toril Holien
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Hematology, St. Olav’s University Hospital, Trondheim, Norway
- Department of Immunology and Transfusion Medicine, St. Olav’s University Hospital, Trondheim, Norway
| | - Kristine Misund
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Sigve Nakken
- Norwegian Cancer Genomics Consortium, Oslo University Hospital, Oslo, Norway
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- Centre for Cancer Cell Reprogramming, Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Daniel Vodak
- Norwegian Cancer Genomics Consortium, Oslo University Hospital, Oslo, Norway
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- Genomics Core Facility, Department of Core Facilities, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Eivind Hovig
- Norwegian Cancer Genomics Consortium, Oslo University Hospital, Oslo, Norway
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- Center for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway
| | - Leonardo A. Meza-Zepeda
- Norwegian Cancer Genomics Consortium, Oslo University Hospital, Oslo, Norway
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- Genomics Core Facility, Department of Core Facilities, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Eivind Coward
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
- Bioinformatics Core Facility, Norwegian University of Science and Technology, Trondheim, Norway
| | - Anders Waage
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Hematology, St. Olav’s University Hospital, Trondheim, Norway
- Department of Immunology and Transfusion Medicine, St. Olav’s University Hospital, Trondheim, Norway
| | - Kjetil Taskén
- Department of Cancer Immunology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- K.G. Jebsen Centre for B Cell Malignancies, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Sigrid S. Skånland
- Department of Cancer Immunology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- K.G. Jebsen Centre for B Cell Malignancies, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| |
Collapse
|
4
|
Zhao Z, Wang S, Zucknick M, Aittokallio T. Tissue-specific identification of multi-omics features for pan-cancer drug response prediction. iScience 2022; 25:104767. [PMID: 35992090 PMCID: PMC9385562 DOI: 10.1016/j.isci.2022.104767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 06/28/2022] [Accepted: 07/11/2022] [Indexed: 11/29/2022] Open
Abstract
Current statistical models for drug response prediction and biomarker identification fall short in leveraging the shared and unique information from various cancer tissues and multi-omics profiles. We developed mix-lasso model that introduces an additional sample group penalty term to capture tissue-specific effects of features on pan-cancer response prediction. The mix-lasso model takes into account both the similarity between drug responses (i.e., multi-task learning), and the heterogeneity between multi-omics data (multi-modal learning). When applied to large-scale pharmacogenomics dataset from Cancer Therapeutics Response Portal, mix-lasso enabled accurate drug response predictions and identification of tissue-specific predictive features in the presence of various degrees of missing data, drug-drug correlations, and high-dimensional and correlated genomic and molecular features that often hinder the use of statistical approaches in drug response modeling. Compared to tree lasso model, mix-lasso identified a smaller number of tissue-specific features, hence making the model more interpretable and stable for drug discovery applications. Pan-cancer cell lines provide a test bench for exploring gene-drug relationships Multi-omics data were integrated with pharmacological profiles for joint modeling Mix-lasso identifies tissue-specific biomarkers predictive of multi-drug responses Mix-lasso provides small number of stable features for drug discovery applications
Collapse
Affiliation(s)
- Zhi Zhao
- Institute for Cancer Research, Department of Cancer Genetics, Oslo University Hospital, Norway
- Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Norway
| | - Shixiong Wang
- Institute for Cancer Research, Department of Cancer Genetics, Oslo University Hospital, Norway
| | - Manuela Zucknick
- Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Norway
- Corresponding author
| | - Tero Aittokallio
- Institute for Cancer Research, Department of Cancer Genetics, Oslo University Hospital, Norway
- Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Norway
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Finland
- Corresponding author
| |
Collapse
|