1
|
Font A, Domenech M, Ramirez JL, Marqués M, Benítez R, Ruiz de Porras V, Gago JL, Carrato C, Sant F, Lopez H, Castellano D, Malats N, Calle ML, Real FX. Predictive signature of response to neoadjuvant chemotherapy in muscle-invasive bladder cancer integrating mRNA expression, taxonomic subtypes, and clinicopathological features. Front Oncol 2023; 13:1155244. [PMID: 37588099 PMCID: PMC10426739 DOI: 10.3389/fonc.2023.1155244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 07/10/2023] [Indexed: 08/18/2023] Open
Abstract
Background and objective Neoadjuvant chemotherapy (NAC) followed by cystectomy is the standard of care in muscle-invasive bladder cancer (MIBC). Pathological response has been associated with longer survival, but no currently available clinicopathological variables can identify patients likely to respond, highlighting the need for predictive biomarkers. We sought to identify a predictive signature of response to NAC integrating clinical score, taxonomic subtype, and gene expression. Material and methods From 1994 to 2014, pre-treatment tumor samples were collected from MIBC patients (stage T2-4N0/+M0) at two Spanish hospitals. A clinical score was determined based on stage, hydronephrosis and histology. Taxonomic subtypes (BASQ, luminal, and mixed) were identified by immunohistochemistry. A custom set of 41 genes involved in DNA damage repair and immune response was analyzed in 84 patients with the NanoString nCounter platform. Genes related to pathological response were identified by LASSO penalized logistic regression. NAC consisted of cisplatin/methotrexate/vinblastine until 2000, after which most patients received cisplatin/gemcitabine. The capacity of the integrated signature to predict pathological response was assessed with AUC. Overall survival (OS) and disease-specific survival (DSS) were analyzed with the Kaplan-Meier method. Results LASSO selected eight genes to be included in the signature (RAD51, IFNγ, CHEK1, CXCL9, c-MET, KRT14, HERC2, FOXA1). The highest predictive accuracy was observed with the inclusion in the model of only three genes (RAD51, IFNɣ, CHEK1). The integrated clinical-taxonomic-gene expression signature including these three genes had a higher predictive ability (AUC=0.71) than only clinical score plus taxonomic subtype (AUC=0.58) or clinical score alone (AUC=0.56). This integrated signature was also significantly associated with OS (p=0.02) and DSS (p=0.02). Conclusions We have identified a predictive signature for response to NAC in MIBC patients that integrates the expression of three genes with clinicopathological characteristics and taxonomic subtypes. Prospective studies to validate these results are ongoing.
Collapse
Affiliation(s)
- Albert Font
- Medical Oncology Department, Institut Català d’Oncologia, Hospital Universitari Germans Trias i Pujol, Badalona, Barcelona, Spain
- Badalona Applied Research Group in Oncology (B-ARGO), Germans Trias i Pujol Research Institute (IGTP), Badalona, Barcelona, Spain
| | - Montserrat Domenech
- Medical Oncology Department, Althaia Xarxa Assistencial Universitària de Manresa, Manresa, Spain
| | - Jose Luis Ramirez
- Hematology Service, Institut Català d'Oncologia (ICO) Badalona-Hospital Germans Trias i Pujol, Lymphoid Neoplasms Group, Josep Carreras Leukemia Research Institute (IJC), Badalona, Spain
| | - Miriam Marqués
- Epithelial Carcinogenesis Group, Spanish National Cancer Research Centre (CNIO) and CIBERONC, Madrid, Spain
| | - Raquel Benítez
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Madrid, Spain
| | - Vicenç Ruiz de Porras
- Medical Oncology Department, Institut Català d’Oncologia, Hospital Universitari Germans Trias i Pujol, Badalona, Barcelona, Spain
- Badalona Applied Research Group in Oncology (B-ARGO), Germans Trias i Pujol Research Institute (IGTP), Badalona, Barcelona, Spain
| | - José L. Gago
- Urology Department, Hospital Universitari Germans Trias I Pujol, Badalona, Barcelona, Spain
| | - Cristina Carrato
- Pathology Department, Hospital Universitari Germans Trias I Pujol, Badalona, Barcelona, Spain
| | - Francesc Sant
- Pathology Department, Althaia Xarxa Assistencial Universitària de Manresa, Manresa, Spain
| | - Hector Lopez
- Urology Department, Althaia Xarxa Assistencial Universitària de Manresa, Manresa, Spain
| | - Daniel Castellano
- Medical Oncology Department, University Hospital 12 de Octubre, Madrid, Spain
| | - Nuria Malats
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Madrid, Spain
| | - M. Luz Calle
- Biosciences Department, Faculty of Sciences, Technology, University of Vic-Central University of Catalonia, Vic, Barcelona, Spain
| | - Francisco X. Real
- Epithelial Carcinogenesis Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
- Centre for Biomedical Research in Cancer Network (CIBERONC), Madrid, Spain
- Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| |
Collapse
|
2
|
Calle ML, Pujolassos M, Susin A. coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies. BMC Bioinformatics 2023; 24:82. [PMID: 36879227 PMCID: PMC9990256 DOI: 10.1186/s12859-023-05205-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 02/22/2023] [Indexed: 03/08/2023] Open
Abstract
BACKGROUND One of the main challenges of microbiome analysis is its compositional nature that if ignored can lead to spurious results. Addressing the compositional structure of microbiome data is particularly critical in longitudinal studies where abundances measured at different times can correspond to different sub-compositions. RESULTS We developed coda4microbiome, a new R package for analyzing microbiome data within the Compositional Data Analysis (CoDA) framework in both, cross-sectional and longitudinal studies. The aim of coda4microbiome is prediction, more specifically, the method is designed to identify a model (microbial signature) containing the minimum number of features with the maximum predictive power. The algorithm relies on the analysis of log-ratios between pairs of components and variable selection is addressed through penalized regression on the "all-pairs log-ratio model", the model containing all possible pairwise log-ratios. For longitudinal data, the algorithm infers dynamic microbial signatures by performing penalized regression over the summary of the log-ratio trajectories (the area under these trajectories). In both, cross-sectional and longitudinal studies, the inferred microbial signature is expressed as the (weighted) balance between two groups of taxa, those that contribute positively to the microbial signature and those that contribute negatively. The package provides several graphical representations that facilitate the interpretation of the analysis and the identified microbial signatures. We illustrate the new method with data from a Crohn's disease study (cross-sectional data) and on the developing microbiome of infants (longitudinal data). CONCLUSIONS coda4microbiome is a new algorithm for identification of microbial signatures in both, cross-sectional and longitudinal studies. The algorithm is implemented as an R package that is available at CRAN ( https://cran.r-project.org/web/packages/coda4microbiome/ ) and is accompanied with a vignette with a detailed description of the functions. The website of the project contains several tutorials: https://malucalle.github.io/coda4microbiome/.
Collapse
Affiliation(s)
- M Luz Calle
- Biosciences Department, Faculty of Sciences, Technology and Engineering, University of Vic - Central University of Catalonia, Carrer de La Laura, 13, 08500, Vic, Spain.
| | - Meritxell Pujolassos
- Biosciences Department, Faculty of Sciences, Technology and Engineering, University of Vic - Central University of Catalonia, Carrer de La Laura, 13, 08500, Vic, Spain
| | - Antoni Susin
- Mathematical Department, UPC-Barcelona Tech, Barcelona, Spain
| |
Collapse
|
3
|
Contreras-Rodriguez O, Arnoriaga-Rodríguez M, Miranda-Olivos R, Blasco G, Biarnés C, Puig J, Rivera-Pinto J, Calle ML, Pérez-Brocal V, Moya A, Coll C, Ramió-Torrentà L, Soriano-Mas C, Fernandez-Real JM. Obesity status and obesity-associated gut dysbiosis effects on hypothalamic structural covariance. Int J Obes (Lond) 2022; 46:30-38. [PMID: 34471225 PMCID: PMC8748191 DOI: 10.1038/s41366-021-00953-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 08/03/2021] [Accepted: 08/18/2021] [Indexed: 02/07/2023]
Abstract
BACKGROUND Functional connectivity alterations in the lateral and medial hypothalamic networks have been associated with the development and maintenance of obesity, but the possible impact on the structural properties of these networks remains largely unexplored. Also, obesity-related gut dysbiosis may delineate specific hypothalamic alterations within obese conditions. We aim to assess the effects of obesity, and obesity and gut-dysbiosis on the structural covariance differences in hypothalamic networks, executive functioning, and depressive symptoms. METHODS Medial (MH) and lateral (LH) hypothalamic structural covariance alterations were identified in 57 subjects with obesity compared to 47 subjects without obesity. Gut dysbiosis in the subjects with obesity was defined by the presence of high (n = 28) and low (n = 29) values in a BMI-associated microbial signature, and posthoc comparisons between these groups were used as a proxy to explore the role of obesity-related gut dysbiosis on the hypothalamic measurements, executive function, and depressive symptoms. RESULTS Structural covariance alterations between the MH and the striatum, lateral prefrontal, cingulate, insula, and temporal cortices are congruent with previously functional connectivity disruptions in obesity conditions. MH structural covariance decreases encompassed postcentral parietal cortices in the subjects with obesity and gut-dysbiosis, but increases with subcortical nuclei involved in the coding food-related hedonic information in the subjects with obesity without gut-dysbiosis. Alterations for the structural covariance of the LH in the subjects with obesity and gut-dysbiosis encompassed increases with frontolimbic networks, but decreases with the lateral orbitofrontal cortex in the subjects with obesity without gut-dysbiosis. Subjects with obesity and gut dysbiosis showed higher executive dysfunction and depressive symptoms. CONCLUSIONS Obesity-related gut dysbiosis is linked to specific structural covariance alterations in hypothalamic networks relevant to the integration of somatic-visceral information, and emotion regulation.
Collapse
Affiliation(s)
- O Contreras-Rodriguez
- Department of Psychiatry, Bellvitge University Hospital-IDIBELL, and CIBERSam-17 and CIBERObn (CB06/03/0034), Barcelona, Spain.
- Department of Radiology-Medical Imaging (IDI), Girona Biomedical Research Institute (IdIBGi), Josep Trueta University Hospital, Girona, Spain.
- Department of Psychiatry and Legal Medicine, Universitat Autònoma de Barcelona, Barcelona, Spain.
- Health Institute Carlos III (ISCIII), Barcelona, Spain.
| | - M Arnoriaga-Rodríguez
- Health Institute Carlos III (ISCIII), Barcelona, Spain
- Department of Diabetes, Endocrinology and Nutrition-UDEN, and CIBERObn (CB06/03/0010), Girona, Spain
- Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain
| | - R Miranda-Olivos
- Department of Psychiatry, Bellvitge University Hospital-IDIBELL, and CIBERSam-17 and CIBERObn (CB06/03/0034), Barcelona, Spain
| | - G Blasco
- Department of Radiology-Medical Imaging (IDI), Girona Biomedical Research Institute (IdIBGi), Josep Trueta University Hospital, Girona, Spain
| | - C Biarnés
- Department of Radiology-Medical Imaging (IDI), Girona Biomedical Research Institute (IdIBGi), Josep Trueta University Hospital, Girona, Spain
| | - J Puig
- Department of Radiology-Medical Imaging (IDI), Girona Biomedical Research Institute (IdIBGi), Josep Trueta University Hospital, Girona, Spain
| | - J Rivera-Pinto
- IrsiCaixa AIDS Research Institute, Badalona, Spain
- Biosciences Department, Faculty of Sciences and Technology, University of Vic-Central University of Catalonia, VIC, Badalona, Spain
| | - M L Calle
- Biosciences Department, Faculty of Sciences and Technology, University of Vic-Central University of Catalonia, VIC, Badalona, Spain
| | - V Pérez-Brocal
- Department of Genomics and Health, Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO-Public Health), Valencia, Spain, and CIBEResp- CB06/02/0050, Madrid, Spain
| | - A Moya
- Department of Genomics and Health, Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO-Public Health), Valencia, Spain, and CIBEResp- CB06/02/0050, Madrid, Spain
- Institute for Integrative Systems Biology (I2SysBio), The University of Valencia and The Spanish National Research Council (CSIC-UVEG), Valencia, Spain
| | - C Coll
- Neuroimmunology and Multiple Sclerosis Unit, Department of Neurology, Girona Biomedical Research Institute (IdIBGi), Dr. Josep Trueta University Hospital, Girona, Spain
| | - L Ramió-Torrentà
- Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain
- Neuroimmunology and Multiple Sclerosis Unit, Department of Neurology, Girona Biomedical Research Institute (IdIBGi), Dr. Josep Trueta University Hospital, Girona, Spain
| | - C Soriano-Mas
- Department of Psychiatry, Bellvitge University Hospital-IDIBELL, and CIBERSam-17 and CIBERObn (CB06/03/0034), Barcelona, Spain
- Department of Psychobiology and Methodology of Health Sciences, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - J M Fernandez-Real
- Health Institute Carlos III (ISCIII), Barcelona, Spain.
- Department of Diabetes, Endocrinology and Nutrition-UDEN, and CIBERObn (CB06/03/0010), Girona, Spain.
- Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain.
| |
Collapse
|
4
|
Susin A, Wang Y, Lê Cao KA, Calle ML. Variable selection in microbiome compositional data analysis. NAR Genom Bioinform 2020; 2:lqaa029. [PMID: 33575585 PMCID: PMC7671404 DOI: 10.1093/nargab/lqaa029] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 03/13/2020] [Accepted: 04/29/2020] [Indexed: 12/25/2022] Open
Abstract
Though variable selection is one of the most relevant tasks in microbiome analysis, e.g. for the identification of microbial signatures, many studies still rely on methods that ignore the compositional nature of microbiome data. The applicability of compositional data analysis methods has been hampered by the availability of software and the difficulty in interpreting their results. This work is focused on three methods for variable selection that acknowledge the compositional structure of microbiome data: selbal, a forward selection approach for the identification of compositional balances, and clr-lasso and coda-lasso, two penalized regression models for compositional data analysis. This study highlights the link between these methods and brings out some limitations of the centered log-ratio transformation for variable selection. In particular, the fact that it is not subcompositionally consistent makes the microbial signatures obtained from clr-lasso not readily transferable. Coda-lasso is computationally efficient and suitable when the focus is the identification of the most associated microbial taxa. Selbal stands out when the goal is to obtain a parsimonious model with optimal prediction performance, but it is computationally greedy. We provide a reproducible vignette for the application of these methods that will enable researchers to fully leverage their potential in microbiome studies.
Collapse
Affiliation(s)
- Antoni Susin
- Mathematical Department, UPC-Barcelona Tech, 08028 Barcelona, Spain
| | - Yiwen Wang
- Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Parkville, VIC 3010, Australia
| | - Kim-Anh Lê Cao
- Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Parkville, VIC 3010, Australia
| | - M Luz Calle
- Biosciences Department, Faculty of Sciences and Technology, University of Vic—Central University of Catalonia, Carrer de la Laura, 13, 08500 Vic, Spain
| |
Collapse
|
5
|
Altarriba-Bartés A, Calle ML, Susín A, Gonçalves B, Vives M, Sampaio J, Peña J. Analysis of the winning probability and the scoring actions in the American professional soccer championship. [Análisis de la probabilidad de ganar y de las acciones que conducen al gol en la liga americana de fútbol profesional]. Rev int cienc deporte 2020. [DOI: 10.5232/ricyde2020.05906] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
6
|
Abstract
Understanding the role of the microbiome in human health and how it can be modulated is becoming increasingly relevant for preventive medicine and for the medical management of chronic diseases. The development of high-throughput sequencing technologies has boosted microbiome research through the study of microbial genomes and allowing a more precise quantification of microbiome abundances and function. Microbiome data analysis is challenging because it involves high-dimensional structured multivariate sparse data and because of its compositional nature. In this review we outline some of the procedures that are most commonly used for microbiome analysis and that are implemented in R packages. We place particular emphasis on the compositional structure of microbiome data. We describe the principles of compositional data analysis and distinguish between standard methods and those that fit into compositional data analysis.
Collapse
Affiliation(s)
- M Luz Calle
- Biosciences Department, Faculty of Science and Technology, University of Vic - Central University of Catalonia, Vic 08500, Spain
| |
Collapse
|
7
|
López de Maturana E, Alonso L, Alarcón P, Martín-Antoniano IA, Pineda S, Piorno L, Calle ML, Malats N. Challenges in the Integration of Omics and Non-Omics Data. Genes (Basel) 2019; 10:genes10030238. [PMID: 30897838 PMCID: PMC6471713 DOI: 10.3390/genes10030238] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 03/05/2019] [Accepted: 03/14/2019] [Indexed: 11/16/2022] Open
Abstract
Omics data integration is already a reality. However, few omics-based algorithms show enough predictive ability to be implemented into clinics or public health domains. Clinical/epidemiological data tend to explain most of the variation of health-related traits, and its joint modeling with omics data is crucial to increase the algorithm’s predictive ability. Only a small number of published studies performed a “real” integration of omics and non-omics (OnO) data, mainly to predict cancer outcomes. Challenges in OnO data integration regard the nature and heterogeneity of non-omics data, the possibility of integrating large-scale non-omics data with high-throughput omics data, the relationship between OnO data (i.e., ascertainment bias), the presence of interactions, the fairness of the models, and the presence of subphenotypes. These challenges demand the development and application of new analysis strategies to integrate OnO data. In this contribution we discuss different attempts of OnO data integration in clinical and epidemiological studies. Most of the reviewed papers considered only one type of omics data set, mainly RNA expression data. All selected papers incorporated non-omics data in a low-dimensionality fashion. The integrative strategies used in the identified papers adopted three modeling methods: Independent, conditional, and joint modeling. This review presents, discusses, and proposes integrative analytical strategies towards OnO data integration.
Collapse
Affiliation(s)
- Evangelina López de Maturana
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Lola Alonso
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Pablo Alarcón
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Isabel Adoración Martín-Antoniano
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Silvia Pineda
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Lucas Piorno
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - M Luz Calle
- Biosciences Department, University of Vic-Central University of Catalonia, Carrer de la Laura 13, 08570 Vic, Spain.
| | - Núria Malats
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| |
Collapse
|
8
|
Abstract
High-throughput sequencing technologies have revolutionized microbiome research by allowing the relative quantification of microbiome composition and function in different environments. In this work we focus on the identification of microbial signatures, groups of microbial taxa that are predictive of a phenotype of interest. We do this by acknowledging the compositional nature of the microbiome and the fact that it carries relative information. Thus, instead of defining a microbial signature as a linear combination in real space corresponding to the abundances of a group of taxa, we consider microbial signatures given by the geometric means of data from two groups of taxa whose relative abundances, or balance, are associated with the response variable of interest. In this work we present selbal, a greedy stepwise algorithm for selection of balances or microbial signatures that preserves the principles of compositional data analysis. We illustrate the algorithm with 16S rRNA abundance data from a Crohn's microbiome study and an HIV microbiome study. IMPORTANCE We propose a new algorithm for the identification of microbial signatures. These microbial signatures can be used for diagnosis, prognosis, or prediction of therapeutic response based on an individual's specific microbiota.
Collapse
Affiliation(s)
- J Rivera-Pinto
- irsiCaixa AIDS Research Institute, Badalona, Spain
- Universitat de Vic-Universitat Central de Catalunya, Vic, Spain
| | - J J Egozcue
- Universitat Politècnica de Catalunya, Barcelona, Spain
| | | | - R Paredes
- irsiCaixa AIDS Research Institute, Badalona, Spain
- Universitat de Vic-Universitat Central de Catalunya, Vic, Spain
- Universitat Autónoma de Barcelona, Barcelona, Spain
- Infectious Diseases Service, Hospital Germans Trias i Pujol, Badalona, Spain
| | - M Noguera-Julian
- irsiCaixa AIDS Research Institute, Badalona, Spain
- Universitat de Vic-Universitat Central de Catalunya, Vic, Spain
- Universitat Autónoma de Barcelona, Barcelona, Spain
| | - M L Calle
- Universitat de Vic-Universitat Central de Catalunya, Vic, Spain
| |
Collapse
|
9
|
Oriol-Tordera B, Llano A, Ganoza C, Cate S, Hildebrand W, Sanchez J, Calle ML, Brander C, Olvera A. Impact of HLA-DRB1 allele polymorphisms on control of HIV infection in a Peruvian MSM cohort. HLA 2017; 90:234-237. [PMID: 28677168 DOI: 10.1111/tan.13085] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Revised: 04/06/2017] [Accepted: 06/15/2017] [Indexed: 11/27/2022]
Abstract
Associations between HLA class II polymorphisms and HIV control were assessed in a Peruvian MSM cohort. Among 233 treatment naïve HIV+ individuals, DRB1*13:02 was linked to elevated viral loads (P = .044) while DRB1*12:01 showed significantly lower viral set points (P = .015) and restricted a dominant T cell response to HIV Gag p24 (P = .038). The present work contributes to a better knowledge of the Peruvian immunogenetics and supports the important role of HLA class II restricted T cells in HIV control.
Collapse
Affiliation(s)
- B Oriol-Tordera
- IrsiCaixa AIDS Research Institute, Hospital Universitari Germans Trias i Pujol, Barcelona, Spain
| | - A Llano
- IrsiCaixa AIDS Research Institute, Hospital Universitari Germans Trias i Pujol, Barcelona, Spain
| | - C Ganoza
- Asociación Civil IMPACTA Salud y Educacion, Lima, Peru
| | - S Cate
- Department of Microbiology and Immunology, University of Oklahoma Health Sciences Center, Oklahoma, Oklahoma
| | - W Hildebrand
- Department of Microbiology and Immunology, University of Oklahoma Health Sciences Center, Oklahoma, Oklahoma
| | - J Sanchez
- Asociación Civil IMPACTA Salud y Educacion, Lima, Peru.,Centro de Investigaciones Tecnológicas, Biomédicas y Medioambientales, Lima, Peru.,Department of Global Health, University of Washington, Seattle, Washington
| | - M L Calle
- Faculty of Medicine, University of Vic-Central University of Catalonia, Barcelona, Spain
| | - C Brander
- IrsiCaixa AIDS Research Institute, Hospital Universitari Germans Trias i Pujol, Barcelona, Spain.,Faculty of Medicine, University of Vic-Central University of Catalonia, Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - A Olvera
- IrsiCaixa AIDS Research Institute, Hospital Universitari Germans Trias i Pujol, Barcelona, Spain
| |
Collapse
|
10
|
Gallego V, Luz Calle M, Oller R. Kernel-Based Measure of Variable Importance for Genetic Association Studies. Int J Biostat 2017. [PMID: 28628480 DOI: 10.1515/ijb-2016-0087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The identification of genetic variants that are associated with disease risk is an important goal of genetic association studies. Standard approaches perform univariate analysis where each genetic variant, usually Single Nucleotide Polymorphisms (SNPs), is tested for association with disease status. Though many genetic variants have been identified and validated so far using this univariate approach, for most complex diseases a large part of their genetic component is still unknown, the so called missing heritability. We propose a Kernel-based measure of variable importance (KVI) that provides the contribution of a SNP, or a group of SNPs, to the joint genetic effect of a set of genetic variants. KVI can be used for ranking genetic markers individually, sets of markers that form blocks of linkage disequilibrium or sets of genetic variants that lie in a gene or a genetic pathway. We prove that, unlike the univariate analysis, KVI captures the relationship with other genetic variants in the analysis, even when measured at the individual level for each genetic variable separately. This is specially relevant and powerful for detecting genetic interactions. We illustrate the results with data from an Alzheimer's disease study and show through simulations that the rankings based on KVI improve those rankings based on two measures of importance provided by the Random Forest. We also prove with a simulation study that KVI is very powerful for detecting genetic interactions.
Collapse
|
11
|
Vilor-Tejedor N, Gonzalez JR, Calle ML. Efficient and Powerful Method for Combining P-Values in Genome-Wide Association Studies. IEEE/ACM Trans Comput Biol Bioinform 2016; 13:1100-1106. [PMID: 28055892 DOI: 10.1109/tcbb.2015.2509977] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The goal of Genome-wide Association Studies (GWAS) is the identification of genetic variants, usually single nucleotide polymorphisms (SNPs), that are associated with disease risk. However, SNPs detected so far with GWAS for most common diseases only explain a small proportion of their total heritability. Gene set analysis (GSA) has been proposed as an alternative to single-SNP analysis with the aim of improving the power of genetic association studies. Nevertheless, most GSA methods rely on expensive computational procedures that make unfeasible their implementation in GWAS. We propose a new GSA method, referred as globalEVT, which uses the extreme value theory to derive gene-level p-values. GlobalEVT reduces dramatically the computational requirements compared to other GSA approaches. In addition, this new approach improves the power by allowing different inheritance models for each genetic variant as illustrated in the simulation study performed and allows the existence of correlation between the SNPs. Real data analysis of an Attention-deficit/hyperactivity disorder (ADHD) study illustrates the importance of using GSA approaches for exploring new susceptibility genes. Specifically, the globalEVT method is able to detect genes related to Cyclophilin A like domain proteins which is known to play an important role in the mechanisms of ADHD development.
Collapse
Affiliation(s)
- Natalia Vilor-Tejedor
- Center for Research in Environmental Epidemiology, Universitat Pompeu Fabra and CIBER Epidemiología y Salud Pública, C/Doctor Aiguader 88, Barcelona, Spain
| | - Juan R Gonzalez
- Center for Research in Environmental Epidemiology, Universitat Pompeu Fabra and CIBER Epidemiología y Salud Pública, C/Doctor Aiguader 88, Barcelona, Spain
| | - M Luz Calle
- Department of Systems Biology, Bioinformatics and Medical Statistics Group, Universitat de Vic-Universitat Central de Catalunya, C. Sagrada Familia 7, Vic, Spain
| |
Collapse
|
12
|
Hedegaard J, Lamy P, Nordentoft I, Algaba F, Høyer S, Ulhøi BP, Vang S, Reinert T, Hermann GG, Mogensen K, Thomsen MBH, Nielsen MM, Marquez M, Segersten U, Aine M, Höglund M, Birkenkamp-Demtröder K, Fristrup N, Borre M, Hartmann A, Stöhr R, Wach S, Keck B, Seitz AK, Nawroth R, Maurer T, Tulic C, Simic T, Junker K, Horstmann M, Harving N, Petersen AC, Calle ML, Steyerberg EW, Beukers W, van Kessel KEM, Jensen JB, Pedersen JS, Malmström PU, Malats N, Real FX, Zwarthoff EC, Ørntoft TF, Dyrskjøt L. Comprehensive Transcriptional Analysis of Early-Stage Urothelial Carcinoma. Cancer Cell 2016; 30:27-42. [PMID: 27321955 DOI: 10.1016/j.ccell.2016.05.004] [Citation(s) in RCA: 420] [Impact Index Per Article: 52.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2015] [Revised: 02/18/2016] [Accepted: 05/13/2016] [Indexed: 01/01/2023]
Abstract
Non-muscle-invasive bladder cancer (NMIBC) is a heterogeneous disease with widely different outcomes. We performed a comprehensive transcriptional analysis of 460 early-stage urothelial carcinomas and showed that NMIBC can be subgrouped into three major classes with basal- and luminal-like characteristics and different clinical outcomes. Large differences in biological processes such as the cell cycle, epithelial-mesenchymal transition, and differentiation were observed. Analysis of transcript variants revealed frequent mutations in genes encoding proteins involved in chromatin organization and cytoskeletal functions. Furthermore, mutations in well-known cancer driver genes (e.g., TP53 and ERBB2) were primarily found in high-risk tumors, together with APOBEC-related mutational signatures. The identification of subclasses in NMIBC may offer better prognostication and treatment selection based on subclass assignment.
Collapse
Affiliation(s)
- Jakob Hedegaard
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Philippe Lamy
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Iver Nordentoft
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Ferran Algaba
- Section of Pathology, Fundació Puigvert, University Autonoma de Barcelona, Barcelona 08025, Spain
| | - Søren Høyer
- Department of Pathology, Aarhus University Hospital, Aarhus 8000, Denmark
| | | | - Søren Vang
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Thomas Reinert
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Gregers G Hermann
- Department of Urology, Frederiksberg Hospital, Frederiksberg 2000, Denmark
| | - Karin Mogensen
- Department of Urology, Frederiksberg Hospital, Frederiksberg 2000, Denmark
| | | | | | - Mirari Marquez
- Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Ulrika Segersten
- Department of Surgical Sciences, Uppsala University, Uppsala 75185, Sweden
| | - Mattias Aine
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, Lund 22100, Sweden
| | - Mattias Höglund
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, Lund 22100, Sweden
| | | | - Niels Fristrup
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Michael Borre
- Department of Urology, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Arndt Hartmann
- Institute of Pathology, University Hospital Erlangen, Friedrich Alexander-University Erlangen-Nürnberg, Erlangen 91054, Germany
| | - Robert Stöhr
- Institute of Pathology, University Hospital Erlangen, Friedrich Alexander-University Erlangen-Nürnberg, Erlangen 91054, Germany
| | - Sven Wach
- Department of Urology, University Hospital Erlangen, Friedrich Alexander-University Erlangen-Nürnberg, Erlangen 91054, Germany
| | - Bastian Keck
- Department of Urology, University Hospital Erlangen, Friedrich Alexander-University Erlangen-Nürnberg, Erlangen 91054, Germany
| | - Anna Katharina Seitz
- Department of Urology, Klinikum rechts der Isar der Technischen Universität München, Munich 81675, Germany
| | - Roman Nawroth
- Department of Urology, Klinikum rechts der Isar der Technischen Universität München, Munich 81675, Germany
| | - Tobias Maurer
- Department of Urology, Klinikum rechts der Isar der Technischen Universität München, Munich 81675, Germany
| | - Cane Tulic
- Faculty of Medicine, Clinic of Urology, Clinical Centre of Serbia, University of Belgrade, 11000 Belgrade, Serbia
| | - Tatjana Simic
- Faculty of Medicine, Institute of Medical and Clinical Biochemistry, University of Belgrade, 11000 Belgrade, Serbia
| | - Kerstin Junker
- Department of Urology, Saarland University, Homburg 66421, Germany
| | - Marcus Horstmann
- Department of Urology, Friedrich-Schiller-University Jena, Jena 07737, Germany
| | - Niels Harving
- Department of Urology, Aalborg University Hospital, Aalborg 9000, Denmark
| | | | - M Luz Calle
- Systems Biology Department, University of Vic, Vic, Barcelona 08500, Spain
| | - Ewout W Steyerberg
- Department of Public Health, Erasmus Medical Centre, 3015 CE Rotterdam, the Netherlands
| | - Willemien Beukers
- Department of Pathology, Erasmus Medical Centre, 3015 CE Rotterdam, the Netherlands
| | - Kim E M van Kessel
- Department of Pathology, Erasmus Medical Centre, 3015 CE Rotterdam, the Netherlands
| | | | - Jakob Skou Pedersen
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Per-Uno Malmström
- Department of Surgical Sciences, Uppsala University, Uppsala 75185, Sweden
| | - Núria Malats
- Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Francisco X Real
- Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain; Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Ellen C Zwarthoff
- Department of Pathology, Erasmus Medical Centre, 3015 CE Rotterdam, the Netherlands
| | - Torben Falck Ørntoft
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark
| | - Lars Dyrskjøt
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus 8200, Denmark.
| |
Collapse
|
13
|
Porta N, Calle ML, Lewis R, Snape M, Hendron C, James N, Huddart R, Hall E. Dynamic prediction methods in the BC2001 clinical trial. Trials 2015. [PMCID: PMC4660330 DOI: 10.1186/1745-6215-16-s2-p144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
14
|
Vilor-Tejedor N, Calle ML. Global adaptive rank truncated product method for gene-set analysis in association studies. Biom J 2014; 56:901-11. [PMID: 25082012 DOI: 10.1002/bimj.201300192] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Revised: 02/18/2014] [Accepted: 04/18/2014] [Indexed: 11/10/2022]
Abstract
Gene set analysis (GSA) aims to assess the overall association of a set of genetic variants with a phenotype and has the potential to detect subtle effects of variants in a gene or a pathway that might be missed when assessed individually. We present a new implementation of the Adaptive Rank Truncated Product method (ARTP) for analyzing the association of a set of Single Nucleotide Polymorphisms (SNPs) in a gene or pathway. The new implementation, referred to as globalARTP, improves the original one by allowing the different SNPs in the set to have different modes of inheritance. We perform a simulation study for exploring the power of the proposed methodology in a set of scenarios with different numbers of causal SNPs with different effect sizes. Moreover, we show the advantage of using the gene set approach in the context of an Alzheimer's disease case-control study where we explore the endocytosis pathway. The new method is implemented in the R function globalARTP of the globalGSA package available at http://cran.r-project.org.
Collapse
Affiliation(s)
- Natalia Vilor-Tejedor
- Centre for Research in Environmental Epidemiology (CREAL), C. Doctor Aiguader, 88, 08003-Barcelona, Spain.,Department of Experimental and Health Sciences, Pompeu Fabra University (UPF), Barcelona, Spain.,CIBER Epidemiologia y Salud Publica (CIBERESP), Barcelona, Spain
| | - M Luz Calle
- Department of Systems Biology, Bioinformatics and Medical Statistics Group, Universitat de Vic - Universitat Central de Catalunya, C. Sagrada Familia, 7, 08570-Vic, Spain
| |
Collapse
|
15
|
Pineda S, Milne RL, Calle ML, Rothman N, López de Maturana E, Herranz J, Kogevinas M, Chanock SJ, Tardón A, Márquez M, Guey LT, García-Closas M, Lloreta J, Baum E, González-Neira A, Carrato A, Navarro A, Silverman DT, Real FX, Malats N. Genetic variation in the TP53 pathway and bladder cancer risk. a comprehensive analysis. PLoS One 2014; 9:e89952. [PMID: 24818791 PMCID: PMC4018346 DOI: 10.1371/journal.pone.0089952] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2013] [Accepted: 01/24/2014] [Indexed: 12/11/2022] Open
Abstract
INTRODUCTION Germline variants in TP63 have been consistently associated with several tumors, including bladder cancer, indicating the importance of TP53 pathway in cancer genetic susceptibility. However, variants in other related genes, including TP53 rs1042522 (Arg72Pro), still present controversial results. We carried out an in depth assessment of associations between common germline variants in the TP53 pathway and bladder cancer risk. MATERIAL AND METHODS We investigated 184 tagSNPs from 18 genes in 1,058 cases and 1,138 controls from the Spanish Bladder Cancer/EPICURO Study. Cases were newly-diagnosed bladder cancer patients during 1998-2001. Hospital controls were age-gender, and area matched to cases. SNPs were genotyped in blood DNA using Illumina Golden Gate and TaqMan assays. Cases were subphenotyped according to stage/grade and tumor p53 expression. We applied classical tests to assess individual SNP associations and the Least Absolute Shrinkage and Selection Operator (LASSO)-penalized logistic regression analysis to assess multiple SNPs simultaneously. RESULTS Based on classical analyses, SNPs in BAK1 (1), IGF1R (5), P53AIP1 (1), PMAIP1 (2), SERINPB5 (3), TP63 (3), and TP73 (1) showed significant associations at p-value≤0.05. However, no evidence of association, either with overall risk or with specific disease subtypes, was observed after correction for multiple testing (p-value≥0.8). LASSO selected the SNP rs6567355 in SERPINB5 with 83% of reproducibility. This SNP provided an OR = 1.21, 95%CI 1.05-1.38, p-value = 0.006, and a corrected p-value = 0.5 when controlling for over-estimation. DISCUSSION We found no strong evidence that common variants in the TP53 pathway are associated with bladder cancer susceptibility. Our study suggests that it is unlikely that TP53 Arg72Pro is implicated in the UCB in white Europeans. SERPINB5 and TP63 variation deserve further exploration in extended studies.
Collapse
Affiliation(s)
- Silvia Pineda
- Spanish National Cancer Research Center (CNIO), Madrid, Spain
| | - Roger L. Milne
- Spanish National Cancer Research Center (CNIO), Madrid, Spain
| | - M. Luz Calle
- Systems Biology Department, University of Vic, Vic, Spain
| | - Nathaniel Rothman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland, United States of America
| | | | - Jesús Herranz
- Spanish National Cancer Research Center (CNIO), Madrid, Spain
| | - Manolis Kogevinas
- Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- Institut Municipal d'Investigació Mèdica – Hospital del Mar, Barcelona, Spain
| | - Stephen J. Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Adonina Tardón
- Department of Preventive Medicine, Universidad de Oviedo, Oviedo, Spain
| | - Mirari Márquez
- Spanish National Cancer Research Center (CNIO), Madrid, Spain
| | - Lin T. Guey
- Spanish National Cancer Research Center (CNIO), Madrid, Spain
| | - Montserrat García-Closas
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Josep Lloreta
- Institut Municipal d'Investigació Mèdica – Hospital del Mar, Barcelona, Spain
- Departament de Patologia, Hospital del Mar – IMAS, Barcelona, Spain
| | - Erin Baum
- Spanish National Cancer Research Center (CNIO), Madrid, Spain
| | | | - Alfredo Carrato
- Servicio de Oncología, Hospital Universitario de Elche, Elche, Spain
- Servicio de Oncología, Hospital Universitario Ramon y Cajal, Madrid, Spain
| | - Arcadi Navarro
- Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
- Institut de Biologia Evolutiva (UPF-CSIC), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- Instituto Nacional de Bioinformática, Barcelona, Spain
| | - Debra T. Silverman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Francisco X. Real
- Spanish National Cancer Research Center (CNIO), Madrid, Spain
- Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
| | - Núria Malats
- Spanish National Cancer Research Center (CNIO), Madrid, Spain
- * E-mail:
| |
Collapse
|
16
|
de Maturana EL, Chanok SJ, Picornell AC, Rothman N, Herranz J, Calle ML, García-Closas M, Marenne G, Brand A, Tardón A, Carrato A, Silverman DT, Kogevinas M, Gianola D, Real FX, Malats N. Whole genome prediction of bladder cancer risk with the Bayesian LASSO. Genet Epidemiol 2014; 38:467-76. [PMID: 24796258 DOI: 10.1002/gepi.21809] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2013] [Revised: 03/05/2014] [Accepted: 03/20/2014] [Indexed: 11/11/2022]
Abstract
To build a predictive model for urothelial carcinoma of the bladder (UCB) risk combining both genomic and nongenomic data, 1,127 cases and 1,090 controls from the Spanish Bladder Cancer/EPICURO study were genotyped using the HumanHap 1M SNP array. After quality control filters, genotypes from 475,290 variants were available. Nongenomic information comprised age, gender, region, and smoking status. Three Bayesian threshold models were implemented including: (1) only genomic information, (2) only nongenomic data, and (3) both sources of information. The three models were applied to the whole population, to only nonsmokers, to male smokers, and to extreme phenotypes to potentiate the UCB genetic component. The area under the ROC curve allowed evaluating the predictive ability of each model in a 10-fold cross-validation scenario. Smoking status showed the highest predictive ability of UCB risk (AUCtest = 0.62). On the other hand, the AUC of all genetic variants was poorer (0.53). When the extreme phenotype approach was applied, the predictive ability of the genomic model improved 15%. This study represents a first attempt to build a predictive model for UCB risk combining both genomic and nongenomic data and applying state-of-the-art statistical approaches. However, the lack of genetic relatedness among individuals, the complexity of UCB etiology, as well as a relatively small statistical power, may explain the low predictive ability for UCB risk. The study confirms the difficulty of predicting complex diseases using genetic data, and suggests the limited translational potential of findings from this type of data into public health interventions.
Collapse
|
17
|
Porta N, Calle ML, Malats N, Gómez G. A dynamic model for the risk of bladder cancer progression. Stat Med 2011; 31:287-300. [PMID: 22161505 DOI: 10.1002/sim.4433] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2010] [Accepted: 09/15/2011] [Indexed: 11/05/2022]
Abstract
We propose a multistate modeling approach to describe the observed evolution of patients diagnosed with non-muscle-invasive bladder cancer. On the basis of data from the Spanish Bladder Cancer/EPICURO study, we adjust a multistate model taking into account the disease-related events of interest (recurrence, progression, and disease-related deaths) as well as competing deaths due to other causes. We then develop a dynamic predictive process for bladder cancer progression, which allows the risk of a patient to be updated whenever new information of his or her evolution is available. By using specific measures of prospective accuracy in the presence of competing risks, the proposed dynamic model has shown to improve prediction accuracy and provides a more personalized management of bladder patients.
Collapse
Affiliation(s)
- Núria Porta
- Universitat Politècnica de Catalunya, Jordi Girona 1-3, 08034, Barcelona, Spain.
| | | | | | | |
Collapse
|
18
|
Calle ML, Urrea V, Boulesteix AL, Malats N. AUC-RF: a new strategy for genomic profiling with random forest. Hum Hered 2011; 72:121-32. [PMID: 21996641 DOI: 10.1159/000330778] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2011] [Accepted: 07/11/2011] [Indexed: 12/23/2022] Open
Abstract
OBJECTIVE Genomic profiling, the use of genetic variants at multiple loci simultaneously for the prediction of disease risk, requires the selection of a set of genetic variants that best predicts disease status. The goal of this work was to provide a new selection algorithm for genomic profiling. METHODS We propose a new algorithm for genomic profiling based on optimizing the area under the receiver operating characteristic curve (AUC) of the random forest (RF). The proposed strategy implements a backward elimination process based on the initial ranking of variables. RESULTS AND CONCLUSIONS We demonstrate the advantage of using the AUC instead of the classification error as a measure of predictive accuracy of RF. In particular, we show that the use of the classification error is especially inappropriate when dealing with unbalanced data sets. The new procedure for variable selection and prediction, namely AUC-RF, is illustrated with data from a bladder cancer study and also with simulated data. The algorithm is publicly available as an R package, named AUCRF, at http://cran.r-project.org/.
Collapse
Affiliation(s)
- M Luz Calle
- Systems Biology Department, University of Vic, Spain. malu.calle @ uvic.cat
| | | | | | | |
Collapse
|
19
|
Cattaert T, Calle ML, Dudek SM, Mahachie John JM, Van Lishout F, Urrea V, Ritchie MD, Van Steen K. Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise. Ann Hum Genet 2010; 75:78-89. [PMID: 21158747 DOI: 10.1111/j.1469-1809.2010.00604.x] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Analyzing the combined effects of genes and/or environmental factors on the development of complex diseases is a great challenge from both the statistical and computational perspective, even using a relatively small number of genetic and nongenetic exposures. Several data-mining methods have been proposed for interaction analysis, among them, the Multifactor Dimensionality Reduction Method (MDR) has proven its utility in a variety of theoretical and practical settings. Model-Based Multifactor Dimensionality Reduction (MB-MDR), a relatively new MDR-based technique that is able to unify the best of both nonparametric and parametric worlds, was developed to address some of the remaining concerns that go along with an MDR analysis. These include the restriction to univariate, dichotomous traits, the absence of flexible ways to adjust for lower order effects and important confounders, and the difficulty in highlighting epistatic effects when too many multilocus genotype cells are pooled into two new genotype groups. We investigate the empirical power of MB-MDR to detect gene-gene interactions in the absence of any noise and in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Power is generally higher for MB-MDR than for MDR, in particular in the presence of genetic heterogeneity, phenocopy, or low minor allele frequencies.
Collapse
Affiliation(s)
- Tom Cattaert
- Montefiore Institute, University of Liege, Belgium
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Calle ML, Urrea V, Malats N, Van Steen K. mbmdr: an R package for exploring gene-gene interactions associated with binary or quantitative traits. ACTA ACUST UNITED AC 2010; 26:2198-9. [PMID: 20595460 DOI: 10.1093/bioinformatics/btq352] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
SUMMARY We describe mbmdr, an R package for implementing the model-based multifactor dimensionality reduction (MB-MDR) method. MB-MDR has been proposed by Calle et al. as a dimension reduction method for exploring gene-gene interactions in case-control association studies. It is an extension of the popular multifactor dimensionality reduction (MDR) method of Ritchie et al. allowing a more flexible definition of risk cells. In MB-MDR, risk categories are defined using a regression model which allows adjustment for covariates and main effects and, in addition to the classical low risk and high risk categories, MB-MDR considers a third category of indeterminate or not informative cells. An important improvement added to the current mbmdr algorithm with respect to the original MB-MDR formulation in Calle et al. and also to the classical MDR approach, is the extension of the methodology to different outcome types. While MB-MDR was initially proposed for binary traits in the context of case-control studies, the mbmdr package provides options to analyze both binary or quantitative traits for unrelated individuals. AVAILABILITY http://cran.r-project.org/.
Collapse
Affiliation(s)
- M Luz Calle
- Department of Systems Biology, Universitat de Vic.
| | | | | | | |
Collapse
|
21
|
Cattaert T, Urrea V, Naj AC, De Lobel L, De Wit V, Fu M, Mahachie John JM, Shen H, Calle ML, Ritchie MD, Edwards TL, Van Steen K. FAM-MDR: a flexible family-based multifactor dimensionality reduction technique to detect epistasis using related individuals. PLoS One 2010; 5:e10304. [PMID: 20421984 PMCID: PMC2858665 DOI: 10.1371/journal.pone.0010304] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2010] [Accepted: 03/01/2010] [Indexed: 12/05/2022] Open
Abstract
We propose a novel multifactor dimensionality reduction method for epistasis detection in small or extended pedigrees, FAM-MDR. It combines features of the Genome-wide Rapid Association using Mixed Model And Regression approach (GRAMMAR) with Model-Based MDR (MB-MDR). We focus on continuous traits, although the method is general and can be used for outcomes of any type, including binary and censored traits. When comparing FAM-MDR with Pedigree-based Generalized MDR (PGMDR), which is a generalization of Multifactor Dimensionality Reduction (MDR) to continuous traits and related individuals, FAM-MDR was found to outperform PGMDR in terms of power, in most of the considered simulated scenarios. Additional simulations revealed that PGMDR does not appropriately deal with multiple testing and consequently gives rise to overly optimistic results. FAM-MDR adequately deals with multiple testing in epistasis screens and is in contrast rather conservative, by construction. Furthermore, simulations show that correcting for lower order (main) effects is of utmost importance when claiming epistasis. As Type 2 Diabetes Mellitus (T2DM) is a complex phenotype likely influenced by gene-gene interactions, we applied FAM-MDR to examine data on glucose area-under-the-curve (GAUC), an endophenotype of T2DM for which multiple independent genetic associations have been observed, in the Amish Family Diabetes Study (AFDS). This application reveals that FAM-MDR makes more efficient use of the available data than PGMDR and can deal with multi-generational pedigrees more easily. In conclusion, we have validated FAM-MDR and compared it to PGMDR, the current state-of-the-art MDR method for family data, using both simulations and a practical dataset. FAM-MDR is found to outperform PGMDR in that it handles the multiple testing issue more correctly, has increased power, and efficiently uses all available information.
Collapse
Affiliation(s)
- Tom Cattaert
- Montefiore Institute, University of Liège, Liège, Belgium.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Abstract
The goal of this article (letter to the editor) is to emphasize the value of exploring ranking stability when using the importance measures, mean decrease accuracy (MDA) and mean decrease Gini (MDG), provided by Random Forest. We illustrate with a real and a simulated example that ranks based on the MDA are unstable to small perturbations of the dataset and ranks based on the MDG provide more robust results.
Collapse
|
23
|
Abstract
Interval censoring is encountered in many practical situations when the event of interest cannot be observed and it is only known to have occurred within a time window. The theory for the analysis of interval-censored data has been developed over the past three decades and several reviews have been written. However, it is still a common practice in medical and reliability studies to simplify the interval censoring structure of the data into a more standard right censoring situation by, for instance, imputing the midpoint of the censoring interval. The availability of software for right censoring might well be the main reason for this simplifying practice. In contrast, several methods have been developed to deal with interval-censored data and the corresponding algorithms to make the procedures feasible are scattered across the statistical software or remain behind the personal computers of many researchers. The purpose of this tutorial is to present, in a pedagogical and unified manner, the methodology and the available software for analyzing interval-censored data. The paper covers frequentist non-parametric, parametric and semiparametric estimating approaches, non-parametric tests for comparing survival curves and a section on simulation of interval-censored data. The methods and the software are described using the data from a dental study.
Collapse
Affiliation(s)
- Guadalupe Gómez
- Departament d’Estadstica i I.O., Universitat Politècnica de Catalunya, Spain
| | - M Luz Calle
- Departament de Biologia de Sistemes, Universitat de Vic, Vic, Spain
| | - Ramon Oller
- Departament d’Economia, Matemàtica i Informàtica, Universitat de Vic, Vic, Spain
| | - Klaus Langohr
- Programa de Recerca en Neuropsicofarmacologia, Institut Municipal d’Investigació Mèdica, Spain
| |
Collapse
|
24
|
Calle ML, Urrea V, Vellalta G, Malats N, Steen KV. Improving strategies for detecting genetic patterns of disease susceptibility in association studies. Stat Med 2009; 27:6532-46. [PMID: 18837071 DOI: 10.1002/sim.3431] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The analysis of gene interactions and epistatic patterns of susceptibility is especially important for investigating complex diseases such as cancer characterized by the joint action of several genes. This work is motivated by a case-control study of bladder cancer, aimed at evaluating the role of both genetic and environmental factors in bladder carcinogenesis. In particular, the analysis of the inflammation pathway is of interest, for which information on a total of 282 SNPs in 108 genes involved in the inflammatory response is available. Detecting and interpreting interactions with such a large number of polymorphisms is a great challenge from both the statistical and the computational perspectives. In this paper we propose a two-stage strategy for identifying relevant interactions: (1) the use of a synergy measure among interacting genes and (2) the use of the model-based multifactor dimensionality reduction method (MB-MDR), a model-based version of the MDR method, which allows adjustment for confounders.
Collapse
Affiliation(s)
- M L Calle
- Department of Systems Biology, Universitat de Vic, Carrer de la Sagrada Família, 7-08500 Vic, Spain.
| | | | | | | | | |
Collapse
|
25
|
Affiliation(s)
- Ramon Oller
- Departamentd'Economia; Matemàtica i Informàtica Universität de Vic; ES-08500 Vic Spain
| | - Guadalupe Gomez
- Departament d'Estadística i Investigació Operativa Universitat Politècnica de Catalunya; ES-08028 Barcelona Spain
| | - M. Luz Calle
- Departamentd'Informatica i Matemàtica Universität de Vic; ES-08500 Vic Spain
| |
Collapse
|
26
|
Calle ML, Sánchez-Espigares JA. [Classification trees and regression in biomedical research]. Med Clin (Barc) 2007; 129:702-6. [PMID: 18021613 DOI: 10.1157/13112516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- M Luz Calle
- Grup de Recerca en Bioinformàtica i Estadística Mèdica, Departament de Biologia de Sistemes, Escola Politècnica Superior, Universitat de Vic, Sagrada Família 7, 08500 Vic, Barcelona, Spain.
| | | |
Collapse
|
27
|
Hough G, Luz Calle M, Serrat C, Curia A. Number of consumers necessary for shelf life estimations based on survival analysis statistics. Food Qual Prefer 2007. [DOI: 10.1016/j.foodqual.2007.01.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
28
|
de Arquer GR, Peña R, Cabrera C, Coma G, Ruiz-Hernandez R, Guerola R, Clotet B, Ruiz L, Esté JA, Calle ML, Bofill M. Skewed expression and up-regulation of the IL-12 and IL-18 receptors in resting and activated CD4 T cells from HIV-1-infected patients. J Leukoc Biol 2007; 82:72-8. [PMID: 17403771 DOI: 10.1189/jlb.1106698] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
IL-12 and IL-18 synergistically induce the production of IFN-gamma by resting and activated T cells. To evaluate whether this induction was affected in HIV-1-infected patients, PBMC or isolated CD4 T cells were cultured with IL-12 plus IL-18, anti-CD3 plus anti-CD28, or PHA for 72 h. Cell samples were labeled daily to assess the levels of IL-12 receptor beta1 (IL-12Rbeta1), IL-12Rbeta2, and IL-18Ralpha. Culture supernatants were analyzed for the presence of Th1- and Th2-related cytokines by ELISA or cytometric bead array and analyzed by flow cytometry. A twofold increase in the percentage of CD4-resting T cells expressing IL-12Rbeta1 and IL-18Ralpha from HIV-1-infected patients was observed when compared with cells from HIV-1-negative donors. Higher IL-12Rbeta1 and IL-18Ralpha expression correlated (r=0.87; P<0.007) to increased production of IFN-gamma by isolated CD4 T cells in the presence of IL-12 and IL-18. Moreover, exogenous IL-12 and IL-18 induced the up-regulation of IL-12Rbeta2 to twice higher in CD4 T cells from HIV-1-positive individuals compared with controls. Conversely, upon activation with anti-CD3 and anti-CD28 antibodies, only 25% of the CD4+ T cells from HIV-1 patients showed an increase in the IL-12beta2 when compared with 50% in healthy controls. Furthermore, the percentage of IL-12Rbeta1-positive cells correlated inversely with the CD4 nadir of patients, suggesting that deregulation of the IL-12 and IL-18 pathways may play a role in the immunopathogenesis of HIV-1 infection.
Collapse
Affiliation(s)
- Guillermo Robert de Arquer
- Fundació IrsiCaixa, HIVACAT, Hospital Universitari Germans Trias i Pujol, Ctra Canyet sn, 08916 Badalona, Spain
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
|
30
|
Abstract
We analyse the elapsed time between intravenous (IV) drug initiation and HIV infection in a cohort of 972 injecting drug users attending a hospital detoxification unit. We use the time of seroconversion instead of the time of HIV infection because the date of HIV infection is rarely known and the gap between these two times is negligible (around one to three months). Although seroconversion time cannot be determined exactly, it can be inferred at least to within an interval. This seroconversion interval is determined from the dates of HIV antibody tests, if available. The data is consequently interval-censored. We estimate the distribution function of the elapsed time from IV drug initiation to seroconversion as well as the risk of seroconversion by means of a non-parametric Bayesian approach. The analysis is conducted according to the following four calendar periods: before or at 1980; between 1981 and 1985; between 1986 and 1991; after or at 1992 where the IV drug use was initiated. The methodology used is based on an alternating conditional sampling algorithm. The Bayesian approach allows not only the incorporation of prior beliefs about the distribution function, but also the analysis of the risk of seroconversion without assuming restrictive parametric models. Furthermore, the estimator for the distribution function is smooth and thus differences between groups can be easily interpreted.
Collapse
Affiliation(s)
- G Gómez
- Dept. d'Estadística, Universitat Politècnica de Catalunya, Edifici U Campus Sud, Pau Gargallo, 5, 08028-Barcelona, Spain.
| | | | | | | |
Collapse
|
31
|
|