Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wessels LFA, Reinders MJT, Hart AAM, Veenman CJ, Dai H, He YD, van't Veer LJ. A protocol for building and evaluating predictors of disease state based on microarray data. Bioinformatics 2005;21:3755-62. [PMID: 15817694 DOI: 10.1093/bioinformatics/bti429] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

For:	Wessels LFA, Reinders MJT, Hart AAM, Veenman CJ, Dai H, He YD, van't Veer LJ. A protocol for building and evaluating predictors of disease state based on microarray data. Bioinformatics 2005;21:3755-62. [PMID: 15817694 DOI: 10.1093/bioinformatics/bti429] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Bouwmeester R, Richardson K, Denny R, Wilson ID, Degroeve S, Martens L, Vissers JPC. Predicting ion mobility collision cross sections and assessing prediction variation by combining conventional and data driven modeling. Talanta 2024;274:125970. [PMID: 38621320 DOI: 10.1016/j.talanta.2024.125970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 03/01/2024] [Accepted: 03/20/2024] [Indexed: 04/17/2024]

Rentroia-Pacheco B, Tjien-Fooh FJ, Quattrocchi E, Kobic A, Wever R, Bellomo D, Meves A, Hieken TJ. Clinicopathologic models predicting non-sentinel lymph node metastasis in cutaneous melanoma patients: Are they useful for patients with a single positive sentinel node? J Surg Oncol 2021;125:516-524. [PMID: 34735719 PMCID: PMC8799494 DOI: 10.1002/jso.26736] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/20/2021] [Accepted: 10/27/2021] [Indexed: 12/03/2022]

Gumaei A, Sammouda R, Al-Rakhami M, AlSalman H, El-Zaart A. Feature selection with ensemble learning for prostate cancer diagnosis from microarray gene expression. Health Informatics J 2021;27:1460458221989402. [PMID: 33570011 DOI: 10.1177/1460458221989402] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Eggermont AMM, Bellomo D, Arias-Mejias SM, Quattrocchi E, Sominidi-Damodaran S, Bridges AG, Lehman JS, Hieken TJ, Jakub JW, Murphree DH, Pittelkow MR, Sluzevich JC, Cappel MA, Bagaria SP, Perniciaro C, Tjien-Fooh FJ, Rentroia-Pacheco B, Wever R, van Vliet MH, Dwarkasing J, Meves A. Identification of stage I/IIA melanoma patients at high risk for disease relapse using a clinicopathologic and gene expression model. Eur J Cancer 2020;140:11-18. [PMID: 33032086 PMCID: PMC7655519 DOI: 10.1016/j.ejca.2020.08.029] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 07/09/2020] [Accepted: 08/16/2020] [Indexed: 12/25/2022]

Bellomo D, Bridges AG, Hieken TJ, Meves A. Reply to E. K. Bartlett et al and A. H. R. Varey et al. JCO Precis Oncol 2020;4:992-994. [PMID: 32914042 PMCID: PMC7480899 DOI: 10.1200/po.20.00289] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/26/2020] [Indexed: 11/20/2022] Open

Profiling of the known-unknown Passiflora variant complement by liquid chromatography - Ion mobility - Mass spectrometry. Talanta 2020;221:121311. [PMID: 33076047 DOI: 10.1016/j.talanta.2020.121311] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 06/16/2020] [Accepted: 06/18/2020] [Indexed: 01/01/2023]

Liu XY, Wang S, Zhang H, Zhang H, Yang ZY, Liang Y. Novel Regularization Method for Biomarker Selection and Cancer Classification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:1329-1340. [PMID: 30716046 DOI: 10.1109/tcbb.2019.2897301] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Bellomo D, Arias-Mejias SM, Ramana C, Heim JB, Quattrocchi E, Sominidi-Damodaran S, Bridges AG, Lehman JS, Hieken TJ, Jakub JW, Pittelkow MR, DiCaudo DJ, Pockaj BA, Sluzevich JC, Cappel MA, Bagaria SP, Perniciaro C, Tjien-Fooh FJ, van Vliet MH, Dwarkasing J, Meves A. Model Combining Tumor Molecular and Clinicopathologic Risk Factors Predicts Sentinel Lymph Node Metastasis in Primary Cutaneous Melanoma. JCO Precis Oncol 2020;4:319-334. [PMID: 32405608 PMCID: PMC7220172 DOI: 10.1200/po.19.00206] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Abstract

Purpose

More than 80% of patients who undergo sentinel lymph node (SLN) biopsy have no nodal metastasis. Here we describe a model that combines clinicopathologic and molecular variables to identify patients with thin and intermediate thickness melanomas who may forgo the SLN biopsy procedure due to their low risk of nodal metastasis.

Patients and Methods

Genes with functional roles in melanoma metastasis were discovered by analysis of next generation sequencing data and case control studies. We then used PCR to quantify gene expression in diagnostic biopsy tissue across a prospectively designed archival cohort of 754 consecutive thin and intermediate thickness primary cutaneous melanomas. Outcome of interest was SLN biopsy metastasis within 90 days of melanoma diagnosis. A penalized maximum likelihood estimation algorithm was used to train logistic regression models in a repeated cross validation scheme to predict the presence of SLN metastasis from molecular, clinical and histologic variables.

Results

Expression of genes with roles in epithelial-to-mesenchymal transition (glia derived nexin, growth differentiation factor 15, integrin β3, interleukin 8, lysyl oxidase homolog 4, TGFβ receptor type 1 and tissue-type plasminogen activator) and melanosome function (melanoma antigen recognized by T cells 1) were associated with SLN metastasis. The predictive ability of a model that only considered clinicopathologic or gene expression variables was outperformed by a model which included molecular variables in combination with the clinicopathologic predictors Breslow thickness and patient age; AUC, 0.82; 95% CI, 0.78-0.86; SLN biopsy reduction rate of 42% at a negative predictive value of 96%.

Conclusion

A combined model including clinicopathologic and gene expression variables improved the identification of melanoma patients who may forgo the SLN biopsy procedure due to their low risk of nodal metastasis.

Collapse

Rodrigues V, Deusdado S. Deterministic Classifiers Accuracy Optimization for Cancer Microarray Data. PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 13TH INTERNATIONAL CONFERENCE 2020. [DOI: 10.1007/978-3-030-23873-5_19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Nye LC, Williams JP, Munjoma NC, Letertre MP, Coen M, Bouwmeester R, Martens L, Swann JR, Nicholson JK, Plumb RS, McCullagh M, Gethings LA, Lai S, Langridge JI, Vissers JP, Wilson ID. A comparison of collision cross section values obtained via travelling wave ion mobility-mass spectrometry and ultra high performance liquid chromatography-ion mobility-mass spectrometry: Application to the characterisation of metabolites in rat urine. J Chromatogr A 2019;1602:386-396. [DOI: 10.1016/j.chroma.2019.06.056] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 06/24/2019] [Accepted: 06/26/2019] [Indexed: 01/01/2023]

Allahyar A, Ubels J, de Ridder J. A data-driven interactome of synergistic genes improves network-based cancer outcome prediction. PLoS Comput Biol 2019;15:e1006657. [PMID: 30726216 PMCID: PMC6380593 DOI: 10.1371/journal.pcbi.1006657] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2018] [Revised: 02/19/2019] [Accepted: 11/20/2018] [Indexed: 12/13/2022] Open

Abstract

Robustly predicting outcome for cancer patients from gene expression is an important challenge on the road to better personalized treatment. Network-based outcome predictors (NOPs), which considers the cellular wiring diagram in the classification, hold much promise to improve performance, stability and interpretability of identified marker genes. Problematically, reports on the efficacy of NOPs are conflicting and for instance suggest that utilizing random networks performs on par to networks that describe biologically relevant interactions. In this paper we turn the prediction problem around: instead of using a given biological network in the NOP, we aim to identify the network of genes that truly improves outcome prediction. To this end, we propose SyNet, a gene network constructed ab initio from synergistic gene pairs derived from survival-labelled gene expression data. To obtain SyNet, we evaluate synergy for all 69 million pairwise combinations of genes resulting in a network that is specific to the dataset and phenotype under study and can be used to in a NOP model. We evaluated SyNet and 11 other networks on a compendium dataset of >4000 survival-labelled breast cancer samples. For this purpose, we used cross-study validation which more closely emulates real world application of these outcome predictors. We find that SyNet is the only network that truly improves performance, stability and interpretability in several existing NOPs. We show that SyNet overlaps significantly with existing gene networks, and can be confidently predicted (~85% AUC) from graph-topological descriptions of these networks, in particular the breast tissue-specific network. Due to its data-driven nature, SyNet is not biased to well-studied genes and thus facilitates post-hoc interpretation. We find that SyNet is highly enriched for known breast cancer genes and genes related to e.g. histological grade and tamoxifen resistance, suggestive of a role in determining breast cancer outcome.

Collapse

Yan S, Zhang L, Song C. Applying a new maximum local asymmetry feature analysis method to improve near-term breast cancer risk prediction. Phys Med Biol 2018;63:205010. [PMID: 30255850 DOI: 10.1088/1361-6560/aae452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Häberle L, Hack CC, Heusinger K, Wagner F, Jud SM, Uder M, Beckmann MW, Schulz-Wendtland R, Wittenberg T, Fasching PA. Using automated texture features to determine the probability for masking of a tumor on mammography, but not ultrasound. Eur J Med Res 2017;22:30. [PMID: 28854966 PMCID: PMC5577694 DOI: 10.1186/s40001-017-0270-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Accepted: 08/11/2017] [Indexed: 01/04/2023] Open

Abstract

BACKGROUND

Tumors in radiologically dense breast were overlooked on mammograms more often than tumors in low-density breasts. A fast reproducible and automated method of assessing percentage mammographic density (PMD) would be desirable to support decisions whether ultrasonography should be provided for women in addition to mammography in diagnostic mammography units. PMD assessment has still not been included in clinical routine work, as there are issues of interobserver variability and the procedure is quite time consuming. This study investigated whether fully automatically generated texture features of mammograms can replace time-consuming semi-automatic PMD assessment to predict a patient's risk of having an invasive breast tumor that is visible on ultrasound but masked on mammography (mammography failure).

METHODS

This observational study included 1334 women with invasive breast cancer treated at a hospital-based diagnostic mammography unit. Ultrasound was available for the entire cohort as part of routine diagnosis. Computer-based threshold PMD assessments ("observed PMD") were carried out and 363 texture features were obtained from each mammogram. Several variable selection and regression techniques (univariate selection, lasso, boosting, random forest) were applied to predict PMD from the texture features. The predicted PMD values were each used as new predictor for masking in logistic regression models together with clinical predictors. These four logistic regression models with predicted PMD were compared among themselves and with a logistic regression model with observed PMD. The most accurate masking prediction was determined by cross-validation.

RESULTS

About 120 of the 363 texture features were selected for predicting PMD. Density predictions with boosting were the best substitute for observed PMD to predict masking. Overall, the corresponding logistic regression model performed better (cross-validated AUC, 0.747) than one without mammographic density (0.734), but less well than the one with the observed PMD (0.753). However, in patients with an assigned mammography failure risk >10%, covering about half of all masked tumors, the boosting-based model performed at least as accurately as the original PMD model.

CONCLUSION

Automatically generated texture features can replace semi-automatically determined PMD in a prediction model for mammography failure, such that more than 50% of masked tumors could be discovered.

Collapse

Affiliation(s)

Lothar Häberle University Breast Center for Franconia, Department of Gynecology and Obstetrics, Erlangen University Hospital, Friedrich Alexander University of Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany. .,Biostatistics Unit, Department of Gynecology and Obstetrics, Erlangen University Hospital, Erlangen, Germany.
Carolin C Hack University Breast Center for Franconia, Department of Gynecology and Obstetrics, Erlangen University Hospital, Friedrich Alexander University of Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany
Katharina Heusinger University Breast Center for Franconia, Department of Gynecology and Obstetrics, Erlangen University Hospital, Friedrich Alexander University of Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany
Florian Wagner Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany
Sebastian M Jud University Breast Center for Franconia, Department of Gynecology and Obstetrics, Erlangen University Hospital, Friedrich Alexander University of Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany
Michael Uder University Breast Center for Franconia, Institute of Radiology, Comprehensive Cancer Center EMN, Erlangen University Hospital, Friedrich Alexander University of Erlangen-Nuremberg, Erlangen, Germany
Matthias W Beckmann University Breast Center for Franconia, Department of Gynecology and Obstetrics, Erlangen University Hospital, Friedrich Alexander University of Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany
Rüdiger Schulz-Wendtland University Breast Center for Franconia, Institute of Radiology, Comprehensive Cancer Center EMN, Erlangen University Hospital, Friedrich Alexander University of Erlangen-Nuremberg, Erlangen, Germany
Thomas Wittenberg Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany
Peter A Fasching University Breast Center for Franconia, Department of Gynecology and Obstetrics, Erlangen University Hospital, Friedrich Alexander University of Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany.,Division Hematology/Oncology, Department of Medicine, David Geffen School of Medicine, University of California at Los Angeles, Los Angeles, CA, USA

Collapse

Naue J, Hoefsloot HCJ, Mook ORF, Rijlaarsdam-Hoekstra L, van der Zwalm MCH, Henneman P, Kloosterman AD, Verschure PJ. Chronological age prediction based on DNA methylation: Massive parallel sequencing and random forest regression. Forensic Sci Int Genet 2017;31:19-28. [PMID: 28841467 DOI: 10.1016/j.fsigen.2017.07.015] [Citation(s) in RCA: 105] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 07/26/2017] [Accepted: 07/30/2017] [Indexed: 01/24/2023]

Abstract

The use of DNA methylation (DNAm) to obtain additional information in forensic investigations showed to be a promising and increasing field of interest. Prediction of the chronological age based on age-dependent changes in the DNAm of specific CpG sites within the genome is one such potential application. Here we present an age-prediction tool for whole blood based on massive parallel sequencing (MPS) and a random forest machine learning algorithm. MPS allows accurate DNAm determination of pre-selected markers and neighboring CpG-sites to identify the best age-predictive markers for the age-prediction tool. 15 age-dependent markers of different loci were initially chosen based on publicly available 450K microarray data, and 13 finally selected for the age tool based on MPS (DDO, ELOVL2, F5, GRM2, HOXC4, KLF14, LDB2, MEIS1-AS3, NKIRAS2, RPA2, SAMD10, TRIM59, ZYG11A). Whole blood samples of 208 individuals were used for training of the algorithm and a further 104 individuals were used for model evaluation (age 18-69). In the case of KLF14, LDB2, SAMD10, and GRM2, neighboring CpG sites and not the initial 450K sites were chosen for the final model. Cross-validation of the training set leads to a mean absolute deviation (MAD) of 3.21 years and a root-mean square error (RMSE) of 3.97 years. Evaluation of model performance using the test set showed a comparable result (MAD 3.16 years, RMSE 3.93 years). A reduced model based on only the top 4 markers (ELOVL2, F5, KLF14, and TRIM59) resulted in a RMSE of 4.19 years and MAD of 3.24 years for the test set (cross validation training set: RMSE 4.63 years, MAD 3.64 years). The amplified region was additionally investigated for occurrence of SNPs in case of an aberrant DNAm result, which in some cases can be an indication for a deviation in DNAm. Our approach uncovered well-known DNAm age-dependent markers, as well as additional new age-dependent sites for improvement of the model, and allowed the creation of a reliable and accurate epigenetic tool for age-prediction without restriction to a linear change in DNAm with age.

Collapse

Applying a new bilateral mammographic density segmentation method to improve accuracy of breast cancer risk prediction. Int J Comput Assist Radiol Surg 2017;12:1819-1828. [PMID: 28726117 DOI: 10.1007/s11548-017-1648-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2017] [Accepted: 07/12/2017] [Indexed: 10/19/2022]

Häberle L, Hein A, Rübner M, Schneider M, Ekici AB, Gass P, Hartmann A, Schulz-Wendtland R, Beckmann MW, Lo WY, Schroth W, Brauch H, Fasching PA, Wunderle M. Predicting Triple-Negative Breast Cancer Subtype Using Multiple Single Nucleotide Polymorphisms for Breast Cancer Risk and Several Variable Selection Methods. Geburtshilfe Frauenheilkd 2017;77:667-678. [PMID: 28757654 DOI: 10.1055/s-0043-111602] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Revised: 05/15/2017] [Accepted: 05/16/2017] [Indexed: 12/22/2022] Open

Abstract

INTRODUCTION

Studies of triple-negative breast cancer have recently been extending the inclusion criteria and incorporating additional molecular markers into the selection criteria, opening up scope for targeted therapies. The screening phases required for studies of this type are often prolonged, since the process of determining the molecular subtype and carrying out additional biomarker assessment is time-consuming. Parameters such as germline genotypes capable of predicting the molecular subtype before it becomes available from pathology might be helpful for treatment planning and optimizing the timing and cost of screening phases. This appears to be feasible, as rapid and low-cost genotyping methods are becoming increasingly available. The aim of this study was to identify single nucleotide polymorphisms (SNPs) for breast cancer risk capable of predicting triple negativity, in addition to clinical predictors, in breast cancer patients.

METHODS

This cross-sectional observational study included 1271 women with invasive breast cancer who were treated at a university hospital. A total of 76 validated breast cancer risk SNPs were successfully genotyped. Univariate associations between each SNP and triple negativity were explored using logistic regression analyses. Several variable selection and regression techniques were applied to identify a set of SNPs that together improve the prediction of triple negativity in addition to the clinical predictors of age at diagnosis and body mass index (BMI). The most accurate prediction method was determined by cross-validation.

RESULTS

The SNP rs10069690 (TERT, CLPTM1L) was the only significant SNP (corrected p = 0.02) after correction of p values for multiple testing in the univariate analyses. This SNP and three additional SNPs from the genes RAD51B, CCND1, and FGFR2 were selected for prediction of triple negativity. The addition of these SNPs to clinical predictors increased the cross-validated area under the curve (AUC) from 0.618 to 0.625. Age at diagnosis was the strongest predictor, stronger than any genetic characteristics.

CONCLUSION

Prediction of triple-negative breast cancer can be improved if SNPs associated with breast cancer risk are added to a prediction rule based on age at diagnosis and BMI. This finding could be used for prescreening purposes in complex molecular therapy studies for triple-negative breast cancer.

Collapse

Affiliation(s)

Lothar Häberle Department of Gynecology and Obstetrics, Erlangen University Hospital, University Breast Center for Franconia, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany.,Biostatistics Unit, Department of Gynecology and Obstetrics, Erlangen University Hospital, Erlangen, Germany
Alexander Hein Department of Gynecology and Obstetrics, Erlangen University Hospital, University Breast Center for Franconia, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Matthias Rübner Department of Gynecology and Obstetrics, Erlangen University Hospital, University Breast Center for Franconia, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Michael Schneider Department of Gynecology and Obstetrics, Erlangen University Hospital, University Breast Center for Franconia, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Arif B Ekici Institute of Human Genetics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Paul Gass Department of Gynecology and Obstetrics, Erlangen University Hospital, University Breast Center for Franconia, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Arndt Hartmann Institute of Pathology, Erlangen University Hospital, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Rüdiger Schulz-Wendtland Institute of Diagnostic Radiology, Erlangen University Hospital, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Matthias W Beckmann Department of Gynecology and Obstetrics, Erlangen University Hospital, University Breast Center for Franconia, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Wing-Yee Lo Dr. Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany.,University of Tübingen, Tübingen, Germany
Werner Schroth Dr. Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany.,University of Tübingen, Tübingen, Germany
Hiltrud Brauch Dr. Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany.,University of Tübingen, Tübingen, Germany.,German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
Peter A Fasching Department of Gynecology and Obstetrics, Erlangen University Hospital, University Breast Center for Franconia, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
Marius Wunderle Department of Gynecology and Obstetrics, Erlangen University Hospital, University Breast Center for Franconia, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany

Collapse

Lottaz C, Gronwald W, Spang R, Engelmann JC. High-Dimensional Profiling for Computational Diagnosis. Methods Mol Biol 2017;1526:205-229. [PMID: 27896744 DOI: 10.1007/978-1-4939-6613-4_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

InFlo: a novel systems biology framework identifies cAMP-CREB1 axis as a key modulator of platinum resistance in ovarian cancer. Oncogene 2016;36:2472-2482. [PMID: 27819677 PMCID: PMC5415943 DOI: 10.1038/onc.2016.398] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 08/23/2016] [Accepted: 09/18/2016] [Indexed: 01/05/2023]

Abstract

Characterizing the complex interplay of cellular processes in cancer would enable the discovery of key mechanisms underlying its development and progression. Published approaches to decipher driver mechanisms do not explicitly model tissue-specific changes in pathway networks and the regulatory disruptions related to genomic aberrations in cancers. We therefore developed InFlo, a novel systems biology approach for characterizing complex biological processes using a unique multidimensional framework integrating transcriptomic, genomic and/or epigenomic profiles for any given cancer sample. We show that InFlo robustly characterizes tissue-specific differences in activities of signalling networks on a genome scale using unique probabilistic models of molecular interactions on a per-sample basis. Using large-scale multi-omics cancer datasets, we show that InFlo exhibits higher sensitivity and specificity in detecting pathway networks associated with specific disease states when compared to published pathway network modelling approaches. Furthermore, InFlo's ability to infer the activity of unmeasured signalling network components was also validated using orthogonal gene expression signatures. We then evaluated multi-omics profiles of primary high-grade serous ovarian cancer tumours (N=357) to delineate mechanisms underlying resistance to frontline platinum-based chemotherapy. InFlo was the only algorithm to identify hyperactivation of the cAMP-CREB1 axis as a key mechanism associated with resistance to platinum-based therapy, a finding that we subsequently experimentally validated. We confirmed that inhibition of CREB1 phosphorylation potently sensitized resistant cells to platinum therapy and was effective in killing ovarian cancer stem cells that contribute to both platinum-resistance and tumour recurrence. Thus, we propose InFlo to be a scalable and widely applicable and robust integrative network modelling framework for the discovery of evidence-based biomarkers and therapeutic targets.

Collapse

CAFÉ-Map: Context Aware Feature Mapping for mining high dimensional biomedical data. Comput Biol Med 2016;79:68-79. [PMID: 27764717 DOI: 10.1016/j.compbiomed.2016.10.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Revised: 10/05/2016] [Accepted: 10/10/2016] [Indexed: 12/18/2022]

Hamm A, Prenen H, Van Delm W, Di Matteo M, Wenes M, Delamarre E, Schmidt T, Weitz J, Sarmiento R, Dezi A, Gasparini G, Rothé F, Schmitz R, D'Hoore A, Iserentant H, Hendlisz A, Mazzone M. Tumour-educated circulating monocytes are powerful candidate biomarkers for diagnosis and disease follow-up of colorectal cancer. Gut 2016;65:990-1000. [PMID: 25814648 DOI: 10.1136/gutjnl-2014-308988] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 03/06/2015] [Indexed: 12/23/2022]

Abstract

OBJECTIVE

Cancer immunology is a growing field of research whose aim is to develop innovative therapies and diagnostic tests. Starting from the hypothesis that immune cells promptly respond to harmful stimuli, we used peripheral blood monocytes in order to characterise a distinct gene expression profile and to evaluate its potential as a candidate diagnostic biomarker in patients with colorectal cancer (CRC), a still unmet clinical need.

DESIGN

We performed a case-control study including 360 peripheral blood monocyte samples from four European oncological centres and defined a gene expression profile specific to CRC. The robustness of the genetic profile and disease specificity were assessed in an independent setting.

RESULTS

This screen returned 43 putative diagnostic markers, which we refined and validated in the confirmative multicentric analysis to 23 genes with outstanding diagnostic accuracy (area under the curve (AUC)=0.99 (0.99 to 1.00), Se=100.0% (100.0% to 100.0%), Sp=92.9% (78.6% to 100.0%) in multiple-gene receiver operating characteristic analysis). The diagnostic accuracy was robustly maintained in prospectively collected independent samples (AUC=0.95 (0.85 to 1.00), Se=92.6% (81.5% to 100.0%), Sp=92.3% (76.9% to 100.0%). This monocyte signature was expressed at early disease onset, remained robust over the course of disease progression, and was specific for the monocytic fraction of mononuclear cells. The gene modulation was induced specifically by soluble factors derived from transformed colon epithelium in comparison to normal colon or other cancer histotypes. Moreover, expression changes were plastic and reversible, as they were abrogated upon withdrawal of these tumour-released factors. Consistently, the modified set of genes reverted to normal expression upon curative treatment and was specific for CRC.

CONCLUSIONS

Our study is the first to demonstrate monocyte plasticity in response to tumour-released soluble factors. The identified distinct signature in tumour-educated monocytes might be used as a candidate biomarker in CRC diagnosis and harbours the potential for disease follow-up and therapeutic monitoring.

Collapse

Affiliation(s)

Alexander Hamm Laboratory of Molecular Oncology and Angiogenesis, Vesalius Research Center, VIB, Leuven, Belgium Laboratory of Molecular Oncology and Angiogenesis, Department of Oncology, Vesalius Research Center, KU Leuven, Leuven, Belgium
Hans Prenen Digestive Oncology, University Hospitals Leuven and Department of Oncology, KU Leuven, Leuven, Belgium
Wouter Van Delm Nucleomics Core, VIB, Leuven, Belgium
Mario Di Matteo Laboratory of Molecular Oncology and Angiogenesis, Vesalius Research Center, VIB, Leuven, Belgium Laboratory of Molecular Oncology and Angiogenesis, Department of Oncology, Vesalius Research Center, KU Leuven, Leuven, Belgium
Mathias Wenes Laboratory of Molecular Oncology and Angiogenesis, Vesalius Research Center, VIB, Leuven, Belgium Laboratory of Molecular Oncology and Angiogenesis, Department of Oncology, Vesalius Research Center, KU Leuven, Leuven, Belgium
Estelle Delamarre Laboratory of Molecular Oncology and Angiogenesis, Vesalius Research Center, VIB, Leuven, Belgium Laboratory of Molecular Oncology and Angiogenesis, Department of Oncology, Vesalius Research Center, KU Leuven, Leuven, Belgium
Thomas Schmidt Department of General, Visceral, and Transplantation Surgery, University of Heidelberg, Heidelberg, Germany
Jürgen Weitz Department of General, Visceral, and Transplantation Surgery, University of Heidelberg, Heidelberg, Germany Department of Visceral, Thoracic, and Vascular Surgery, University Hospital Carl Gustav Carus, Technical University Dresden, Dresden, Germany
Roberta Sarmiento Department of Oncology, San Filippo Neri, Rome, Italy
Angelo Dezi Department of Oncology, San Filippo Neri, Rome, Italy
Giampietro Gasparini Department of Oncology, San Filippo Neri, Rome, Italy
Françoise Rothé Medical Oncology Clinic, Institut Jules Bordet, Brussels, Belgium
Robin Schmitz Department of General, Visceral, and Transplantation Surgery, University of Heidelberg, Heidelberg, Germany
André D'Hoore Department of Abdominal Surgery, University Hospitals Leuven, KU Leuven, Leuven, Belgium
Hannes Iserentant VIB, Zwijnaarde, Belgium
Alain Hendlisz Medical Oncology Clinic, Institut Jules Bordet, Brussels, Belgium
Massimiliano Mazzone Laboratory of Molecular Oncology and Angiogenesis, Vesalius Research Center, VIB, Leuven, Belgium Laboratory of Molecular Oncology and Angiogenesis, Department of Oncology, Vesalius Research Center, KU Leuven, Leuven, Belgium

Collapse

Huang HH, Liu XY, Liang Y. Feature Selection and Cancer Classification via Sparse Logistic Regression with the Hybrid L1/2 +2 Regularization. PLoS One 2016;11:e0149675. [PMID: 27136190 PMCID: PMC4852916 DOI: 10.1371/journal.pone.0149675] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Accepted: 02/02/2016] [Indexed: 11/18/2022] Open

Jong VL, Novianti PW, Roes KCB, Eijkemans MJC. Selecting a classification function for class prediction with gene expression data. Bioinformatics 2016;32:1814-22. [PMID: 26873933 DOI: 10.1093/bioinformatics/btw034] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 01/15/2016] [Indexed: 11/13/2022] Open

Jong VL, Novianti PW, Roes KCB, Eijkemans MJC. Exploring homogeneity of correlation structures of gene expression datasets within and between etiological disease categories. Stat Appl Genet Mol Biol 2015;13:717-32. [PMID: 25503674 DOI: 10.1515/sagmb-2014-0003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Factors affecting the accuracy of a class prediction model in gene expression data. BMC Bioinformatics 2015;16:199. [PMID: 26093633 PMCID: PMC4475623 DOI: 10.1186/s12859-015-0610-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 04/30/2015] [Indexed: 01/12/2023] Open

Abstract

BACKGROUND

Class prediction models have been shown to have varying performances in clinical gene expression datasets. Previous evaluation studies, mostly done in the field of cancer, showed that the accuracy of class prediction models differs from dataset to dataset and depends on the type of classification function. While a substantial amount of information is known about the characteristics of classification functions, little has been done to determine which characteristics of gene expression data have impact on the performance of a classifier. This study aims to empirically identify data characteristics that affect the predictive accuracy of classification models, outside of the field of cancer.

RESULTS

Datasets from twenty five studies meeting predefined inclusion and exclusion criteria were downloaded. Nine classification functions were chosen, falling within the categories: discriminant analyses or Bayes classifiers, tree based, regularization and shrinkage and nearest neighbors methods. Consequently, nine class prediction models were built for each dataset using the same procedure and their performances were evaluated by calculating their accuracies. The characteristics of each experiment were recorded, (i.e., observed disease, medical question, tissue/cell types and sample size) together with characteristics of the gene expression data, namely the number of differentially expressed genes, the fold changes and the within-class correlations. Their effects on the accuracy of a class prediction model were statistically assessed by random effects logistic regression. The number of differentially expressed genes and the average fold change had significant impact on the accuracy of a classification model and gave individual explained-variation in prediction accuracy of up to 72% and 57%, respectively. Multivariable random effects logistic regression with forward selection yielded the two aforementioned study factors and the within class correlation as factors affecting the accuracy of classification functions, explaining 91.5% of the between study variation.

CONCLUSIONS

We evaluated study- and data-related factors that might explain the varying performances of classification functions in non-cancerous datasets. Our results showed that the number of differentially expressed genes, the fold change, and the correlation in gene expression data significantly affect the accuracy of class prediction models.

Collapse

Matamala N, Vargas MT, González-Cámpora R, Miñambres R, Arias JI, Menéndez P, Andrés-León E, Gómez-López G, Yanowsky K, Calvete-Candenas J, Inglada-Pérez L, Martínez-Delgado B, Benítez J. Tumor microRNA expression profiling identifies circulating microRNAs for early breast cancer detection. Clin Chem 2015;61:1098-106. [PMID: 26056355 DOI: 10.1373/clinchem.2015.238691] [Citation(s) in RCA: 148] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 05/07/2015] [Indexed: 01/20/2023]

van den Berg BA, Reinders MJT, de Ridder D, de Beer TAP. Insight into neutral and disease-associated human genetic variants through interpretable predictors. PLoS One 2015;10:e0120729. [PMID: 25826299 PMCID: PMC4380319 DOI: 10.1371/journal.pone.0120729] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2014] [Accepted: 01/14/2015] [Indexed: 11/30/2022] Open

Taskesen E, Babaei S, Reinders MMJ, de Ridder J. Integration of gene expression and DNA-methylation profiles improves molecular subtype classification in acute myeloid leukemia. BMC Bioinformatics 2015;16 Suppl 4:S5. [PMID: 25734246 PMCID: PMC4347619 DOI: 10.1186/1471-2105-16-s4-s5] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Kempowsky-Hamon T, Valle C, Lacroix-Triki M, Hedjazi L, Trouilh L, Lamarre S, Labourdette D, Roger L, Mhamdi L, Dalenc F, Filleron T, Favre G, François JM, Le Lann MV, Anton-Leberre V. Fuzzy logic selection as a new reliable tool to identify molecular grade signatures in breast cancer--the INNODIAG study. BMC Med Genomics 2015;8:3. [PMID: 25888889 PMCID: PMC4342216 DOI: 10.1186/s12920-015-0077-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Accepted: 01/12/2015] [Indexed: 01/22/2023] Open

Abstract

BACKGROUND

Personalized medicine has become a priority in breast cancer patient management. In addition to the routinely used clinicopathological characteristics, clinicians will have to face an increasing amount of data derived from tumor molecular profiling. The aims of this study were to develop a new gene selection method based on a fuzzy logic selection and classification algorithm, and to validate the gene signatures obtained on breast cancer patient cohorts.

METHODS

We analyzed data from four published gene expression datasets for breast carcinomas. We identified the best discriminating genes by comparing molecular expression profiles between histologic grade 1 and 3 tumors for each of the training datasets. The most pertinent probes were selected and used to define fuzzy molecular grade 1-like (good prognosis) and fuzzy molecular grade 3-like (poor prognosis) profiles. To evaluate the prognostic performance of the fuzzy grade signatures in breast cancer tumors, a Kaplan-Meier analysis was conducted to compare the relapse-free survival deduced from histologic grade and fuzzy molecular grade classification.

RESULTS

We applied the fuzzy logic selection on breast cancer databases and obtained four new gene signatures. Analysis in the training public sets showed good performance of these gene signatures for grade (sensitivity from 90% to 95%, specificity 67% to 93%). To validate these gene signatures, we designed probes on custom microarrays and tested them on 150 invasive breast carcinomas. Good performance was obtained with an error rate of less than 10%. For one gene signature, among 74 histologic grade 3 and 18 grade 1 tumors, 88 cases (96%) were correctly assigned. Interestingly histologic grade 2 tumors (n = 58) were split in these two molecular grade categories.

CONCLUSION

We confirmed the use of fuzzy logic selection as a new tool to identify gene signatures with good reliability and increased classification power. This method based on artificial intelligence algorithms was successfully applied to breast cancers molecular grade classification allowing histologic grade 2 classification into grade 1 and grade 2 like to improve patients prognosis. It opens the way to further development for identification of new biomarker combinations in other applications such as prediction of treatment response.

Collapse

Affiliation(s)

Tatiana Kempowsky-Hamon CNRS, LAAS, F-31400, Toulouse, France. Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France.
Carine Valle Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France. INRA, UMR792, Ingénierie des Systèmes Biologiques et des Procédés, F-31400, Toulouse, France. CNRS, UMR5504, F-31400, Toulouse, France.
Magali Lacroix-Triki Institut Claudius Regaud, Biology and Pathology Department; INSERM UMR1037, Toulouse, France.
Lyamine Hedjazi CNRS, LAAS, F-31400, Toulouse, France. Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France.
Lidwine Trouilh Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France. INRA, UMR792, Ingénierie des Systèmes Biologiques et des Procédés, F-31400, Toulouse, France. CNRS, UMR5504, F-31400, Toulouse, France.
Sophie Lamarre Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France. INRA, UMR792, Ingénierie des Systèmes Biologiques et des Procédés, F-31400, Toulouse, France. CNRS, UMR5504, F-31400, Toulouse, France.
Delphine Labourdette Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France. INRA, UMR792, Ingénierie des Systèmes Biologiques et des Procédés, F-31400, Toulouse, France. CNRS, UMR5504, F-31400, Toulouse, France.
Laurence Roger Institut Claudius Regaud, Biology and Pathology Department; INSERM UMR1037, Toulouse, France.
Loubna Mhamdi Institut Claudius Regaud, Biology and Pathology Department; INSERM UMR1037, Toulouse, France.
Florence Dalenc Dendris SAS, 8 Rue de Cugnaux, 31300, Toulouse, France.
Thomas Filleron Institut Claudius Regaud, Oncology Department, Toulouse, France.
Gilles Favre Institut Claudius Regaud, Biology and Pathology Department; INSERM UMR1037, Toulouse, France.
Jean-Marie François Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France. INRA, UMR792, Ingénierie des Systèmes Biologiques et des Procédés, F-31400, Toulouse, France. CNRS, UMR5504, F-31400, Toulouse, France. Dendris SAS, 8 Rue de Cugnaux, 31300, Toulouse, France.
Marie-Véronique Le Lann CNRS, LAAS, F-31400, Toulouse, France. Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France.
Véronique Anton-Leberre Université de Toulouse; INSA, UPS, INP; LISBP, F-31077, Toulouse, France. INRA, UMR792, Ingénierie des Systèmes Biologiques et des Procédés, F-31400, Toulouse, France. CNRS, UMR5504, F-31400, Toulouse, France.

Collapse

Ma C, Zhang HH, Wang X. Machine learning for Big Data analytics in plants. TRENDS IN PLANT SCIENCE 2014;19:798-808. [PMID: 25223304 DOI: 10.1016/j.tplants.2014.08.004] [Citation(s) in RCA: 93] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Revised: 07/30/2014] [Accepted: 08/20/2014] [Indexed: 05/19/2023]

Tanić M, Yanowski K, Andrés E, Gómez-López G, Socorro MRP, Pisano DG, Martinez-Delgado B, Benítez J. miRNA expression profiling of formalin-fixed paraffin-embedded (FFPE) hereditary breast tumors. GENOMICS DATA 2014;3:75-9. [PMID: 26484152 PMCID: PMC4535901 DOI: 10.1016/j.gdata.2014.11.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Revised: 11/13/2014] [Accepted: 11/17/2014] [Indexed: 10/28/2022]

Tanic M, Yanowski K, Gómez-López G, Rodriguez-Pinilla MS, Marquez-Rodas I, Osorio A, Pisano DG, Martinez-Delgado B, Benítez J. MicroRNA expression signatures for the prediction of BRCA1/2 mutation-associated hereditary breast cancer in paraffin-embedded formalin-fixed breast tumors. Int J Cancer 2014;136:593-602. [PMID: 24917463 DOI: 10.1002/ijc.29021] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Accepted: 05/26/2014] [Indexed: 01/07/2023]

Emura T, Chen YH. Gene selection for survival data under dependent censoring: A copula-based approach. Stat Methods Med Res 2014;25:2840-2857. [PMID: 24821000 DOI: 10.1177/0962280214533378] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Novianti PW, Roes KCB, Eijkemans MJC. Evaluation of gene expression classification studies: factors associated with classification performance. PLoS One 2014;9:e96063. [PMID: 24770439 PMCID: PMC4000205 DOI: 10.1371/journal.pone.0096063] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2013] [Accepted: 04/03/2014] [Indexed: 12/22/2022] Open

van den Berg BA, Reinders MJT, Roubos JA, de Ridder D. SPiCE: a web-based tool for sequence-based protein classification and exploration. BMC Bioinformatics 2014;15:93. [PMID: 24685258 PMCID: PMC4021553 DOI: 10.1186/1471-2105-15-93] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Accepted: 03/26/2014] [Indexed: 12/16/2022] Open

Breast cancer subtype specific classifiers of response to neoadjuvant chemotherapy do not outperform classifiers trained on all subtypes. PLoS One 2014;9:e88551. [PMID: 24558399 PMCID: PMC3928239 DOI: 10.1371/journal.pone.0088551] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2013] [Accepted: 01/06/2014] [Indexed: 11/19/2022] Open

Staiger C, Cadot S, Györffy B, Wessels LFA, Klau GW. Current composite-feature classification methods do not outperform simple single-genes classifiers in breast cancer prognosis. Front Genet 2013;4:289. [PMID: 24391662 PMCID: PMC3870302 DOI: 10.3389/fgene.2013.00289] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 11/28/2013] [Indexed: 01/21/2023] Open

Van den broeck A, Vankelecom H, Van Delm W, Gremeaux L, Wouters J, Allemeersch J, Govaere O, Roskams T, Topal B. Human pancreatic cancer contains a side population expressing cancer stem cell-associated and prognostic genes. PLoS One 2013;8:e73968. [PMID: 24069258 PMCID: PMC3775803 DOI: 10.1371/journal.pone.0073968] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Accepted: 07/23/2013] [Indexed: 12/17/2022] Open

Hedjazi L, Le Lann MV, Kempowsky T, Dalenc F, Aguilar-Martin J, Favre G. Symbolic data analysis to defy low signal-to-noise ratio in microarray data for breast cancer prognosis. J Comput Biol 2013;20:610-20. [PMID: 23899014 DOI: 10.1089/cmb.2012.0249] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

García-Closas M, Gail MH, Kelsey KT, Ziegler RG. Searching for blood DNA methylation markers of breast cancer risk and early detection. J Natl Cancer Inst 2013;105:678-80. [PMID: 23578855 DOI: 10.1093/jnci/djt090] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Xu Z, Bolick SCE, DeRoo LA, Weinberg CR, Sandler DP, Taylor JA. Epigenome-wide association study of breast cancer using prospectively collected sister study samples. J Natl Cancer Inst 2013;105:694-700. [PMID: 23578854 DOI: 10.1093/jnci/djt045] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

de Ridder D, de Ridder J, Reinders MJT. Pattern recognition in bioinformatics. Brief Bioinform 2013;14:633-47. [DOI: 10.1093/bib/bbt020] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

van Vliet MH, Burgmer P, de Quartel L, Brand JPL, de Best LCM, Viëtor H, Löwenberg B, Valk PJM, van Beers EH. Detection of CEBPA double mutants in acute myeloid leukemia using a custom gene expression array. Genet Test Mol Biomarkers 2013;17:395-400. [PMID: 23485358 DOI: 10.1089/gtmb.2012.0437] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Cranford SW, de Boer J, van Blitterswijk C, Buehler MJ. Materiomics: an -omics approach to biomaterials research. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2013;25:802-24. [PMID: 23297023 DOI: 10.1002/adma.201202553] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2012] [Revised: 10/13/2012] [Indexed: 05/20/2023]

Urquidi V, Goodison S, Cai Y, Sun Y, Rosser CJ. A candidate molecular biomarker panel for the detection of bladder cancer. Cancer Epidemiol Biomarkers Prev 2012;21:2149-58. [PMID: 23097579 DOI: 10.1158/1055-9965.epi-12-0428] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Leung YY, Chang CQ, Hung YS. An integrated approach for identifying wrongly labelled samples when performing classification in microarray data. PLoS One 2012;7:e46700. [PMID: 23082127 PMCID: PMC3474777 DOI: 10.1371/journal.pone.0046700] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2012] [Accepted: 09/03/2012] [Indexed: 01/05/2023] Open

Abstract

Background

Using hybrid approach for gene selection and classification is common as results obtained are generally better than performing the two tasks independently. Yet, for some microarray datasets, both classification accuracy and stability of gene sets obtained still have rooms for improvement. This may be due to the presence of samples with wrong class labels (i.e. outliers). Outlier detection algorithms proposed so far are either not suitable for microarray data, or only solve the outlier detection problem on their own.

Results

We tackle the outlier detection problem based on a previously proposed Multiple-Filter-Multiple-Wrapper (MFMW) model, which was demonstrated to yield promising results when compared to other hybrid approaches (Leung and Hung, 2010). To incorporate outlier detection and overcome limitations of the existing MFMW model, three new features are introduced in our proposed MFMW-outlier approach: 1) an unbiased external Leave-One-Out Cross-Validation framework is developed to replace internal cross-validation in the previous MFMW model; 2) wrongly labeled samples are identified within the MFMW-outlier model; and 3) a stable set of genes is selected using an L1-norm SVM that removes any redundant genes present. Six binary-class microarray datasets were tested. Comparing with outlier detection studies on the same datasets, MFMW-outlier could detect all the outliers found in the original paper (for which the data was provided for analysis), and the genes selected after outlier removal were proven to have biological relevance. We also compared MFMW-outlier with PRAPIV (Zhang et al., 2006) based on same synthetic datasets. MFMW-outlier gave better average precision and recall values on three different settings. Lastly, artificially flipped microarray datasets were created by removing our detected outliers and flipping some of the remaining samples' labels. Almost all the ‘wrong’ (artificially flipped) samples were detected, suggesting that MFMW-outlier was sufficiently powerful to detect outliers in high-dimensional microarray datasets.

Collapse

Glaab E, Bacardit J, Garibaldi JM, Krasnogor N. Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data. PLoS One 2012;7:e39932. [PMID: 22808075 PMCID: PMC3394775 DOI: 10.1371/journal.pone.0039932] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2012] [Accepted: 05/29/2012] [Indexed: 12/19/2022] Open

van Vliet MH, Horlings HM, van de Vijver MJ, Reinders MJT, Wessels LFA. Integration of clinical and gene expression data has a synergetic effect on predicting breast cancer outcome. PLoS One 2012;7:e40358. [PMID: 22808140 PMCID: PMC3394805 DOI: 10.1371/journal.pone.0040358] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2012] [Accepted: 06/06/2012] [Indexed: 12/12/2022] Open

Staiger C, Cadot S, Kooter R, Dittrich M, Müller T, Klau GW, Wessels LFA. A critical evaluation of network and pathway-based classifiers for outcome prediction in breast cancer. PLoS One 2012;7:e34796. [PMID: 22558100 PMCID: PMC3338754 DOI: 10.1371/journal.pone.0034796] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2011] [Accepted: 03/09/2012] [Indexed: 12/19/2022] Open

Abstract

Recently, several classifiers that combine primary tumor data, like gene expression data, and secondary data sources, such as protein-protein interaction networks, have been proposed for predicting outcome in breast cancer. In these approaches, new composite features are typically constructed by aggregating the expression levels of several genes. The secondary data sources are employed to guide this aggregation. Although many studies claim that these approaches improve classification performance over single genes classifiers, the gain in performance is difficult to assess. This stems mainly from the fact that different breast cancer data sets and validation procedures are employed to assess the performance. Here we address these issues by employing a large cohort of six breast cancer data sets as benchmark set and by performing an unbiased evaluation of the classification accuracies of the different approaches. Contrary to previous claims, we find that composite feature classifiers do not outperform simple single genes classifiers. We investigate the effect of (1) the number of selected features; (2) the specific gene set from which features are selected; (3) the size of the training set and (4) the heterogeneity of the data set on the performance of composite feature and single genes classifiers. Strikingly, we find that randomization of secondary data sources, which destroys all biological information in these sources, does not result in a deterioration in performance of composite feature classifiers. Finally, we show that when a proper correction for gene set size is performed, the stability of single genes sets is similar to the stability of composite feature sets. Based on these results there is currently no reason to prefer prognostic classifiers based on composite features over single genes classifiers for predicting outcome in breast cancer.

Collapse

van Iterson M, van Haagen HHHBM, Goeman JJ. Resolving confusion of tongues in statistics and machine learning: A primer for biologists and bioinformaticians. Proteomics 2012;12:543-9. [DOI: 10.1002/pmic.201100395] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Revised: 11/09/2011] [Accepted: 11/14/2011] [Indexed: 11/06/2022]

Robust two-gene classifiers for cancer prediction. Genomics 2011;99:90-5. [PMID: 22138042 DOI: 10.1016/j.ygeno.2011.11.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Revised: 11/04/2011] [Accepted: 11/09/2011] [Indexed: 11/23/2022]