Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Statnikov A, Aliferis CF, Tsamardinos I, Hardin D, Levy S. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 2004;21:631-43. [PMID: 15374862 DOI: 10.1093/bioinformatics/bti033] [Citation(s) in RCA: 597] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Statnikov A, Aliferis CF, Tsamardinos I, Hardin D, Levy S. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 2004;21:631-43. [PMID: 15374862 DOI: 10.1093/bioinformatics/bti033] [Citation(s) in RCA: 597] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Piccolo SR, Lee TJ, Suh E, Hill K. ShinyLearner: A containerized benchmarking tool for machine-learning classification of tabular data. Gigascience 2020;9:giaa026. [PMID: 32249316 PMCID: PMC7131989 DOI: 10.1093/gigascience/giaa026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 12/05/2019] [Accepted: 02/28/2020] [Indexed: 11/27/2022] Open

Abstract

BACKGROUND

Classification algorithms assign observations to groups based on patterns in data. The machine-learning community have developed myriad classification algorithms, which are used in diverse life science research domains. Algorithm choice can affect classification accuracy dramatically, so it is crucial that researchers optimize the choice of which algorithm(s) to apply in a given research domain on the basis of empirical evidence. In benchmark studies, multiple algorithms are applied to multiple datasets, and the researcher examines overall trends. In addition, the researcher may evaluate multiple hyperparameter combinations for each algorithm and use feature selection to reduce data dimensionality. Although software implementations of classification algorithms are widely available, robust benchmark comparisons are difficult to perform when researchers wish to compare algorithms that span multiple software packages. Programming interfaces, data formats, and evaluation procedures differ across software packages; and dependency conflicts may arise during installation.

FINDINGS

To address these challenges, we created ShinyLearner, an open-source project for integrating machine-learning packages into software containers. ShinyLearner provides a uniform interface for performing classification, irrespective of the library that implements each algorithm, thus facilitating benchmark comparisons. In addition, ShinyLearner enables researchers to optimize hyperparameters and select features via nested cross-validation; it tracks all nested operations and generates output files that make these steps transparent. ShinyLearner includes a Web interface to help users more easily construct the commands necessary to perform benchmark comparisons. ShinyLearner is freely available at https://github.com/srp33/ShinyLearner.

CONCLUSIONS

This software is a resource to researchers who wish to benchmark multiple classification or feature-selection algorithms on a given dataset. We hope it will serve as example of combining the benefits of software containerization with a user-friendly approach.

Collapse

Hammami M, Bechikh S, Louati A, Makhlouf M, Said LB. Feature construction as a bi-level optimization problem. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-04784-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Yoo TK, Ryu IH, Choi H, Kim JK, Lee IS, Kim JS, Lee G, Rim TH. Explainable Machine Learning Approach as a Tool to Understand Factors Used to Select the Refractive Surgery Technique on the Expert Level. Transl Vis Sci Technol 2020;9:8. [PMID: 32704414 PMCID: PMC7346876 DOI: 10.1167/tvst.9.2.8] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 11/18/2019] [Indexed: 12/23/2022] Open

Abstract

Purpose

Recently, laser refractive surgery options, including laser epithelial keratomileusis, laser in situ keratomileusis, and small incision lenticule extraction, successfully improved patients' quality of life. Evidence-based recommendation for an optimal surgery technique is valuable in increasing patient satisfaction. We developed an interpretable multiclass machine learning model that selects the laser surgery option on the expert level.

Methods

A multiclass XGBoost model was constructed to classify patients into four categories including laser epithelial keratomileusis, laser in situ keratomileusis, small incision lenticule extraction, and contraindication groups. The analysis included 18,480 subjects who intended to undergo refractive surgery at the B&VIIT Eye center. Training (n = 10,561) and internal validation (n = 2640) were performed using subjects who visited between 2016 and 2017. The model was trained based on clinical decisions of highly experienced experts and ophthalmic measurements. External validation (n = 5279) was conducted using subjects who visited in 2018. The SHapley Additive ex-Planations technique was adopted to explain the output of the XGBoost model.

Results

The multiclass XGBoost model exhibited an accuracy of 81.0% and 78.9% when tested on the internal and external validation datasets, respectively. The SHapley Additive ex-Planations explanations for the results were consistent with prior knowledge from ophthalmologists. The explanation from one-versus-one and one-versus-rest XGBoost classifiers was effective for easily understanding users in the multicategorical classification problem.

Conclusions

This study suggests an expert-level multiclass machine learning model for selecting the refractive surgery for patients. It also provided a clinical understanding in a multiclass problem based on an explainable artificial intelligence technique.

Translational Relevance

Explainable machine learning exhibits a promising future for increasing the practical use of artificial intelligence in ophthalmic clinics.

Collapse

Xu Z, Chou J, Zhang XS, Luo Y, Isakova T, Adekkanattu P, Ancker JS, Jiang G, Kiefer RC, Pacheco JA, Rasmussen LV, Pathak J, Wang F. Identifying sub-phenotypes of acute kidney injury using structured and unstructured electronic health record data with memory networks. J Biomed Inform 2020;102:103361. [PMID: 31911172 DOI: 10.1016/j.jbi.2019.103361] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 11/18/2019] [Accepted: 12/16/2019] [Indexed: 01/08/2023]

Dragonfly Algorithm: Theory, Literature Review, and Application in Feature Selection. NATURE-INSPIRED OPTIMIZERS 2020. [DOI: 10.1007/978-3-030-12127-3_4] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]

Wnt/β-Catenin, Carbohydrate Metabolism, and PI3K-Akt Signaling Pathway-Related Genes as Potential Cancer Predictors. JOURNAL OF HEALTHCARE ENGINEERING 2019;2019:9724589. [PMID: 31781361 PMCID: PMC6855054 DOI: 10.1155/2019/9724589] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 09/17/2019] [Indexed: 01/07/2023]

Bir-Jmel A, Douiri SM, Elbernoussi S. Gene Selection via a New Hybrid Ant Colony Optimization Algorithm for Cancer Classification in High-Dimensional Data. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019;2019:7828590. [PMID: 31737086 PMCID: PMC6815598 DOI: 10.1155/2019/7828590] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 08/14/2019] [Accepted: 09/09/2019] [Indexed: 11/18/2022]

Heitz F, Kommoss S, Tourani R, Grandelis A, Uppendahl L, Aliferis C, Burges A, Wang C, Canzler U, Wang J, Belau A, Prader S, Hanker L, Ma S, Ataseven B, Hilpert F, Schneider S, Sehouli J, Kimmig R, Kurzeder C, Schmalfeldt B, Braicu EI, Harter P, Dowdy SC, Winterhoff BJ, Pfisterer J, du Bois A. Dilution of Molecular-Pathologic Gene Signatures by Medically Associated Factors Might Prevent Prediction of Resection Status After Debulking Surgery in Patients With Advanced Ovarian Cancer. Clin Cancer Res 2019;26:213-219. [PMID: 31527166 DOI: 10.1158/1078-0432.ccr-19-1741] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Revised: 08/08/2019] [Accepted: 09/11/2019] [Indexed: 11/16/2022]

Affiliation(s)

Florian Heitz Department of Gynecology and Gynecologic Oncology, Kliniken-Essen-Mitte, Germany. .,Charité - Universitätsmedizin Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Department of Gynecology, Berlin, Germany.,AGO Study Group
Stefan Kommoss AGO Study Group.,Department of Women's Health, Tuebingen University Hospital, Tuebingen, Germany
Roshan Tourani Institute for Health Informatics (IHI), Academic Health Center, University of Minnesota, Minneapolis, Minnesota
Anthony Grandelis Department of Gynecology, Obstetrics and Women's Health, Division of Gynecologic Oncology, University of Minnesota, Minneapolis, Minnesota
Locke Uppendahl Department of Gynecology, Obstetrics and Women's Health, Division of Gynecologic Oncology, University of Minnesota, Minneapolis, Minnesota
Constantin Aliferis Institute for Health Informatics (IHI), Academic Health Center, University of Minnesota, Minneapolis, Minnesota
Alexander Burges AGO Study Group.,Department of Obstetrics and Gynecology, University Hospital, LMU Munich, Germany
Chen Wang Division of Gynecologic Surgery, Department of Obstetrics and Gynecology; Mayo Clinic, Rochester, Minnesota
Ulrich Canzler AGO Study Group.,Department of Gynecology and Obstetrics, Technische Universität Dresden, Dresden, Germany
Jinhua Wang Institute for Health Informatics (IHI), Academic Health Center, University of Minnesota, Minneapolis, Minnesota
Antje Belau AGO Study Group.,Ernst Moritz Arndt Universität Greifswald - Klinik und Poliklinik für Frauenheilkunde und Geburtshilfe, Greifswald, Germany
Sonia Prader Department of Gynecology and Gynecologic Oncology, Kliniken-Essen-Mitte, Germany
Lars Hanker AGO Study Group.,Klinik für Frauenheilkunde und Geburtshilfe, University of Schleswig-Holstein, Lübeck, Germany
Sisi Ma Institute for Health Informatics (IHI), Academic Health Center, University of Minnesota, Minneapolis, Minnesota
Beyhan Ataseven Department of Gynecology and Gynecologic Oncology, Kliniken-Essen-Mitte, Germany.,Department of Obstetrics and Gynecology, University Hospital, LMU Munich, Germany
Felix Hilpert AGO Study Group.,Krankenhaus Jerusalem Hamburg, Hamburg, Germany
Stephanie Schneider Department of Gynecology and Gynecologic Oncology, Kliniken-Essen-Mitte, Germany
Jalid Sehouli Charité - Universitätsmedizin Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Department of Gynecology, Berlin, Germany
Rainer Kimmig AGO Study Group.,Department of Gynecology and Obstetrics, University of Duisburg-Essen, Essen, Germany
Christian Kurzeder AGO Study Group.,Universitätsspital Basel, Basel, Switzerland.,Department of Obstrics and Gynecology, University of Ulm, Ulm, Germany
Barbara Schmalfeldt AGO Study Group.,Technical University of Munich - Klinikum rechts der Isar, Munich, Germany.,Department of Gynecology and Obstetrics, Technical University of Munich, Munich, Germany
Elena I Braicu Charité - Universitätsmedizin Berlin, Humboldt-Universität zu Berlin, Berlin Institute of Health, Department of Gynecology, Berlin, Germany
Philipp Harter Department of Gynecology and Gynecologic Oncology, Kliniken-Essen-Mitte, Germany.,AGO Study Group
Sean C Dowdy Division of Gynecologic Surgery, Department of Obstetrics and Gynecology; Mayo Clinic, Rochester, Minnesota
Boris J Winterhoff Department of Gynecology, Obstetrics and Women's Health, Division of Gynecologic Oncology, University of Minnesota, Minneapolis, Minnesota
Jacobus Pfisterer AGO Study Group.,Gynecologic Oncology Center, Kiel, Germany
Andreas du Bois Department of Gynecology and Gynecologic Oncology, Kliniken-Essen-Mitte, Germany.,AGO Study Group

Collapse

Qayyum A, Saeed Malik A, Saad NM, Iqbal M, Abdullah MF, Rasheed W, Abdullah TABR, Bin Jafaar MY. Image classification based on sparse-coded features using sparse coding technique for aerial imagery: a hybrid dictionary approach. Neural Comput Appl 2019. [DOI: 10.1007/s00521-017-3300-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Bang S, Yoo D, Kim SJ, Jhang S, Cho S, Kim H. Establishment and evaluation of prediction model for multiple disease classification based on gut microbial data. Sci Rep 2019;9:10189. [PMID: 31308384 PMCID: PMC6629854 DOI: 10.1038/s41598-019-46249-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Accepted: 04/12/2019] [Indexed: 12/17/2022] Open

Lin X, Huang X, Zhou L, Ren W, Zeng J, Yao W, Wang X. The Robust Classification Model Based on Combinatorial Features. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:650-657. [PMID: 29990202 DOI: 10.1109/tcbb.2017.2779512] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Sumsion GR, Bradshaw MS, Hill KT, Pinto LD, Piccolo SR. Remote sensing tree classification with a multilayer perceptron. PeerJ 2019;7:e6101. [PMID: 30842894 PMCID: PMC6397751 DOI: 10.7717/peerj.6101] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Accepted: 11/12/2018] [Indexed: 11/20/2022] Open

Kang C, Huo Y, Xin L, Tian B, Yu B. Feature selection and tumor classification for microarray data using relaxed Lasso and generalized multi-class support vector machine. J Theor Biol 2019;463:77-91. [DOI: 10.1016/j.jtbi.2018.12.010] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Revised: 11/03/2018] [Accepted: 12/06/2018] [Indexed: 02/08/2023]

Hartono P. A transparent cancer classifier. Health Informatics J 2018;26:190-204. [DOI: 10.1177/1460458218817800] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Mafarja M, Aljarah I, Heidari AA, Faris H, Fournier-Viger P, Li X, Mirjalili S. Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2018.08.003] [Citation(s) in RCA: 246] [Impact Index Per Article: 35.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Caglar MU, Hockenberry AJ, Wilke CO. Predicting bacterial growth conditions from mRNA and protein abundances. PLoS One 2018;13:e0206634. [PMID: 30388153 PMCID: PMC6214550 DOI: 10.1371/journal.pone.0206634] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Accepted: 10/16/2018] [Indexed: 01/30/2023] Open

Yoo TK, Choi JY, Seo JG, Ramasubramanian B, Selvaperumal S, Kim DW. The possibility of the combination of OCT and fundus images for improving the diagnostic accuracy of deep learning for age-related macular degeneration: a preliminary experiment. Med Biol Eng Comput 2018;57:677-687. [DOI: 10.1007/s11517-018-1915-z] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Accepted: 10/09/2018] [Indexed: 12/23/2022]

Armañanzas R. Revealing post-transcriptional microRNA-mRNA regulations in Alzheimer's disease through ensemble graphs. BMC Genomics 2018;19:668. [PMID: 30255799 PMCID: PMC6157163 DOI: 10.1186/s12864-018-5025-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open

Jadhav S, He H, Jenkins K. Information gain directed genetic algorithm wrapper feature selection for credit rating. Appl Soft Comput 2018. [DOI: 10.1016/j.asoc.2018.04.033] [Citation(s) in RCA: 140] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Tsamardinos I, Greasidou E, Borboudakis G. Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Mach Learn 2018;107:1895-1922. [PMID: 30393425 PMCID: PMC6191021 DOI: 10.1007/s10994-018-5714-4] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Accepted: 04/21/2018] [Indexed: 12/26/2022]

Pan M, Zhang J. Quantile normalization for combining gene-expression datasets. BIOTECHNOL BIOTEC EQ 2018. [DOI: 10.1080/13102818.2017.1419376] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open

Wang A, An N, Chen G, Liu L, Alterovitz G. Subtype dependent biomarker identification and tumor classification from gene expression profiles. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2018.01.025] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

High-dimensional hybrid feature selection using interaction information-guided search. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2018.01.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Peeken JC, Goldberg T, Knie C, Komboz B, Bernhofer M, Pasa F, Kessel KA, Tafti PD, Rost B, Nüsslin F, Braun AE, Combs SE. Treatment-related features improve machine learning prediction of prognosis in soft tissue sarcoma patients. Strahlenther Onkol 2018;194:824-834. [PMID: 29557486 DOI: 10.1007/s00066-018-1294-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 03/05/2018] [Indexed: 12/01/2022]

Affiliation(s)

Jan C Peeken Department of Radiation Oncology, Klinikum rechts der Isar, Technical University of Munich (TUM), Ismaninger Straße 22, 81675, Munich, Germany. .,Partner Site Munich, Deutsches Konsortium für Translationale Krebsforschung (DKTK), Munich, Germany.
Tatyana Goldberg Allianz SE, Königinstraße 28, 80802, Munich, Germany
Christoph Knie Department of Radiation Oncology, Klinikum rechts der Isar, Technical University of Munich (TUM), Ismaninger Straße 22, 81675, Munich, Germany
Basil Komboz Allianz SE, Königinstraße 28, 80802, Munich, Germany
Michael Bernhofer Department for Bioinformatics and Computational Biology, Informatik 12, Technical University of Munich (TUM), Boltzmannstraße 3, 85748, Garching, Germany
Francesco Pasa Department of Computer Science, Informatik 9, Technical University of Munich (TUM), Boltzmannstraße 3, 85748, Garching, Germany.,Chair of Biomedical Physics, Department of Physics, Technical University of Munich (TUM), James-Franck-Straße 1, 85748, Garching, Germany
Kerstin A Kessel Department of Radiation Oncology, Klinikum rechts der Isar, Technical University of Munich (TUM), Ismaninger Straße 22, 81675, Munich, Germany.,Institute of Innovative Radiotherapy (iRT), Department of Radiation Sciences (DRS), Helmholtz Zentrum München, Ingolstaedter Landstraße 1, 85764, Neuherberg, Germany.,Partner Site Munich, Deutsches Konsortium für Translationale Krebsforschung (DKTK), Munich, Germany
Pouya D Tafti Allianz SE, Königinstraße 28, 80802, Munich, Germany
Burkhard Rost Department for Bioinformatics and Computational Biology, Informatik 12, Technical University of Munich (TUM), Boltzmannstraße 3, 85748, Garching, Germany
Fridtjof Nüsslin Department of Radiation Oncology, Klinikum rechts der Isar, Technical University of Munich (TUM), Ismaninger Straße 22, 81675, Munich, Germany
Andreas E Braun Allianz SE, Königinstraße 28, 80802, Munich, Germany
Stephanie E Combs Department of Radiation Oncology, Klinikum rechts der Isar, Technical University of Munich (TUM), Ismaninger Straße 22, 81675, Munich, Germany.,Institute of Innovative Radiotherapy (iRT), Department of Radiation Sciences (DRS), Helmholtz Zentrum München, Ingolstaedter Landstraße 1, 85764, Neuherberg, Germany.,Partner Site Munich, Deutsches Konsortium für Translationale Krebsforschung (DKTK), Munich, Germany

Collapse

Gabryś HS, Buettner F, Sterzing F, Hauswald H, Bangert M. Design and Selection of Machine Learning Methods Using Radiomics and Dosiomics for Normal Tissue Complication Probability Modeling of Xerostomia. Front Oncol 2018;8:35. [PMID: 29556480 PMCID: PMC5844945 DOI: 10.3389/fonc.2018.00035] [Citation(s) in RCA: 112] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 02/01/2018] [Indexed: 01/13/2023] Open

Abstract

Purpose

The purpose of this study is to investigate whether machine learning with dosiomic, radiomic, and demographic features allows for xerostomia risk assessment more precise than normal tissue complication probability (NTCP) models based on the mean radiation dose to parotid glands.

Material and methods

A cohort of 153 head-and-neck cancer patients was used to model xerostomia at 0–6 months (early), 6–15 months (late), 15–24 months (long-term), and at any time (a longitudinal model) after radiotherapy. Predictive power of the features was evaluated by the area under the receiver operating characteristic curve (AUC) of univariate logistic regression models. The multivariate NTCP models were tuned and tested with single and nested cross-validation, respectively. We compared predictive performance of seven classification algorithms, six feature selection methods, and ten data cleaning/class balancing techniques using the Friedman test and the Nemenyi post hoc analysis.

Results

NTCP models based on the parotid mean dose failed to predict xerostomia (AUCs < 0.60). The most informative predictors were found for late and long-term xerostomia. Late xerostomia correlated with the contralateral dose gradient in the anterior–posterior (AUC = 0.72) and the right–left (AUC = 0.68) direction, whereas long-term xerostomia was associated with parotid volumes (AUCs > 0.85), dose gradients in the right–left (AUCs > 0.78), and the anterior–posterior (AUCs > 0.72) direction. Multivariate models of long-term xerostomia were typically based on the parotid volume, the parotid eccentricity, and the dose–volume histogram (DVH) spread with the generalization AUCs ranging from 0.74 to 0.88. On average, support vector machines and extra-trees were the top performing classifiers, whereas the algorithms based on logistic regression were the best choice for feature selection. We found no advantage in using data cleaning or class balancing methods.

Conclusion

We demonstrated that incorporation of organ- and dose-shape descriptors is beneficial for xerostomia prediction in highly conformal radiotherapy treatments. Due to strong reliance on patient-specific, dose-independent factors, our results underscore the need for development of personalized data-driven risk profiles for NTCP models of xerostomia. The facilitated machine learning pipeline is described in detail and can serve as a valuable reference for future work in radiomic and dosiomic NTCP modeling.

Collapse

Mazumdar H, Kim TH, Lee JM, Ha JH, Ahrberg CD, Chung BG. Prediction analysis and quality assessment of microwell array images. Electrophoresis 2018;39:948-956. [PMID: 29323408 DOI: 10.1002/elps.201700460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 12/29/2017] [Accepted: 12/29/2017] [Indexed: 11/11/2022]

Bi-stage hierarchical selection of pathway genes for cancer progression using a swarm based computational approach. Appl Soft Comput 2018. [DOI: 10.1016/j.asoc.2017.10.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Gene selection from large-scale gene expression data based on fuzzy interactive multi-objective binary optimization for medical diagnosis. Biocybern Biomed Eng 2018. [DOI: 10.1016/j.bbe.2018.02.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Mohammed A, Biegert G, Adamec J, Helikar T. CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data. Oncotarget 2017;9:2565-2573. [PMID: 29416792 PMCID: PMC5788660 DOI: 10.18632/oncotarget.23511] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Accepted: 12/09/2017] [Indexed: 11/25/2022] Open

Collaborative representation-based classification of microarray gene expression data. PLoS One 2017;12:e0189533. [PMID: 29236759 PMCID: PMC5728509 DOI: 10.1371/journal.pone.0189533] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2017] [Accepted: 11/27/2017] [Indexed: 11/19/2022] Open

Sutton EJ, Huang EP, Drukker K, Burnside ES, Li H, Net JM, Rao A, Whitman GJ, Zuley M, Ganott M, Bonaccio E, Giger ML, Morris EA. Breast MRI radiomics: comparison of computer- and human-extracted imaging phenotypes. Eur Radiol Exp 2017;1:22. [PMID: 29708200 PMCID: PMC5909355 DOI: 10.1186/s41747-017-0025-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 09/19/2017] [Indexed: 01/18/2023] Open

Abstract

Background

In this study, we sought to investigate if computer-extracted magnetic resonance imaging (MRI) phenotypes of breast cancer could replicate human-extracted size and Breast Imaging-Reporting and Data System (BI-RADS) imaging phenotypes using MRI data from The Cancer Genome Atlas (TCGA) project of the National Cancer Institute.

Methods

Our retrospective interpretation study involved analysis of Health Insurance Portability and Accountability Act-compliant breast MRI data from The Cancer Imaging Archive, an open-source database from the TCGA project. This study was exempt from institutional review board approval at Memorial Sloan Kettering Cancer Center and the need for informed consent was waived. Ninety-one pre-operative breast MRIs with verified invasive breast cancers were analysed. Three fellowship-trained breast radiologists evaluated the index cancer in each case according to size and the BI-RADS lexicon for shape, margin, and enhancement (human-extracted image phenotypes [HEIP]). Human inter-observer agreement was analysed by the intra-class correlation coefficient (ICC) for size and Krippendorff’s α for other measurements. Quantitative MRI radiomics of computerised three-dimensional segmentations of each cancer generated computer-extracted image phenotypes (CEIP). Spearman’s rank correlation coefficients were used to compare HEIP and CEIP.

Results

Inter-observer agreement for HEIP varied, with the highest agreement seen for size (ICC 0.679) and shape (ICC 0.527). The computer-extracted maximum linear size replicated the human measurement with p < 10⁻¹². CEIP of shape, specifically sphericity and irregularity, replicated HEIP with both p values < 0.001. CEIP did not demonstrate agreement with HEIP of tumour margin or internal enhancement.

Conclusions

Quantitative radiomics of breast cancer may replicate human-extracted tumour size and BI-RADS imaging phenotypes, thus enabling precision medicine.

Collapse

Choi JY, Yoo TK, Seo JG, Kwak J, Um TT, Rim TH. Multi-categorical deep learning neural network to classify retinal images: A pilot study employing small database. PLoS One 2017;12:e0187336. [PMID: 29095872 PMCID: PMC5667846 DOI: 10.1371/journal.pone.0187336] [Citation(s) in RCA: 117] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2017] [Accepted: 10/18/2017] [Indexed: 01/03/2023] Open

Abstract

Deep learning emerges as a powerful tool for analyzing medical images. Retinal disease detection by using computer-aided diagnosis from fundus image has emerged as a new method. We applied deep learning convolutional neural network by using MatConvNet for an automated detection of multiple retinal diseases with fundus photographs involved in STructured Analysis of the REtina (STARE) database. Dataset was built by expanding data on 10 categories, including normal retina and nine retinal diseases. The optimal outcomes were acquired by using a random forest transfer learning based on VGG-19 architecture. The classification results depended greatly on the number of categories. As the number of categories increased, the performance of deep learning models was diminished. When all 10 categories were included, we obtained results with an accuracy of 30.5%, relative classifier information (RCI) of 0.052, and Cohen's kappa of 0.224. Considering three integrated normal, background diabetic retinopathy, and dry age-related macular degeneration, the multi-categorical classifier showed accuracy of 72.8%, 0.283 RCI, and 0.577 kappa. In addition, several ensemble classifiers enhanced the multi-categorical classification performance. The transfer learning incorporated with ensemble classifier of clustering and voting approach presented the best performance with accuracy of 36.7%, 0.053 RCI, and 0.225 kappa in the 10 retinal diseases classification problem. First, due to the small size of datasets, the deep learning techniques in this study were ineffective to be applied in clinics where numerous patients suffering from various types of retinal disorders visit for diagnosis and treatment. Second, we found that the transfer learning incorporated with ensemble classifiers can improve the classification performance in order to detect multi-categorical retinal diseases. Further studies should confirm the effectiveness of algorithms with large datasets obtained from hospitals.

Collapse

Yu K, Wu X, Ding W, Mu Y, Wang H. Markov Blanket Feature Selection Using Representative Sets. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017;28:2775-2788. [PMID: 28113384 DOI: 10.1109/tnnls.2016.2602365] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Abstract

It has received much attention in recent years to use Markov blankets in a Bayesian network for feature selection. The Markov blanket of a class attribute in a Bayesian network is a unique yet minimal feature subset for optimal feature selection if the probability distribution of a data set can be faithfully represented by this Bayesian network. However, if a data set violates the faithful condition, Markov blankets of a class attribute may not be unique. To tackle this issue, in this paper, we propose a new concept of representative sets and then design the selection via group alpha-investing (SGAI) algorithm to perform Markov blanket feature selection with representative sets for classification. Using a comprehensive set of real data, our empirical studies have demonstrated that SGAI outperforms the state-of-the-art Markov blanket feature selectors and other well-established feature selection methods.It has received much attention in recent years to use Markov blankets in a Bayesian network for feature selection. The Markov blanket of a class attribute in a Bayesian network is a unique yet minimal feature subset for optimal feature selection if the probability distribution of a data set can be faithfully represented by this Bayesian network. However, if a data set violates the faithful condition, Markov blankets of a class attribute may not be unique. To tackle this issue, in this paper, we propose a new concept of representative sets and then design the selection via group alpha-investing (SGAI) algorithm to perform Markov blanket feature selection with representative sets for classification. Using a comprehensive set of real data, our empirical studies have demonstrated that SGAI outperforms the state-of-the-art Markov blanket feature selectors and other well-established feature selection methods.

Collapse

Aliferis CF, Statnikov A, Tsamardinos I. Challenges in the Analysis of Mass-Throughput Data: A Technical Commentary from the Statistical Machine Learning Perspective. Cancer Inform 2017. [DOI: 10.1177/117693510600200004] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Deng X, Geng H, Ali HH. Cross-platform Analysis of Cancer Biomarkers: A Bayesian Network Approach to Incorporating Mass Spectrometry and Microarray Data. Cancer Inform 2017. [DOI: 10.1177/117693510700300001] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Mohammed A, Biegert G, Adamec J, Helikar T. Identification of potential tissue-specific cancer biomarkers and development of cancer versus normal genomic classifiers. Oncotarget 2017;8:85692-85715. [PMID: 29156751 PMCID: PMC5689641 DOI: 10.18632/oncotarget.21127] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 09/05/2017] [Indexed: 01/15/2023] Open

Boulesteix AL, Wilson R, Hapfelmeier A. Towards evidence-based computational statistics: lessons from clinical research on the role and design of real-data benchmark studies. BMC Med Res Methodol 2017;17:138. [PMID: 28888225 PMCID: PMC5591542 DOI: 10.1186/s12874-017-0417-2] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 08/31/2017] [Indexed: 11/10/2022] Open

Duan F, Xu Y. Applying Multivariate Adaptive Splines to Identify Genes With Expressions Varying After Diagnosis in Microarray Experiments. Cancer Inform 2017;16:1176935117705381. [PMID: 28579740 PMCID: PMC5422340 DOI: 10.1177/1176935117705381] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2014] [Accepted: 02/20/2017] [Indexed: 12/17/2022] Open

Urda D, Luque-Baena RM, Franco L, Jerez JM, Sanchez-Marono N. Machine learning models to search relevant genetic signatures in clinical context. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) 2017:1649-1656. [DOI: 10.1109/ijcnn.2017.7966049] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]

Robust Microbiota-Based Diagnostics for Inflammatory Bowel Disease. J Clin Microbiol 2017;55:1720-1732. [PMID: 28330889 DOI: 10.1128/jcm.00162-17] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Accepted: 03/15/2017] [Indexed: 01/11/2023] Open

Fei Y, Hu J, Li WQ, Wang W, Zong GQ. Artificial neural networks predict the incidence of portosplenomesenteric venous thrombosis in patients with acute pancreatitis. J Thromb Haemost 2017;15:439-445. [PMID: 27960048 DOI: 10.1111/jth.13588] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Indexed: 12/18/2022]

Abstract

Essentials Predicting the occurrence of portosplenomesenteric vein thrombosis (PSMVT) is difficult. We studied 72 patients with acute pancreatitis. Artificial neural networks modeling was more accurate than logistic regression in predicting PSMVT. Additional predictive factors may be incorporated into artificial neural networks.

SUMMARY

Objective To construct and validate artificial neural networks (ANNs) for predicting the occurrence of portosplenomesenteric venous thrombosis (PSMVT) and compare the predictive ability of the ANNs with that of logistic regression. Methods The ANNs and logistic regression modeling were constructed using simple clinical and laboratory data of 72 acute pancreatitis (AP) patients. The ANNs and logistic modeling were first trained on 48 randomly chosen patients and validated on the remaining 24 patients. The accuracy and the performance characteristics were compared between these two approaches by SPSS17.0 software. Results The training set and validation set did not differ on any of the 11 variables. After training, the back propagation network training error converged to 1 × 10-20 , and it retained excellent pattern recognition ability. When the ANNs model was applied to the validation set, it revealed a sensitivity of 80%, specificity of 85.7%, a positive predictive value of 77.6% and negative predictive value of 90.7%. The accuracy was 83.3%. Differences could be found between ANNs modeling and logistic regression modeling in these parameters (10.0% [95% CI, -14.3 to 34.3%], 14.3% [95% CI, -8.6 to 37.2%], 15.7% [95% CI, -9.9 to 41.3%], 11.8% [95% CI, -8.2 to 31.8%], 22.6% [95% CI, -1.9 to 47.1%], respectively). When ANNs modeling was used to identify PSMVT, the area under receiver operating characteristic curve was 0.849 (95% CI, 0.807-0.901), which demonstrated better overall properties than logistic regression modeling (AUC = 0.716) (95% CI, 0.679-0.761). Conclusions ANNs modeling was a more accurate tool than logistic regression in predicting the occurrence of PSMVT following AP. More clinical factors or biomarkers may be incorporated into ANNs modeling to improve its predictive ability.

Collapse

FRBPSO: A Fuzzy Rule Based Binary PSO for Feature Selection. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES INDIA SECTION A-PHYSICAL SCIENCES 2017. [DOI: 10.1007/s40010-017-0347-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Bari MG, Salekin S, Zhang JM. A Robust and Efficient Feature Selection Algorithm for Microarray Data. Mol Inform 2016;36. [PMID: 28000384 DOI: 10.1002/minf.201600099] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Accepted: 11/21/2016] [Indexed: 12/20/2022]

Lai CM, Yeh WC, Chang CY. Gene selection using information gain and improved simplified swarm optimization. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.08.089] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Devi Arockia Vanitha C, Devaraj D, Venkatesulu M. Multiclass cancer diagnosis in microarray gene expression profile using mutual information and Support Vector Machine. INTELL DATA ANAL 2016. [DOI: 10.3233/ida-150203] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Prieto A, Prieto B, Ortigosa EM, Ros E, Pelayo F, Ortega J, Rojas I. Neural networks: An overview of early research, current frameworks and new challenges. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.06.014] [Citation(s) in RCA: 161] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]

Yang H, Seoighe C. Impact of the Choice of Normalization Method on Molecular Cancer Class Discovery Using Nonnegative Matrix Factorization. PLoS One 2016;11:e0164880. [PMID: 27741311 PMCID: PMC5065197 DOI: 10.1371/journal.pone.0164880] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Accepted: 10/03/2016] [Indexed: 11/18/2022] Open

Khondoker M, Dobson R, Skirrow C, Simmons A, Stahl D. A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies. Stat Methods Med Res 2016;25:1804-1823. [PMID: 24047600 PMCID: PMC5081132 DOI: 10.1177/0962280213502437] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract

BACKGROUND

Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms.

METHODS

We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features.

RESULTS

For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study.

Collapse

Moosa JM, Shakur R, Kaykobad M, Rahman MS. Gene selection for cancer classification with the help of bees. BMC Med Genomics 2016;9 Suppl 2:47. [PMID: 27510562 PMCID: PMC4980787 DOI: 10.1186/s12920-016-0204-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2023] Open

100

Lovato P, Bicego M, Kesa M, Jojic N, Murino V, Perina A. Traveling on discrete embeddings of gene expression. Artif Intell Med 2016;70:1-11. [PMID: 27431033 DOI: 10.1016/j.artmed.2016.05.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Revised: 05/20/2016] [Accepted: 05/21/2016] [Indexed: 12/24/2022]