1
|
Eckardt JN, Bornhäuser M, Wendt K, Middeke JM. Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects. Blood Adv 2020; 4:6077-6085. [PMID: 33290546 PMCID: PMC7724910 DOI: 10.1182/bloodadvances.2020002997] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 10/26/2020] [Indexed: 12/19/2022] Open
Abstract
Machine learning (ML) is rapidly emerging in several fields of cancer research. ML algorithms can deal with vast amounts of medical data and provide a better understanding of malignant disease. Its ability to process information from different diagnostic modalities and functions to predict prognosis and suggest therapeutic strategies indicates that ML is a promising tool for the future management of hematologic malignancies; acute myeloid leukemia (AML) is a model disease of various recent studies. An integration of these ML techniques into various applications in AML management can assure fast and accurate diagnosis as well as precise risk stratification and optimal therapy. Nevertheless, these techniques come with various pitfalls and need a strict regulatory framework to ensure safe use of ML. This comprehensive review highlights and discusses recent advances in ML techniques in the management of AML as a model disease of hematologic neoplasms, enabling researchers and clinicians alike to critically evaluate this upcoming, potentially practice-changing technology.
Collapse
Affiliation(s)
- Jan-Niklas Eckardt
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
| | - Martin Bornhäuser
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
- National Center for Tumor Diseases, Dresden (NCT/UCC), Dresden, Germany
- German Consortium for Translational Cancer Research, DKFZ, Heidelberg, Germany; and
| | - Karsten Wendt
- Institute of Circuits and Systems, Technical University Dresden, Dresden, Germany
| | - Jan Moritz Middeke
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
| |
Collapse
|
2
|
Single Cell and Population Level Analysis of HCA Data. Methods Mol Biol 2017. [PMID: 29082497 DOI: 10.1007/978-1-4939-7357-6_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
High Content Analysis instrumentation has undergone tremendous hardware advances in recent years. It is now possible to obtain images of hundreds of thousands to millions of individual objects, across multiple wells, channels, and plates, in a reasonable amount of time. In addition, it is possible to extract dozens, or hundreds, of features per object using commonly available software tools. Analyzing this data provides new challenges to the scientists. The magnitude of these numbers is reminiscent of flow cytometer, where practitioners have long been taking what effectively amounted to very low resolution, multi-parametric measurements from individual cells for many decades. Flow cytometrists have developed a wide range of tools to effectively analyze and interpret these types of data. This chapter will review the techniques used in flow cytometry and show how they can easily and effectively be applied to High Content Analysis.
Collapse
|
3
|
Robinson JP, Patsekin V, Holdman C, Ragheb K, Sturgis J, Fatig R, Avramova LV, Rajwa B, Davisson VJ, Lewis N, Narayanan P, Li N, Qualls CW. High-throughput secondary screening at the single-cell level. ACTA ACUST UNITED AC 2012; 18:85-98. [PMID: 22968419 DOI: 10.1177/2211068212456978] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
We have developed an automated system for drug screening using a single-cell-multiple functional response technology. The approach uses a semiautomated preparatory system, high-speed sample collection, and a unique analytical tool that provides instantaneous results for compound dilutions using 384-well plates. The combination of automation and rapid robotic sampling increases quality control and robustness. High-speed flow cytometry is used to collect single-cell results together with a newly defined analytical tool for extraction of IC(50) curves for multiple assays per cell. The principal advantage is the extreme speed of sample collection, with results from a 384-well plate being completed for both collection and data processing in less than 10 min. Using this approach, it is possible to extract detailed drug response information in a highly controlled fashion. The data are based on single-cell results, not populations. With simultaneous assays for different functions, it is possible to gain a more detailed understanding of each drug/compound interaction. Combined with integrated advanced data processing directly from raw data files, the process from sampling to analytical results is highly intuitive. Direct PubMed links allow review of drug structure and comparisons with similar compounds.
Collapse
Affiliation(s)
- J Paul Robinson
- Purdue University Cytometry Laboratories, Department of Basic Medical Sciences, Purdue University, West Lafayette, IN 47907, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Sugár IP, Sealfon SC. Misty Mountain clustering: application to fast unsupervised flow cytometry gating. BMC Bioinformatics 2010; 11:502. [PMID: 20932336 PMCID: PMC2967560 DOI: 10.1186/1471-2105-11-502] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2009] [Accepted: 10/09/2010] [Indexed: 11/26/2022] Open
Abstract
Background There are many important clustering questions in computational biology for which no satisfactory method exists. Automated clustering algorithms, when applied to large, multidimensional datasets, such as flow cytometry data, prove unsatisfactory in terms of speed, problems with local minima or cluster shape bias. Model-based approaches are restricted by the assumptions of the fitting functions. Furthermore, model based clustering requires serial clustering for all cluster numbers within a user defined interval. The final cluster number is then selected by various criteria. These supervised serial clustering methods are time consuming and frequently different criteria result in different optimal cluster numbers. Various unsupervised heuristic approaches that have been developed such as affinity propagation are too expensive to be applied to datasets on the order of 106 points that are often generated by high throughput experiments. Results To circumvent these limitations, we developed a new, unsupervised density contour clustering algorithm, called Misty Mountain, that is based on percolation theory and that efficiently analyzes large data sets. The approach can be envisioned as a progressive top-down removal of clouds covering a data histogram relief map to identify clusters by the appearance of statistically distinct peaks and ridges. This is a parallel clustering method that finds every cluster after analyzing only once the cross sections of the histogram. The overall run time for the composite steps of the algorithm increases linearly by the number of data points. The clustering of 106 data points in 2D data space takes place within about 15 seconds on a standard laptop PC. Comparison of the performance of this algorithm with other state of the art automated flow cytometry gating methods indicate that Misty Mountain provides substantial improvements in both run time and in the accuracy of cluster assignment. Conclusions Misty Mountain is fast, unbiased for cluster shape, identifies stable clusters and is robust to noise. It provides a useful, general solution for multidimensional clustering problems. We demonstrate its suitability for automated gating of flow cytometry data.
Collapse
Affiliation(s)
- István P Sugár
- Department of Neurology and Center for Translational Systems Biology, Mount Sinai School of Medicine, New York, NY, USA.
| | | |
Collapse
|
5
|
Cleophas TJ, Cleophas TF. Artificial intelligence for diagnostic purposes: principles, procedures and limitations. Clin Chem Lab Med 2010; 48:159-65. [PMID: 20001439 DOI: 10.1515/cclm.2010.045] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
BACKGROUND Back propagation (BP) artificial neural networks are a distribution-free method for data analysis based on layers of artificial neurons that transduce imputed information. It has been recognized as having a number of advantages compared to traditional methods including the possibility to process imperfect data, and complex non-linear data. The objective of this study was to review the principles, procedures, and limitations of BP artificial neural networks for a non-mathematical readership. METHODS A real data sample of weight, height and measured body surface area from 90 individuals was used as an example. SPSS 17.0 with neural network add-on was used for the analysis. The predicted body surface from a two hidden layer BP neural network was compared to the body surface calculated by the Haycock equation. RESULTS Both the predicted values from the neural network and from the Haycock equation were close to the measured values. A linear regression analysis with neural network as predictor produced an r(2)-value of 0.983, while the Haycock equation produced an r(2)-value of 0.995 (r(2)>0.95 is a criterion for accurate diagnostic testing). CONCLUSIONS BP neural networks may, sometimes, predict clinical diagnoses with accuracies similar to those of other methods. However, traditional statistical procedures, such as regression analyses need to be added for testing their accuracies against alternative methods. Nonetheless, BP neural networks have great potential through their ability to learn by example instead of learning by theory.
Collapse
|
6
|
Bashashati A, Brinkman RR. A survey of flow cytometry data analysis methods. Adv Bioinformatics 2009; 2009:584603. [PMID: 20049163 PMCID: PMC2798157 DOI: 10.1155/2009/584603] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2009] [Revised: 07/20/2009] [Accepted: 08/22/2009] [Indexed: 02/04/2023] Open
Abstract
Flow cytometry (FCM) is widely used in health research and in treatment for a variety of tasks, such as in the diagnosis and monitoring of leukemia and lymphoma patients, providing the counts of helper-T lymphocytes needed to monitor the course and treatment of HIV infection, the evaluation of peripheral blood hematopoietic stem cell grafts, and many other diseases. In practice, FCM data analysis is performed manually, a process that requires an inordinate amount of time and is error-prone, nonreproducible, nonstandardized, and not open for re-evaluation, making it the most limiting aspect of this technology. This paper reviews state-of-the-art FCM data analysis approaches using a framework introduced to report each of the components in a data analysis pipeline. Current challenges and possible future directions in developing fully automated FCM data analysis tools are also outlined.
Collapse
Affiliation(s)
- Ali Bashashati
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada V5Z 1L3
| | - Ryan R. Brinkman
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, BC, Canada V5Z 1L3
| |
Collapse
|
7
|
Lo K, Brinkman RR, Gottardo R. Automated gating of flow cytometry data via robust model-based clustering. Cytometry A 2008; 73:321-32. [PMID: 18307272 DOI: 10.1002/cyto.a.20531] [Citation(s) in RCA: 158] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The capability of flow cytometry to offer rapid quantification of multidimensional characteristics for millions of cells has made this technology indispensable for health research, medical diagnosis, and treatment. However, the lack of statistical and bioinformatics tools to parallel recent high-throughput technological advancements has hindered this technology from reaching its full potential. We propose a flexible statistical model-based clustering approach for identifying cell populations in flow cytometry data based on t-mixture models with a Box-Cox transformation. This approach generalizes the popular Gaussian mixture models to account for outliers and allow for nonelliptical clusters. We describe an Expectation-Maximization (EM) algorithm to simultaneously handle parameter estimation and transformation selection. Using two publicly available datasets, we demonstrate that our proposed methodology provides enough flexibility and robustness to mimic manual gating results performed by an expert researcher. In addition, we present results from a simulation study, which show that this new clustering framework gives better results in terms of robustness to model misspecification and estimation of the number of clusters, compared to the popular mixture models. The proposed clustering methodology is well adapted to automated analysis of flow cytometry data. It tends to give more reproducible results, and helps reduce the significant subjectivity and human time cost encountered in manual gating analysis.
Collapse
Affiliation(s)
- Kenneth Lo
- Department of Statistics, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada.
| | | | | |
Collapse
|
8
|
Lisboa PJ, Taktak AFG. The use of artificial neural networks in decision support in cancer: a systematic review. Neural Netw 2006; 19:408-15. [PMID: 16483741 DOI: 10.1016/j.neunet.2005.10.007] [Citation(s) in RCA: 176] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2005] [Accepted: 10/31/2005] [Indexed: 02/08/2023]
Abstract
Artificial neural networks have featured in a wide range of medical journals, often with promising results. This paper reports on a systematic review that was conducted to assess the benefit of artificial neural networks (ANNs) as decision making tools in the field of cancer. The number of clinical trials (CTs) and randomised controlled trials (RCTs) involving the use of ANNs in diagnosis and prognosis increased from 1 to 38 in the last decade. However, out of 396 studies involving the use of ANNs in cancer, only 27 were either CTs or RCTs. Out of these trials, 21 showed an increase in benefit to healthcare provision and 6 did not. None of these studies however showed a decrease in benefit. This paper reviews the clinical fields where neural network methods figure most prominently, the main algorithms featured, methodologies for model selection and the need for rigorous evaluation of results.
Collapse
Affiliation(s)
- Paulo J Lisboa
- School of Computing and Mathematical Science, Liverpool John Moores University, Liverpool, UK
| | | |
Collapse
|
9
|
Cualing HD, Zhong E, Moscinski L. “Virtual flow cytometry” of immunostained lymphocytes on microscopic tissue slides:iHCFlow™ tissue cytometry. CYTOMETRY PART B-CLINICAL CYTOMETRY 2006; 72:63-76. [PMID: 17133379 DOI: 10.1002/cyto.b.20148] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
BACKGROUND A method and approach is developed for fully automated measurements of immunostained lymphocytes in tissue sections by means of digital color microscopy and patent pending advanced cell analysis. The validation data for population statistic measurements of immunostained lymphocytes in tissue sections using tissue cytometry (TC) is presented. The report is the first to describe the conversion of immunohistochemistry (IHC) data to a flow cytometry-like two parameter dot-plot display, hence the technique is also a virtual flow cytometry. We believe this approach is a paradigm shift, as well as novel, and called the system iHCFlow TC. Seven issues related to technical obstacles to virtual flow cytometry (FC) are identified. DESIGN Segmentation of a 512 x 474 RGB image and tabular display of statistical results table took 12-15 s using proprietary developed algorithms. We used a panel of seven antibodies for validation on 14 cases of mantle cell lymphoma giving percentage positive, total lymphocytes, and staining density. A total of 2,027 image frames with 810,800 cell objects (COBs) were evaluated. Antibodies to CD3, CD4, CD8, Bcl-1, Ki-67, CD20, CD5 were subjected to virtual FC on tissue. The results of TC were compared with manual counts of expert observers and with the results of flow cytometric immunophenotyping of the same specimen. RESULTS The correlation coefficient and 95% confidence interval by linear regression analysis yielded a high concordance between manual human results (M), FC results, and TC results per antibody, (r = 0.9365 M vs. TC, r= 0.9537 FC vs. TC). The technical issues were resolved and the solutions and results were evaluated and presented. CONCLUSION These results suggest the new technology of TC by iHCFlow could be a clinically valid surrogate for both M and FC analysis when only tissue IHC is available for diagnosis and prognosis. The application for cancer diagnosis, monitoring, and prognosis is for objective, rapid, automated counting of immunostained cells in tissues with percentage results. We report a new paradigm in TC that converts IHC staining of lymphocytes to automated results and a flow cytometry-like report. The dot plot histogram display is familiar, intuitive, informative, and provides the pathologists with an automated tool to rapidly characterize the staining and size distribution of the immunoreactive as well as the negative cell population in the tissue. This systems tool is a major improvement over existing ones and satisfies fully the criteria to perform Cytomics (Ecker and Tarnok, Cytometry A 2005;65:1; Ecker and Steiner, Cytometry A 2004;59:182-190; Ecker et al., Cytometry A 2004;59:172-181).
Collapse
Affiliation(s)
- Hernani D Cualing
- H. Lee Moffitt Cancer Center and Research Institute, University of South Florida, Tampa, Florida, USA.
| | | | | |
Collapse
|
10
|
Chuang L, Hwang JY, Chang CH, Yu CH, Chang FM. Ultrasound estimation of fetal weight with the use of computerized artificial neural network model. ULTRASOUND IN MEDICINE & BIOLOGY 2002; 28:991-996. [PMID: 12217434 DOI: 10.1016/s0301-5629(02)00554-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The aim of this study was to test if the computerized artificial neural network (ANN) model could improve ultrasound (US) estimation of fetal weight over estimation with the other commonly used formulas generated from regression analysis. First, as the training group, we performed US examinations on 991 singleton fetuses within 3 days of delivery. Six input variables were used to construct the ANN model: biparietal diameter (BPD), occipitofrontal diameter (OFD), abdominal circumference (AC), femur length (FL), gestational age and fetal presentation. Second, a total of 362 fetuses were assessed subsequently as the validation group. In this training group, the ANN model was better than the other compared formulas in fetal weight estimation (n = 991, mean absolute error 183.83 g, mean absolute percent error 6.02%, all p < 0.0001). In addition, the validation group further proved the results (n = 362, mean absolute error 179.91 g, mean absolute percent error 6.15%, all p < 0.005). In conclusion, the computerized artificial neural network (ANN) model could provide better US estimation of fetal weight than estimations by means of commonly used formulas generated from regression analysis.
Collapse
Affiliation(s)
- Louise Chuang
- Department of Obstetrics and Gynecology, National Cheng Kung University Medical College and Hospital, Tainan, Taiwan
| | | | | | | | | |
Collapse
|
11
|
Lisboa PJG. A review of evidence of health benefit from artificial neural networks in medical intervention. Neural Netw 2002; 15:11-39. [PMID: 11958484 DOI: 10.1016/s0893-6080(01)00111-3] [Citation(s) in RCA: 319] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
The purpose of this review is to assess the evidence of healthcare benefits involving the application of artificial neural networks to the clinical functions of diagnosis, prognosis and survival analysis, in the medical domains of oncology, critical care and cardiovascular medicine. The primary source of publications is PUBMED listings under Randomised Controlled Trials and Clinical Trials. The rĵle of neural networks is introduced within the context of advances in medical decision support arising from parallel developments in statistics and artificial intelligence. This is followed by a survey of published Randomised Controlled Trials and Clinical Trials, leading to recommendations for good practice in the design and evaluation of neural networks for use in medical intervention.
Collapse
Affiliation(s)
- P J G Lisboa
- School of Computing and Mathematical Sciences, Liverpool John Moores University, UK.
| |
Collapse
|
12
|
Abstract
BACKGROUND Analytical flow cytometry (AFC), by quantifying sometimes more than 10 optical parameters on cells at rates of approximately 10(3) cells/s, rapidly generates vast quantities of multidimensional data, which provides a considerable challenge for data analysis. We review the application of multivariate data analysis and pattern recognition techniques to flow cytometry. METHODS Approaches were divided into two broad types depending on whether the aim was identification or clustering. Multivariate statistical approaches, supervised artificial neural networks (ANNs), problems of overlapping character distributions, unbounded data sets, missing parameters, scaling up, and estimating proportions of different types of cells comprised the first category. Classic clustering methods, fuzzy clustering, and unsupervised ANNs comprised the second category. We demonstrate the state of the art by using AFC data on marine phytoplankton populations. RESULTS AND CONCLUSIONS Information held within the large quantities of data generated by AFC was tractable using ANNs, but for field studies the problem of obtaining suitable training data needs to be resolved, and coping with an almost infinite number of cell categories needs further research.
Collapse
Affiliation(s)
- L Boddy
- Cardiff School of Biosciences, Cardiff University, Cardiff, United Kingdom.
| | | | | |
Collapse
|
13
|
Affiliation(s)
- H D Cualing
- Flow Cytometry and Diagnostic Immunology Division, Department of Pathology, University of Cincinnati Medical Center, OH 45267-0529, USA.
| |
Collapse
|