1
|
Nambisan AK, Maurya A, Lama N, Phan T, Patel G, Miller K, Lama B, Hagerty J, Stanley R, Stoecker WV. Improving Automatic Melanoma Diagnosis Using Deep Learning-Based Segmentation of Irregular Networks. Cancers (Basel) 2023; 15:cancers15041259. [PMID: 36831599 PMCID: PMC9953766 DOI: 10.3390/cancers15041259] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 02/08/2023] [Accepted: 02/09/2023] [Indexed: 02/18/2023] Open
Abstract
Deep learning has achieved significant success in malignant melanoma diagnosis. These diagnostic models are undergoing a transition into clinical use. However, with melanoma diagnostic accuracy in the range of ninety percent, a significant minority of melanomas are missed by deep learning. Many of the melanomas missed have irregular pigment networks visible using dermoscopy. This research presents an annotated irregular network database and develops a classification pipeline that fuses deep learning image-level results with conventional hand-crafted features from irregular pigment networks. We identified and annotated 487 unique dermoscopic melanoma lesions from images in the ISIC 2019 dermoscopic dataset to create a ground-truth irregular pigment network dataset. We trained multiple transfer learned segmentation models to detect irregular networks in this training set. A separate, mutually exclusive subset of the International Skin Imaging Collaboration (ISIC) 2019 dataset with 500 melanomas and 500 benign lesions was used for training and testing deep learning models for the binary classification of melanoma versus benign. The best segmentation model, U-Net++, generated irregular network masks on the 1000-image dataset. Other classical color, texture, and shape features were calculated for the irregular network areas. We achieved an increase in the recall of melanoma versus benign of 11% and in accuracy of 2% over DL-only models using conventional classifiers in a sequential pipeline based on the cascade generalization framework, with the highest increase in recall accompanying the use of the random forest algorithm. The proposed approach facilitates leveraging the strengths of both deep learning and conventional image processing techniques to improve the accuracy of melanoma diagnosis. Further research combining deep learning with conventional image processing on automatically detected dermoscopic features is warranted.
Collapse
Affiliation(s)
- Anand K. Nambisan
- Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Akanksha Maurya
- Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Norsang Lama
- Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Thanh Phan
- Department of Biological Sciences, College of Arts, Sciences, and Education, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Gehana Patel
- College of Health Sciences, University of Missouri—Columbia, Columbia, MO 65211, USA
| | - Keith Miller
- Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Binita Lama
- Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Jason Hagerty
- S&A Technologies, 10101 Stoltz Drive, Rolla, MO 65401, USA
| | - Ronald Stanley
- Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO 65409, USA
- Correspondence:
| | | |
Collapse
|
2
|
Alabi RO, Youssef O, Pirinen M, Elmusrati M, Mäkitie AA, Leivo I, Almangush A. Machine learning in oral squamous cell carcinoma: Current status, clinical concerns and prospects for future-A systematic review. Artif Intell Med 2021; 115:102060. [PMID: 34001326 DOI: 10.1016/j.artmed.2021.102060] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 01/27/2021] [Accepted: 03/23/2021] [Indexed: 02/06/2023]
Abstract
BACKGROUND Oral cancer can show heterogenous patterns of behavior. For proper and effective management of oral cancer, early diagnosis and accurate prediction of prognosis are important. To achieve this, artificial intelligence (AI) or its subfield, machine learning, has been touted for its potential to revolutionize cancer management through improved diagnostic precision and prediction of outcomes. Yet, to date, it has made only few contributions to actual medical practice or patient care. OBJECTIVES This study provides a systematic review of diagnostic and prognostic application of machine learning in oral squamous cell carcinoma (OSCC) and also highlights some of the limitations and concerns of clinicians towards the implementation of machine learning-based models for daily clinical practice. DATA SOURCES We searched OvidMedline, PubMed, Scopus, Web of Science, and Institute of Electrical and Electronics Engineers (IEEE) databases from inception until February 2020 for articles that used machine learning for diagnostic or prognostic purposes of OSCC. ELIGIBILITY CRITERIA Only original studies that examined the application of machine learning models for prognostic and/or diagnostic purposes were considered. DATA EXTRACTION Independent extraction of articles was done by two researchers (A.R. & O.Y) using predefine study selection criteria. We used the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) in the searching and screening processes. We also used Prediction model Risk of Bias Assessment Tool (PROBAST) for assessing the risk of bias (ROB) and quality of included studies. RESULTS A total of 41 studies were published to have used machine learning to aid in the diagnosis/or prognosis of OSCC. The majority of these studies used the support vector machine (SVM) and artificial neural network (ANN) algorithms as machine learning techniques. Their specificity ranged from 0.57 to 1.00, sensitivity from 0.70 to 1.00, and accuracy from 63.4 % to 100.0 % in these studies. The main limitations and concerns can be grouped as either the challenges inherent to the science of machine learning or relating to the clinical implementations. CONCLUSION Machine learning models have been reported to show promising performances for diagnostic and prognostic analyses in studies of oral cancer. These models should be developed to further enhance explainability, interpretability, and externally validated for generalizability in order to be safely integrated into daily clinical practices. Also, regulatory frameworks for the adoption of these models in clinical practices are necessary.
Collapse
Affiliation(s)
- Rasheed Omobolaji Alabi
- Department of Industrial Digitalization, School of Technology and Innovations, University of Vaasa, Vaasa, Finland.
| | - Omar Youssef
- Department of Pathology, University of Helsinki, Helsinki, Finland; Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Matti Pirinen
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland; Department of Public Health, University of Helsinki, Helsinki, Finland; Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| | - Mohammed Elmusrati
- Department of Industrial Digitalization, School of Technology and Innovations, University of Vaasa, Vaasa, Finland
| | - Antti A Mäkitie
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland; Department of Otorhinolaryngology - Head and Neck Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland; Division of Ear, Nose and Throat Diseases, Department of Clinical Sciences, Intervention and Technology, Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
| | - Ilmo Leivo
- University of Turku, Institute of Biomedicine, Pathology, Turku, Finland
| | - Alhadi Almangush
- Department of Pathology, University of Helsinki, Helsinki, Finland; Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland; University of Turku, Institute of Biomedicine, Pathology, Turku, Finland; Faculty of Dentistry, Misurata University, Misurata, Libya
| |
Collapse
|
3
|
Fujihara K, Matsubayashi Y, Harada Yamada M, Yamamoto M, Iizuka T, Miyamura K, Hasegawa Y, Maegawa H, Kodama S, Yamazaki T, Sone H. Machine Learning Approach to Decision Making for Insulin Initiation in Japanese Patients With Type 2 Diabetes (JDDM 58): Model Development and Validation Study. JMIR Med Inform 2021; 9:e22148. [PMID: 33502325 PMCID: PMC7875702 DOI: 10.2196/22148] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Revised: 11/23/2020] [Accepted: 12/12/2020] [Indexed: 12/21/2022] Open
Abstract
Background Applications of machine learning for the early detection of diseases for which a clear-cut diagnostic gold standard exists have been evaluated. However, little is known about the usefulness of machine learning approaches in the decision-making process for decisions such as insulin initiation by diabetes specialists for which no absolute standards exist in clinical settings. Objective The objectives of this study were to examine the ability of machine learning models to predict insulin initiation by specialists and whether the machine learning approach could support decision making by general physicians for insulin initiation in patients with type 2 diabetes. Methods Data from patients prescribed hypoglycemic agents from December 2009 to March 2015 were extracted from diabetes specialists’ registries, resulting in a sample size of 4860 patients who had received initial monotherapy with either insulin (n=293) or noninsulin (n=4567). Neural network output was insulin initiation ranging from 0 to 1 with a cutoff of >0.5 for the dichotomous classification. Accuracy, recall, and area under the receiver operating characteristic curve (AUC) were calculated to compare the ability of machine learning models to make decisions regarding insulin initiation to the decision-making ability of logistic regression and general physicians. By comparing the decision-making ability of machine learning and logistic regression to that of general physicians, 7 cases were chosen based on patient information as the gold standard based on the agreement of 8 of the 9 specialists. Results The AUCs, accuracy, and recall of logistic regression were higher than those of machine learning (AUCs of 0.89-0.90 for logistic regression versus 0.67-0.74 for machine learning). When the examination was limited to cases receiving insulin, discrimination by machine learning was similar to that of logistic regression analysis (recall of 0.05-0.68 for logistic regression versus 0.11-0.52 for machine learning). Accuracies of logistic regression, a machine learning model (downsampling ratio of 1:8), and general physicians were 0.80, 0.70, and 0.66, respectively, for 43 randomly selected cases. For the 7 gold standard cases, the accuracies of logistic regression and the machine learning model were 1.00 and 0.86, respectively, with a downsampling ratio of 1:8, which were higher than the accuracy of general physicians (ie, 0.43). Conclusions Although we found no superior performance of machine learning over logistic regression, machine learning had higher accuracy in prediction of insulin initiation than general physicians, defined by diabetes specialists’ choice of the gold standard. Further study is needed before the use of machine learning–based decision support systems for insulin initiation can be incorporated into clinical practice.
Collapse
Affiliation(s)
- Kazuya Fujihara
- Department of Internal Medicine, Faculty of Medicine, Niigata University, Niigata, Japan
| | - Yasuhiro Matsubayashi
- Department of Internal Medicine, Faculty of Medicine, Niigata University, Niigata, Japan
| | - Mayuko Harada Yamada
- Department of Internal Medicine, Faculty of Medicine, Niigata University, Niigata, Japan
| | - Masahiko Yamamoto
- Department of Internal Medicine, Faculty of Medicine, Niigata University, Niigata, Japan
| | | | | | | | - Hiroshi Maegawa
- Department of Internal Medicine, Shiga University of Medical Science, Shiga, Japan
| | - Satoru Kodama
- Department of Internal Medicine, Faculty of Medicine, Niigata University, Niigata, Japan
| | | | - Hirohito Sone
- Department of Internal Medicine, Faculty of Medicine, Niigata University, Niigata, Japan
| |
Collapse
|
4
|
Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med 2019; 17:195. [PMID: 31665002 PMCID: PMC6821018 DOI: 10.1186/s12916-019-1426-2] [Citation(s) in RCA: 667] [Impact Index Per Article: 133.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 09/16/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications being demonstrated across various domains of medicine. However, there are currently limited examples of such techniques being successfully deployed into clinical practice. This article explores the main challenges and limitations of AI in healthcare, and considers the steps required to translate these potentially transformative technologies from research to clinical practice. MAIN BODY Key challenges for the translation of AI systems in healthcare include those intrinsic to the science of machine learning, logistical difficulties in implementation, and consideration of the barriers to adoption as well as of the necessary sociocultural or pathway changes. Robust peer-reviewed clinical evaluation as part of randomised controlled trials should be viewed as the gold standard for evidence generation, but conducting these in practice may not always be appropriate or feasible. Performance metrics should aim to capture real clinical applicability and be understandable to intended users. Regulation that balances the pace of innovation with the potential for harm, alongside thoughtful post-market surveillance, is required to ensure that patients are not exposed to dangerous interventions nor deprived of access to beneficial innovations. Mechanisms to enable direct comparisons of AI systems must be developed, including the use of independent, local and representative test sets. Developers of AI algorithms must be vigilant to potential dangers, including dataset shift, accidental fitting of confounders, unintended discriminatory bias, the challenges of generalisation to new populations, and the unintended negative consequences of new algorithms on health outcomes. CONCLUSION The safe and timely translation of AI research into clinically validated and appropriately regulated systems that can benefit everyone is challenging. Robust clinical evaluation, using metrics that are intuitive to clinicians and ideally go beyond measures of technical accuracy to include quality of care and patient outcomes, is essential. Further work is required (1) to identify themes of algorithmic bias and unfairness while developing mitigations to address these, (2) to reduce brittleness and improve generalisability, and (3) to develop methods for improved interpretability of machine learning predictions. If these goals can be achieved, the benefits for patients are likely to be transformational.
Collapse
|