1
|
Tsagiopoulou M, Gut IG. Machine learning and multi-omics data in chronic lymphocytic leukemia: the future of precision medicine? Front Genet 2024; 14:1304661. [PMID: 38283149 PMCID: PMC10811210 DOI: 10.3389/fgene.2023.1304661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 12/27/2023] [Indexed: 01/30/2024] Open
Abstract
Chronic lymphocytic leukemia is a complex and heterogeneous hematological malignancy. The advance of high-throughput multi-omics technologies has significantly influenced chronic lymphocytic leukemia research and paved the way for precision medicine approaches. In this review, we explore the role of machine learning in the analysis of multi-omics data in this hematological malignancy. We discuss recent literature on different machine learning models applied to single omic studies in chronic lymphocytic leukemia, with a special focus on the potential contributions to precision medicine. Finally, we highlight the recently published machine learning applications in multi-omics data in this area of research as well as their potential and limitations.
Collapse
Affiliation(s)
| | - Ivo G. Gut
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
- Universitat de Barcelona (UB), Barcelona, Spain
| |
Collapse
|
2
|
Wu X, Zhai F, Chang A, Wei J, Guo Y, Zhang J. Application of machine learning algorithms to predict osteoporosis in postmenopausal women with type 2 diabetes mellitus. J Endocrinol Invest 2023; 46:2535-2546. [PMID: 37171784 DOI: 10.1007/s40618-023-02109-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 05/03/2023] [Indexed: 05/13/2023]
Abstract
PURPOSE The screening and diagnosis of osteoporosis in patients with type 2 diabetes mellitus (T2DM) based on bone mineral density remains challenging because of the limited availability and accessibility of dual-energy X-ray absorptiometry. We aimed to develop and validate models to predict the risk of osteoporosis in postmenopausal women with T2DM based on machine learning (ML) algorithms. METHODS This retrospective study included 303 postmenopausal women with T2DM. To develop prediction models for osteoporosis, we applied nine ML algorithms combined with demographic, clinical, and laboratory data. The least absolute shrinkage and selection operator were used to perform feature selection. We used the bootstrap resampling technique for model training and validation. To test the performance of the models, we calculated indices including the area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, positive predictive value, negative predictive value, F1 score, calibration curve, and decision curve analysis. Furthermore, we conducted fivefold cross-validation for parameter optimization and model validation. Feature importance was assessed using the SHapley additive explanation (SHAP). RESULTS We identified 10 independent predictors as the most valuable features. An AUROC of 0.616-1.000 was observed for nine ML algorithms. The extreme gradient boosting (XGBoost) model exhibited the best performance, outperforming conventional risk assessment tools and registering 0.993 in the training set, 0.798 in the validation set, and 0.786 in the test set for fivefold cross-validation. Using SHAP, we found that the explanatory variables contributed to the model and their relationship with osteoporosis occurrence. Furthermore, we developed a user-friendly tool for calculating the risk of osteoporosis. CONCLUSIONS With the integration of demographic and clinical risk factors, ML algorithms can accurately predict osteoporosis. The XGBoost model showed ideal performance. With the incorporation of these models in the clinic, patients may benefit from early osteoporosis diagnosis and treatment.
Collapse
Affiliation(s)
- X Wu
- Department of Endocrinology, Cangzhou Central Hospital, 16 Xinhua West Road, Cangzhou, 061000, Hebei, People's Republic of China.
| | - F Zhai
- Gynecological Clinic, Cangzhou Central Hospital, 16 Xinhua West Road, Cangzhou, 061000, Hebei, People's Republic of China
| | - A Chang
- Department of Endocrinology, Cangzhou Central Hospital, 16 Xinhua West Road, Cangzhou, 061000, Hebei, People's Republic of China
| | - J Wei
- Department of Endocrinology, Cangzhou Central Hospital, 16 Xinhua West Road, Cangzhou, 061000, Hebei, People's Republic of China
| | - Y Guo
- Department of Endocrinology, Cangzhou Central Hospital, 16 Xinhua West Road, Cangzhou, 061000, Hebei, People's Republic of China
| | - J Zhang
- Department of Endocrinology, Cangzhou Central Hospital, 16 Xinhua West Road, Cangzhou, 061000, Hebei, People's Republic of China
| |
Collapse
|
3
|
Prabhakar SK, Ryu S, Jeong IC, Won DO. A Dual Level Analysis with Evolutionary Computing and Swarm Models for Classification of Leukemia. Biomed Res Int 2022; 2022:2052061. [PMID: 35663047 PMCID: PMC9162867 DOI: 10.1155/2022/2052061] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 03/17/2022] [Accepted: 03/28/2022] [Indexed: 11/17/2022]
Abstract
One of the major reasons of mortality in human beings is cancer, and there is an absolute necessity for doctors to identify and treat a person suffering from it. Leukemia is a group of blood cancers that usually originates in the bone marrow and results in very high number of abnormal cells. For the diagnosis of cancer, microarray data serves as an important clinical application and serves as a great aid to the entire medical community. The dimensionality of the microarray data is too high, and so selection of suitable genes is quite an important step for the improvement of data classification. Therefore, for the prediction and diagnosis of cancer, there is an utmost necessity to select the most informative genes. In this work, Minimum Redundancy Maximum Relevance (MRMR), Signal to Noise Ratio (SNR), Multivariate Error Weight Uncorrelated Shrunken Centroid (EWUSC), and multivariate correlation-based feature selection (CFS) are chosen as initial feature selection techniques. Then, to select the most informative genes, five different kinds of evolutionary optimization techniques too are incorporated here such as African Buffalo Optimization (ABO), Artificial Bee Colony Optimization (ABCO), Cockroach Swarm Optimization (CSO), Imperialist Competitive Optimization (ICO), and Social Spider Optimization (SSO). Finally, the optimized values are fed through classification process and the best results are obtained when multivariate CFS with SSO is utilized and classified with Probabilistic Neural Network (PNN), and a high classification accuracy of 95.70% is obtained.
Collapse
Affiliation(s)
- Sunil Kumar Prabhakar
- Department of Artificial Intelligence Convergence, Hallym University, Chuncheon, 24252 Gangwon, Republic of Korea
| | - Semin Ryu
- Department of Artificial Intelligence Convergence, Hallym University, Chuncheon, 24252 Gangwon, Republic of Korea
| | - In cheol Jeong
- Department of Artificial Intelligence Convergence, Hallym University, Chuncheon, 24252 Gangwon, Republic of Korea
| | - Dong-Ok Won
- Department of Artificial Intelligence Convergence, Hallym University, Chuncheon, 24252 Gangwon, Republic of Korea
| |
Collapse
|
4
|
Wang Y, Wang L, Sun Y, Wu M, Ma Y, Yang L, Meng C, Zhong L, Hossain MA, Peng B. Prediction model for the risk of osteoporosis incorporating factors of disease history and living habits in physical examination of population in Chongqing, Southwest China: based on artificial neural network. BMC Public Health 2021; 21:991. [PMID: 34039329 PMCID: PMC8157412 DOI: 10.1186/s12889-021-11002-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 05/06/2021] [Indexed: 01/07/2023] Open
Abstract
Background Osteoporosis is a gradually recognized health problem with risks related to disease history and living habits. This study aims to establish the optimal prediction model by comparing the performance of four prediction models that incorporated disease history and living habits in predicting the risk of Osteoporosis in Chongqing adults. Methods We conduct a cross-sectional survey with convenience sampling in this study. We use a questionnaire From January 2019 to December 2019 to collect data on disease history and adults’ living habits who got dual-energy X-ray absorptiometry. We established the prediction models of osteoporosis in three steps. Firstly, we performed feature selection to identify risk factors related to osteoporosis. Secondly, the qualified participants were randomly divided into a training set and a test set in the ratio of 7:3. Then the prediction models of osteoporosis were established based on Artificial Neural Network (ANN), Deep Belief Network (DBN), Support Vector Machine (SVM) and combinatorial heuristic method (Genetic Algorithm - Decision Tree (GA-DT)). Finally, we compared the prediction models’ performance through accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC) to select the optimal prediction model. Results The univariate logistic model found that taking calcium tablet (odds ratio [OR] = 0.431), SBP (OR = 1.010), fracture (OR = 1.796), coronary heart disease (OR = 4.299), drinking alcohol (OR = 1.835), physical exercise (OR = 0.747) and other factors were related to the risk of osteoporosis. The AUCs of the training set and test set of the prediction models based on ANN, DBN, SVM and GA-DT were 0.901, 0.762; 0.622, 0.618; 0.698, 0.627; 0.744, 0.724, respectively. After evaluating four prediction models’ performance, we selected a three-layer back propagation neural network (BPNN) with 18, 4, and 1 neuron in the input layer, hidden and output layers respectively, as the optimal prediction model. When the probability was greater than 0.330, osteoporosis would occur. Conclusions Compared with DBN, SVM and GA-DT, the established ANN model had the best prediction ability and can be used to predict the risk of osteoporosis in physical examination of the Chongqing population. The model needs to be further improved through large sample research. Supplementary Information The online version contains supplementary material available at 10.1186/s12889-021-11002-5.
Collapse
Affiliation(s)
- Yuqi Wang
- Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing, 400016, China
| | - Liangxu Wang
- School of Basic Medicine, Kunming Medical University, Kunming, 650031, China
| | - Yanli Sun
- The First Affiliated Hospital of Chongqing Medical University Health Management Center, Chongqing, 400016, China
| | - Miao Wu
- Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing, 400016, China
| | - Yingjie Ma
- Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing, 400016, China
| | - Lingping Yang
- Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing, 400016, China
| | - Chun Meng
- Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing, 400016, China
| | - Li Zhong
- The First Affiliated Hospital of Chongqing Medical University Health Management Center, Chongqing, 400016, China
| | - Mohammad Arman Hossain
- The First Affiliated Hospital of Chongqing Medical University, Department of Urology, Chongqing, 400016, China
| | - Bin Peng
- Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing, 400016, China.
| |
Collapse
|
5
|
Moran-Sanchez J, Santisteban-Espejo A, Martin-Piedra MA, Perez-Requena J, Garcia-Rojo M. Translational Applications of Artificial Intelligence and Machine Learning for Diagnostic Pathology in Lymphoid Neoplasms: A Comprehensive and Evolutive Analysis. Biomolecules 2021; 11:793. [PMID: 34070632 PMCID: PMC8227233 DOI: 10.3390/biom11060793] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/13/2021] [Accepted: 05/24/2021] [Indexed: 12/12/2022] Open
Abstract
Genomic analysis and digitalization of medical records have led to a big data scenario within hematopathology. Artificial intelligence and machine learning tools are increasingly used to integrate clinical, histopathological, and genomic data in lymphoid neoplasms. In this study, we identified global trends, cognitive, and social framework of this field from 1990 to 2020. Metadata were obtained from the Clarivate Analytics Web of Science database in January 2021. A total of 525 documents were assessed by document type, research areas, source titles, organizations, and countries. SciMAT and VOSviewer package were used to perform scientific mapping analysis. Geographical distribution showed the USA and People's Republic of China as the most productive countries, reporting up to 190 (36.19%) of all documents. A third-degree polynomic equation predicts that future global production in this area will be three-fold the current number, near 2031. Thematically, current research is focused on the integration of digital image analysis and genomic sequencing in Non-Hodgkin lymphomas, prediction of chemotherapy response and validation of new prognostic models. These findings can serve pathology departments to depict future clinical and research avenues, but also, public institutions and administrations to promote synergies and optimize funding allocation.
Collapse
Affiliation(s)
- Julia Moran-Sanchez
- Division of Hematology and Hemotherapy, Puerta del Mar Hospital, 11009 Cadiz, Spain;
- Ph.D Program of Clinical Medicine and Surgery, University of Cadiz, 11009 Cadiz, Spain
| | - Antonio Santisteban-Espejo
- Pathology Department, Puerta del Mar Hospital, 11009 Cadiz, Spain; (J.P.-R.); (M.G.-R.)
- Institute of Research and Innovation in Biomedical Sciences of the Province of Cadiz (INiBICA), University of Cadiz, 11009 Cadiz, Spain
| | | | - Jose Perez-Requena
- Pathology Department, Puerta del Mar Hospital, 11009 Cadiz, Spain; (J.P.-R.); (M.G.-R.)
| | - Marcial Garcia-Rojo
- Pathology Department, Puerta del Mar Hospital, 11009 Cadiz, Spain; (J.P.-R.); (M.G.-R.)
- Institute of Research and Innovation in Biomedical Sciences of the Province of Cadiz (INiBICA), University of Cadiz, 11009 Cadiz, Spain
| |
Collapse
|
6
|
Lu D, Jiang J, Liu X, Wang H, Feng S, Shi X, Wang Z, Chen Z, Yan X, Wu H, Cai K. Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary. Front Genet 2020; 11:614823. [PMID: 33408743 PMCID: PMC7779672 DOI: 10.3389/fgene.2020.614823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Accepted: 11/30/2020] [Indexed: 11/29/2022] Open
Abstract
Metastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site pathologically. Therefore, it seems necessary and urgent to develop novel and effective methods to determine the primary site in MCCUP. In the present study, the RNA sequencing data of four types of SCC and Pan-Cancer from the cancer genome atlas (TCGA) were obtained. And after data pre-processing, their differentially expressed genes (DEGs) were identified, respectively. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that these significantly changed genes of four types of SCC share lots of similar molecular functions and histological features. Then three machine learning models, [Random Forest (RF), support vector machine (SVM), and neural network (NN)] which consisted of ten genes to distinguish these four types of SCC were developed. Among the three models with prediction tests, the RF model worked best in the external validation set, with an overall predictive accuracy of 88.2%, sensitivity of 88.71%, and specificity of 95.42%. The NN model is the second in efficacy, with an overall accuracy of 82.02%, sensitivity of 81.23%, and specificity of 93.04%. The SVM model is the last, with an overall accuracy of 76.69%, sensitivity of 74.81%, and specificity of 90.84%. The present analysis of similarities and differences among the four types of SCC, and novel models developments for distinguishing four types of SCC with informatics methods shed lights on precision MCCUP diagnosis in the future.
Collapse
Affiliation(s)
- Di Lu
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Jianjun Jiang
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Xiguang Liu
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - He Wang
- Department of Thoracic Surgery, Peking University Shenzhen Hospital, Shenzhen, China
| | - Siyang Feng
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Xiaoshun Shi
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Zhizhi Wang
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Zhiming Chen
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Xuebin Yan
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Hua Wu
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Kaican Cai
- Department of Thoracic Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
7
|
Zeng J, Zhang J, Li Z, Li T, Li G. Prediction model of artificial neural network for the risk of hyperuricemia incorporating dietary risk factors in a Chinese adult study. Food Nutr Res 2020; 64:3712. [PMID: 32047420 PMCID: PMC6983978 DOI: 10.29219/fnr.v64.3712] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 12/04/2019] [Accepted: 12/09/2019] [Indexed: 01/06/2023] Open
Abstract
Background Risk of hyperuricemia (HU) has been shown to be strongly associated with dietary factors. However, there is scarce evidence on prediction models incorporating dietary factors to estimate the risk of HU. Objective The aim of this study was to develop a prediction model to predict the risk of HU in Chinese adults based on dietary information. Design Our study was based on a cross-sectional survey, which recruited 1,488 community residents aged 18 to 60 years in Beijing from October 2010 to January 2011. The eligible participants were randomly divided into a training set (n1 = 992) and a validation set (n2 = 496) in the ratio of 2:1. We developed the prediction model in three stages. We first used a logistic regression model (LRM) based on the training set to select a set of dietary risk factors which were related to the risk of HU. Artificial neural network (ANN) was then used to construct the prediction model using the training set. Finally, we used receiver operating characteristic (ROC) curve analysis to assess the accuracy of the prediction model using training and validation sets. Results In the training set, the mean age of participants with and without HU was 39.3 (standard deviation [SD]: 9.65) and 38.2 (SD: 9.38) years, respectively. Patients with HU consisted of 101 males (77.7%) and 29 females (22.3%). The LRM found that food frequency (vegetables [odds ratio (OR) = 0.73], meat [0.72], eggs [0.80], plant oil [0.78], tea [0.51], eating habits (breakfast [OR = 1.28]), and the salty cooking style (OR = 1.33) were associated with risk of HU. In the ANN analysis, we selected a three-layer back propagation neural network (BPNN) model with 14, 3, and 1 neuron in the input, hidden, and output layers, respectively, as the best prediction model. The areas under the ROC of the training and validation sets were 0.827 and 0.814, respectively. HU would occur when the incidence probability is greater than 0.128. The indicators of accuracy, sensitivity, specificity, and Yuden Index suggested that the ANN model in our study is successful and valuable. Conclusions This study suggests that the ANN model could be used to predict the risk of HU in Chinese adults. Further prospective studies are needed to improve the accuracy and to generalize the use of model.
Collapse
Affiliation(s)
- Jie Zeng
- Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Junguo Zhang
- Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Ziyi Li
- Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Tianwang Li
- Department of Rheumatology and Immunology, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Guowei Li
- Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou, China
| |
Collapse
|