1
|
Rugină AI, Ungureanu A, Giuglea C, Marinescu SA. Artificial Intelligence in Breast Reconstruction: A Narrative Review. MEDICINA (KAUNAS, LITHUANIA) 2025; 61:440. [PMID: 40142251 PMCID: PMC11944005 DOI: 10.3390/medicina61030440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 02/01/2025] [Revised: 02/20/2025] [Accepted: 02/27/2025] [Indexed: 03/28/2025]
Abstract
Breast reconstruction following mastectomy or sectorectomy significantly impacts the quality of life and psychological well-being of breast cancer patients. Since its inception in the 1950s, artificial intelligence (AI) has gradually entered the medical field, promising to transform surgical planning, intraoperative guidance, postoperative care, and medical research. This article examines AI applications in breast reconstruction, supported by recent studies. AI shows promise in enhancing imaging for tumor detection and surgical planning, improving microsurgical precision, predicting complications such as flap failure, and optimizing postoperative monitoring. However, challenges remain, including data quality, safety, algorithm transparency, and clinical integration. Despite these shortcomings, AI has the potential to revolutionize breast reconstruction by improving preoperative planning, surgical precision, operative efficiency, and patient outcomes. This review provides a foundation for further research as AI continues to evolve and clinical trials expand its applications, offering greater benefits to patients and healthcare providers.
Collapse
Affiliation(s)
- Andrei Iulian Rugină
- Department of Plastic and Reconstructive Surgery, “Bagdasar-Arseni” Emergency Hospital, University of Medicine and Pharmacy “Carol Davila”, Blvd. Eroii Sanitari Nr. 8, Sector 5, 050474 Bucharest, Romania; (A.I.R.); (S.A.M.)
| | - Andreea Ungureanu
- Department of Plastic and Reconstructive Surgery, “Bagdasar-Arseni” Emergency Hospital, University of Medicine and Pharmacy “Carol Davila”, Blvd. Eroii Sanitari Nr. 8, Sector 5, 050474 Bucharest, Romania; (A.I.R.); (S.A.M.)
| | - Carmen Giuglea
- Department of Plastic and Reconstructive Surgery, University of Medicine and Pharmacy “Carol Davila”, Blvd. Eroii Sanitari Nr. 8, Sector 5, 050474 Bucharest, Romania;
| | - Silviu Adrian Marinescu
- Department of Plastic and Reconstructive Surgery, “Bagdasar-Arseni” Emergency Hospital, University of Medicine and Pharmacy “Carol Davila”, Blvd. Eroii Sanitari Nr. 8, Sector 5, 050474 Bucharest, Romania; (A.I.R.); (S.A.M.)
- Department of Plastic and Reconstructive Surgery, University of Medicine and Pharmacy “Carol Davila”, Blvd. Eroii Sanitari Nr. 8, Sector 5, 050474 Bucharest, Romania;
| |
Collapse
|
2
|
Luo X, Zhao J, Zou D, Luo X, Fan M, Hu H, Zheng P, Li Y, Xia R, Mo L. Construction and evaluation of glucocorticoid dose prediction model based on genetic and clinical characteristics of patients with systemic lupus erythematosus. Int J Immunopathol Pharmacol 2025; 39:3946320251331791. [PMID: 40186486 PMCID: PMC12032459 DOI: 10.1177/03946320251331791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Accepted: 03/16/2025] [Indexed: 04/07/2025] Open
Abstract
Currently, no glucocorticoid dose prediction model is available for clinical practice. This study aimed to utilise machine learning techniques to develop and validate personalised dosage models. Participants were patients with SLE who were registered at Nanfang Hospital and received prednisone. Univariate analysis was used to confirm the feature variables. Subsequently, the random forest (RF) algorithm was utilised to interpolate the absent values of the feature variables. Finally, we assessed the prediction capabilities of 11 machine learning and deep-learning algorithms (Logistic, SVM, RF, Adaboost, Bagging, XGBoost, LightGBM, CatBoost, MLP, and TabNet). Finally, a confusion matrix was used to validate the three regimens. In total, 129 patients met the inclusion criteria. The XGBoost algorithm was selected as the preferred method because of its superior performance, achieving an accuracy of 0.81. The factors exhibiting the highest correlation with the prednisone dose were CYP3A4 (rs4646437), albumin (ALB), haemoglobin (HGB), anti-double-stranded DNA antibodies (Anti-dsDNA), erythrocyte sedimentation rate (ESR), age, and HLA-DQA1 (rs2187668). Based on validation, the precision and recall rates for low-dose prednisone (⩾5 mg but <7.5 mg/d) were 100% and 40% respectively. Similarly, for medium-dose prednisone (⩾7.5 mg but <30 mg/d), the accuracy and recall rates were 88% and 88%, and for high-dose prednisone (⩾30 mg but ⩽100 mg/d), the accuracy and recall rates were 62% and 100% respectively. A robust machine learning model was developed to accurately predict prednisone dosage by integrating the identified genetic and clinical factors.
Collapse
Affiliation(s)
- Xin Luo
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Jinjun Zhao
- Department of Rheumatology and Immunology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Danfeng Zou
- Overseas Patient Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Xiaoning Luo
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Meida Fan
- Department of Rheumatology and Immunology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Hongling Hu
- Department of Trauma and Joint Surgery, Shunde Hospital, Southern Medical University, Foshan, China
| | - Ping Zheng
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Yilei Li
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Renfei Xia
- Department of Transplantation, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Liqian Mo
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
3
|
Yu Z, Kou F, Gao Y, Gao F, Lyu CM, Wei H. A machine learning model for predicting abnormal liver function induced by a Chinese herbal medicine preparation (Zhengqing Fengtongning) in patients with rheumatoid arthritis based on real-world study. JOURNAL OF INTEGRATIVE MEDICINE 2025; 23:25-35. [PMID: 39721810 DOI: 10.1016/j.joim.2024.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 06/20/2024] [Indexed: 12/28/2024]
Abstract
OBJECTIVE Rheumatoid arthritis (RA) is a systemic autoimmune disease that affects the small joints of the whole body and degrades the patients' quality of life. Zhengqing Fengtongning (ZF) is a traditional Chinese medicine preparation used to treat RA. ZF may cause liver injury. In this study, we aimed to develop a prediction model for abnormal liver function caused by ZF. METHODS This retrospective study collected data from multiple centers from January 2018 to April 2023. Abnormal liver function was set as the target variable according to the alanine transaminase (ALT) level. Features were screened through univariate analysis and sequential forward selection for modeling. Ten machine learning and deep learning models were compared to find the model that most effectively predicted liver function from the available data. RESULTS This study included 1,913 eligible patients. The LightGBM model exhibited the best performance (accuracy = 0.96) out of the 10 learning models. The predictive metrics of the LightGBM model were as follows: precision = 0.99, recall rate = 0.97, F1_score = 0.98, area under the curve (AUC) = 0.98, sensitivity = 0.97 and specificity = 0.85 for predicting ALT < 40 U/L; precision = 0.60, recall rate = 0.83, F1_score = 0.70, AUC = 0.98, sensitivity = 0.83 and specificity = 0.97 for predicting 40 ≤ ALT < 80 U/L; and precision = 0.83, recall rate = 0.63, F1_score = 0.71, AUC = 0.97, sensitivity = 0.63 and specificity = 1.00 for predicting ALT ≥ 80 U/L. ZF-induced abnormal liver function was found to be associated with high total cholesterol and triglyceride levels, the combination of TNF-α inhibitors, JAK inhibitors, methotrexate + nonsteroidal anti-inflammatory drugs, leflunomide, smoking, older age, and females in middle-age (45-65 years old). CONCLUSION This study developed a model for predicting ZF-induced abnormal liver function, which may help improve the safety of integrated administration of ZF and Western medicine. Please cite this article as: Yu Z, Kou F, Gao Y, Lyu CM, Gao F, Wei H. A machine learning model for predicting abnormal liver function induced by a Chinese herbal medicine preparation (Zhengqing Fengtongning) in patients with rheumatoid arthritis based on real-world study. J Integr Med. 2025; 23(1): 25-35.
Collapse
Affiliation(s)
- Ze Yu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Fang Kou
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
| | - Ya Gao
- Department of Pharmacy, Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing 100037, China
| | - Fei Gao
- Beijing Medicinovo Technology Co. Ltd., Beijing 100071, China
| | - Chun-Ming Lyu
- Experiment Center for Science and Technology, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China.
| | - Hai Wei
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China.
| |
Collapse
|
4
|
Sharma SD, Leung SH, Viatte S. Genetics of rheumatoid arthritis. Best Pract Res Clin Rheumatol 2024; 38:101968. [PMID: 38955657 DOI: 10.1016/j.berh.2024.101968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 06/17/2024] [Accepted: 06/24/2024] [Indexed: 07/04/2024]
Abstract
In the past four decades, a plethora of genetic association studies have been carried out in cohorts of patients with rheumatoid arthritis. These studies have highlighted key aspects of disease pathogenesis and suggested causal mechanisms. In this review, we discuss major advances in our understanding of the genetic architecture of rheumatoid arthritis susceptibility, severity and treatment response and explain how genetics supports current models of disease pathogenesis and outcome. We outline future research directions, like Mendelian randomisation, and present a number of potential avenues for clinical translation, including risk and outcome prediction, patient stratification into treatment response groups and pharmacological applications.
Collapse
Affiliation(s)
- Seema D Sharma
- Versus Arthritis Centre for Genetics and Genomics, Centre for Musculoskeletal Research, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK; NIHR Manchester Musculoskeletal Biomedical Research Centre, Central Manchester NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK.
| | - Shek H Leung
- Versus Arthritis Centre for Genetics and Genomics, Centre for Musculoskeletal Research, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK.
| | - Sebastien Viatte
- Versus Arthritis Centre for Genetics and Genomics, Centre for Musculoskeletal Research, The University of Manchester, Oxford Road, Manchester, M13 9PT, UK; NIHR Manchester Musculoskeletal Biomedical Research Centre, Central Manchester NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK; Lydia Becker Institute of Immunology and Inflammation, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK.
| |
Collapse
|
5
|
Gorgy A, Xu HH, Hawary HE, Nepon H, Lee J, Vorstenbosch J. Integrating AI into Breast Reconstruction Surgery: Exploring Opportunities, Applications, and Challenges. Plast Surg (Oakv) 2024:22925503241292349. [PMID: 39545210 PMCID: PMC11559540 DOI: 10.1177/22925503241292349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 08/25/2024] [Accepted: 09/08/2024] [Indexed: 11/17/2024] Open
Abstract
Background: Artificial intelligence (AI) has significantly influenced various sectors, including healthcare, by enhancing machine capabilities in assisting with human tasks. In surgical fields, where precision and timely decision-making are crucial, AI's integration could revolutionize clinical quality and health resource optimization. This study explores the current and future applications of AI technologies in reconstructive breast surgery, aiming for broader implementation. Methods: We conducted systematic reviews through PubMed, Web of Science, and Google Scholar using relevant keywords and MeSH terms. The focus was on the main AI subdisciplines: machine learning, computer vision, natural language processing, and robotics. This review includes studies discussing AI applications across preoperative, intraoperative, postoperative, and academic settings in breast plastic surgery. Results: AI is currently utilized preoperatively to predict surgical risks and outcomes, enhancing patient counseling and informed consent processes. During surgery, AI supports the identification of anatomical landmarks and dissection strategies and provides 3-dimensional visualizations. Robotic applications are promising for procedures like microsurgical anastomoses, flap harvesting, and dermal matrix anchoring. Postoperatively, AI predicts discharge times and customizes follow-up schedules, which improves resource allocation and patient management at home. Academically, AI offers personalized training feedback to surgical trainees and aids research in breast reconstruction. Despite these advancements, concerns regarding privacy, costs, and operational efficacy persist and are critically examined in this review. Conclusions: The application of AI in breast plastic and reconstructive surgery presents substantial benefits and diverse potentials. However, much remains to be explored and developed. This study aims to consolidate knowledge and encourage ongoing research and development within the field, thereby empowering the plastic surgery community to leverage AI technologies effectively and responsibly for advancing breast reconstruction surgery.
Collapse
Affiliation(s)
- Andrew Gorgy
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| | - Hong Hao Xu
- Faculty of Medicine, Laval University, Quebec City, Quebec, Canada
| | - Hassan El Hawary
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| | - Hillary Nepon
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| | - James Lee
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| | - Joshua Vorstenbosch
- Department of Plastic and Reconstructive Surgery, McGill University Health Center, Montreal, Quebec, Canada
| |
Collapse
|
6
|
Liao Z, Lu Y, Wei D, Ding R, Wu Y, Gao H, Liao A, Tang Y, Xu H, Chen Z, Hu HY. Tailor-made ammonia nitrogen risk management with machine learning models for aquatic environments in the Mainland of China. JOURNAL OF HAZARDOUS MATERIALS 2024; 479:135726. [PMID: 39241361 DOI: 10.1016/j.jhazmat.2024.135726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 08/24/2024] [Accepted: 08/31/2024] [Indexed: 09/09/2024]
Abstract
Efficient management of pollutant risks in water bodies is crucial for public health and aquatic ecosystem sustainability. However, the toxicities of pollutants, such as ammonia nitrogen (NH3-N), are often affected by multiple water quality factors, including the pH and water temperature. Extensive spatial and temporal variability in these factors hinders tailor-made management of risk. This study used high-frequency monitoring data collected over 1 year to evaluate the long-term NH3-N risk in China's aquatic ecosystems. High accuracy and interpretability were achieved by decomposing NH3-N risk into the contributions of key influencing factors using random forest models and Shapley Additive Explanations. Two distinct types of NH3-N risk hotspots were identified across 18 cities: 15 cities with high NH3-N concentrations and 3 cities with low environmental carrying capacity due to high pH levels or elevated water temperatures. For the former, rapid NH3-N abatement measures are necessary to bring NH3-N concentrations back below the environmental capacity. For the latter, it is recommended that NH3-N related industries are relocated to regions with high environmental capacities because fragile environments are not suitable for such industries. Importantly, this study investigated methods for attributing pollutant risks in the context of non-linear influencing factors, and the risk of NH3-N was predicted to increase by 6.1 % by the end of 2100 in the context of increasing temperatures under the SSP 2-4.5 scenario. The methodology is also adaptable and suitable for integration into global ecosystem risk management efforts to balance development and aquatic ecological sustainability.
Collapse
Affiliation(s)
- Zitong Liao
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Yun Lu
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China; Beijing Laboratory for Environmental Frontier Technologies, School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Dongbin Wei
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China
| | - Ren Ding
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Yinhu Wu
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China; Beijing Laboratory for Environmental Frontier Technologies, School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Huanan Gao
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Anran Liao
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Yingcai Tang
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Hongwei Xu
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China
| | - Zhuo Chen
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China; Beijing Laboratory for Environmental Frontier Technologies, School of Environment, Tsinghua University, Beijing 100084, PR China.
| | - Hong-Ying Hu
- Environmental Simulation and Pollution Control State Key Joint Laboratory, Key Laboratory of Microorganism Application and Risk Control (SMARC) of Ministry of Ecology and Environment, School of Environment, Tsinghua University, Beijing 100084, PR China; Beijing Laboratory for Environmental Frontier Technologies, School of Environment, Tsinghua University, Beijing 100084, PR China
| |
Collapse
|
7
|
Maita KC, Avila FR, Torres-Guzman RA, Garcia JP, De Sario Velasquez GD, Borna S, Brown SA, Haider CR, Ho OS, Forte AJ. The usefulness of artificial intelligence in breast reconstruction: a systematic review. Breast Cancer 2024; 31:562-571. [PMID: 38619786 DOI: 10.1007/s12282-024-01582-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 03/30/2024] [Indexed: 04/16/2024]
Abstract
BACKGROUND Artificial Intelligence (AI) offers an approach to predictive modeling. The model learns to determine specific patterns of undesirable outcomes in a dataset. Therefore, a decision-making algorithm can be built based on these patterns to prevent negative results. This systematic review aimed to evaluate the usefulness of AI in breast reconstruction. METHODS A systematic review was conducted in August 2022 following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. MEDLINE, EMBASE, SCOPUS, and Google Scholar online databases were queried to capture all publications studying the use of artificial intelligence in breast reconstruction. RESULTS A total of 23 studies were full text-screened after removing duplicates, and twelve articles fulfilled our inclusion criteria. The Machine Learning algorithms applied for neuropathic pain, lymphedema diagnosis, microvascular abdominal flap failure, donor site complications associated to muscle sparing Transverse Rectus Abdominis flap, surgical complications, financial toxicity, and patient-reported outcomes after breast surgery demonstrated that AI is a helpful tool to accurately predict patient results. In addition, one study used Computer Vision technology to assist in Deep Inferior Epigastric Perforator Artery detection for flap design, considerably reducing the preoperative time compared to manual identification. CONCLUSIONS In breast reconstruction, AI can help the surgeon by optimizing the perioperative patients' counseling to predict negative outcomes, allowing execution of timely interventions and reducing the postoperative burden, which leads to obtaining the most successful results and improving patient satisfaction.
Collapse
Affiliation(s)
- Karla C Maita
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | - Francisco R Avila
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | | | - John P Garcia
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | | | - Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | - Sally A Brown
- Department of Administration, Mayo Clinic, Jacksonville, FL, USA
| | - Clifton R Haider
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, USA
| | - Olivia S Ho
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA
| | - Antonio Jorge Forte
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd, Jacksonville, FL, 32224, USA.
| |
Collapse
|
8
|
Feng Y, Feng Y, Hu M, Xu H, Wang Z, Xu S, Yan Y, Feng C, Li Z, Feng G, Shang W. Early prediction of growth patterns after pediatric kidney transplantation based on height-related single-nucleotide polymorphisms. Chin Med J (Engl) 2024; 137:1199-1206. [PMID: 37672508 PMCID: PMC11101222 DOI: 10.1097/cm9.0000000000002828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Growth retardation is a common complication of chronic kidney disease in children, which can be partially relieved after renal transplantation. This study aimed to develop and validate a predictive model for growth patterns of children with end-stage renal disease (ESRD) after kidney transplantation using machine learning algorithms based on genomic and clinical variables. METHODS A retrospective cohort of 110 children who received kidney transplants between May 2013 and September 2021 at the First Affiliated Hospital of Zhengzhou University were recruited for whole-exome sequencing (WES), and another 39 children who underwent transplant from October 2021 to March 2022 were enrolled for external validation. Based on previous studies, we comprehensively collected 729 height-related single-nucleotide polymorphisms (SNPs) in exon regions. Seven machine learning algorithms and 10-fold cross-validation analysis were employed for model construction. RESULTS The 110 children were divided into two groups according to change in height-for-age Z -score. After univariate analysis, age and 19 SNPs were incorporated into the model and validated. The random forest model showed the best prediction efficacy with an accuracy of 0.8125 and an area under curve (AUC) of 0.924, and also performed well in the external validation cohort (accuracy, 0.7949; AUC, 0.796). CONCLUSIONS A model with good performance for predicting post-transplant growth patterns in children based on SNPs and clinical variables was constructed and validated using machine learning algorithms. The model is expected to guide clinicians in the management of children after renal transplantation, including the use of growth hormone, glucocorticoid withdrawal, and nutritional supplementation, to alleviate growth retardation in children with ESRD.
Collapse
Affiliation(s)
- Yi Feng
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Yonghua Feng
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Mingyao Hu
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Hongen Xu
- Precision Medicine Center, Academy of Medical Science, Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Zhigang Wang
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Shicheng Xu
- Precision Medicine Center, Academy of Medical Science, Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Yongchuang Yan
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Chenghao Feng
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Zhou Li
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Guiwen Feng
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Wenjun Shang
- Department of Renal Transplantation, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| |
Collapse
|
9
|
Vornholt E, Liharska LE, Cheng E, Hashemi A, Park YJ, Ziafat K, Wilkins L, Silk H, Linares LM, Thompson RC, Sullivan B, Moya E, Nadkarni GN, Sebra R, Schadt EE, Kopell BH, Charney AW, Beckmann ND. Characterizing cell type specific transcriptional differences between the living and postmortem human brain. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.01.24306590. [PMID: 38746297 PMCID: PMC11092720 DOI: 10.1101/2024.05.01.24306590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Single-nucleus RNA sequencing (snRNA-seq) is often used to define gene expression patterns characteristic of brain cell types as well as to identify cell type specific gene expression signatures of neurological and mental illnesses in postmortem human brains. As methods to obtain brain tissue from living individuals emerge, it is essential to characterize gene expression differences associated with tissue originating from either living or postmortem subjects using snRNA-seq, and to assess whether and how such differences may impact snRNA-seq studies of brain tissue. To address this, human prefrontal cortex single nuclei gene expression was generated and compared between 31 samples from living individuals and 21 postmortem samples. The same cell types were consistently identified in living and postmortem nuclei, though for each cell type, a large proportion of genes were differentially expressed between samples from postmortem and living individuals. Notably, estimation of cell type proportions by cell type deconvolution of pseudo-bulk data was found to be more accurate in samples from living individuals. To allow for future integration of living and postmortem brain gene expression, a model was developed that quantifies from gene expression data the probability a human brain tissue sample was obtained postmortem. These probabilities are established as a means to statistically account for the gene expression differences between samples from living and postmortem individuals. Together, the results presented here provide a deep characterization of both differences between snRNA-seq derived from samples from living and postmortem individuals, as well as qualify and account for their effect on common analyses performed on this type of data.
Collapse
|
10
|
Alfayyadh MM, Maksemous N, Sutherland HG, Lea RA, Griffiths LR. Unravelling the Genetic Landscape of Hemiplegic Migraine: Exploring Innovative Strategies and Emerging Approaches. Genes (Basel) 2024; 15:443. [PMID: 38674378 PMCID: PMC11049430 DOI: 10.3390/genes15040443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 03/25/2024] [Indexed: 04/28/2024] Open
Abstract
Migraine is a severe, debilitating neurovascular disorder. Hemiplegic migraine (HM) is a rare and debilitating neurological condition with a strong genetic basis. Sequencing technologies have improved the diagnosis and our understanding of the molecular pathophysiology of HM. Linkage analysis and sequencing studies in HM families have identified pathogenic variants in ion channels and related genes, including CACNA1A, ATP1A2, and SCN1A, that cause HM. However, approximately 75% of HM patients are negative for these mutations, indicating there are other genes involved in disease causation. In this review, we explored our current understanding of the genetics of HM. The evidence presented herein summarises the current knowledge of the genetics of HM, which can be expanded further to explain the remaining heritability of this debilitating condition. Innovative bioinformatics and computational strategies to cover the entire genetic spectrum of HM are also discussed in this review.
Collapse
Affiliation(s)
| | | | | | | | - Lyn R. Griffiths
- Centre for Genomics and Personalised Health, Genomics Research Centre, School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD 4059, Australia; (M.M.A.); (N.M.); (H.G.S.); (R.A.L.)
| |
Collapse
|
11
|
Zhang D, Fan B, Lv L, Li D, Yang H, Jiang P, Jin F. Research hotspots and trends of artificial intelligence in rheumatoid arthritis: A bibliometric and visualized study. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20405-20421. [PMID: 38124558 DOI: 10.3934/mbe.2023902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Artificial intelligence (AI) applications on rheumatoid arthritis (RA) are becoming increasingly popular. In this bibliometric study, we aimed to analyze the characteristics of publications relevant to the research of AI in RA, thereby developing a thorough overview of this research topic. Web of Science was used to retrieve publications on the application of AI in RA from 2003 to 2022. Bibliometric analysis and visualization were performed using Microsoft Excel (2019), R software (4.2.2) and VOSviewer (1.6.18). The overall distribution of yearly outputs, leading countries, top institutions and authors, active journals, co-cited references and keywords were analyzed. A total of 859 relevant articles were identified in the Web of Science with an increasing trend. USA and China were the leading countries in this field, accounting for 71.59% of publications in total. Harvard University was the most influential institution. Arthritis Research & Therapy was the most active journal. Primary topics in this field focused on estimating the risk of developing RA, diagnosing RA using sensor, clinical, imaging and omics data, identifying the phenotype of RA patients using electronic health records, predicting treatment response, tracking the progression of the disease and predicting prognosis and developing new drugs. Machine learning and deep learning algorithms were the recent research hotspots and trends in this field. AI has potential applications in various fields of RA, including the risk assessment, screening, early diagnosis, monitoring, prognosis determination, achieving optimal therapeutic outcomes and new drug development for RA patients. Incorporating machine learning and deep learning algorithms into real-world clinical practice will be a future research hotspot and trend for AI in RA. Extensive collaboration to improve model maturity and robustness will be a critical step in the advancement of AI in healthcare.
Collapse
Affiliation(s)
- Di Zhang
- Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan 250011, China
| | - Bing Fan
- Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan 250011, China
| | - Liu Lv
- Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing 100700, China
| | - Da Li
- Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan 250011, China
| | - Huijun Yang
- Gansu Provincial Hospital of TCM, Lanzhou 730050, China
| | - Ping Jiang
- Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan 250011, China
| | - Fangmei Jin
- Gansu Provincial Hospital of TCM, Lanzhou 730050, China
| |
Collapse
|
12
|
Ma J, Yu Z, Chen T, Li P, Liu Y, Chen J, Lyu C, Hao X, Zhang J, Wang S, Gao F, Zhang J, Bu S. The effect of Shengmai injection in patients with coronary heart disease in real world and its personalized medicine research using machine learning techniques. Front Pharmacol 2023; 14:1208621. [PMID: 37781710 PMCID: PMC10537936 DOI: 10.3389/fphar.2023.1208621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 08/30/2023] [Indexed: 10/03/2023] Open
Abstract
Objective: Shengmai injection is a common treatment for coronary heart disease. The accurate dose regimen is important to maximize effectiveness and minimize adverse reactions. We aim to explore the effect of Shengmai injection in patients with coronary heart disease based on real-world data and establish a personalized medicine model using machine learning and deep learning techniques. Methods: 211 patients were enrolled. The length of hospital stay was used to explore the effect of Shengmai injection in a case-control study. We applied propensity score matching to reduce bias and Wilcoxon rank sum test to compare results between the experimental group and the control group. Important variables influencing the dose regimen of Shengmai injection were screened by XGBoost. A personalized medicine model of Shengmai injection was established by XGBoost selected from nine algorithm models. SHapley Additive exPlanations and confusion matrix were used to interpret the results clinically. Results: Patients using Shengmai injection had shorter length of hospital stay than those not using Shengmai injection (median 10.00 days vs. 11.00 days, p = 0.006). The personalized medicine model established via XGBoost shows accuracy = 0.81 and AUC = 0.87 in test cohort and accuracy = 0.84 and AUC = 0.84 in external verification. The important variables influencing the dose regimen of Shengmai injection include lipid-lowering drugs, platelet-lowering drugs, levels of GGT, hemoglobin, prealbumin, and cholesterol at admission. Finally, the personalized model shows precision = 75%, recall rate = 83% and F1-score = 79% for predicting 40 mg of Shengmai injection; and precision = 86%, recall rate = 79% and F1-score = 83% for predicting 60 mg of Shengmai injection. Conclusion: This study provides evidence supporting the clinical effectiveness of Shengmai injection, and established its personalized medicine model, which may help clinicians make better decisions.
Collapse
Affiliation(s)
- Jing Ma
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Ze Yu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Ting Chen
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Ping Li
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Yan Liu
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Jihui Chen
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Chunming Lyu
- Experiment Center for Science and Technology, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Xin Hao
- Dalian Medicinovo Technology Co., Ltd., Dalian, China
| | - Jinyuan Zhang
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Shuang Wang
- Dalian Medicinovo Technology Co., Ltd., Dalian, China
| | - Fei Gao
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Jian Zhang
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Shuhong Bu
- Department of Pharmacy, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| |
Collapse
|
13
|
Wang G, Xu J, Lin X, Lai W, Lv L, Peng S, Li K, Luo M, Chen J, Zhu D, Chen X, Yao C, Wu S, Huang K. Machine learning-based models for predicting mortality and acute kidney injury in critical pulmonary embolism. BMC Cardiovasc Disord 2023; 23:385. [PMID: 37533004 PMCID: PMC10399014 DOI: 10.1186/s12872-023-03363-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 06/22/2023] [Indexed: 08/04/2023] Open
Abstract
OBJECTIVES We aimed to use machine learning (ML) algorithms to risk stratify the prognosis of critical pulmonary embolism (PE). MATERIAL AND METHODS In total, 1229 patients were obtained from MIMIC-IV database. Main outcomes were set as all-cause mortality within 30 days. Logistic regression (LR) and simplified eXtreme gradient boosting (XGBoost) were applied for model constructions. We chose the final models based on their matching degree with data. To simplify the model and increase its usefulness, finally simplified models were built based on the most important 8 variables. Discrimination and calibration were exploited to evaluate the prediction ability. We stratified the risk groups based on risk estimate deciles. RESULTS The simplified XGB model performed better in model discrimination, which AUC were 0.82 (95% CI: 0.78-0.87) in the validation cohort, compared with the AUC of simplified LR model (0.75 [95% CI: 0.69-0.80]). And XGB performed better than sPESI in the validation cohort. A new risk-classification based on XGB could accurately predict low-risk of mortality, and had high consistency with acknowledged risk scores. CONCLUSIONS ML models can accurately predict the 30-day mortality of critical PE patients, which could further be used to reduce the burden of ICU stay, decrease the mortality and improve the quality of life for critical PE patients.
Collapse
Affiliation(s)
- Geng Wang
- Department of Vascular Interventional Radiology, Zhongshan Hospital of Traditional Chinese Medicine, Zhongshan, China
| | - Jiatang Xu
- Department of Cardiovascular Surgery, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, No.33, Yingfeng Road, Haizhu District, Guangdong Province, 510000, Guangzhou, China
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Xixia Lin
- Department of Medicine, Sun Yat-Sen Memorial Hospital South Campus Clinic, Guangzhou, China
| | - Weijie Lai
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Lin Lv
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Senyi Peng
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Kechen Li
- Hospital of Stomatology, Guanghua School of Stomatology, Sun Yat-Sen University, Guangzhou, China
| | - Mingli Luo
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Department of Urology, SunYat-Sen Memorial Hospital, SunYat-Sen University, Guangzhou, China
| | - Jiale Chen
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Dongxi Zhu
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Xiong Chen
- Department of Urology, SunYat-Sen Memorial Hospital, SunYat-Sen University, Guangzhou, China
| | - Chen Yao
- Department of Vascular Surgery, First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Shaoxu Wu
- Department of Urology, SunYat-Sen Memorial Hospital, SunYat-Sen University, Guangzhou, China.
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangzhou, China.
| | - Kai Huang
- Department of Cardiovascular Surgery, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, No.33, Yingfeng Road, Haizhu District, Guangdong Province, 510000, Guangzhou, China.
| |
Collapse
|
14
|
Mao J, Chao K, Jiang FL, Ye XP, Yang T, Li P, Zhu X, Hu PJ, Zhou BJ, Huang M, Gao X, Wang XD. Comparison and development of machine learning for thalidomide-induced peripheral neuropathy prediction of refractory Crohn’s disease in Chinese population. World J Gastroenterol 2023; 29:3855-3870. [PMID: 37426324 PMCID: PMC10324537 DOI: 10.3748/wjg.v29.i24.3855] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/07/2023] [Accepted: 05/23/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUND Thalidomide is an effective treatment for refractory Crohn’s disease (CD). However, thalidomide-induced peripheral neuropathy (TiPN), which has a large individual variation, is a major cause of treatment failure. TiPN is rarely predictable and recognized, especially in CD. It is necessary to develop a risk model to predict TiPN occurrence.
AIM To develop and compare a predictive model of TiPN using machine learning based on comprehensive clinical and genetic variables.
METHODS A retrospective cohort of 164 CD patients from January 2016 to June 2022 was used to establish the model. The National Cancer Institute Common Toxicity Criteria Sensory Scale (version 4.0) was used to assess TiPN. With 18 clinical features and 150 genetic variables, five predictive models were established and evaluated by the confusion matrix receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), specificity, sensitivity (recall rate), precision, accuracy, and F1 score.
RESULTS The top-ranking five risk variables associated with TiPN were interleukin-12 rs1353248 [P = 0.0004, odds ratio (OR): 8.983, 95% confidence interval (CI): 2.497-30.90], dose (mg/d, P = 0.002), brain-derived neurotrophic factor (BDNF) rs2030324 (P = 0.001, OR: 3.164, 95%CI: 1.561-6.434), BDNF rs6265 (P = 0.001, OR: 3.150, 95%CI: 1.546-6.073) and BDNF rs11030104 (P = 0.001, OR: 3.091, 95%CI: 1.525-5.960). In the training set, gradient boosting decision tree (GBDT), extremely random trees (ET), random forest, logistic regression and extreme gradient boosting (XGBoost) obtained AUROC values > 0.90 and AUPRC > 0.87. Among these models, XGBoost and GBDT obtained the first two highest AUROC (0.90 and 1), AUPRC (0.98 and 1), accuracy (0.96 and 0.98), precision (0.90 and 0.95), F1 score (0.95 and 0.98), specificity (0.94 and 0.97), and sensitivity (1). In the validation set, XGBoost algorithm exhibited the best predictive performance with the highest specificity (0.857), accuracy (0.818), AUPRC (0.86) and AUROC (0.89). ET and GBDT obtained the highest sensitivity (1) and F1 score (0.8). Overall, compared with other state-of-the-art classifiers such as ET, GBDT and RF, XGBoost algorithm not only showed a more stable performance, but also yielded higher ROC-AUC and PRC-AUC scores, demonstrating its high accuracy in prediction of TiPN occurrence.
CONCLUSION The powerful XGBoost algorithm accurately predicts TiPN using 18 clinical features and 14 genetic variables. With the ability to identify high-risk patients using single nucleotide polymorphisms, it offers a feasible option for improving thalidomide efficacy in CD patients.
Collapse
Affiliation(s)
- Jing Mao
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Kang Chao
- Department of Gastroenterology, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Fu-Lin Jiang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Xiao-Ping Ye
- Department of Pharmacy, Guangdong Women and Children Hospital, Guangzhou 510000, Guangdong Province, China
| | - Ting Yang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Pan Li
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Xia Zhu
- Department of Gastroenterology, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Pin-Jin Hu
- Department of Gastroenterology, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Bai-Jun Zhou
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Min Huang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Xiang Gao
- Department of Gastroenterology, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| | - Xue-Ding Wang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
- Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
| |
Collapse
|
15
|
Bourquard T, Lee K, Al-Ramahi I, Pham M, Shapiro D, Lagisetty Y, Soleimani S, Mota S, Wilhelm K, Samieinasab M, Kim YW, Huh E, Asmussen J, Katsonis P, Botas J, Lichtarge O. Functional variants identify sex-specific genes and pathways in Alzheimer's Disease. Nat Commun 2023; 14:2765. [PMID: 37179358 PMCID: PMC10183026 DOI: 10.1038/s41467-023-38374-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 04/28/2023] [Indexed: 05/15/2023] Open
Abstract
The incidence of Alzheimer's Disease in females is almost double that of males. To search for sex-specific gene associations, we build a machine learning approach focused on functionally impactful coding variants. This method can detect differences between sequenced cases and controls in small cohorts. In the Alzheimer's Disease Sequencing Project with mixed sexes, this approach identified genes enriched for immune response pathways. After sex-separation, genes become specifically enriched for stress-response pathways in male and cell-cycle pathways in female. These genes improve disease risk prediction in silico and modulate Drosophila neurodegeneration in vivo. Thus, a general approach for machine learning on functionally impactful variants can uncover sex-specific candidates towards diagnostic biomarkers and therapeutic targets.
Collapse
Affiliation(s)
- Thomas Bourquard
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Kwanghyuk Lee
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Ismael Al-Ramahi
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, 77030, USA
- Center for Alzheimer's and Neurodegenerative Diseases, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Minh Pham
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Dillon Shapiro
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Yashwanth Lagisetty
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Biology and Pharmacology, UTHealth McGovern Medical School, Houston, TX, 77030, USA
| | - Shirin Soleimani
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Samantha Mota
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Kevin Wilhelm
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Maryam Samieinasab
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Young Won Kim
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Eunna Huh
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jennifer Asmussen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Juan Botas
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, 77030, USA
- Center for Alzheimer's and Neurodegenerative Diseases, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
- Center for Alzheimer's and Neurodegenerative Diseases, Baylor College of Medicine, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| |
Collapse
|
16
|
Alzoubi H, Alzubi R, Ramzan N. Deep Learning Framework for Complex Disease Risk Prediction Using Genomic Variations. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094439. [PMID: 37177642 PMCID: PMC10181706 DOI: 10.3390/s23094439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/05/2023] [Accepted: 04/26/2023] [Indexed: 05/15/2023]
Abstract
Genome-wide association studies have proven their ability to improve human health outcomes by identifying genotypes associated with phenotypes. Various works have attempted to predict the risk of diseases for individuals based on genotype data. This prediction can either be considered as an analysis model that can lead to a better understanding of gene functions that underlie human disease or as a black box in order to be used in decision support systems and in early disease detection. Deep learning techniques have gained more popularity recently. In this work, we propose a deep-learning framework for disease risk prediction. The proposed framework employs a multilayer perceptron (MLP) in order to predict individuals' disease status. The proposed framework was applied to the Wellcome Trust Case-Control Consortium (WTCCC), the UK National Blood Service (NBS) Control Group, and the 1958 British Birth Cohort (58C) datasets. The performance comparison of the proposed framework showed that the proposed approach outperformed the other methods in predicting disease risk, achieving an area under the curve (AUC) up to 0.94.
Collapse
Affiliation(s)
- Hadeel Alzoubi
- Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia
| | - Raid Alzubi
- Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia
| | - Naeem Ramzan
- School of Computing, Engineering and Physical Sciences, University of the West of Scotland, High Street, Paisley PA1 2BE, UK
| |
Collapse
|
17
|
Ezugwu AE, Oyelade ON, Ikotun AM, Agushaka JO, Ho YS. Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2023; 30:1-31. [PMID: 37359741 PMCID: PMC10148585 DOI: 10.1007/s11831-023-09930-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 04/19/2023] [Indexed: 06/28/2023]
Abstract
The machine learning (ML) paradigm has gained much popularity today. Its algorithmic models are employed in every field, such as natural language processing, pattern recognition, object detection, image recognition, earth observation and many other research areas. In fact, machine learning technologies and their inevitable impact suffice in many technological transformation agendas currently being propagated by many nations, for which the already yielded benefits are outstanding. From a regional perspective, several studies have shown that machine learning technology can help address some of Africa's most pervasive problems, such as poverty alleviation, improving education, delivering quality healthcare services, and addressing sustainability challenges like food security and climate change. In this state-of-the-art paper, a critical bibliometric analysis study is conducted, coupled with an extensive literature survey on recent developments and associated applications in machine learning research with a perspective on Africa. The presented bibliometric analysis study consists of 2761 machine learning-related documents, of which 89% were articles with at least 482 citations published in 903 journals during the past three decades. Furthermore, the collated documents were retrieved from the Science Citation Index EXPANDED, comprising research publications from 54 African countries between 1993 and 2021. The bibliometric study shows the visualization of the current landscape and future trends in machine learning research and its application to facilitate future collaborative research and knowledge exchange among authors from different research institutions scattered across the African continent.
Collapse
Affiliation(s)
- Absalom E. Ezugwu
- Unit for Data Science and Computing, North-West University, 11 Hoffman Street, Potchefstroom, 2520 South Africa
| | - Olaide N. Oyelade
- Department of Computer Science, Faculty of Physical Sciences, Ahmadu Bello University, Zaria, Nigeria
| | - Abiodun M. Ikotun
- Unit for Data Science and Computing, North-West University, 11 Hoffman Street, Potchefstroom, 2520 South Africa
| | - Jeffery O. Agushaka
- Unit for Data Science and Computing, North-West University, 11 Hoffman Street, Potchefstroom, 2520 South Africa
| | - Yuh-Shan Ho
- Trend Research Centre, Asia University, No. 500, Lioufeng RoadWufeng, Taichung, 41354 Taiwan
| |
Collapse
|
18
|
Mo X, Chen X, Zeng H, Zheng W, Ieong C, Li H, Huang Q, Xu Z, Yang J, Liang Q, Liang H, Gao X, Huang M, Li J. Tacrolimus in the treatment of childhood nephrotic syndrome: Machine learning detects novel biomarkers and predicts efficacy. Pharmacotherapy 2023; 43:43-52. [PMID: 36521865 DOI: 10.1002/phar.2749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 11/10/2022] [Accepted: 11/10/2022] [Indexed: 12/23/2022]
Abstract
STUDY OBJECTIVE The pharmacokinetics and pharmacodynamics of tacrolimus (TAC) vary greatly among individuals, hindering its precise utilization. Moreover, effective models for the early prediction of TAC efficacy in patients with nephrotic syndrome (NS) are lacking. We aimed to identify key factors affecting TAC efficacy and develop efficacy prediction models for childhood NS using machine learning algorithms. DESIGN This was an observational cohort study of patients with pediatric refractory NS. SETTING Guangzhou Women and Children's Medical Center between June 2013 and December 2018. PATIENTS 203 patients with pediatric refractory NS were used for model generation and 35 patients were used for model validation. INTERVENTION All patients regularly received double immunosuppressive therapy comprising TAC and low-dose prednisone or methylprednisolone. In this observational cohort study of 203 pediatric patients with refractory NS, clinical and genetic variables, including single-nucleotide polymorphism (SNPs), were identified. TAC efficacy was evaluated 3 months after administration according to two different evaluation criteria: response or non-response (Group 1) and complete remission, partial remission, or non-remission (Group 2). MEASUREMENTS Logistic regression, extremely random trees, gradient boosting decision trees, random forest, and extreme gradient boosting algorithms were used to develop and validate the models. Prediction models were validated among a cohort of 35 patients with NS. MAIN RESULTS The random forest models performed best in both groups, and the area under the receiver operating characteristics curve of these two models was 80.7% (Group 1) and 80.3% (Group 2). These prediction models included urine erythrocyte count before administration, steroid types, and eight SNPs (ITGB4 rs2290460, TRPC6 rs3824934, CTGF rs9399005, IL13 rs20541, NFKBIA rs8904, NFKBIA rs8016947, MAP3K11 rs7946115, and SMARCAL1 rs11886806). CONCLUSIONS Two pre-administration models with good predictive performance for TAC response of patients with NS were developed and validated using machine learning algorithms. These accurate models could assist clinicians in predicting TAC efficacy in pediatric patients with NS before utilization to avoid treatment failure or adverse effects.
Collapse
Affiliation(s)
- Xiaolan Mo
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China.,Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Xiujuan Chen
- Department of Medical Big Data Center, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Huasong Zeng
- Division of Nephrology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Wei Zheng
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Chifong Ieong
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Huixian Li
- Department of Medical Big Data Center, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Qiongbo Huang
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Zichuan Xu
- Division of Nephrology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Jinlian Yang
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Qianying Liang
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huiying Liang
- Department of Medical Big Data Center, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Xia Gao
- Division of Nephrology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Min Huang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Jiali Li
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
19
|
Zheng P, Yu Z, Mo L, Zhang Y, Lyu C, Yu Y, Zhang J, Hao X, Wei H, Gao F, Li Y. An individualized medication model of sodium valproate for patients with bipolar disorder based on machine learning and deep learning techniques. Front Pharmacol 2022; 13:890221. [PMID: 36339624 PMCID: PMC9627622 DOI: 10.3389/fphar.2022.890221] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 09/29/2022] [Indexed: 07/20/2023] Open
Abstract
Valproic acid/sodium valproate (VPA) is a widely used anticonvulsant drug for maintenance treatment of bipolar disorders. In order to balance the efficacy and adverse events of VPA treatment, an individualized dose regimen is necessary. This study aimed to establish an individualized medication model of VPA for patients with bipolar disorder based on machine learning and deep learning techniques. The sequential forward selection (SFS) algorithm was applied for selecting a feature subset, and random forest was used for interpolating missing values. Then, we compared nine models using XGBoost, LightGBM, CatBoost, random forest, GBDT, SVM, logistic regression, ANN, and TabNet, and CatBoost was chosen to establish the individualized medication model with the best performance (accuracy = 0.85, AUC = 0.91, sensitivity = 0.85, and specificity = 0.83). Three important variables that correlated with VPA daily dose included VPA TDM value, antipsychotics, and indirect bilirubin. SHapley Additive exPlanations was applied to visually interpret their impacts on VPA daily dose. Last, the confusion matrix presented that predicting a daily dose of 0.5 g VPA had a precision of 55.56% and recall rate of 83.33%, and predicting a daily dose of 1 g VPA had a precision of 95.83% and a recall rate of 85.19%. In conclusion, the individualized medication model of VPA for patients with bipolar disorder based on CatBoost had a good prediction ability, which provides guidance for clinicians to propose the optimal medication regimen.
Collapse
Affiliation(s)
- Ping Zheng
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Ze Yu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Liqian Mo
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Yuqing Zhang
- Zhongshan School of Medicine, SYSU, Guangzhou, China
| | - Chunming Lyu
- Experiment Center for Science and Technology, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yongsheng Yu
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Jinyuan Zhang
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Xin Hao
- Dalian Medicinovo Technology Co., Ltd., Dalian, Liaoning, China
| | - Hai Wei
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Fei Gao
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Yilei Li
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
20
|
Momtazmanesh S, Nowroozi A, Rezaei N. Artificial Intelligence in Rheumatoid Arthritis: Current Status and Future Perspectives: A State-of-the-Art Review. Rheumatol Ther 2022; 9:1249-1304. [PMID: 35849321 PMCID: PMC9510088 DOI: 10.1007/s40744-022-00475-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 06/24/2022] [Indexed: 11/23/2022] Open
Abstract
Investigation of the potential applications of artificial intelligence (AI), including machine learning (ML) and deep learning (DL) techniques, is an exponentially growing field in medicine and healthcare. These methods can be critical in providing high-quality care to patients with chronic rheumatological diseases lacking an optimal treatment, like rheumatoid arthritis (RA), which is the second most prevalent autoimmune disease. Herein, following reviewing the basic concepts of AI, we summarize the advances in its applications in RA clinical practice and research. We provide directions for future investigations in this field after reviewing the current knowledge gaps and technical and ethical challenges in applying AI. Automated models have been largely used to improve RA diagnosis since the early 2000s, and they have used a wide variety of techniques, e.g., support vector machine, random forest, and artificial neural networks. AI algorithms can facilitate screening and identification of susceptible groups, diagnosis using omics, imaging, clinical, and sensor data, patient detection within electronic health record (EHR), i.e., phenotyping, treatment response assessment, monitoring disease course, determining prognosis, novel drug discovery, and enhancing basic science research. They can also aid in risk assessment for incidence of comorbidities, e.g., cardiovascular diseases, in patients with RA. However, the proposed models may vary significantly in their performance and reliability. Despite the promising results achieved by AI models in enhancing early diagnosis and management of patients with RA, they are not fully ready to be incorporated into clinical practice. Future investigations are required to ensure development of reliable and generalizable algorithms while they carefully look for any potential source of bias or misconduct. We showed that a growing body of evidence supports the potential role of AI in revolutionizing screening, diagnosis, and management of patients with RA. However, multiple obstacles hinder clinical applications of AI models. Incorporating the machine and/or deep learning algorithms into real-world settings would be a key step in the progress of AI in medicine.
Collapse
Affiliation(s)
- Sara Momtazmanesh
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran
| | - Ali Nowroozi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
| | - Nima Rezaei
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran.
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran.
- Department of Immunology, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
21
|
Baralou V, Kalpourtzi N, Touloumi G. Individual risk prediction: Comparing random forests with Cox proportional-hazards model by a simulation study. Biom J 2022. [PMID: 36169048 DOI: 10.1002/bimj.202100380] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 06/08/2022] [Accepted: 07/04/2022] [Indexed: 12/26/2022]
Abstract
With big data becoming widely available in healthcare, machine learning algorithms such as random forest (RF) that ignores time-to-event information and random survival forest (RSF) that handles right-censored data are used for individual risk prediction alternatively to the Cox proportional hazards (Cox-PH) model. We aimed to systematically compare RF and RSF with Cox-PH. RSF with three split criteria [log-rank (RSF-LR), log-rank score (RSF-LRS), maximally selected rank statistics (RSF-MSR)]; RF, Cox-PH, and Cox-PH with splines (Cox-S) were evaluated through a simulation study based on real data. One hundred eighty scenarios were investigated assuming different associations between the predictors and the outcome (linear/linear and interactions/nonlinear/nonlinear and interactions), training sample sizes (500/1000/5000), censoring rates (50%/75%/93%), hazard functions (increasing/decreasing/constant), and number of predictors (seven, 15 including noise variables). Methods' performance was evaluated with time-dependent area under curve and integrated Brier score. In all scenarios, RF had the worst performance. In scenarios with a low number of events (⩽70), Cox-PH was at least noninferior to RSF, whereas under linearity assumption it outperformed RSF. Under the presence of interactions, RSF performed better than Cox-PH as the number of events increased whereas Cox-S reached at least similar performance with RSF under nonlinear effects. RSF-LRS performed slightly worse than RSF-LR and RSF-MSR when including noise variables and interaction effects. When applied to real data, models incorporating survival time performed better. Although RSF algorithms are a promising alternative to conventional Cox-PH as data complexity increases, they require a higher number of events for training. In time-to-event analysis, algorithms that consider survival time should be used.
Collapse
Affiliation(s)
- Valia Baralou
- Department of Hygiene, Epidemiology & Medical Statistics, Medical School, National & Kapodistrian University of Athens, Athens, Greece
| | - Natasa Kalpourtzi
- Department of Hygiene, Epidemiology & Medical Statistics, Medical School, National & Kapodistrian University of Athens, Athens, Greece
| | - Giota Touloumi
- Department of Hygiene, Epidemiology & Medical Statistics, Medical School, National & Kapodistrian University of Athens, Athens, Greece
| |
Collapse
|
22
|
Pudjihartono N, Fadason T, Kempa-Liehr AW, O'Sullivan JM. A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. FRONTIERS IN BIOINFORMATICS 2022; 2:927312. [PMID: 36304293 PMCID: PMC9580915 DOI: 10.3389/fbinf.2022.927312] [Citation(s) in RCA: 168] [Impact Index Per Article: 56.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 06/03/2022] [Indexed: 01/14/2023] Open
Abstract
Machine learning has shown utility in detecting patterns within large, unstructured, and complex datasets. One of the promising applications of machine learning is in precision medicine, where disease risk is predicted using patient genetic data. However, creating an accurate prediction model based on genotype data remains challenging due to the so-called “curse of dimensionality” (i.e., extensively larger number of features compared to the number of samples). Therefore, the generalizability of machine learning models benefits from feature selection, which aims to extract only the most “informative” features and remove noisy “non-informative,” irrelevant and redundant features. In this article, we provide a general overview of the different feature selection methods, their advantages, disadvantages, and use cases, focusing on the detection of relevant features (i.e., SNPs) for disease risk prediction.
Collapse
Affiliation(s)
| | - Tayaza Fadason
- Liggins Institute, University of Auckland, Auckland, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Auckland, New Zealand
| | - Andreas W. Kempa-Liehr
- Department of Engineering Science, The University of Auckland, Auckland, New Zealand
- *Correspondence: Andreas W. Kempa-Liehr, ; Justin M. O'Sullivan,
| | - Justin M. O'Sullivan
- Liggins Institute, University of Auckland, Auckland, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Auckland, New Zealand
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, United Kingdom
- Singapore Institute for Clinical Sciences, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Australian Parkinson’s Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia
- *Correspondence: Andreas W. Kempa-Liehr, ; Justin M. O'Sullivan,
| |
Collapse
|
23
|
Yu Z, Ye X, Liu H, Li H, Hao X, Zhang J, Kou F, Wang Z, Wei H, Gao F, Zhai Q. Predicting Lapatinib Dose Regimen Using Machine Learning and Deep Learning Techniques Based on a Real-World Study. Front Oncol 2022; 12:893966. [PMID: 35719963 PMCID: PMC9203846 DOI: 10.3389/fonc.2022.893966] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 05/05/2022] [Indexed: 11/26/2022] Open
Abstract
Lapatinib is used for the treatment of metastatic HER2(+) breast cancer. We aim to establish a prediction model for lapatinib dose using machine learning and deep learning techniques based on a real-world study. There were 149 breast cancer patients enrolled from July 2016 to June 2017 at Fudan University Shanghai Cancer Center. The sequential forward selection algorithm based on random forest was applied for variable selection. Twelve machine learning and deep learning algorithms were compared in terms of their predictive abilities (logistic regression, SVM, random forest, Adaboost, XGBoost, GBDT, LightGBM, CatBoost, TabNet, ANN, Super TML, and Wide&Deep). As a result, TabNet was chosen to construct the prediction model with the best performance (accuracy = 0.82 and AUC = 0.83). Afterward, four variables that strongly correlated with lapatinib dose were ranked via importance score as follows: treatment protocols, weight, number of chemotherapy treatments, and number of metastases. Finally, the confusion matrix was used to validate the model for a dose regimen of 1,250 mg lapatinib (precision = 81% and recall = 95%), and for a dose regimen of 1,000 mg lapatinib (precision = 87% and recall = 64%). To conclude, we established a deep learning model to predict lapatinib dose based on important influencing variables selected from real-world evidence, to achieve an optimal individualized dose regimen with good predictive performance.
Collapse
Affiliation(s)
- Ze Yu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Xuan Ye
- Department of Pharmacy, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College of Fudan University, Shanghai, China
| | - Hongyue Liu
- Department of Pharmacy, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College of Fudan University, Shanghai, China
| | - Huan Li
- Department of Pharmacy, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College of Fudan University, Shanghai, China
| | - Xin Hao
- Dalian Medicinovo Technology Co., Ltd., Dalian, China
| | - Jinyuan Zhang
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Fang Kou
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Zeyuan Wang
- Faculty of Engineering, School of Computer Science, The University of Sydney, Sydney, NSW, Australia
| | - Hai Wei
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Fei Gao
- Beijing Medicinovo Technology Co., Ltd., Beijing, China
| | - Qing Zhai
- Department of Pharmacy, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College of Fudan University, Shanghai, China
| |
Collapse
|
24
|
Zhang Q, Tian X, Chen G, Yu Z, Zhang X, Lu J, Zhang J, Wang P, Hao X, Huang Y, Wang Z, Gao F, Yang J. A Prediction Model for Tacrolimus Daily Dose in Kidney Transplant Recipients With Machine Learning and Deep Learning Techniques. Front Med (Lausanne) 2022; 9:813117. [PMID: 35712101 PMCID: PMC9197124 DOI: 10.3389/fmed.2022.813117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 04/22/2022] [Indexed: 11/13/2022] Open
Abstract
Tacrolimus is a major immunosuppressor against post-transplant rejection in kidney transplant recipients. However, the narrow therapeutic index of tacrolimus and considerable variability among individuals are challenges for therapeutic outcomes. The aim of this study was to compare different machine learning and deep learning algorithms and establish individualized dose prediction models by using the best performing algorithm. Therefore, among the 10 commonly used algorithms we compared, the TabNet algorithm outperformed other algorithms with the highest R2 (0.824), the lowest prediction error [mean absolute error (MAE) 0.468, mean square error (MSE) 0.558, and root mean square error (RMSE) 0.745], and good performance of overestimated (5.29%) or underestimated dose percentage (8.52%). In the final prediction model, the last tacrolimus daily dose, the last tacrolimus therapeutic drug monitoring value, time after transplantation, hematocrit, serum creatinine, aspartate aminotransferase, weight, CYP3A5, body mass index, and uric acid were the most influential variables on tacrolimus daily dose. Our study provides a reference for the application of deep learning technique in tacrolimus dose estimation, and the TabNet model with desirable predictive performance is expected to be expanded and applied in future clinical practice.
Collapse
Affiliation(s)
- Qiwen Zhang
- Department of Pharmacy, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Henan Key Laboratory of Precision Clinical Pharmacy, Zhengzhou University, Zhengzhou, China
| | - Xueke Tian
- Department of Pharmacy, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Henan Key Laboratory of Precision Clinical Pharmacy, Zhengzhou University, Zhengzhou, China
| | - Guang Chen
- Department of Pharmacy, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Henan Key Laboratory of Precision Clinical Pharmacy, Zhengzhou University, Zhengzhou, China
| | - Ze Yu
- Beijing Medicinovo Technology Co. Ltd, Beijing, China
| | - Xiaojian Zhang
- Department of Pharmacy, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Henan Key Laboratory of Precision Clinical Pharmacy, Zhengzhou University, Zhengzhou, China
| | - Jingli Lu
- Department of Pharmacy, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Henan Key Laboratory of Precision Clinical Pharmacy, Zhengzhou University, Zhengzhou, China
| | - Jinyuan Zhang
- Beijing Medicinovo Technology Co. Ltd, Beijing, China
| | - Peile Wang
- Department of Pharmacy, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Henan Key Laboratory of Precision Clinical Pharmacy, Zhengzhou University, Zhengzhou, China
| | - Xin Hao
- Dalian Medicinovo Technology Co. Ltd, Dalian, China
| | - Yining Huang
- McCormick School of Engineering, Northwestern University, Evanston, IL, United States
| | - Zeyuan Wang
- Beijing Medicinovo Technology Co. Ltd, Beijing, China
| | - Fei Gao
- Beijing Medicinovo Technology Co. Ltd, Beijing, China
| | - Jing Yang
- Department of Pharmacy, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Henan Key Laboratory of Precision Clinical Pharmacy, Zhengzhou University, Zhengzhou, China
| |
Collapse
|
25
|
Lau M, Wigmann C, Kress S, Schikowski T, Schwender H. Evaluation of tree-based statistical learning methods for constructing genetic risk scores. BMC Bioinformatics 2022; 23:97. [PMID: 35313824 PMCID: PMC8935722 DOI: 10.1186/s12859-022-04634-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 03/14/2022] [Indexed: 04/11/2024] Open
Abstract
Background Genetic risk scores (GRS) summarize genetic features such as single nucleotide polymorphisms (SNPs) in a single statistic with respect to a given trait. So far, GRS are typically built using generalized linear models or regularized extensions. However, these linear methods are usually not able to incorporate gene-gene interactions or non-linear SNP-response relationships. Tree-based statistical learning methods such as random forests and logic regression may be an alternative to such regularized-regression-based methods and are investigated in this article. Moreover, we consider modifications of random forests and logic regression for the construction of GRS. Results In an extensive simulation study and an application to a real data set from a German cohort study, we show that both tree-based approaches can outperform elastic net when constructing GRS for binary traits. Especially a modification of logic regression called logic bagging could induce comparatively high predictive power as measured by the area under the curve and the statistical power. Even when considering no epistatic interaction effects but only marginal genetic effects, the regularized regression method lead in most cases to inferior results. Conclusions When constructing GRS, we recommend taking random forests and logic bagging into account, in particular, if it can be assumed that possibly unknown epistasis between SNPs is present. To develop the best possible prediction models, extensive joint hyperparameter optimizations should be conducted. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04634-w.
Collapse
Affiliation(s)
- Michael Lau
- Mathematical Institute, Heinrich Heine University, Düsseldorf, Germany. .,IUF - Leibniz Research Institute for Environmental Medicine, Düsseldorf, Germany.
| | - Claudia Wigmann
- IUF - Leibniz Research Institute for Environmental Medicine, Düsseldorf, Germany
| | - Sara Kress
- IUF - Leibniz Research Institute for Environmental Medicine, Düsseldorf, Germany
| | - Tamara Schikowski
- IUF - Leibniz Research Institute for Environmental Medicine, Düsseldorf, Germany
| | - Holger Schwender
- Mathematical Institute, Heinrich Heine University, Düsseldorf, Germany
| |
Collapse
|
26
|
Yoo HY, Lee KC, Woo JE, Park SH, Lee S, Joo J, Bae JS, Kwon HJ, Park BJ. A Genome-Wide Association Study and Machine-Learning Algorithm Analysis on the Prediction of Facial Phenotypes by Genotypes in Korean Women. Clin Cosmet Investig Dermatol 2022; 15:433-445. [PMID: 35313536 PMCID: PMC8933694 DOI: 10.2147/ccid.s339547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 01/12/2022] [Indexed: 12/03/2022]
Abstract
Purpose Changes in facial appearance are affected by various intrinsic and extrinsic factors, which vary from person to person. Therefore, each person needs to determine their skin condition accurately to care for their skin accordingly. Recently, genetic identification by skin-related phenotypes has become possible using genome-wide association studies (GWAS) and machine-learning algorithms. However, because most GWAS have focused on populations with American or European skin pigmentation, large-scale GWAS are needed for Asian populations. This study aimed to evaluate the correlation of facial phenotypes with candidate single-nucleotide polymorphisms (SNPs) to predict phenotype from genotype using machine learning. Materials and Methods A total of 749 Korean women aged 30-50 years were enrolled in this study and evaluated for five facial phenotypes (melanin, gloss, hydration, wrinkle, and elasticity). To find highly related SNPs with each phenotype, GWAS analysis was used. In addition, phenotype prediction was performed using three machine-learning algorithms (linear, ridge, and linear support vector regressions) using five-fold cross-validation. Results Using GWAS analysis, we found 46 novel highly associated SNPs (p < 1×10-05): 3, 20, 12, 6, and 5 SNPs for melanin, gloss, hydration, wrinkle, and elasticity, respectively. On comparing the performance of each model based on phenotypes using five-fold cross-validation, the ridge regression model showed the highest accuracy (r2 = 0.6422-0.7266) in all skin traits. Therefore, the optimal solution for personal skin diagnosis using GWAS was with the ridge regression model. Conclusion The proposed facial phenotype prediction model in this study provided the optimal solution for accurately predicting the skin condition of an individual by identifying genotype information of target characteristics and machine-learning methods. This model has potential utility for the development of customized cosmetics.
Collapse
Affiliation(s)
- Hye-Young Yoo
- Skin & Natural Products Lab, Kolmar Korea Co., Ltd., Seoul, 06800, Republic of Korea
| | - Ki-Chan Lee
- R&D Department, Eone Diagnomics Genome Center Co., Ltd, Songdo Incheon, 22014, Republic of Korea
| | - Ji-Eun Woo
- Skin & Natural Products Lab, Kolmar Korea Co., Ltd., Seoul, 06800, Republic of Korea
| | - Sung-Ha Park
- Skin & Natural Products Lab, Kolmar Korea Co., Ltd., Seoul, 06800, Republic of Korea
| | - Sunghoon Lee
- R&D Department, Eone Diagnomics Genome Center Co., Ltd, Songdo Incheon, 22014, Republic of Korea
| | - Joungsu Joo
- R&D Department, Eone Diagnomics Genome Center Co., Ltd, Songdo Incheon, 22014, Republic of Korea
| | - Jin-Sik Bae
- R&D Department, Eone Diagnomics Genome Center Co., Ltd, Songdo Incheon, 22014, Republic of Korea
| | - Hyuk-Jung Kwon
- R&D Department, Eone Diagnomics Genome Center Co., Ltd, Songdo Incheon, 22014, Republic of Korea
| | - Byoung-Jun Park
- Skin & Natural Products Lab, Kolmar Korea Co., Ltd., Seoul, 06800, Republic of Korea
| |
Collapse
|
27
|
Mo X, Chen X, Wang X, Zhong X, Liang H, Wei Y, Deng H, Hu R, Zhang T, Chen Y, Gao X, Huang M, Li J. Prediction of Tacrolimus Dose/Weight-Adjusted Trough Concentration in Pediatric Refractory Nephrotic Syndrome: A Machine Learning Approach. Pharmgenomics Pers Med 2022; 15:143-155. [PMID: 35228813 PMCID: PMC8881964 DOI: 10.2147/pgpm.s339318] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 01/20/2022] [Indexed: 12/13/2022] Open
Abstract
Purpose Tacrolimus (TAC) is a first-line immunosuppressant for patients with refractory nephrotic syndrome (NS). However, there is a high inter-patient variability of TAC pharmacokinetics, thus therapeutic drug monitoring (TDM) is required. In this study, we aimed to employ machine learning algorithms to investigate the impact of clinical and genetic variables on the TAC dose/weight-adjusted trough concentration (C0/D) in Chinese children with refractory NS, and then develop and validate the TAC C0/D prediction models. Patients and Methods The association of 82 clinical variables and 244 single nucleotide polymorphisms (SNPs) with TAC C0/D in the third month since TAC treatment was examined in 171 children with refractory NS. Extremely randomized trees (ET), gradient boosting decision tree (GBDT), random forest (RF), extreme gradient boosting (XGBoost), and Lasso regression were carried out to establish and validate prediction models, respectively. The best prediction models were validated on a cohort of 30 refractory NS patients. Results GBDT algorithm performed best in the whole group (R2=0.444, MSE=591.032, MAE=20.782, MedAE=18.980) and CYP3A5 nonexpresser group (R2=0.264, MSE=477.948, MAE=18.119, MedAE=18.771), while ET algorithm performed best in the CYP3A5 expresser group (R2=0.380, MSE=1839.459, MAE=31.257, MedAE=19.399). These prediction models included 3 clinical variables (ALB0, AGE0, and gender) and 10 SNPs (ACTN4 rs3745859, ACTN4 rs56113315, ACTN4 rs62121818, CTLA4 rs4553808, CYP3A5 rs776746, IL2RA rs12722489, INF2 rs1128880, MAP3K11 rs7946115, MYH9 rs2239781, and MYH9 rs4821478). Conclusion The association between the clinical and genetic variables and TAC C0/D was described, and three TAC C0/D prediction models integrating clinical and genetic variables were developed and validated using machine learning, which may support individualized TAC dosing.
Collapse
Affiliation(s)
- Xiaolan Mo
- Department of Pharmacy, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, 510623, People’s Republic of China
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510080, People’s Republic of China
| | - Xiujuan Chen
- Department of clinical Data Center, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, 510080, People’s Republic of China
| | - Xianggui Wang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510080, People’s Republic of China
| | - Xiaoli Zhong
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510080, People’s Republic of China
| | - Huiying Liang
- Department of clinical Data Center, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, 510080, People’s Republic of China
| | - Yuanyi Wei
- Department of Pharmacy, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, 510623, People’s Republic of China
| | - Houliang Deng
- Department of Pharmacy, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, 510623, People’s Republic of China
| | - Rong Hu
- Department of Pharmacy, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, 510623, People’s Republic of China
| | - Tao Zhang
- Department of Pharmacy, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, 510623, People’s Republic of China
| | - Yilu Chen
- Department of Pharmacy, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, 510623, People’s Republic of China
| | - Xia Gao
- Division of Nephrology, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, 510623, People’s Republic of China
| | - Min Huang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510080, People’s Republic of China
| | - Jiali Li
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510080, People’s Republic of China
- Correspondence: Jiali Li; Min Huang, Tel +86-20-39943034; +86-20-39943011, Fax +86-20-39943004; +86-20-39943000, Email ;
| |
Collapse
|
28
|
Wang S, Hou Y, Li X, Meng X, Zhang Y, Wang X. Practical Implementation of Artificial Intelligence-Based Deep Learning and Cloud Computing on the Application of Traditional Medicine and Western Medicine in the Diagnosis and Treatment of Rheumatoid Arthritis. Front Pharmacol 2022; 12:765435. [PMID: 35002704 PMCID: PMC8733656 DOI: 10.3389/fphar.2021.765435] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 12/09/2021] [Indexed: 12/23/2022] Open
Abstract
Rheumatoid arthritis (RA), an autoimmune disease of unknown etiology, is a serious threat to the health of middle-aged and elderly people. Although western medicine, traditional medicine such as traditional Chinese medicine, Tibetan medicine and other ethnic medicine have shown certain advantages in the diagnosis and treatment of RA, there are still some practical shortcomings, such as delayed diagnosis, improper treatment scheme and unclear drug mechanism. At present, the applications of artificial intelligence (AI)-based deep learning and cloud computing has aroused wide attention in the medical and health field, especially in screening potential active ingredients, targets and action pathways of single drugs or prescriptions in traditional medicine and optimizing disease diagnosis and treatment models. Integrated information and analysis of RA patients based on AI and medical big data will unquestionably benefit more RA patients worldwide. In this review, we mainly elaborated the application status and prospect of AI-assisted deep learning and cloud computation-oriented western medicine and traditional medicine on the diagnosis and treatment of RA in different stages. It can be predicted that with the help of AI, more pharmacological mechanisms of effective ethnic drugs against RA will be elucidated and more accurate solutions will be provided for the treatment and diagnosis of RA in the future.
Collapse
Affiliation(s)
- Shaohui Wang
- School of Ethnic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Ya Hou
- School of Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Xuanhao Li
- Chengdu Second People's Hospital, Chengdu, China
| | - Xianli Meng
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Yi Zhang
- School of Ethnic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Xiaobo Wang
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| |
Collapse
|
29
|
Evaluating borrowers' default risk with a spatial probit model reflecting the distance in their relational network. PLoS One 2022; 16:e0261737. [PMID: 34972129 PMCID: PMC8719753 DOI: 10.1371/journal.pone.0261737] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 12/08/2021] [Indexed: 11/28/2022] Open
Abstract
Potential relationship among loan applicants can provide valuable information for evaluating default risk. However, most of the existing credit scoring models either ignore this relationship or consider a simple connection information. This study assesses the applicants’ relation in terms of their distance estimated based on their characteristics. This information is then utilized in a proposed spatial probit model to reflect the different degree of borrowers’ relation on the default prediction of loan applicant. We apply this method to peer-to-peer Lending Club Loan data. Empirical results show that the consideration of information on the spatial autocorrelation among loan applicants can provide high predictive power for defaults.
Collapse
|
30
|
Chung CW, Hsiao TH, Huang CJ, Chen YJ, Chen HH, Lin CH, Chou SC, Chen TS, Chung YF, Yang HI, Chen YM. Machine learning approaches for the genomic prediction of rheumatoid arthritis and systemic lupus erythematosus. BioData Min 2021; 14:52. [PMID: 34895289 PMCID: PMC8666017 DOI: 10.1186/s13040-021-00284-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Accepted: 11/21/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Rheumatoid arthritis (RA) and systemic lupus erythematous (SLE) are autoimmune rheumatic diseases that share a complex genetic background and common clinical features. This study's purpose was to construct machine learning (ML) models for the genomic prediction of RA and SLE. METHODS A total of 2,094 patients with RA and 2,190 patients with SLE were enrolled from the Taichung Veterans General Hospital cohort of the Taiwan Precision Medicine Initiative. Genome-wide single nucleotide polymorphism (SNP) data were obtained using Taiwan Biobank version 2 array. The ML methods used were logistic regression (LR), random forest (RF), support vector machine (SVM), gradient tree boosting (GTB), and extreme gradient boosting (XGB). SHapley Additive exPlanation (SHAP) values were calculated to clarify the contribution of each SNPs. Human leukocyte antigen (HLA) imputation was performed using the HLA Genotype Imputation with Attribute Bagging package. RESULTS Compared with LR (area under the curve [AUC] = 0.8247), the RF approach (AUC = 0.9844), SVM (AUC = 0.9828), GTB (AUC = 0.9932), and XGB (AUC = 0.9919) exhibited significantly better prediction performance. The top 20 genes by feature importance and SHAP values included HLA class II alleles. We found that imputed HLA-DQA1*05:01, DQB1*0201 and DRB1*0301 were associated with SLE; HLA-DQA1*03:03, DQB1*0401, DRB1*0405 were more frequently observed in patients with RA. CONCLUSIONS We established ML methods for genomic prediction of RA and SLE. Genetic variations at HLA-DQA1, HLA-DQB1, and HLA-DRB1 were crucial for differentiating RA from SLE. Future studies are required to verify our results and explore their mechanistic explanation.
Collapse
Affiliation(s)
- Chih-Wei Chung
- Department of Information Management, National Taiwan University, Taipei, Taiwan
| | - Tzu-Hung Hsiao
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Chih-Jen Huang
- Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | - Yen-Ju Chen
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
- Division of Allergy, Immunology and Rheumatology, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Hsin-Hua Chen
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
- Division of Allergy, Immunology and Rheumatology, Taichung Veterans General Hospital, Taichung, Taiwan
- Rong Hsing Research Center for Translational Medicine & Ph.D. Program in Translational Medicine, National Chung Hsing University, Taichung, Taiwan
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Ching-Heng Lin
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
| | - Seng-Cho Chou
- Department of Information Management, National Taiwan University, Taipei, Taiwan
| | - Tzer-Shyong Chen
- Department of Information Management, Tunghai University, Taichung, Taiwan
| | - Yu-Fang Chung
- Department of Electrical Engineering, Tunghai University, Taichung, Taiwan
| | - Hwai-I Yang
- Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | - Yi-Ming Chen
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan.
- Division of Allergy, Immunology and Rheumatology, Taichung Veterans General Hospital, Taichung, Taiwan.
- Rong Hsing Research Center for Translational Medicine & Ph.D. Program in Translational Medicine, National Chung Hsing University, Taichung, Taiwan.
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.
- College of Medicine, National Chung Hsing University, 40227, Taichung City, Taiwan.
| |
Collapse
|
31
|
Mo X, Chen X, Ieong C, Gao X, Li Y, Liao X, Yang H, Li H, He F, He Y, Chen Y, Liang H, Huang M, Li J. Early Prediction of Tacrolimus-Induced Tubular Toxicity in Pediatric Refractory Nephrotic Syndrome Using Machine Learning. Front Pharmacol 2021; 12:638724. [PMID: 34512318 PMCID: PMC8430214 DOI: 10.3389/fphar.2021.638724] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 08/10/2021] [Indexed: 01/10/2023] Open
Abstract
Background and Aims: Tacrolimus(TAC)-induced nephrotoxicity, which has a large individual variation, may lead to treatment failure or even the end-stage renal disease. However, there is still a lack of effective models for the early prediction of TAC-induced nephrotoxicity, especially in nephrotic syndrome(NS). We aimed to develop and validate a predictive model of TAC-induced tubular toxicity in children with NS using machine learning based on comprehensive clinical and genetic variables. Materials and Methods: A retrospective cohort of 218 children with NS admitted between June 2013 and December 2018 was used to establish the models, and 11 children were prospectively enrolled for external validation. We screened 47 clinical features and 244 genetic variables. The changes in urine N- acetyl- β-D- glucosaminidase(NAG) levels before and after administration was used as an indicator of renal tubular toxicity. Results: Five machine learning algorithms, including extreme gradient boosting (XGBoost), gradient boosting decision tree (GBDT), extremely random trees (ET), random forest (RF), and logistic regression (LR) were used for model generation and validation. Four genetic variables, including TRPC6 rs3824934_GG, HSD11B1 rs846910_AG, MAP2K6 rs17823202_GG, and SCARB2 rs6823680_CC were incorporated into the final model. The XGBoost model has the best performance: sensitivity 75%, specificity 77.8%, accuracy 77.3%, and AUC 78.9%. Conclusion: A pre-administration model with good performance for predicting TAC-induced nephrotoxicity in NS was developed and validated using machine learning based on genetic factors. Physicians can estimate the possibility of nephrotoxicity in NS patients using this simple and accurate model to optimize treatment regimen before administration or to intervene in time after administration to avoid kidney damage.
Collapse
Affiliation(s)
- Xiaolan Mo
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China.,Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Xiujuan Chen
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Chifong Ieong
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Xia Gao
- Division of Nephrology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Yingjie Li
- Division of Nephrology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Xin Liao
- Division of Nephrology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huabin Yang
- Division of Nephrology, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huiyi Li
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China.,Department of Pharmacy, Guangzhou Institute of Dermatology, Guangzhou, China
| | - Fan He
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Yanling He
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Yilu Chen
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huiying Liang
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Min Huang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Jiali Li
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
32
|
Mahmood I, Abdullah H. WisdomModel: convert data into wisdom. APPLIED COMPUTING AND INFORMATICS 2021. [DOI: 10.1108/aci-06-2021-0155] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
Traditional classification algorithms always have an incorrect prediction. As the misclassification rate increases, the usefulness of the learning model decreases. This paper presents the development of a wisdom framework that reduces the error rate to less than 3% without human intervention.
Design/methodology/approach
The proposed WisdomModel consists of four stages: build a classifier, isolate the misclassified instances, construct an automated knowledge base for the misclassified instances and rectify incorrect prediction. This approach will identify misclassified instances by comparing them against the knowledge base. If an instance is close to a rule in the knowledge base by a certain threshold, then this instance is considered misclassified.
Findings
The authors have evaluated the WisdomModel using different measures such as accuracy, recall, precision, f-measure, receiver operating characteristics (ROC) curve, area under the curve (AUC) and error rate with various data sets to prove its ability to generalize without human involvement. The results of the proposed model minimize the number of misclassified instances by at least 70% and increase the accuracy of the model minimally by 7%.
Originality/value
This research focuses on defining wisdom in practical applications. Despite of the development in information system, there is still no framework or algorithm that can be used to extract wisdom from data. This research will build a general wisdom framework that can be used in any domain to reach wisdom.
Collapse
|
33
|
Ponsonby AL. Reflection on modern methods: building causal evidence within high-dimensional molecular epidemiological studies of moderate size. Int J Epidemiol 2021; 50:1016-1029. [PMID: 33594409 DOI: 10.1093/ije/dyaa174] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/17/2020] [Indexed: 12/29/2022] Open
Abstract
This commentary provides a practical perspective on epidemiological analysis within a single high-dimensional study of moderate size to consider a causal question. In this setting, non-causal confounding is important. This occurs when a factor is a determinant of outcome and the underlying association between exposure and the factor is non-causal. That is, the association arises due to chance, confounding or other bias rather than reflecting that exposure and the factor are causally related. In particular, the influence of technical processing factors must be accounted for by pre-processing measures to remove artefact or to control for these factors such as batch run. Work steps include the evaluation of alternative non-causal explanations for observed exposure-disease associations and strategies to obtain the highest level of causal inference possible within the study. A systematic approach is required to work through a question set and obtain insights on not only the exposure-disease association but also the multifactorial causal structure of the underlying data where possible. The appropriate inclusion of molecular findings will enhance the quest to better understand multifactorial disease causation in modern observational epidemiological studies.
Collapse
|
34
|
Zhao Z, Cheng X, Sun X, Ma S, Feng H, Zhao L. Prediction Model of Anastomotic Leakage Among Esophageal Cancer Patients After Receiving an Esophagectomy: Machine Learning Approach. JMIR Med Inform 2021; 9:e27110. [PMID: 34313597 PMCID: PMC8367102 DOI: 10.2196/27110] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 04/10/2021] [Accepted: 06/16/2021] [Indexed: 12/13/2022] Open
Abstract
Background Anastomotic leakage (AL) is one of the severe postoperative adverse events (5%-30%), and it is related to increased medical costs in cancer patients who undergo esophagectomies. Machine learning (ML) methods show good performance at predicting risk for AL. However, AL risk prediction based on ML models among the Chinese population is unavailable. Objective This study uses ML techniques to develop and validate a risk prediction model to screen patients with emerging AL risk factors. Methods Analyses were performed using medical records from 710 patients who underwent esophagectomies at the National Clinical Research Center for Cancer between January 2010 and May 2015. We randomly split (9:1) the data set into a training data set of 639 patients and a testing data set of 71 patients using a computer algorithm. We assessed multiple classification tools to create a multivariate risk prediction model. Our ML algorithms contained decision tree, random forest, naive Bayes, and logistic regression with least absolute shrinkage and selection operator. The optimal AL prediction model was selected based on model evaluation metrics. Results The final risk panel included 36 independent risk features. Of those, 10 features were significantly identified by the logistic model, including aortic calcification (OR 2.77, 95% CI 1.32-5.81), celiac trunk calcification (OR 2.79, 95% CI 1.20-6.48), forced expiratory volume 1% (OR 0.51, 95% CI 0.30-0.89); TLco (OR 0.56, 95% CI 0.27-1.18), peripheral vascular disease (OR 4.97, 95% CI 1.44-17.07), laparoscope (OR 3.92, 95% CI 1.23-12.51), postoperative length of hospital stay (OR 1.17, 95% CI 1.13-1.21), vascular permeability activity (OR 0.46, 95% CI 0.14-1.48), and fat liquefaction of incisions (OR 4.36, 95% CI 1.86-10.21). Logistic regression with least absolute shrinkage and selection operator offered the highest prediction quality with an area under the receiver operator characteristic of 72% in the training data set. The testing model also achieved similar high performance. Conclusions Our model offered a prediction of AL with high accuracy, assisting in AL prevention and treatment. A personalized ML prediction model with a purely data-driven selection of features is feasible and effective in predicting AL in patients who underwent esophagectomy.
Collapse
Affiliation(s)
- Ziran Zhao
- Thoracic Surgery Department, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xi Cheng
- Department of Global Health Management, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, United States
| | - Xiao Sun
- Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, United States
| | - Shanrui Ma
- Thoracic Surgery Department, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Hao Feng
- Thoracic Surgery Department, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Liang Zhao
- Thoracic Surgery Department, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
35
|
Katsaouni N, Tashkandi A, Wiese L, Schulz MH. Machine learning based disease prediction from genotype data. Biol Chem 2021; 402:871-885. [PMID: 34218544 DOI: 10.1515/hsz-2021-0109] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Accepted: 06/15/2021] [Indexed: 12/16/2022]
Abstract
Using results from genome-wide association studies for understanding complex traits is a current challenge. Here we review how genotype data can be used with different machine learning (ML) methods to predict phenotype occurrence and severity from genotype data. We discuss common feature encoding schemes and how studies handle the often small number of samples compared to the huge number of variants. We compare which ML methods are being applied, including recent results using deep neural networks. Further, we review the application of methods for feature explanation and interpretation.
Collapse
Affiliation(s)
- Nikoletta Katsaouni
- Institute for Cardiovascular Regeneration, Goethe University, 60590 Frankfurt am Main, Germany
| | - Araek Tashkandi
- Institute of Computer Sciences and Engineering, University of Jeddah, 21959 Jeddah, Saudi Arabia
| | - Lena Wiese
- Institute of Computer Science, Goethe University, 60629 Frankfurt am Main, Germany
| | - Marcel H Schulz
- Institute for Cardiovascular Regeneration, Goethe University, 60590 Frankfurt am Main, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site RheinMain, 60590 Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| |
Collapse
|
36
|
Aivaliotis G, Palczewski J, Atkinson R, Cade JE, Morris MA. A comparison of time to event analysis methods, using weight status and breast cancer as a case study. Sci Rep 2021; 11:14058. [PMID: 34234154 PMCID: PMC8263588 DOI: 10.1038/s41598-021-92944-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 06/15/2021] [Indexed: 11/12/2022] Open
Abstract
Survival analysis with cohort study data has been traditionally performed using Cox proportional hazards models. Random survival forests (RSFs), a machine learning method, now present an alternative method. Using the UK Women's Cohort Study (n = 34,493) we evaluate two methods: a Cox model and an RSF, to investigate the association between Body Mass Index and time to breast cancer incidence. Robustness of the models were assessed by cross validation and bootstraping. Histograms of bootstrap coefficients are reported. C-Indices and Integrated Brier Scores are reported for all models. In post-menopausal women, the Cox model Hazard Ratios (HR) for Overweight (OW) and Obese (O) were 1.25 (1.04, 1.51) and 1.28 (0.98, 1.68) respectively and the RSF Odds Ratios (OR) with partial dependence on menopause for OW and O were 1.34 (1.31, 1.70) and 1.45 (1.42, 1.48). HR are non-significant results. Only the RSF appears confident about the effect of weight status on time to event. Bootstrapping demonstrated Cox model coefficients can vary significantly, weakening interpretation potential. An RSF was used to produce partial dependence plots (PDPs) showing OW and O weight status increase the probability of breast cancer incidence in post-menopausal women. All models have relatively low C-Index and high Integrated Brier Score. The RSF overfits the data. In our study, RSF can identify complex non-proportional hazard type patterns in the data, and allow more complicated relationships to be investigated using PDPs, but it overfits limiting extrapolation of results to new instances. Moreover, it is less easily interpreted than Cox models. The value of survival analysis remains paramount and therefore machine learning techniques like RSF should be considered as another method for analysis.
Collapse
Affiliation(s)
- Georgios Aivaliotis
- School of Mathematics, University of Leeds, Leeds, LS2 9JT, UK
- Leeds Institute for Data Analytics, University of Leeds, Leeds, LS2 9JT, UK
- Alan Turing Institute, British Library, London, NW1 2DB, UK
| | - Jan Palczewski
- School of Mathematics, University of Leeds, Leeds, LS2 9JT, UK
- Leeds Institute for Data Analytics, University of Leeds, Leeds, LS2 9JT, UK
| | - Rebecca Atkinson
- Leeds Institute for Data Analytics, University of Leeds, Leeds, LS2 9JT, UK
| | - Janet E Cade
- Nutritional Epidemiology Group, School of Food Sciences and Nutrition, University of Leeds, Leeds, LS2 9JT, UK
| | - Michelle A Morris
- Leeds Institute for Data Analytics, University of Leeds, Leeds, LS2 9JT, UK.
- Alan Turing Institute, British Library, London, NW1 2DB, UK.
- School of Medicine, University of Leeds, Leeds, UK.
| |
Collapse
|
37
|
Prediction of atherosclerosis diseases using biosensor-assisted deep learning artificial neuron model. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05317-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
|
38
|
Twelve years of GWAS discoveries for osteoporosis and related traits: advances, challenges and applications. Bone Res 2021; 9:23. [PMID: 33927194 PMCID: PMC8085014 DOI: 10.1038/s41413-021-00143-3] [Citation(s) in RCA: 110] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 12/21/2020] [Indexed: 02/03/2023] Open
Abstract
Osteoporosis is a common skeletal disease, affecting ~200 million people around the world. As a complex disease, osteoporosis is influenced by many factors, including diet (e.g. calcium and protein intake), physical activity, endocrine status, coexisting diseases and genetic factors. In this review, we first summarize the discovery from genome-wide association studies (GWASs) in the bone field in the last 12 years. To date, GWASs and meta-analyses have discovered hundreds of loci that are associated with bone mineral density (BMD), osteoporosis, and osteoporotic fractures. However, the GWAS approach has sometimes been criticized because of the small effect size of the discovered variants and the mystery of missing heritability, these two questions could be partially explained by the newly raised conceptual models, such as omnigenic model and natural selection. Finally, we introduce the clinical use of GWAS findings in the bone field, such as the identification of causal clinical risk factors, the development of drug targets and disease prediction. Despite the fruitful GWAS discoveries in the bone field, most of these GWAS participants were of European descent, and more genetic studies should be carried out in other ethnic populations to benefit disease prediction in the corresponding population.
Collapse
|
39
|
Gola D, König IR. Empowering individual trait prediction using interactions for precision medicine. BMC Bioinformatics 2021; 22:74. [PMID: 33602124 PMCID: PMC7890638 DOI: 10.1186/s12859-021-04011-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Accepted: 02/08/2021] [Indexed: 11/11/2022] Open
Abstract
Background One component of precision medicine is to construct prediction models with their predicitve ability as high as possible, e.g. to enable individual risk prediction. In genetic epidemiology, complex diseases like coronary artery disease, rheumatoid arthritis, and type 2 diabetes, have a polygenic basis and a common assumption is that biological and genetic features affect the outcome under consideration via interactions. In the case of omics data, the use of standard approaches such as generalized linear models may be suboptimal and machine learning methods are appealing to make individual predictions. However, most of these algorithms focus mostly on main or marginal effects of the single features in a dataset. On the other hand, the detection of interacting features is an active area of research in the realm of genetic epidemiology. One big class of algorithms to detect interacting features is based on the multifactor dimensionality reduction (MDR). Here, we further develop the model-based MDR (MB-MDR), a powerful extension of the original MDR algorithm, to enable interaction empowered individual prediction. Results Using a comprehensive simulation study we show that our new algorithm (median AUC: 0.66) can use information hidden in interactions and outperforms two other state-of-the-art algorithms, namely the Random Forest (median AUC: 0.54) and Elastic Net (median AUC: 0.50), if interactions are present in a scenario of two pairs of two features having small effects. The performance of these algorithms is comparable if no interactions are present. Further, we show that our new algorithm is applicable to real data by comparing the performance of the three algorithms on a dataset of rheumatoid arthritis cases and healthy controls. As our new algorithm is not only applicable to biological/genetic data but to all datasets with discrete features, it may have practical implications in other research fields where interactions between features have to be considered as well, and we made our method available as an R package (https://github.com/imbs-hl/MBMDRClassifieR). Conclusions The explicit use of interactions between features can improve the prediction performance and thus should be included in further attempts to move precision medicine forward.
Collapse
Affiliation(s)
- Damian Gola
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Lübeck, Germany
| | - Inke R König
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Lübeck, Germany.
| |
Collapse
|
40
|
Chen TH, Chatterjee N, Landi MT, Shi J. A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information. J Am Stat Assoc 2020; 116:133-143. [PMID: 34483403 DOI: 10.1080/01621459.2020.1764849] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Large-scale genome-wide association (GWAS) studies provide opportunities for developing genetic risk prediction models that have the potential to improve disease prevention, intervention or treatment. The key step is to develop polygenic risk score (PRS) models with high predictive performance for a given disease, which typically requires a large training data set for selecting truly associated single nucleotide polymorphisms (SNPs) and estimating effect sizes accurately. Here, we develop a comprehensive penalized regression for fitting l 1 regularized regression models to GWAS summary statistics. We propose incorporating Pleiotropy and ANnotation information into PRS (PANPRS) development through suitable formulation of penalty functions and associated tuning parameters. Extensive simulations show that PANPRS performs equally well or better than existing PRS methods when no functional annotation or pleiotropy is incorporated. When functional annotation data and pleiotropy are informative, PANPRS substantially outperforms existing PRS methods in simulations. Finally, we applied our methods to build PRS for type 2 diabetes and melanoma and found that incorporating relevant functional annotations and GWAS of genetically related traits improved prediction of these two complex diseases.
Collapse
Affiliation(s)
- Ting-Huei Chen
- Department of Mathematics and Statistics, Regular member, Cervo Brain Research Centre, University of Laval, 1045, av. of Medicine, Suite 1056, Quebec G1V 0A6, Canada
| | - Nilanjan Chatterjee
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University Baltimore, Maryland, United States of America, 615 N Wolfe Street Baltimore, MD 21205
| | - Maria Teresa Landi
- Integrative Tumor Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Maryland, United States of America, 9609 Medical Center Drive, RM 7E106, Bethesda, MD, 20892
| | - Jianxin Shi
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Maryland, United States of America, 9609 Medical Center Drive, RM 7E122, Bethesda, MD, 20892
| |
Collapse
|
41
|
Moss E, Metcalf J. High Tech, High Risk: Tech Ethics Lessons for the COVID-19 Pandemic Response. PATTERNS (NEW YORK, N.Y.) 2020; 1:100102. [PMID: 33073256 PMCID: PMC7546204 DOI: 10.1016/j.patter.2020.100102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The COVID-19 pandemic has, in a matter of a few short months, drastically reshaped society around the world. Because of the growing perception of machine learning as a technology capable of addressing large problems at scale, machine learning applications have been seen as desirable interventions in mitigating the risks of the pandemic disease. However, machine learning, like many tools of technocratic governance, is deeply implicated in the social production and distribution of risk and the role of machine learning in the production of risk must be considered as engineers and other technologists develop tools for the current crisis. This paper describes the coupling of machine learning and the social production of risk, generally, and in pandemic responses specifically. It goes on to describe the role of risk management in the effort to institutionalize ethics in the technology industry and how such efforts can benefit from a deeper understanding of the social production of risk through machine learning.
Collapse
Affiliation(s)
- Emanuel Moss
- Data & Society Research Institute, New York, NY 10010, USA
- CUNY Graduate Center, New York, NY 10016, USA
| | - Jacob Metcalf
- Data & Society Research Institute, New York, NY 10010, USA
| |
Collapse
|
42
|
Machado RA, de Oliveira Silva C, Martelli-Junior H, das Neves LT, Coletta RD. Machine learning in prediction of genetic risk of nonsyndromic oral clefts in the Brazilian population. Clin Oral Investig 2020; 25:1273-1280. [PMID: 32617779 DOI: 10.1007/s00784-020-03433-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 06/24/2020] [Indexed: 01/07/2023]
Abstract
OBJECTIVES Genetic variants in multiple genes and loci have been associated with the risk of nonsyndromic cleft lip with or without cleft palate (NSCL ± P). However, the estimation of risk remains challenge, because most of these variants are population-specific rendering the identification of the underlying genetic risk difficult. Herein we examined the use of machine learning network in previously reported single nucleotide polymorphisms (SNPs) to predict risk of NSCL ± P in the Brazilian population. MATERIALS AND METHODS Random forest and neural network methods were applied in 72 SNPs in a case-control sample composed by 722 NSCL ± P and 866 controls for discrimination of NSCL ± P risk. SNP-SNP interactions and functional annotation biological processes associated with the identified NSCL ± P risk genes were verified. RESULTS Supervised random forest decision trees revealed high scores of importance for the SNPs rs11717284 and rs1875735 in FGF12, rs41268753 in GRHL3, rs2236225 in MTHFD1, rs2274976 in MTHFR, rs2235371 and rs642961 in IRF6, rs17085106 in RHPN2, rs28372960 in TCOF1, rs7078160 in VAX1, rs10762573 and rs2131960 in VCL, and rs227731 in 17q22, with an accuracy of 99% and an error rate of approximately 3% to predict the risk of NSCL ± P. Those same 13 SNPs were considered the most important for the neural network to effectively predict NSCL ± P risk, with an overall accuracy of 94%. Multivariate regression model revealed significant interactions among all SNPs, with an exception of those in FGF12 and MTHFD1. The most significantly biological processes for selected genes were those involved in tissue and epithelium development; neural tube closure; and metabolism of methionine, folate, and homocysteine. CONCLUSIONS Our results provide novel clues for genetic mechanism studies of NSCL ± P and point out for a machine learning model composed by 13 SNPs that is capable of predicting NSCL ± P risk. CLINICAL RELEVANCE Although validation is necessary, this genetic panel can be useful in the near future to assist in NSCL ± P genetic counseling.
Collapse
Affiliation(s)
- Renato Assis Machado
- Department of Oral Diagnosis, School of Dentistry, University of Campinas, Piracicaba, São Paulo, CEP 13414-018, Brazil
- Post-Graduation Program in Rehabilitation Sciences, Hospital for Rehabilitation of Craniofacial Anomalies, University of São Paulo, Bauru, São Paulo, Brazil
| | - Carolina de Oliveira Silva
- Department of Oral Diagnosis, School of Dentistry, University of Campinas, Piracicaba, São Paulo, CEP 13414-018, Brazil
| | - Hercílio Martelli-Junior
- Stomatology Clinic, Dental School, State University of Montes Claros, Montes Claros, Minas Gerais, Brazil
- Center for Rehabilitation of Craniofacial Anomalies, Dental School, University of José Rosario Vellano, Alfenas, Minas Gerais, Brazil
| | - Lucimara Teixeira das Neves
- Post-Graduation Program in Rehabilitation Sciences, Hospital for Rehabilitation of Craniofacial Anomalies, University of São Paulo, Bauru, São Paulo, Brazil
- Department of Biological Sciences, Bauru School of Dentistry, University of São Paulo, Bauru, São Paulo, Brazil
| | - Ricardo D Coletta
- Department of Oral Diagnosis, School of Dentistry, University of Campinas, Piracicaba, São Paulo, CEP 13414-018, Brazil.
| |
Collapse
|
43
|
Arakawa T. Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods. Animals (Basel) 2020; 10:ani10050771. [PMID: 32365596 PMCID: PMC7278493 DOI: 10.3390/ani10050771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 04/16/2020] [Accepted: 04/27/2020] [Indexed: 11/16/2022] Open
Abstract
Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.
Collapse
Affiliation(s)
- Toshiya Arakawa
- Department of Mechanical Systems Engineering, Aichi University of Technology, Gamagori-shi, Aichi 443-0047, Japan
| |
Collapse
|
44
|
Malten J, König IR. Modified entropy-based procedure detects gene-gene-interactions in unconventional genetic models. BMC Med Genomics 2020; 13:65. [PMID: 32326960 PMCID: PMC7181579 DOI: 10.1186/s12920-020-0703-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 03/13/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Since it is assumed that genetic interactions play an important role in understanding the mechanisms of complex diseases, different statistical approaches have been suggested in recent years for this task. One interesting approach is the entropy-based IGENT method by Kwon et al. that promises an efficient detection of main effects and interaction effects simultaneously. However, a modification is required if the aim is to only detect interaction effects. METHODS Based on the IGENT method, we present a modification that leads to a conditional mutual information based approach under the condition of linkage equilibrium. The modified estimator is investigated in a comprehensive simulation based on five genetic interaction models and applied to real data from the genome-wide association study by the North American Rheumatoid Arthritis Consortium (NARAC). RESULTS The presented modification of IGENT controls the type I error in all simulated constellations. Furthermore, it provides high power for detecting pure interactions specifically on unconventional genetic models both in simulation and real data. CONCLUSIONS The proposed method uses the IGENT software, which is free available, simple and fast, and detects pure interactions on unconventional genetic models. Our results demonstrate that this modification is an attractive complement to established analysis methods.
Collapse
Affiliation(s)
- Jörg Malten
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Inke R König
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany.
| |
Collapse
|
45
|
O'Neill AC. ASO Author Reflections: Machine Learning Strategies Can Aid Patient Selection in Microvascular Breast Reconstruction. Ann Surg Oncol 2020; 27:3476-3477. [PMID: 32206950 DOI: 10.1245/s10434-020-08352-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Indexed: 11/18/2022]
Affiliation(s)
- Anne C O'Neill
- Division of Plastic Surgery, Department of Surgery and Surgical Oncology, University Health Network, University of Toronto, Toronto, Canada. Anne.O'
| |
Collapse
|
46
|
Stafford IS, Kellermann M, Mossotto E, Beattie RM, MacArthur BD, Ennis S. A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases. NPJ Digit Med 2020; 3:30. [PMID: 32195365 PMCID: PMC7062883 DOI: 10.1038/s41746-020-0229-3] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Accepted: 01/17/2020] [Indexed: 02/07/2023] Open
Abstract
Autoimmune diseases are chronic, multifactorial conditions. Through machine learning (ML), a branch of the wider field of artificial intelligence, it is possible to extract patterns within patient data, and exploit these patterns to predict patient outcomes for improved clinical management. Here, we surveyed the use of ML methods to address clinical problems in autoimmune disease. A systematic review was conducted using MEDLINE, embase and computers and applied sciences complete databases. Relevant papers included "machine learning" or "artificial intelligence" and the autoimmune diseases search term(s) in their title, abstract or key words. Exclusion criteria: studies not written in English, no real human patient data included, publication prior to 2001, studies that were not peer reviewed, non-autoimmune disease comorbidity research and review papers. 169 (of 702) studies met the criteria for inclusion. Support vector machines and random forests were the most popular ML methods used. ML models using data on multiple sclerosis, rheumatoid arthritis and inflammatory bowel disease were most common. A small proportion of studies (7.7% or 13/169) combined different data types in the modelling process. Cross-validation, combined with a separate testing set for more robust model evaluation occurred in 8.3% of papers (14/169). The field may benefit from adopting a best practice of validation, cross-validation and independent testing of ML models. Many models achieved good predictive results in simple scenarios (e.g. classification of cases and controls). Progression to more complex predictive models may be achievable in future through integration of multiple data types.
Collapse
Affiliation(s)
- I. S. Stafford
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - M. Kellermann
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
| | - E. Mossotto
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - R. M. Beattie
- Department of Paediatric Gastroenterology, Southampton Children’s Hospital, Southampton, UK
| | - B. D. MacArthur
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - S. Ennis
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
| |
Collapse
|
47
|
O'Neill AC, Yang D, Roy M, Sebastiampillai S, Hofer SOP, Xu W. Development and Evaluation of a Machine Learning Prediction Model for Flap Failure in Microvascular Breast Reconstruction. Ann Surg Oncol 2020; 27:3466-3475. [PMID: 32152777 DOI: 10.1245/s10434-020-08307-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Indexed: 12/18/2022]
Abstract
BACKGROUND Despite high success rates, flap failure remains an inherent risk in microvascular breast reconstruction. Identifying patients who are at high risk for flap failure would enable us to recommend alternative reconstructive techniques. However, as flap failure is a rare event, identification of risk factors is statistically challenging. Machine learning is a form of artificial intelligence that automates analytical model building. It has been proposed that machine learning can build superior prediction models when the outcome of interest is rare. METHODS In this study we evaluate machine learning resampling and decision-tree classification models for the prediction of flap failure in a large retrospective cohort of microvascular breast reconstructions. RESULTS A total of 1012 patients were included in the study. Twelve patients (1.1%) experienced flap failure. The ROSE informed oversampling technique and decision-tree classification resulted in a strong prediction model (AUC 0.95) with high sensitivity and specificity. In the testing cohort, the model maintained acceptable specificity and predictive power (AUC 0.67), but sensitivity was reduced. The model identified four high-risk patient groups. Obesity, comorbidities and smoking were found to contribute to flap loss. The flap failure rate in high-risk patients was 7.8% compared with 0.44% in the low-risk cohort (p = 0.001). CONCLUSIONS This machine-learning risk prediction model suggests that flap failure may not be a random event. The algorithm indicates that flap failure is multifactorial and identifies a number of potential contributing factors that warrant further investigation.
Collapse
Affiliation(s)
- Anne C O'Neill
- Division of Plastic Surgery, Department of Surgery and Surgical Oncology, University Health Network, University of Toronto, Toronto, Canada. Anne.O'
| | - Donyang Yang
- Department of Biostatistics, Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
| | - Melissa Roy
- Division of Plastic Surgery, Department of Surgery and Surgical Oncology, University Health Network, University of Toronto, Toronto, Canada
| | - Stephanie Sebastiampillai
- Division of Plastic Surgery, Department of Surgery and Surgical Oncology, University Health Network, University of Toronto, Toronto, Canada
| | - Stefan O P Hofer
- Division of Plastic Surgery, Department of Surgery and Surgical Oncology, University Health Network, University of Toronto, Toronto, Canada
| | - Wei Xu
- Department of Biostatistics, Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
| |
Collapse
|
48
|
Wongvibulsin S, Wu KC, Zeger SL. Clinical risk prediction with random forests for survival, longitudinal, and multivariate (RF-SLAM) data analysis. BMC Med Res Methodol 2019; 20:1. [PMID: 31888507 PMCID: PMC6937754 DOI: 10.1186/s12874-019-0863-0] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 11/08/2019] [Indexed: 12/23/2022] Open
Abstract
Background Clinical research and medical practice can be advanced through the prediction of an individual’s health state, trajectory, and responses to treatments. However, the majority of current clinical risk prediction models are based on regression approaches or machine learning algorithms that are static, rather than dynamic. To benefit from the increasing emergence of large, heterogeneous data sets, such as electronic health records (EHRs), novel tools to support improved clinical decision making through methods for individual-level risk prediction that can handle multiple variables, their interactions, and time-varying values are necessary. Methods We introduce a novel dynamic approach to clinical risk prediction for survival, longitudinal, and multivariate (SLAM) outcomes, called random forest for SLAM data analysis (RF-SLAM). RF-SLAM is a continuous-time, random forest method for survival analysis that combines the strengths of existing statistical and machine learning methods to produce individualized Bayes estimates of piecewise-constant hazard rates. We also present a method-agnostic approach for time-varying evaluation of model performance. Results We derive and illustrate the method by predicting sudden cardiac arrest (SCA) in the Left Ventricular Structural (LV) Predictors of Sudden Cardiac Death (SCD) Registry. We demonstrate superior performance relative to standard random forest methods for survival data. We illustrate the importance of the number of preceding heart failure hospitalizations as a time-dependent predictor in SCA risk assessment. Conclusions RF-SLAM is a novel statistical and machine learning method that improves risk prediction by incorporating time-varying information and accommodating a large number of predictors, their interactions, and missing values. RF-SLAM is designed to easily extend to simultaneous predictions of multiple, possibly competing, events and/or repeated measurements of discrete or continuous variables over time.Trial registration: LV Structural Predictors of SCD Registry (clinicaltrials.gov, NCT01076660), retrospectively registered 25 February 2010
Collapse
Affiliation(s)
- Shannon Wongvibulsin
- Department of Biomedical Engineering, Johns Hopkins School of Medicine, Baltimore, USA.
| | - Katherine C Wu
- Department of Medicine, Division of Cardiology, Johns Hopkins School of Medicine, Baltimore, USA
| | - Scott L Zeger
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA
| |
Collapse
|
49
|
Chen D, Goyal G, Go RS, Parikh SA, Ngufor CG. Improved Interpretability of Machine Learning Model Using Unsupervised Clustering: Predicting Time to First Treatment in Chronic Lymphocytic Leukemia. JCO Clin Cancer Inform 2019; 3:1-11. [DOI: 10.1200/cci.18.00137] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Time to event is an important aspect of clinical decision making. This is particularly true when diseases have highly heterogeneous presentations and prognoses, as in chronic lymphocytic lymphoma (CLL). Although machine learning methods can readily learn complex nonlinear relationships, many methods are criticized as inadequate because of limited interpretability. We propose using unsupervised clustering of the continuous output of machine learning models to provide discrete risk stratification for predicting time to first treatment in a cohort of patients with CLL. PATIENTS AND METHODS A total of 737 treatment-naïve patients with CLL diagnosed at Mayo Clinic were included in this study. We compared predictive abilities for two survival models (Cox proportional hazards and random survival forest) and four classification methods (logistic regression, support vector machines, random forest, and gradient boosting machine). Probability of treatment was then stratified. RESULTS Machine learning methods did not yield significantly more accurate predictions of time to first treatment. However, automated risk stratification provided by clustering was able to better differentiate patients who were at risk for treatment within 1 year than models developed using standard survival analysis techniques. CONCLUSION Clustering the posterior probabilities of machine learning models provides a way to better interpret machine learning models.
Collapse
|
50
|
Mo X, Chen X, Li H, Li J, Zeng F, Chen Y, He F, Zhang S, Li H, Pan L, Zeng P, Xie Y, Li H, Huang M, He Y, Liang H, Zeng H. Early and Accurate Prediction of Clinical Response to Methotrexate Treatment in Juvenile Idiopathic Arthritis Using Machine Learning. Front Pharmacol 2019; 10:1155. [PMID: 31649533 PMCID: PMC6791251 DOI: 10.3389/fphar.2019.01155] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 09/09/2019] [Indexed: 11/29/2022] Open
Abstract
Background and Aims: Accurately predicting the response to methotrexate (MTX) in juvenile idiopathic arthritis (JIA) patients before administration is the key point to improve the treatment outcome. However, no simple and reliable prediction model has been identified. Here, we aimed to develop and validate predictive models for the MTX response to JIA using machine learning based on electronic medical record (EMR) before and after administering MTX. Materials and Methods: Data of 362 JIA patients with MTX mono-therapy were retrospectively collected from EMR between January 2008 and October 2018. DAS44/ESR-3 simplified standard was used to evaluate the MTX response. Extreme gradient boosting (XGBoost), support vector machine (SVM), random forest (RF), and logistic regression (LR) algorithms were applied to develop and validate models with 5-fold cross-validation on the randomly split training and test set. Data of 13 patients additionally collected were used for external validation. Results: The XGBoost screened out the optimal 10 pre-administration features and 6 mix-variables. The XGBoost established the best model based on the 10 pre-administration variables. The performances were accuracy 91.78%, sensitivity 90.70%, specificity 93.33%, AUC 97.00%, respectively. Similarly, the XGBoost developed a better model based on the 6 mix-variables, whose performances were accuracy 94.52%, sensitivity 95.35%, specificity 93.33%, AUC 99.00%, respectively. Conclusion: Based on common EMR data, we developed two MTX response predictive models with excellent performance in JIA using machine learning. These models can predict the MTX efficacy early and accurately, which provides powerful decision support for doctors to make or adjust therapeutic scheme before or after treatment.
Collapse
Affiliation(s)
- Xiaolan Mo
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China.,Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Xiujuan Chen
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Hongwei Li
- Pediatric Allergy Immunology & Rheumatology Department, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Jiali Li
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Fangling Zeng
- Department of Medical, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Yilu Chen
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Fan He
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Song Zhang
- Pediatric Allergy Immunology & Rheumatology Department, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huixian Li
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Liyan Pan
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Ping Zeng
- Pediatric Allergy Immunology & Rheumatology Department, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Ying Xie
- Pediatric Allergy Immunology & Rheumatology Department, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huiyi Li
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Min Huang
- Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Yanling He
- Department of Pharmacy, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huiying Liang
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| | - Huasong Zeng
- Pediatric Allergy Immunology & Rheumatology Department, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|