1
|
Zainal NH, Tan HH, Hong RYS, Newman MG. Prescriptive Predictors of Mindfulness Ecological Momentary Intervention for Social Anxiety Disorder: Machine Learning Analysis of Randomized Controlled Trial Data. JMIR Ment Health 2025; 12:e67210. [PMID: 40359509 DOI: 10.2196/67210] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Revised: 01/09/2025] [Accepted: 01/15/2025] [Indexed: 05/15/2025] Open
Abstract
BACKGROUND Shame and stigma often prevent individuals with social anxiety disorder (SAD) from seeking and attending costly and time-intensive psychotherapies, highlighting the importance of brief, low-cost, and scalable treatments. Creating prescriptive outcome prediction models is thus crucial for identifying which clients with SAD might gain the most from a unique scalable treatment option. Nevertheless, widely used classical regression methods might not optimally capture complex nonlinear associations and interactions. OBJECTIVE Precision medicine approaches were thus harnessed to examine prescriptive predictors of optimization to a 14-day fully self-guided mindfulness ecological momentary intervention (MEMI) over a self-monitoring app (SM). METHODS This study involved 191 participants who had probable SAD. Participants were randomly assigned to MEMI (n=96) or SM (n=95). They completed self-reports of symptoms, risk factors, treatment, and sociodemographics at baseline, posttreatment, and 1-month follow-up (1MFU). Machine learning (ML) models with 17 predictors of optimization to MEMI over SM, defined as a higher probability of SAD remission from MEMI at posttreatment and 1MFU, were evaluated. The Social Phobia Diagnostic Questionnaire, structurally equivalent to the Diagnostic and Statistical Manual SAD criteria, was used to define remission. These ML models included random forest and support vector machines (radial basis function kernel) and 10-fold nested cross-validation that separated model training, minimal tuning in inner folds, and model testing in outer folds. RESULTS ML models outperformed logistic regression. The multivariable ML models using the 10 most important predictors achieved good performance, with the area under the receiver operating characteristic curve (AU-ROC) values ranging from .71 to .72 at posttreatment and 1MFU. These prerandomization and early-stage prescriptive predictors consistently identified which participants had the highest probability of optimization of MEMI over SM after 14 days and 6 weeks from baseline. Significant predictors included 4 strengths (higher trait mindfulness, lower SAD severity, presence of university education, no current psychotropic medication use), 2 weaknesses (higher generalized anxiety severity and clinician-diagnosed depression or anxiety disorder), and 1 sociodemographic variable (Chinese ethnicity). Emotion dysregulation and current psychotherapy predicted remission with inconsistent signs across time points. CONCLUSIONS The AU-ROC values indicated moderately meaningful effect sizes in identifying prescriptive predictors within multivariable models for clients with SAD. Focusing on the identified notable client strengths, weaknesses, and Chinese ethnicity may enhance our ability to predict future responses to scalable treatments. Estimating the likelihood of SAD remission with a "prescriptive predictor calculator" for each client may help clinicians and policymakers allocate scarce treatment resources effectively. Clients with high remission probability may benefit from receiving the MEMI as a vigilant waitlist strategy before intensive therapist-led psychotherapy. These efforts may aid in creating actionable treatment selection tools to optimize care for clients with SAD in routine health care settings that use stratified care principles. TRIAL REGISTRATION OSF Registries 10.17605/OSF.IO/M3KXZ; https://osf.io/m3kxz.
Collapse
Affiliation(s)
- Nur Hani Zainal
- Department of Psychology, National University of Singapore, Singapore, Singapore
| | - Hui Han Tan
- Department of Psychology, National University of Singapore, Singapore, Singapore
| | - Ryan Yee Shiun Hong
- Department of Psychology, National University of Singapore, Singapore, Singapore
| | - Michelle Gayle Newman
- Department of Psychology, The Pennsylvania State University, University Park, PA, United States
| |
Collapse
|
2
|
Krzyzanowski B, Mullan AF, Dorsey ER, Chirag SS, Turcano P, Camerucci E, Bower JH, Savica R. Proximity to Golf Courses and Risk of Parkinson Disease. JAMA Netw Open 2025; 8:e259198. [PMID: 40338549 PMCID: PMC12062912 DOI: 10.1001/jamanetworkopen.2025.9198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Accepted: 02/27/2025] [Indexed: 05/09/2025] Open
Abstract
Importance The role of pesticide exposure from golf courses in Parkinson disease (PD) risk remains unclear. Objective To assess whether proximity to golf courses is associated with increased PD risk and to use information on groundwater vulnerability and municipal well locations to investigate drinking water contamination as a potential route of exposure. Design, Setting, and Participants This case-control study included patients with incident PD and matched controls from the Rochester Epidemiology Project from 1991 to 2015. Data were analyzed between June and August 2024. Exposures Distance to golf courses, living in water service areas with a golf course, living in water service areas in vulnerable groundwater regions, living in water service areas with shallow municipal wells, and living in water service areas with a municipal well on a golf course. Main Outcome and Measures Risk of incident PD. All models adjusted for age, sex, race and ethnicity, year of index, median household income, and urban or rural category. Results A total of 419 incident PD cases were identified (median [IQR] age, 73 [65-80] years; 257 male [61.3%]) with 5113 matched controls (median [IQR] age, 72 [65-79] years; 3043 male [59.5%]; 4504 White [88.1%]). After adjusting for patient demographics and neighborhood characteristics, living within 1 mile of a golf course was associated with 126% increased odds of developing PD compared with individuals living more than 6 miles away from a golf course (adjusted odds ratio [aOR], 2.26; 95% CI, 1.09-4.70). Individuals living within water service areas with a golf course had nearly double the odds of PD compared with individuals in water service areas without golf courses (aOR, 1.96; 95% CI, 1.20-3.23) and 49% greater odds compared with individuals with private wells (aOR, 1.49; 95% CI, 1.05-2.13). Additionally, individuals living in water service areas with a golf course in vulnerable groundwater regions had 82% greater odds of developing PD compared with those in nonvulnerable groundwater regions (aOR, 1.82; 95% CI, 1.09-3.03). Conclusions and Relevance In this population-based case-control study, the greatest risk of PD was found within 1 to 3 miles of a golf course and risk generally decreased with distance. Associations with the largest effect sizes were in water service areas with a golf course and in vulnerable ground water regions.
Collapse
Affiliation(s)
| | - Aidan F. Mullan
- Department of Neurology, Mayo Clinic, Rochester, Minnesota
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota
| | - E. Ray Dorsey
- Department of Neurology, Center for Health + Technology, University of Rochester Medical Center, Rochester, New York
| | - Sai Shivani Chirag
- Department of Neurology, Barrow Neurological Institute, Phoenix, Arizona
| | | | - Emanuele Camerucci
- Department of Neurology, University of Kansas Medical Center, Kansas City, Kansas
| | - James H. Bower
- Department of Neurology, Mayo Clinic, Rochester, Minnesota
| | - Rodolfo Savica
- Department of Neurology, Mayo Clinic, Rochester, Minnesota
| |
Collapse
|
3
|
Ellrott K, Wong CK, Yau C, Castro MAA, Lee JA, Karlberg BJ, Grewal JK, Lagani V, Tercan B, Friedl V, Hinoue T, Uzunangelov V, Westlake L, Loinaz X, Felau I, Wang PI, Kemal A, Caesar-Johnson SJ, Shmulevich I, Lazar AJ, Tsamardinos I, Hoadley KA, Robertson AG, Knijnenburg TA, Benz CC, Stuart JM, Zenklusen JC, Cherniack AD, Laird PW. Classification of non-TCGA cancer samples to TCGA molecular subtypes using compact feature sets. Cancer Cell 2025; 43:195-212.e11. [PMID: 39753139 PMCID: PMC11949768 DOI: 10.1016/j.ccell.2024.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Revised: 08/26/2024] [Accepted: 12/05/2024] [Indexed: 02/12/2025]
Abstract
Molecular subtypes, such as defined by The Cancer Genome Atlas (TCGA), delineate a cancer's underlying biology, bringing hope to inform a patient's prognosis and treatment plan. However, most approaches used in the discovery of subtypes are not suitable for assigning subtype labels to new cancer specimens from other studies or clinical trials. Here, we address this barrier by applying five different machine learning approaches to multi-omic data from 8,791 TCGA tumor samples comprising 106 subtypes from 26 different cancer cohorts to build models based upon small numbers of features that can classify new samples into previously defined TCGA molecular subtypes-a step toward molecular subtype application in the clinic. We validate select classifiers using external datasets. Predictive performance and classifier-selected features yield insight into the different machine-learning approaches and genomic data platforms. For each cancer and data type we provide containerized versions of the top-performing models as a public resource.
Collapse
Affiliation(s)
- Kyle Ellrott
- Oregon Health and Science University, Portland, OR 97239, USA.
| | - Christopher K Wong
- Biomolecular Engineering Department, School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christina Yau
- University of California, San Francisco, Department of Surgery, San Francisco, CA 94158, USA; Buck Institute for Research on Aging, Novato, CA 94945, USA
| | - Mauro A A Castro
- Bioinformatics and Systems Biology Laboratory, Federal University of Paraná, Curitiba, PR 81520-260, Brazil
| | - Jordan A Lee
- Oregon Health and Science University, Portland, OR 97239, USA
| | | | - Jasleen K Grewal
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Vincenzo Lagani
- JADBio Gnosis DA, GR-700 13 Heraklion, Crete, Greece; Institute of Chemical Biology, Ilia State University, Tbilisi 0162, Georgia
| | - Bahar Tercan
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109, USA
| | - Verena Friedl
- Biomolecular Engineering Department, School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Toshinori Hinoue
- Department of Epigenetics, Van Andel Institute, Grand Rapids, MI 49503, USA
| | - Vladislav Uzunangelov
- Biomolecular Engineering Department, School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Lindsay Westlake
- The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Xavier Loinaz
- The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Ina Felau
- Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Peggy I Wang
- Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Anab Kemal
- Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
| | | | - Ilya Shmulevich
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109, USA
| | - Alexander J Lazar
- Departments of Pathology & Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ioannis Tsamardinos
- JADBio Gnosis DA, GR-700 13 Heraklion, Crete, Greece; Department of Computer Science, University of Crete, GR-700 13 Heraklion, Crete, Greece; Institute of Applied and Computational Mathematics, Foundation for Research and Technology Hellas (FORTH), GR-700 13 Heraklion, Crete, Greece
| | - Katherine A Hoadley
- Department of Genetics, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27519, USA
| | - A Gordon Robertson
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Theo A Knijnenburg
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, WA 98109, USA
| | | | - Joshua M Stuart
- Biomolecular Engineering Department, School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jean C Zenklusen
- Center for Cancer Genomics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Andrew D Cherniack
- The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Harvard Medical School, Boston, MA 02115, USA.
| | - Peter W Laird
- Department of Epigenetics, Van Andel Institute, Grand Rapids, MI 49503, USA.
| |
Collapse
|
4
|
Bhattacharyay S, van Leeuwen FD, Beqiri E, Åkerlund CAI, Wilson L, Steyerberg EW, Nelson DW, Maas AIR, Menon DK, Ercole A. TILTomorrow today: dynamic factors predicting changes in intracranial pressure treatment intensity after traumatic brain injury. Sci Rep 2025; 15:95. [PMID: 39747195 PMCID: PMC11696189 DOI: 10.1038/s41598-024-83862-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 12/18/2024] [Indexed: 01/04/2025] Open
Abstract
Practices for controlling intracranial pressure (ICP) in traumatic brain injury (TBI) patients admitted to the intensive care unit (ICU) vary considerably between centres. To help understand the rational basis for such variance in care, this study aims to identify the patient-level predictors of changes in ICP management. We extracted all heterogeneous data (2008 pre-ICU and ICU variables) collected from a prospective cohort (n = 844, 51 ICUs) of ICP-monitored TBI patients in the Collaborative European NeuroTrauma Effectiveness Research in TBI study. We developed the TILTomorrow modelling strategy, which leverages recurrent neural networks to map a token-embedded time series representation of all variables (including missing values) to an ordinal, dynamic prediction of the following day's five-category therapy intensity level (TIL(Basic)) score. With 20 repeats of fivefold cross-validation, we trained TILTomorrow on different variable sets and applied the TimeSHAP (temporal extension of SHapley Additive exPlanations) algorithm to estimate variable contributions towards predictions of next-day changes in TIL(Basic). Based on Somers' Dxy, the full range of variables explained 68% (95% CI 65-72%) of the ordinal variation in next-day changes in TIL(Basic) on day one and up to 51% (95% CI 45-56%) thereafter, when changes in TIL(Basic) became less frequent. Up to 81% (95% CI 78-85%) of this explanation could be derived from non-treatment variables (i.e., markers of pathophysiology and injury severity), but the prior trajectory of ICU management significantly improved prediction of future de-escalations in ICP-targeted treatment. Whilst there was no significant difference in the predictive discriminability (i.e., area under receiver operating characteristic curve) between next-day escalations (0.80 [95% CI 0.77-0.84]) and de-escalations (0.79 [95% CI 0.76-0.82]) in TIL(Basic) after day two, we found specific predictor effects to be more robust with de-escalations. The most important predictors of day-to-day changes in ICP management included preceding treatments, age, space-occupying lesions, ICP, metabolic derangements, and neurological function. Serial protein biomarkers were also important and may serve a useful role in the clinical armamentarium for assessing therapeutic needs. Approximately half of the ordinal variation in day-to-day changes in TIL(Basic) after day two remained unexplained, underscoring the significant contribution of unmeasured factors or clinicians' personal preferences in ICP treatment. At the same time, specific dynamic markers of pathophysiology associated strongly with changes in treatment intensity and, upon mechanistic investigation, may improve the timing and personalised targeting of future care.
Collapse
Affiliation(s)
- Shubhayu Bhattacharyay
- Division of Anaesthesia, University of Cambridge, Cambridge, UK.
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK.
- Harvard Medical School, Boston, MA, USA.
| | - Florian D van Leeuwen
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Erta Beqiri
- Brain Physics Laboratory, Division of Neurosurgery, University of Cambridge, Cambridge, UK
| | - Cecilia A I Åkerlund
- Department of Physiology and Pharmacology, Section for Perioperative Medicine and Intensive Care, Karolinska Institutet, Stockholm, Sweden
| | - Lindsay Wilson
- Division of Psychology, University of Stirling, Stirling, UK
| | - Ewout W Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - David W Nelson
- Department of Physiology and Pharmacology, Section for Perioperative Medicine and Intensive Care, Karolinska Institutet, Stockholm, Sweden
| | - Andrew I R Maas
- Department of Neurosurgery, Antwerp University Hospital, Edegem, Belgium
- Department of Translational Neuroscience, Faculty of Medicine and Health Science, University of Antwerp, Antwerp, Belgium
| | - David K Menon
- Division of Anaesthesia, University of Cambridge, Cambridge, UK
| | - Ari Ercole
- Division of Anaesthesia, University of Cambridge, Cambridge, UK
- Cambridge Centre for Artificial Intelligence in Medicine, Cambridge, UK
| |
Collapse
|
5
|
Omigbodun FT, Osa-Uwagboe N, Udu AG, Oladapo BI. Leveraging Machine Learning for Optimized Mechanical Properties and 3D Printing of PLA/cHAP for Bone Implant. Biomimetics (Basel) 2024; 9:587. [PMID: 39451792 PMCID: PMC11504968 DOI: 10.3390/biomimetics9100587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 09/13/2024] [Accepted: 09/23/2024] [Indexed: 10/26/2024] Open
Abstract
This study explores the fabrication and characterisation of 3D-printed polylactic acid (PLA) scaffolds reinforced with calcium hydroxyapatite (cHAP) for bone tissue engineering applications. By varying the cHAP content, we aimed to enhance PLA scaffolds' mechanical and thermal properties, making them suitable for load-bearing biomedical applications. The results indicate that increasing cHAP content improves the tensile and compressive strength of the scaffolds, although it also increases brittleness. Notably, incorporating cHAP at 7.5% and 10% significantly enhances thermal stability and mechanical performance, with properties comparable to or exceeding those of human cancellous bone. Furthermore, this study integrates machine learning techniques to predict the mechanical properties of these composites, employing algorithms such as XGBoost and AdaBoost. The models demonstrated high predictive accuracy, with R2 scores of 0.9173 and 0.8772 for compressive and tensile strength, respectively. These findings highlight the potential of using data-driven approaches to optimise material properties autonomously, offering significant implications for developing custom-tailored scaffolds in bone tissue engineering and regenerative medicine. The study underscores the promise of PLA/cHAP composites as viable candidates for advanced biomedical applications, particularly in creating patient-specific implants with improved mechanical and thermal characteristics.
Collapse
Affiliation(s)
- Francis T. Omigbodun
- Wolfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University, Loughborough LE11 3TU, UK; (F.T.O.); (N.O.-U.)
- The Manufacturing Technology Centre, Coventry CV7 9JU, UK
| | - Norman Osa-Uwagboe
- Wolfson School of Mechanical, Electrical and Manufacturing Engineering, Loughborough University, Loughborough LE11 3TU, UK; (F.T.O.); (N.O.-U.)
- Air Force Research and Development Centre, Nigerian Air Force Base, Kaduna PMB 2104, Nigeria
| | - Amadi Gabriel Udu
- Air Force Research and Development Centre, Nigerian Air Force Base, Kaduna PMB 2104, Nigeria
- School of Engineering, University of Leicester, Leicester LE1 7RH, UK
| | - Bankole I. Oladapo
- School of Science and Engineering, University of Dundee, Dundee DD1 4HN, UK
| |
Collapse
|
6
|
Chang MEK, Lange J, Cartier JM, Moore TW, Soriano SM, Albracht B, Krawitzky M, Guturu H, Alavi A, Stukalov A, Zhou X, Elgierari EM, Chu J, Benz R, Cuevas JC, Ferdosi S, Hornburg D, Farokhzad O, Siddiqui A, Batzoglou S, Leach RJ, Liss MA, Kopp RP, Flory MR. A Scaled Proteomic Discovery Study for Prostate Cancer Diagnostic Markers Using Proteograph TM and Trapped Ion Mobility Mass Spectrometry. Int J Mol Sci 2024; 25:8010. [PMID: 39125581 PMCID: PMC11311733 DOI: 10.3390/ijms25158010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 07/03/2024] [Accepted: 07/09/2024] [Indexed: 08/12/2024] Open
Abstract
There is a significant unmet need for clinical reflex tests that increase the specificity of prostate-specific antigen blood testing, the longstanding but imperfect tool for prostate cancer diagnosis. Towards this endpoint, we present the results from a discovery study that identifies new prostate-specific antigen reflex markers in a large-scale patient serum cohort using differentiating technologies for deep proteomic interrogation. We detect known prostate cancer blood markers as well as novel candidates. Through bioinformatic pathway enrichment and network analysis, we reveal associations of differentially abundant proteins with cytoskeletal, metabolic, and ribosomal activities, all of which have been previously associated with prostate cancer progression. Additionally, optimized machine learning classifier analysis reveals proteomic signatures capable of detecting the disease prior to biopsy, performing on par with an accepted clinical risk calculator benchmark.
Collapse
Affiliation(s)
- Matthew E. K. Chang
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97201, USA; (M.E.K.C.); (S.M.S.)
| | - Jane Lange
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97201, USA; (M.E.K.C.); (S.M.S.)
| | - Jessie May Cartier
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97201, USA; (M.E.K.C.); (S.M.S.)
| | - Travis W. Moore
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97201, USA; (M.E.K.C.); (S.M.S.)
| | - Sophia M. Soriano
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97201, USA; (M.E.K.C.); (S.M.S.)
| | - Brenna Albracht
- Department of Cell Systems and Anatomy, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | | | | | | | | | | | | | | | - Ryan Benz
- Seer Inc., Redwood City, CA 94065, USA
| | | | | | | | | | | | | | - Robin J. Leach
- Department of Cell Systems and Anatomy, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Michael A. Liss
- Roger L. & Laura D. Zeller Charitable Foundation in Urologic Oncology, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Ryan P. Kopp
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97201, USA; (M.E.K.C.); (S.M.S.)
| | - Mark R. Flory
- Cancer Early Detection Advanced Research Center, Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97201, USA; (M.E.K.C.); (S.M.S.)
| |
Collapse
|
7
|
Karaglani M, Agorastos A, Panagopoulou M, Parlapani E, Athanasis P, Bitsios P, Tzitzikou K, Theodosiou T, Iliopoulos I, Bozikas VP, Chatzaki E. A novel blood-based epigenetic biosignature in first-episode schizophrenia patients through automated machine learning. Transl Psychiatry 2024; 14:257. [PMID: 38886359 PMCID: PMC11183091 DOI: 10.1038/s41398-024-02946-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 05/15/2024] [Accepted: 05/17/2024] [Indexed: 06/20/2024] Open
Abstract
Schizophrenia (SCZ) is a chronic, severe, and complex psychiatric disorder that affects all aspects of personal functioning. While SCZ has a very strong biological component, there are still no objective diagnostic tests. Lately, special attention has been given to epigenetic biomarkers in SCZ. In this study, we introduce a three-step, automated machine learning (AutoML)-based, data-driven, biomarker discovery pipeline approach, using genome-wide DNA methylation datasets and laboratory validation, to deliver a highly performing, blood-based epigenetic biosignature of diagnostic clinical value in SCZ. Publicly available blood methylomes from SCZ patients and healthy individuals were analyzed via AutoML, to identify SCZ-specific biomarkers. The methylation of the identified genes was then analyzed by targeted qMSP assays in blood gDNA of 30 first-episode drug-naïve SCZ patients and 30 healthy controls (CTRL). Finally, AutoML was used to produce an optimized disease-specific biosignature based on patient methylation data combined with demographics. AutoML identified a SCZ-specific set of novel gene methylation biomarkers including IGF2BP1, CENPI, and PSME4. Functional analysis investigated correlations with SCZ pathology. Methylation levels of IGF2BP1 and PSME4, but not CENPI were found to differ, IGF2BP1 being higher and PSME4 lower in the SCZ group as compared to the CTRL group. Additional AutoML classification analysis of our experimental patient data led to a five-feature biosignature including all three genes, as well as age and sex, that discriminated SCZ patients from healthy individuals [AUC 0.755 (0.636, 0.862) and average precision 0.758 (0.690, 0.825)]. In conclusion, this three-step pipeline enabled the discovery of three novel genes and an epigenetic biosignature bearing potential value as promising SCZ blood-based diagnostics.
Collapse
Affiliation(s)
- Makrina Karaglani
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, GR-68132, Alexandroupolis, Greece
- Institute of Agri-food and Life Sciences, University Research & Innovation Center, H.M.U.R.I.C., Hellenic Mediterranean University, GR-71003, Crete, Greece
| | - Agorastos Agorastos
- Institute of Agri-food and Life Sciences, University Research & Innovation Center, H.M.U.R.I.C., Hellenic Mediterranean University, GR-71003, Crete, Greece
- II. Department of Psychiatry, Faculty of Health Sciences, School of Medicine, Aristotle University of Thessaloniki, GR-56430, Thessaloniki, Greece
| | - Maria Panagopoulou
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, GR-68132, Alexandroupolis, Greece
- Institute of Agri-food and Life Sciences, University Research & Innovation Center, H.M.U.R.I.C., Hellenic Mediterranean University, GR-71003, Crete, Greece
| | - Eleni Parlapani
- Ι. Department of Psychiatry, Faculty of Health Sciences, School of Medicine, Aristotle University of Thessaloniki, GR-56429, Thessaloniki, Greece
| | - Panagiotis Athanasis
- II. Department of Psychiatry, Faculty of Health Sciences, School of Medicine, Aristotle University of Thessaloniki, GR-56430, Thessaloniki, Greece
| | - Panagiotis Bitsios
- Department of Psychiatry and Behavioral Sciences, Faculty of Medicine, University of Crete, GR-71500, Heraklion, Greece
| | - Konstantina Tzitzikou
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, GR-68132, Alexandroupolis, Greece
| | - Theodosis Theodosiou
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, GR-68132, Alexandroupolis, Greece
- ABCureD P.C, GR-68131, Alexandroupolis, Greece
| | - Ioannis Iliopoulos
- Division of Basic Sciences, School of Medicine, University of Crete, GR-71003, Heraklion, Greece
| | - Vasilios-Panteleimon Bozikas
- II. Department of Psychiatry, Faculty of Health Sciences, School of Medicine, Aristotle University of Thessaloniki, GR-56430, Thessaloniki, Greece
| | - Ekaterini Chatzaki
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, GR-68132, Alexandroupolis, Greece.
- Institute of Agri-food and Life Sciences, University Research & Innovation Center, H.M.U.R.I.C., Hellenic Mediterranean University, GR-71003, Crete, Greece.
- ABCureD P.C, GR-68131, Alexandroupolis, Greece.
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, 70013, Heraklion, Greece.
| |
Collapse
|
8
|
Han L, Xu Q, Meng P, Xu R, Nan J. Brain identification of IBS patients based on GBDT and multiple imaging techniques. Phys Eng Sci Med 2024; 47:651-662. [PMID: 38416373 DOI: 10.1007/s13246-024-01394-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 01/16/2024] [Indexed: 02/29/2024]
Abstract
The brain biomarker of irritable bowel syndrome (IBS) patients is still lacking. The study aims to explore a new technology studying the brain alterations of IBS patients based on multi-source brain data. In the study, a decision-level fusion method based on gradient boosting decision tree (GBDT) was proposed. Next, 100 healthy subjects were used to validate the effectiveness of the method. Finally, the identification of brain alterations and the pain evaluation in IBS patients were carried out by the fusion method based on the resting-state fMRI and DWI for 46 patients and 46 controls selected randomly from 100 healthy subjects. The results showed that the method can achieve good classification between IBS patients and controls (accuracy = 95%) and pain evaluation of IBS patients (mean absolute error = 0.1977). Moreover, both the gain-based and the permutation-based evaluation instead of statistical analysis showed that left cingulum bundle contributed most significantly to the classification, and right precuneus contributed most significantly to the evaluation of abdominal pain intensity in the IBS patients. The differences seem to suggest a probable but unexplored separation about the central regions between the identification and progression of IBS. This finding may provide one new thought and technology for brain alteration related to IBS.
Collapse
Affiliation(s)
- Li Han
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, 136 Science Avenue, Zhengzhou, 450000, Henan, China
| | - Qian Xu
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, 136 Science Avenue, Zhengzhou, 450000, Henan, China
| | - Panting Meng
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, 136 Science Avenue, Zhengzhou, 450000, Henan, China
| | - Ruyun Xu
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, 136 Science Avenue, Zhengzhou, 450000, Henan, China
| | - Jiaofen Nan
- School of Computer and Communication Engineering, Zhengzhou University of Light Industry, 136 Science Avenue, Zhengzhou, 450000, Henan, China.
| |
Collapse
|
9
|
Montesanto A, Lagani V, Spazzafumo L, Tortato E, Rosati S, Corsonello A, Soraci L, Sabbatinelli J, Cherubini A, Conte M, Capri M, Capalbo M, Lattanzio F, Olivieri F, Bonfigli AR. Physical performance strongly predicts all-cause mortality risk in a real-world population of older diabetic patients: machine learning approach for mortality risk stratification. Front Endocrinol (Lausanne) 2024; 15:1359482. [PMID: 38745954 PMCID: PMC11091327 DOI: 10.3389/fendo.2024.1359482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 04/12/2024] [Indexed: 05/16/2024] Open
Abstract
Background Prognostic risk stratification in older adults with type 2 diabetes (T2D) is important for guiding decisions concerning advance care planning. Materials and methods A retrospective longitudinal study was conducted in a real-world sample of older diabetic patients afferent to the outpatient facilities of the Diabetology Unit of the IRCCS INRCA Hospital of Ancona (Italy). A total of 1,001 T2D patients aged more than 70 years were consecutively evaluated by a multidimensional geriatric assessment, including physical performance evaluated using the Short Physical Performance Battery (SPPB). The mortality was assessed during a 5-year follow-up. We used the automatic machine-learning (AutoML) JADBio platform to identify parsimonious mathematical models for risk stratification. Results Of 977 subjects included in the T2D cohort, the mean age was 76.5 (SD: 4.5) years and 454 (46.5%) were men. The mean follow-up time was 53.3 (SD:15.8) months, and 209 (21.4%) patients died by the end of the follow-up. The JADBio AutoML final model included age, sex, SPPB, chronic kidney disease, myocardial ischemia, peripheral artery disease, neuropathy, and myocardial infarction. The bootstrap-corrected concordance index (c-index) for the final model was 0.726 (95% CI: 0.687-0.763) with SPPB ranked as the most important predictor. Based on the penalized Cox regression model, the risk of death per unit of time for a subject with an SPPB score lower than five points was 3.35 times that for a subject with a score higher than eight points (P-value <0.001). Conclusion Assessment of physical performance needs to be implemented in clinical practice for risk stratification of T2D older patients.
Collapse
Affiliation(s)
- Alberto Montesanto
- Department of Biology, Ecology and Earth Sciences, University of Calabria, Rende, Italy
| | - Vincenzo Lagani
- Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, Thuwal, Saudi Arabia
- Institute of Chemical Biology, Ilia State University, Tbilisi, Georgia
| | | | | | | | - Andrea Corsonello
- Unit of Geriatric Medicine, IRCCS INRCA, Cosenza, Italy
- Department of Pharmacy, Health and Nutritional Sciences, University of Calabria, Rende, Italy
| | - Luca Soraci
- Unit of Geriatric Medicine, IRCCS INRCA, Cosenza, Italy
| | - Jacopo Sabbatinelli
- Department of Clinical and Molecular Sciences, Università Politecnica delle Marche, Ancona, Italy
- Laboratory Medicine Unit, Azienda Ospedaliero Universitaria delle Marche, Ancona, Italy
| | - Antonio Cherubini
- Geriatria, Accettazione geriatrica e Centro di ricerca per l’invecchiamento, IRCCS INRCA, Ancona, Italy
| | - Maria Conte
- Department of Medical and Surgical Science, University of Bologna, Bologna, Italy
| | - Miriam Capri
- Department of Medical and Surgical Science, University of Bologna, Bologna, Italy
| | | | | | - Fabiola Olivieri
- Department of Clinical and Molecular Sciences, Università Politecnica delle Marche, Ancona, Italy
- Clinic of Laboratory and Precision Medicine, IRCCS INRCA, Ancona, Italy
| | | |
Collapse
|
10
|
Panagopoulou M, Karaglani M, Tzitzikou K, Kessari N, Arvanitidis K, Amarantidis K, Drosos GI, Gerou S, Papanas N, Papazoglou D, Baritaki S, Constantinidis TC, Chatzaki E. Mitochondrial Fraction of Circulating Cell-Free DNA as an Indicator of Human Pathology. Int J Mol Sci 2024; 25:4199. [PMID: 38673785 PMCID: PMC11050675 DOI: 10.3390/ijms25084199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/01/2024] [Accepted: 04/08/2024] [Indexed: 04/28/2024] Open
Abstract
Circulating cell-free DNA (ccfDNA) of mitochondrial origin (ccf-mtDNA) consists of a minor fraction of total ccfDNA in blood or in other biological fluids. Aberrant levels of ccf-mtDNA have been observed in many pathologies. Here, we introduce a simple and effective standardized Taqman probe-based dual-qPCR assay for the simultaneous detection and relative quantification of nuclear and mitochondrial fragments of ccfDNA. Three pathologies of major burden, one malignancy (Breast Cancer, BrCa), one inflammatory (Osteoarthritis, OA) and one metabolic (Type 2 Diabetes, T2D), were studied. Higher levels of ccf-mtDNA were detected both in BrCa and T2D in relation to health, but not in OA. In BrCa, hormonal receptor status was associated with ccf-mtDNA levels. Machine learning analysis of ccf-mtDNA datasets was used to build biosignatures of clinical relevance. (A) a three-feature biosignature discriminating between health and BrCa (AUC: 0.887) and a five-feature biosignature for predicting the overall survival of BrCa patients (Concordance Index: 0.756). (B) a five-feature biosignature stratifying among T2D, prediabetes and health (AUC: 0.772); a five-feature biosignature discriminating between T2D and health (AUC: 0.797); and a four-feature biosignature identifying prediabetes from health (AUC: 0.795). (C) a biosignature including total plasma ccfDNA with very high performance in discriminating OA from health (AUC: 0.934). Aberrant ccf-mtDNA levels could have diagnostic/prognostic potential in BrCa and Diabetes, while the developed multiparameter biosignatures can add value to their clinical management.
Collapse
Affiliation(s)
- Maria Panagopoulou
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece (K.T.)
- Institute of Agri-Food and Life Sciences, University Research and Innovation Centre, Hellenic Mediterranean University, 71003 Heraklion, Greece
| | - Makrina Karaglani
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece (K.T.)
- Institute of Agri-Food and Life Sciences, University Research and Innovation Centre, Hellenic Mediterranean University, 71003 Heraklion, Greece
| | - Konstantina Tzitzikou
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece (K.T.)
| | - Nikoleta Kessari
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece (K.T.)
| | - Konstantinos Arvanitidis
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece (K.T.)
- Institute of Agri-Food and Life Sciences, University Research and Innovation Centre, Hellenic Mediterranean University, 71003 Heraklion, Greece
| | - Kyriakos Amarantidis
- Clinic of Medical Oncology, Department of Medicine, Democritus University of Thrace, University General Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece
| | - George I. Drosos
- Clinic of Orthopaedic Surgery, Department of Medicine, Democritus University of Thrace, University General Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece
| | - Spyros Gerou
- Analysis Biopathological Diagnostic Research Laboratories, 54623 Thessaloniki, Greece
| | - Nikolaos Papanas
- Diabetes Centre, 2nd Department of Internal Medicine, University Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece
| | - Dimitrios Papazoglou
- Diabetes Centre, 2nd Department of Internal Medicine, University Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece
| | - Stavroula Baritaki
- Laboratory of Experimental Oncology, Division of Surgery, School of Medicine, University of Crete, 71500 Heraklion, Greece
| | - Theodoros C. Constantinidis
- Laboratory of Hygiene and Environmental Protection, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| | - Ekaterini Chatzaki
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece (K.T.)
- Institute of Agri-Food and Life Sciences, University Research and Innovation Centre, Hellenic Mediterranean University, 71003 Heraklion, Greece
| |
Collapse
|
11
|
Biza K, Tsamardinos I, Triantafillou S. Out-of-Sample Tuning for Causal Discovery. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4963-4973. [PMID: 35830399 DOI: 10.1109/tnnls.2022.3185842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Causal discovery is continually being enriched with new algorithms for learning causal graphical probabilistic models. Each one of them requires a set of hyperparameters, creating a great number of combinations. Given that the true graph is unknown and the learning task is unsupervised, the challenge to a practitioner is how to tune these choices. We propose out-of-sample causal tuning (OCT) that aims to select an optimal combination. The method treats a causal model as a set of predictive models and uses out-of-sample protocols for supervised methods. This approach can handle general settings like latent confounders and nonlinear relationships. The method uses an information-theoretic approach to be able to generalize to mixed data types and a penalty for dense graphs to penalize for complexity. To evaluate OCT, we introduce a causal-based simulation method to create datasets that mimic the properties of real-world problems. We evaluate OCT against two other tuning approaches, based on stability and in-sample fitting. We show that OCT performs well in many experimental settings and it is an effective tuning method for causal discovery.
Collapse
|
12
|
Ergün H, Ergün ME. Modeling Xanthan Gum Foam's Material Properties Using Machine Learning Methods. Polymers (Basel) 2024; 16:740. [PMID: 38543346 PMCID: PMC10974626 DOI: 10.3390/polym16060740] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 02/28/2024] [Accepted: 03/06/2024] [Indexed: 11/12/2024] Open
Abstract
Xanthan gum is commonly used in the pharmaceutical, cosmetic, and food industries. However, there have been no studies on utilizing this natural biopolymer as a foam material in the insulation and packaging sectors, which are large markets, or modeling it using an artificial neural network. In this study, foam material production was carried out in an oven using different ratios of cellulose fiber and xanthan gum in a 5% citric acid medium. As a result of the physical and mechanical experiments conducted, it was determined that xanthan gum had a greater impact on the properties of the foam material than cellulose. The densities of the produced foam materials ranged from 49.42 kg/m3 to 172.2 kg/m3. In addition, the compressive and flexural moduli were found to vary between 235.25 KPa and 1257.52 KPa and between 1939.76 KPa and 12,736.39 KPa, respectively. Five machine-learning-based methods (multiple linear regression, support vector machines, artificial neural networks, least squares methods, and generalized regression neural networks) were utilized to analyze the effects of the components used in the foam formulation. These models yielded accurate results without time, material, or cost losses, making the process more efficient. The models predicted the best results for density, compression modulus, and flexural modulus achieved in the experimental tests. The generalized regression neural network model yielded impressive results, with R2 values above 0.97, enabling the acquisition of more quantitative data with fewer experimental results.
Collapse
Affiliation(s)
- Halime Ergün
- Seydisehir Ahmet Cengiz Faculty of Engineering, Necmettin Erbakan University, Konya 42360, Turkey;
| | - Mehmet Emin Ergün
- Akseki Vocational School, Alanya Alaaddin Keykubat University, Antalya 07630, Turkey
| |
Collapse
|
13
|
Li L, Yang J, Por LY, Khan MS, Hamdaoui R, Hussain L, Iqbal Z, Rotaru IM, Dobrotă D, Aldrdery M, Omar A. Enhancing lung cancer detection through hybrid features and machine learning hyperparameters optimization techniques. Heliyon 2024; 10:e26192. [PMID: 38404820 PMCID: PMC10884486 DOI: 10.1016/j.heliyon.2024.e26192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 01/30/2024] [Accepted: 02/08/2024] [Indexed: 02/27/2024] Open
Abstract
Machine learning offers significant potential for lung cancer detection, enabling early diagnosis and potentially improving patient outcomes. Feature extraction remains a crucial challenge in this domain. Combining the most relevant features can further enhance detection accuracy. This study employed a hybrid feature extraction approach, which integrates both Gray-level co-occurrence matrix (GLCM) with Haralick and autoencoder features with an autoencoder. These features were subsequently fed into supervised machine learning methods. Support Vector Machine (SVM) Radial Base Function (RBF) and SVM Gaussian achieved perfect performance measures, while SVM polynomial produced an accuracy of 99.89% when utilizing GLCM with an autoencoder, Haralick, and autoencoder features. SVM Gaussian achieved an accuracy of 99.56%, while SVM RBF achieved an accuracy of 99.35% when utilizing GLCM with Haralick features. These results demonstrate the potential of the proposed approach for developing improved diagnostic and prognostic lung cancer treatment planning and decision-making systems.
Collapse
Affiliation(s)
- Liangyu Li
- Center for Software Technology and Management, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
- Health Informatics Laboratory, Cancer Research Institute, Chifeng Cancer Hospital (Second Affiliated Hospital of Chifeng University), Medical Department, Chifeng University, Chifeng City, Inner Mongolia Autonomous Region, 024000, China
| | - Jing Yang
- Department of Computer System and Technology, Faculty of Computer Science and Information Technology, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Lip Yee Por
- Department of Computer System and Technology, Faculty of Computer Science and Information Technology, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Mohammad Shahbaz Khan
- Children's National Hospital, 111 Michigan Ave NW, Washington, DC, 20010, United States
| | - Rim Hamdaoui
- Department of Computer Science, College of Science and Human Studies Dawadmi, Shaqra University, Shaqra, Riyadh, Saudi Arabia
| | - Lal Hussain
- Department of Computer Science and Information Technology, King Abdullah Campus Chatter Kalas, University of Azad Jammu and Kashmir, Muzaffarabad, 13100, Azad Kashmir, Pakistan
- Department of Computer Science and Information Technology, Neelum Campus, University of Azad Jammu and Kashmir, Athmuqam, 13230, Azad Kashmir, Pakistan
| | - Zahoor Iqbal
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua, 321004, China
| | - Ionela Magdalena Rotaru
- Department of Industrial Engineering and Management, Lucian Blaga University of Sibiu, Bulevardul Victoriei 10, Sibiu, 550024, Romania
| | - Dan Dobrotă
- Faculty of Engineering, Lucian Blaga University of Sibiu, Bulevardul Victoriei 10, Sibiu, 550024, Romania
| | - Moutaz Aldrdery
- Department of Chemical Engineering, College of Engineering, King Khalid University, Abha, 61411, Saudi Arabia
| | - Abdulfattah Omar
- Department of English, College of Science & Humanities, Prince Sattam Bin Abdulaziz University, Saudi Arabia
| |
Collapse
|
14
|
Litwińczuk MC, Muhlert N, Trujillo‐Barreto N, Woollams A. Impact of brain parcellation on prediction performance in models of cognition and demographics. Hum Brain Mapp 2024; 45:e26592. [PMID: 38339892 PMCID: PMC10831203 DOI: 10.1002/hbm.26592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 12/18/2023] [Accepted: 12/31/2023] [Indexed: 02/12/2024] Open
Abstract
Brain connectivity analysis begins with the selection of a parcellation scheme that will define brain regions as nodes of a network whose connections will be studied. Brain connectivity has already been used in predictive modelling of cognition, but it remains unclear if the resolution of the parcellation used can systematically impact the predictive model performance. In this work, structural, functional and combined connectivity were each defined with five different parcellation schemes. The resolution and modality of the parcellation schemes were varied. Each connectivity defined with each parcellation was used to predict individual differences in age, education, sex, executive function, self-regulation, language, encoding and sequence processing. It was found that low-resolution functional parcellation consistently performed above chance at producing generalisable models of both demographics and cognition. However, no single parcellation scheme showed a superior predictive performance across all cognitive domains and demographics. In addition, although parcellation schemes impacted the graph theory measures of each connectivity type (structural, functional and combined), these differences did not account for the out-of-sample predictive performance of the models. Taken together, these findings demonstrate that while high-resolution parcellations may be beneficial for modelling specific individual differences, partial voluming of signals produced by the higher resolution of the parcellation likely disrupts model generalisability.
Collapse
Affiliation(s)
| | - Nils Muhlert
- School of Health SciencesUniversity of ManchesterManchesterUK
| | | | - Anna Woollams
- School of Health SciencesUniversity of ManchesterManchesterUK
| |
Collapse
|
15
|
Wilimitis D, Walsh CG. Practical Considerations and Applied Examples of Cross-Validation for Model Development and Evaluation in Health Care: Tutorial. JMIR AI 2023; 2:e49023. [PMID: 38875530 PMCID: PMC11041453 DOI: 10.2196/49023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 09/19/2023] [Accepted: 09/28/2023] [Indexed: 06/16/2024]
Abstract
Cross-validation remains a popular means of developing and validating artificial intelligence for health care. Numerous subtypes of cross-validation exist. Although tutorials on this validation strategy have been published and some with applied examples, we present here a practical tutorial comparing multiple forms of cross-validation using a widely accessible, real-world electronic health care data set: Medical Information Mart for Intensive Care-III (MIMIC-III). This tutorial explored methods such as K-fold cross-validation and nested cross-validation, highlighting their advantages and disadvantages across 2 common predictive modeling use cases: classification (mortality) and regression (length of stay). We aimed to provide readers with reproducible notebooks and best practices for modeling with electronic health care data. We also described sets of useful recommendations as we demonstrated that nested cross-validation reduces optimistic bias but comes with additional computational challenges. This tutorial might improve the community's understanding of these important methods while catalyzing the modeling community to apply these guides directly in their work using the published code.
Collapse
Affiliation(s)
- Drew Wilimitis
- Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN, United States
| | - Colin G Walsh
- Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN, United States
| |
Collapse
|
16
|
Thomaidis GV, Papadimitriou K, Michos S, Chartampilas E, Tsamardinos I. A characteristic cerebellar biosignature for bipolar disorder, identified with fully automatic machine learning. IBRO Neurosci Rep 2023; 15:77-89. [PMID: 38025660 PMCID: PMC10668096 DOI: 10.1016/j.ibneur.2023.06.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 05/19/2023] [Accepted: 06/29/2023] [Indexed: 12/01/2023] Open
Abstract
Background Transcriptomic profile differences between patients with bipolar disorder and healthy controls can be identified using machine learning and can provide information about the potential role of the cerebellum in the pathogenesis of bipolar disorder.With this aim, user-friendly, fully automated machine learning algorithms can achieve extremely high classification scores and disease-related predictive biosignature identification, in short time frames and scaled down to small datasets. Method A fully automated machine learning platform, based on the most suitable algorithm selection and relevant set of hyper-parameter values, was applied on a preprocessed transcriptomics dataset, in order to produce a model for biosignature selection and to classify subjects into groups of patients and controls. The parent GEO datasets were originally produced from the cerebellar and parietal lobe tissue of deceased bipolar patients and healthy controls, using Affymetrix Human Gene 1.0 ST Array. Results Patients and controls were classified into two separate groups, with no close-to-the-boundary cases, and this classification was based on the cerebellar transcriptomic biosignature of 25 features (genes), with Area Under Curve 0.929 and Average Precision 0.955. The biosignature includes both genes connected before to bipolar disorder, depression, psychosis or epilepsy, as well as genes not linked before with any psychiatric disease. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed participation of 4 identified features in 6 pathways which have also been associated with bipolar disorder. Conclusion Automated machine learning (AutoML) managed to identify accurately 25 genes that can jointly - in a multivariate-fashion - separate bipolar patients from healthy controls with high predictive power. The discovered features lead to new biological insights. Machine Learning (ML) analysis considers the features in combination (in contrast to standard differential expression analysis), removing both irrelevant as well as redundant markers, and thus, focusing to biological interpretation.
Collapse
Affiliation(s)
- Georgios V. Thomaidis
- Greek National Health System, Psychiatric Department, Katerini General Hospital, Katerini, Greece
| | - Konstantinos Papadimitriou
- Greek National Health System, G. Papanikolaou General Hospital, Organizational Unit - Psychiatric Hospital of Thessaloniki, Thessaloniki, Greece
| | | | - Evangelos Chartampilas
- Laboratory of Radiology, AHEPA General Hospital, University of Thessaloniki, Thessaloniki, Greece
| | | |
Collapse
|
17
|
Erion Barner LA, Gao G, Reddi DM, Lan L, Burke W, Mahmood F, Grady WM, Liu JTC. Artificial Intelligence-Triaged 3-Dimensional Pathology to Improve Detection of Esophageal Neoplasia While Reducing Pathologist Workloads. Mod Pathol 2023; 36:100322. [PMID: 37657711 DOI: 10.1016/j.modpat.2023.100322] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 07/25/2023] [Accepted: 08/25/2023] [Indexed: 09/03/2023]
Abstract
Early detection of esophageal neoplasia via evaluation of endoscopic surveillance biopsies is the key to maximizing survival for patients with Barrett's esophagus, but it is hampered by the sampling limitations of conventional slide-based histopathology. Comprehensive evaluation of whole biopsies with 3-dimensional (3D) pathology may improve early detection of malignancies, but large 3D pathology data sets are tedious for pathologists to analyze. Here, we present a deep learning-based method to automatically identify the most critical 2-dimensional (2D) image sections within 3D pathology data sets for pathologists to review. Our method first generates a 3D heatmap of neoplastic risk for each biopsy, then classifies all 2D image sections within the 3D data set in order of neoplastic risk. In a clinical validation study, we diagnose esophageal biopsies with artificial intelligence-triaged 3D pathology (3 images per biopsy) vs standard slide-based histopathology (16 images per biopsy) and show that our method improves detection sensitivity while reducing pathologist workloads.
Collapse
Affiliation(s)
| | - Gan Gao
- Department of Mechanical Engineering, University of Washington, Seattle, Washington
| | - Deepti M Reddi
- Department of Laboratory Medicine & Pathology, University of Washington School of Medicine, Seattle, Washington
| | - Lydia Lan
- Department of Mechanical Engineering, University of Washington, Seattle, Washington; Department of Biology, University of Washington, Seattle, Washington
| | - Wynn Burke
- Department of Laboratory Medicine & Pathology, University of Washington School of Medicine, Seattle, Washington; Department of Medicine (Gastroenterology Division), University of Washington School of Medicine, Seattle, Washington
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, Massachusetts; Harvard Data Science Initiative, Harvard University, Cambridge, Massachusetts
| | - William M Grady
- Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Jonathan T C Liu
- Department of Mechanical Engineering, University of Washington, Seattle, Washington; Department of Laboratory Medicine & Pathology, University of Washington School of Medicine, Seattle, Washington; Department of Bioengineering, University of Washington, Seattle, Washington.
| |
Collapse
|
18
|
Shahini E, Chaulagain N, Shankar K, Tang T. Predicting Free Energies of Exfoliation and Solvation for Graphitic Carbon Nitrides Using Machine Learning. ACS APPLIED MATERIALS & INTERFACES 2023; 15:53786-53801. [PMID: 37938813 DOI: 10.1021/acsami.3c09347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
As a metal-free and visible-light-responsive photocatalyst, graphitic carbon nitride (g-C3N4) has emerged as a new research hotspot and has attracted broad attention in the field of solar energy conversion and thin-film transistors. Liquid-phase exfoliation (LPE) is the best-known method for the synthesis of 2D g-C3N4 nanosheets. In LPE, bulk g-C3N4 is exfoliated in a solvent via high-shear mixing or sonication in order to produce a stable suspension of individual nanosheets. Two parameters of importance in gauging the performance of a solvent in LPE are the free energy required to exfoliate a unit area of layered materials into individual sheets in the solvent (ΔGexf) and the solvation free energy per unit area of a nanosheet (ΔGsol). While approximations for the free energies exist, they are shown in our previous work to be inaccurate and incapable of capturing the experimentally observed efficacy of LPE. Molecular dynamics (MD) simulations can provide accurate free-energy calculations, but doing so for every single solvent is time- and resource-consuming. Herein, machine learning (ML) algorithms are used to predict ΔGexf and ΔGsol for g-C3N4. First, a database for ΔGexf and ΔGsol is created based on a series of MD simulations involving 49 different solvents with distinct chemical structures and properties. The data set also includes values of critical descriptors for the solvents, including density, surface tension, dielectric constant, etc. Different ML methods are compared, accompanied by descriptor selection, to develop the most accurate model for predicting ΔGexf and ΔGsol. The extra tree regressor is shown to be the best performer among the six ML methods studied. Experimental validation of the model is conducted by performing dispersibility tests in several solvents for which the free energies are predicted. Finally, the influence of the selected descriptors on the free energies is analyzed, and strategies for solvent selection in LPE are proposed.
Collapse
Affiliation(s)
- Ehsan Shahini
- Department of Mechanical Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada
| | - Narendra Chaulagain
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada
| | - Karthik Shankar
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada
| | - Tian Tang
- Department of Mechanical Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada
| |
Collapse
|
19
|
Su J, Zhang F, Yu C, Zhang Y, Wang J, Wang C, Wang H, Jiang H. Machine learning: Next promising trend for microplastics study. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2023; 344:118756. [PMID: 37573697 DOI: 10.1016/j.jenvman.2023.118756] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 07/24/2023] [Accepted: 08/09/2023] [Indexed: 08/15/2023]
Abstract
Microplastics (MPs), as an emerging pollutant, pose a significant threat to humans and ecosystems. However, traditional MPs characterization methods are limited by sample requirements and characterization time. Machine Learning (ML) has emerged as a vital technology for analyzing MPs pollution due to its accuracy, broad application, and powerful feature extraction. Nevertheless, environmental scientists require threshold knowledge before using ML, restricting the ML application in MPs research. Furthermore, imbalanced development of ML in MPs research is a pressing concern. In order to achieve a wide ML application in MPs research, in this review, we comprehensively discussed the size and sources of MPs datasets in relevant literature to help environmental scientists deepen their understanding of the construction of MPs datasets. Commonly used ML algorithms are analyzed from the perspective of interpretability and the need for computer facilities. Additionally, methods for improving and evaluating ML model performance, such as dataset pre-processing, model optimization, and model assessment metrics, are discussed. According to datasets and characterization techniques, MPs identification using ML was divided into three categories in this work: spectral identification, image identification, and spectral imaging identification. Finally, other applications of ML in MPs studies, including toxicity analysis, pollutants adsorption, and microbial colonization, are comprehensively discussed, which reveals the great application potential of ML. Based on the discussion above, this review suggests an algorithm selection strategy to assist researchers in selecting the most suitable ML algorithm in different situations, improving efficiency and decreasing the costs of trial and error. We believe that this work sheds light on the application of ML in MPs study.
Collapse
Affiliation(s)
- Jiming Su
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, Hunan, PR China
| | - Fupeng Zhang
- Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Tsinghua University, 518055, Shenzhen, PR China
| | - Chuanxiu Yu
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, Hunan, PR China
| | - Yingshuang Zhang
- School of Chemical Engineering and Technology, Xinjiang University, 830017, Urumqi, Xinjiang, PR China
| | - Jianchao Wang
- School of Chemical and Environmental Engineering, China University of Mining and Technology (Beijing), Beijing, 100083, PR China
| | - Chongqing Wang
- School of Chemical Engineering, Zhengzhou University, Zhengzhou, 450001, PR China
| | - Hui Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, Hunan, PR China.
| | - Hongru Jiang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, Hunan, PR China.
| |
Collapse
|
20
|
Peisen F, Gerken A, Hering A, Dahm I, Nikolaou K, Gatidis S, Eigentler TK, Amaral T, Moltz JH, Othman AE. Can Whole-Body Baseline CT Radiomics Add Information to the Prediction of Best Response, Progression-Free Survival, and Overall Survival of Stage IV Melanoma Patients Receiving First-Line Targeted Therapy: A Retrospective Register Study. Diagnostics (Basel) 2023; 13:3210. [PMID: 37892030 PMCID: PMC10605712 DOI: 10.3390/diagnostics13203210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 10/06/2023] [Accepted: 10/12/2023] [Indexed: 10/29/2023] Open
Abstract
BACKGROUND The aim of this study was to investigate whether the combination of radiomics and clinical parameters in a machine-learning model offers additive information compared with the use of only clinical parameters in predicting the best response, progression-free survival after six months, as well as overall survival after six and twelve months in patients with stage IV malignant melanoma undergoing first-line targeted therapy. METHODS A baseline machine-learning model using clinical variables (demographic parameters and tumor markers) was compared with an extended model using clinical variables and radiomic features of the whole tumor burden, utilizing repeated five-fold cross-validation. Baseline CTs of 91 stage IV malignant melanoma patients, all treated in the same university hospital, were identified in the Central Malignant Melanoma Registry and all metastases were volumetrically segmented (n = 4727). RESULTS Compared with the baseline model, the extended radiomics model did not add significantly more information to the best-response prediction (AUC [95% CI] 0.548 (0.188, 0.808) vs. 0.487 (0.139, 0.743)), the prediction of PFS after six months (AUC [95% CI] 0.699 (0.436, 0.958) vs. 0.604 (0.373, 0.867)), or the overall survival prediction after six and twelve months (AUC [95% CI] 0.685 (0.188, 0.967) vs. 0.766 (0.433, 1.000) and AUC [95% CI] 0.554 (0.163, 0.781) vs. 0.616 (0.271, 1.000), respectively). CONCLUSIONS The results showed no additional value of baseline whole-body CT radiomics for best-response prediction, progression-free survival prediction for six months, or six-month and twelve-month overall survival prediction for stage IV melanoma patients receiving first-line targeted therapy. These results need to be validated in a larger cohort.
Collapse
Affiliation(s)
- Felix Peisen
- Department of Diagnostic and Interventional Radiology, Tuebingen University Hospital, Eberhard Karls University, Hoppe-Seyler-Straße 3, 72076 Tuebingen, Germany; (I.D.); (K.N.); (S.G.); (A.E.O.)
| | - Annika Gerken
- Fraunhofer MEVIS, Max-von-Laue-Straße 2, 28359 Bremen, Germany; (A.G.); (A.H.); (J.H.M.)
| | - Alessa Hering
- Fraunhofer MEVIS, Max-von-Laue-Straße 2, 28359 Bremen, Germany; (A.G.); (A.H.); (J.H.M.)
- Diagnostic Image Analysis Group, Radboud University Medical Center (Radboudumc), Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Isabel Dahm
- Department of Diagnostic and Interventional Radiology, Tuebingen University Hospital, Eberhard Karls University, Hoppe-Seyler-Straße 3, 72076 Tuebingen, Germany; (I.D.); (K.N.); (S.G.); (A.E.O.)
| | - Konstantin Nikolaou
- Department of Diagnostic and Interventional Radiology, Tuebingen University Hospital, Eberhard Karls University, Hoppe-Seyler-Straße 3, 72076 Tuebingen, Germany; (I.D.); (K.N.); (S.G.); (A.E.O.)
- Image-Guided and Functionally Instructed Tumor Therapies (iFIT), The Cluster of Excellence (EXC 2180), 72076 Tuebingen, Germany
| | - Sergios Gatidis
- Department of Diagnostic and Interventional Radiology, Tuebingen University Hospital, Eberhard Karls University, Hoppe-Seyler-Straße 3, 72076 Tuebingen, Germany; (I.D.); (K.N.); (S.G.); (A.E.O.)
- Max Planck Institute for Intelligent Systems, Max-Planck-Ring 4, 72076 Tuebingen, Germany
| | - Thomas K. Eigentler
- Center of Dermato-Oncology, Department of Dermatology, Tuebingen University Hospital, Eberhard Karls University, Liebermeisterstraße 25, 72076 Tuebingen, Germany; (T.K.E.); (T.A.)
- Department of Dermatology, Venereology and Allergology, Charité—Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humbolt-Universität zu Berlin, Luisenstraße 2, 10117 Berlin, Germany
| | - Teresa Amaral
- Center of Dermato-Oncology, Department of Dermatology, Tuebingen University Hospital, Eberhard Karls University, Liebermeisterstraße 25, 72076 Tuebingen, Germany; (T.K.E.); (T.A.)
| | - Jan H. Moltz
- Fraunhofer MEVIS, Max-von-Laue-Straße 2, 28359 Bremen, Germany; (A.G.); (A.H.); (J.H.M.)
| | - Ahmed E. Othman
- Department of Diagnostic and Interventional Radiology, Tuebingen University Hospital, Eberhard Karls University, Hoppe-Seyler-Straße 3, 72076 Tuebingen, Germany; (I.D.); (K.N.); (S.G.); (A.E.O.)
- Institute of Neuroradiology, Johannes Gutenberg University Hospital Mainz, Langenbeckstraße 1, 55131 Mainz, Germany
| |
Collapse
|
21
|
Papoutsoglou G, Tarazona S, Lopes MB, Klammsteiner T, Ibrahimi E, Eckenberger J, Novielli P, Tonda A, Simeon A, Shigdel R, Béreux S, Vitali G, Tangaro S, Lahti L, Temko A, Claesson MJ, Berland M. Machine learning approaches in microbiome research: challenges and best practices. Front Microbiol 2023; 14:1261889. [PMID: 37808286 PMCID: PMC10556866 DOI: 10.3389/fmicb.2023.1261889] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 09/04/2023] [Indexed: 10/10/2023] Open
Abstract
Microbiome data predictive analysis within a machine learning (ML) workflow presents numerous domain-specific challenges involving preprocessing, feature selection, predictive modeling, performance estimation, model interpretation, and the extraction of biological information from the results. To assist decision-making, we offer a set of recommendations on algorithm selection, pipeline creation and evaluation, stemming from the COST Action ML4Microbiome. We compared the suggested approaches on a multi-cohort shotgun metagenomics dataset of colorectal cancer patients, focusing on their performance in disease diagnosis and biomarker discovery. It is demonstrated that the use of compositional transformations and filtering methods as part of data preprocessing does not always improve the predictive performance of a model. In contrast, the multivariate feature selection, such as the Statistically Equivalent Signatures algorithm, was effective in reducing the classification error. When validated on a separate test dataset, this algorithm in combination with random forest modeling, provided the most accurate performance estimates. Lastly, we showed how linear modeling by logistic regression coupled with visualization techniques such as Individual Conditional Expectation (ICE) plots can yield interpretable results and offer biological insights. These findings are significant for clinicians and non-experts alike in translational applications.
Collapse
Affiliation(s)
- Georgios Papoutsoglou
- Department of Computer Science, University of Crete, Heraklion, Greece
- JADBio Gnosis DA S.A., Science and Technology Park of Crete, Heraklion, Greece
| | - Sonia Tarazona
- Department of Applied Statistics and Operations Research and Quality, Polytechnic University of Valencia, Valencia, Spain
| | - Marta B. Lopes
- Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, Portugal
- Research and Development Unit for Mechanical and Industrial Engineering (UNIDEMI), Department of Mechanical and Industrial Engineering, NOVA School of Science and Technology, Caparica, Portugal
| | - Thomas Klammsteiner
- Department of Ecology, Universität Innsbruck, Innsbruck, Austria
- Department of Microbiology, Universität Innsbruck, Innsbruck, Austria
| | - Eliana Ibrahimi
- Department of Biology, University of Tirana, Tirana, Albania
| | - Julia Eckenberger
- School of Microbiology, University College Cork, Cork, Ireland
- APC Microbiome Ireland, Cork, Ireland
| | - Pierfrancesco Novielli
- Department of Soil, Plant, and Food Sciences, University of Bari Aldo Moro, Bari, Italy
- National Institute for Nuclear Physics, Bari Division, Bari, Italy
| | - Alberto Tonda
- UMR 518 MIA-PS, INRAE, Paris-Saclay University, Palaiseau, France
- Complex Systems Institute of Paris Ile-de-France (ISC-PIF) - UAR 3611 CNRS, Paris, France
| | - Andrea Simeon
- BioSense Institute, University of Novi Sad, Novi Sad, Serbia
| | - Rajesh Shigdel
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Stéphane Béreux
- MetaGenoPolis, INRAE, Paris-Saclay University, Jouy-en-Josas, France
- MaIAGE, INRAE, Paris-Saclay University, Jouy-en-Josas, France
| | - Giacomo Vitali
- MetaGenoPolis, INRAE, Paris-Saclay University, Jouy-en-Josas, France
| | - Sabina Tangaro
- Department of Soil, Plant, and Food Sciences, University of Bari Aldo Moro, Bari, Italy
- National Institute for Nuclear Physics, Bari Division, Bari, Italy
| | - Leo Lahti
- Department of Computing, University of Turku, Turku, Finland
| | - Andriy Temko
- Department of Electrical and Electronic Engineering, University College Cork, Cork, Ireland
| | - Marcus J. Claesson
- School of Microbiology, University College Cork, Cork, Ireland
- APC Microbiome Ireland, Cork, Ireland
| | - Magali Berland
- MetaGenoPolis, INRAE, Paris-Saclay University, Jouy-en-Josas, France
| |
Collapse
|
22
|
Lakiotaki K, Papadovasilakis Z, Lagani V, Fafalios S, Charonyktakis P, Tsagris M, Tsamardinos I. Automated machine learning for genome wide association studies. Bioinformatics 2023; 39:btad545. [PMID: 37672022 PMCID: PMC10562960 DOI: 10.1093/bioinformatics/btad545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 06/29/2023] [Accepted: 09/05/2023] [Indexed: 09/07/2023] Open
Abstract
MOTIVATION Genome-wide association studies (GWAS) present several computational and statistical challenges for their data analysis, including knowledge discovery, interpretability, and translation to clinical practice. RESULTS We develop, apply, and comparatively evaluate an automated machine learning (AutoML) approach, customized for genomic data that delivers reliable predictive and diagnostic models, the set of genetic variants that are important for predictions (called a biosignature), and an estimate of the out-of-sample predictive power. This AutoML approach discovers variants with higher predictive performance compared to standard GWAS methods, computes an individual risk prediction score, generalizes to new, unseen data, is shown to better differentiate causal variants from other highly correlated variants, and enhances knowledge discovery and interpretability by reporting multiple equivalent biosignatures. AVAILABILITY AND IMPLEMENTATION Code for this study is available at: https://github.com/mensxmachina/autoML-GWAS. JADBio offers a free version at: https://jadbio.com/sign-up/. SNP data can be downloaded from the EGA repository (https://ega-archive.org/). PRS data are found at: https://www.aicrowd.com/challenges/opensnp-height-prediction. Simulation data to study population structure can be found at: https://easygwas.ethz.ch/data/public/dataset/view/1/.
Collapse
Affiliation(s)
| | - Zaharias Papadovasilakis
- Department of Computer Science, University of Crete, Heraklion, Greece
- JADBio Gnosis DA S.A., Science and Technology Park of Crete, GR-70013 Heraklion, Greece
- Laboratory of Immune Regulation and Tolerance, School of Medicine, University of Crete, Heraklion, Greece
| | - Vincenzo Lagani
- Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology KAUST, Thuwal 23952, Saudi Arabia
- SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence, Thuwal 23952, Saudi Arabia
- Institute of Chemical Biology, Ilia State University, Tbilisi, Georgia
| | - Stefanos Fafalios
- Department of Computer Science, University of Crete, Heraklion, Greece
- JADBio Gnosis DA S.A., Science and Technology Park of Crete, GR-70013 Heraklion, Greece
| | - Paulos Charonyktakis
- JADBio Gnosis DA S.A., Science and Technology Park of Crete, GR-70013 Heraklion, Greece
| | - Michail Tsagris
- Department of Computer Science, University of Crete, Heraklion, Greece
- Department of Economics, University of Crete, Heraklion, Greece
| | - Ioannis Tsamardinos
- Department of Computer Science, University of Crete, Heraklion, Greece
- JADBio Gnosis DA S.A., Science and Technology Park of Crete, GR-70013 Heraklion, Greece
| |
Collapse
|
23
|
Bhattacharyay S, Caruso PF, Åkerlund C, Wilson L, Stevens RD, Menon DK, Steyerberg EW, Nelson DW, Ercole A. Mining the contribution of intensive care clinical course to outcome after traumatic brain injury. NPJ Digit Med 2023; 6:154. [PMID: 37604980 PMCID: PMC10442346 DOI: 10.1038/s41746-023-00895-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 08/01/2023] [Indexed: 08/23/2023] Open
Abstract
Existing methods to characterise the evolving condition of traumatic brain injury (TBI) patients in the intensive care unit (ICU) do not capture the context necessary for individualising treatment. Here, we integrate all heterogenous data stored in medical records (1166 pre-ICU and ICU variables) to model the individualised contribution of clinical course to 6-month functional outcome on the Glasgow Outcome Scale -Extended (GOSE). On a prospective cohort (n = 1550, 65 centres) of TBI patients, we train recurrent neural network models to map a token-embedded time series representation of all variables (including missing values) to an ordinal GOSE prognosis every 2 h. The full range of variables explains up to 52% (95% CI: 50-54%) of the ordinal variance in functional outcome. Up to 91% (95% CI: 90-91%) of this explanation is derived from pre-ICU and admission information (i.e., static variables). Information collected in the ICU (i.e., dynamic variables) increases explanation (by up to 5% [95% CI: 4-6%]), though not enough to counter poorer overall performance in longer-stay (>5.75 days) patients. Highest-contributing variables include physician-based prognoses, CT features, and markers of neurological function. Whilst static information currently accounts for the majority of functional outcome explanation after TBI, data-driven analysis highlights investigative avenues to improve the dynamic characterisation of longer-stay patients. Moreover, our modelling strategy proves useful for converting large patient records into interpretable time series with missing data integration and minimal processing.
Collapse
Affiliation(s)
- Shubhayu Bhattacharyay
- Division of Anaesthesia, University of Cambridge, Cambridge, UK.
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK.
- Laboratory of Computational Intensive Care Medicine, Johns Hopkins University, Baltimore, MD, USA.
| | - Pier Francesco Caruso
- Division of Anaesthesia, University of Cambridge, Cambridge, UK
- Department of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4, Pieve Emanuele, Milan, 20072, Italy
| | - Cecilia Åkerlund
- Department of Physiology and Pharmacology, Section for Perioperative Medicine and Intensive Care, Karolinska Institutet, Stockholm, Sweden
| | - Lindsay Wilson
- Division of Psychology, University of Stirling, Stirling, UK
| | - Robert D Stevens
- Laboratory of Computational Intensive Care Medicine, Johns Hopkins University, Baltimore, MD, USA
- Department of Anesthesiology and Critical Care Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - David K Menon
- Division of Anaesthesia, University of Cambridge, Cambridge, UK
| | - Ewout W Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - David W Nelson
- Department of Physiology and Pharmacology, Section for Perioperative Medicine and Intensive Care, Karolinska Institutet, Stockholm, Sweden
| | - Ari Ercole
- Division of Anaesthesia, University of Cambridge, Cambridge, UK
- Cambridge Centre for Artificial Intelligence in Medicine, Cambridge, UK
| |
Collapse
|
24
|
Kuipers M, Kappen M, Naber M. How nervous am I? How computer vision succeeds and humans fail in interpreting state anxiety from dynamic facial behaviour. Cogn Emot 2023; 37:1105-1115. [PMID: 37395739 DOI: 10.1080/02699931.2023.2229545] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 04/17/2023] [Accepted: 06/16/2023] [Indexed: 07/04/2023]
Abstract
For human interaction, it is important to understand what emotional state others are in. Especially the observation of faces aids us in putting behaviours into context and gives insight into emotions and mental states of others. Detecting whether someone is nervous, a form of state anxiety, is such an example as it reveals a person's familiarity and contentment with the circumstances. With recent developments in computer vision we developed behavioural nervousness models to show which time-varying facial cues reveal whether someone is nervous in an interview setting. The facial changes, reflecting a state of anxiety, led to more visual exposure and less chemosensory (taste and olfaction) exposure. However, experienced observers had difficulty picking up these changes and failed to detect nervousness levels accurately therewith. This study highlights humans' limited capacity in determining complex emotional states but at the same time provides an automated model that can assist us in achieving fair assessments of so far unexplored emotional states.
Collapse
Affiliation(s)
- Mithras Kuipers
- Experimental Psychology, Helmholtz Institute, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands
| | - Mitchel Kappen
- Experimental Psychology, Helmholtz Institute, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands
- Department of Head and Skin, Ghent University, University Hospital Ghent (UZ Ghent), Ghent, Belgium
| | - Marnix Naber
- Experimental Psychology, Helmholtz Institute, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
25
|
Boutin L, Morisson L, Riché F, Barthélémy R, Mebazaa A, Soyer P, Gallix B, Dohan A, Chousterman BG. Radiomic analysis of abdominal organs during sepsis of digestive origin in a French intensive care unit. Acute Crit Care 2023; 38:343-352. [PMID: 37652864 PMCID: PMC10497895 DOI: 10.4266/acc.2023.00136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 06/12/2023] [Accepted: 06/15/2023] [Indexed: 09/02/2023] Open
Abstract
BACKGROUND Sepsis is a severe and common cause of admission to the intensive care unit (ICU). Radiomic analysis (RA) may predict organ failure and patient outcomes. The objective of this study was to assess a model of RA and to evaluate its performance in predicting in-ICU mortality and acute kidney injury (AKI) during abdominal sepsis. METHODS This single-center, retrospective study included patients admitted to the ICU for abdominal sepsis. To predict in-ICU mortality or AKI, elastic net regularized logistic regression and the random forest algorithm were used in a five-fold cross-validation set repeated 10 times. RESULTS Fifty-five patients were included. In-ICU mortality was 25.5%, and 76.4% of patients developed AKI. To predict in-ICU mortality, elastic net and random forest models, respectively, achieved areas under the curve (AUCs) of 0.48 (95% confidence interval [CI], 0.43-0.54) and 0.51 (95% CI, 0.46-0.57) and were not improved combined with Simplified Acute Physiology Score (SAPS) II. To predict AKI with RA, the AUC was 0.71 (95% CI, 0.66-0.77) for elastic net and 0.69 (95% CI, 0.64-0.74) for random forest, and these were improved combined with SAPS II, respectively; AUC of 0.94 (95% CI, 0.91-0.96) and 0.75 (95% CI, 0.70-0.80) for elastic net and random forest, respectively. CONCLUSIONS This study suggests that RA has poor predictive performance for in-ICU mortality but good predictive performance for AKI in patients with abdominal sepsis. A secondary validation cohort is needed to confirm these results and the assessed model.
Collapse
Affiliation(s)
- Louis Boutin
- Department of Anesthesiology and Critical Care, Hôpital Lariboisière, AP-HP, Paris, France
- INSERM UMR-S 942, MASCOT, Université Paris Cité, Paris, France
| | - Louis Morisson
- Department of Anesthesiology and Critical Care, Hôpital Lariboisière, AP-HP, Paris, France
| | - Florence Riché
- Department of Anesthesiology and Critical Care, Hôpital Lariboisière, AP-HP, Paris, France
| | - Romain Barthélémy
- Department of Anesthesiology and Critical Care, Hôpital Lariboisière, AP-HP, Paris, France
| | - Alexandre Mebazaa
- Department of Anesthesiology and Critical Care, Hôpital Lariboisière, AP-HP, Paris, France
- INSERM UMR-S 942, MASCOT, Université Paris Cité, Paris, France
| | - Philippe Soyer
- INSERM UMR-S 942, MASCOT, Université Paris Cité, Paris, France
- Department of Radiology, Cochin Hospital, AP-HP, Paris, France
| | - Benoit Gallix
- IHU Strasbourg, Strasbourg, France
- Icube Laboratory and Faculty of Medicine, University of Strasbourg, Strasbourg, France
- Department of Radiology, McGill University, Montreal, QC, Canada
| | - Anthony Dohan
- INSERM UMR-S 942, MASCOT, Université Paris Cité, Paris, France
- Department of Radiology, Cochin Hospital, AP-HP, Paris, France
| | - Benjamin G Chousterman
- Department of Anesthesiology and Critical Care, Hôpital Lariboisière, AP-HP, Paris, France
- INSERM UMR-S 942, MASCOT, Université Paris Cité, Paris, France
| |
Collapse
|
26
|
Laqua FC, Woznicki P, Bley TA, Schöneck M, Rinneburger M, Weisthoff M, Schmidt M, Persigehl T, Iuga AI, Baeßler B. Transfer-Learning Deep Radiomics and Hand-Crafted Radiomics for Classifying Lymph Nodes from Contrast-Enhanced Computed Tomography in Lung Cancer. Cancers (Basel) 2023; 15:2850. [PMID: 37345187 PMCID: PMC10216416 DOI: 10.3390/cancers15102850] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/06/2023] [Accepted: 05/19/2023] [Indexed: 06/23/2023] Open
Abstract
OBJECTIVES Positron emission tomography (PET) is currently considered the non-invasive reference standard for lymph node (N-)staging in lung cancer. However, not all patients can undergo this diagnostic procedure due to high costs, limited availability, and additional radiation exposure. The purpose of this study was to predict the PET result from traditional contrast-enhanced computed tomography (CT) and to test different feature extraction strategies. METHODS In this study, 100 lung cancer patients underwent a contrast-enhanced 18F-fluorodeoxyglucose (FDG) PET/CT scan between August 2012 and December 2019. We trained machine learning models to predict FDG uptake in the subsequent PET scan. Model inputs were composed of (i) traditional "hand-crafted" radiomics features from the segmented lymph nodes, (ii) deep features derived from a pretrained EfficientNet-CNN, and (iii) a hybrid approach combining (i) and (ii). RESULTS In total, 2734 lymph nodes [555 (20.3%) PET-positive] from 100 patients [49% female; mean age 65, SD: 14] with lung cancer (60% adenocarcinoma, 21% plate epithelial carcinoma, 8% small-cell lung cancer) were included in this study. The area under the receiver operating characteristic curve (AUC) ranged from 0.79 to 0.87, and the scaled Brier score (SBS) ranged from 16 to 36%. The random forest model (iii) yielded the best results [AUC 0.871 (0.865-0.878), SBS 35.8 (34.2-37.2)] and had significantly higher model performance than both approaches alone (AUC: p < 0.001, z = 8.8 and z = 22.4; SBS: p < 0.001, z = 11.4 and z = 26.6, against (i) and (ii), respectively). CONCLUSION Both traditional radiomics features and transfer-learning deep radiomics features provide relevant and complementary information for non-invasive N-staging in lung cancer.
Collapse
Affiliation(s)
- Fabian Christopher Laqua
- Department of Diagnostic and Interventional Radiology, University Hospital Würzburg, University of Würzburg, 97080 Würzburg, Germany
| | - Piotr Woznicki
- Department of Diagnostic and Interventional Radiology, University Hospital Würzburg, University of Würzburg, 97080 Würzburg, Germany
| | - Thorsten A. Bley
- Department of Diagnostic and Interventional Radiology, University Hospital Würzburg, University of Würzburg, 97080 Würzburg, Germany
| | - Mirjam Schöneck
- Institute of Diagnostic and Interventional Radiology, Medical Faculty and University Hospital Cologne, University of Cologne, 50937 Cologne, Germany
| | - Miriam Rinneburger
- Institute of Diagnostic and Interventional Radiology, Medical Faculty and University Hospital Cologne, University of Cologne, 50937 Cologne, Germany
| | - Mathilda Weisthoff
- Institute of Diagnostic and Interventional Radiology, Medical Faculty and University Hospital Cologne, University of Cologne, 50937 Cologne, Germany
| | - Matthias Schmidt
- Department of Nuclear Medicine, Medical Faculty and University Hospital Cologne, University of Cologne, 50937 Cologne, Germany
| | - Thorsten Persigehl
- Institute of Diagnostic and Interventional Radiology, Medical Faculty and University Hospital Cologne, University of Cologne, 50937 Cologne, Germany
| | - Andra-Iza Iuga
- Institute of Diagnostic and Interventional Radiology, Medical Faculty and University Hospital Cologne, University of Cologne, 50937 Cologne, Germany
| | - Bettina Baeßler
- Department of Diagnostic and Interventional Radiology, University Hospital Würzburg, University of Würzburg, 97080 Würzburg, Germany
| |
Collapse
|
27
|
Pellegrini M. Accurate prognosis for localized prostate cancer through coherent voting networks with multi-omic and clinical data. Sci Rep 2023; 13:7875. [PMID: 37188913 DOI: 10.1038/s41598-023-35023-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 05/11/2023] [Indexed: 05/17/2023] Open
Abstract
Localized prostate cancer is a very heterogeneous disease, from both a clinical and a biological/biochemical point of view, which makes the task of producing stratifications of patients into risk classes remarkably challenging. In particular, it is important an early detection and discrimination of the indolent forms of the disease, from the aggressive ones, requiring post-surgery closer surveillance and timely treatment decisions. This work extends a recently developed supervised machine learning (ML) technique, called coherent voting networks (CVN) by incorporating a novel model-selection technique to counter the danger of model overfitting. For the challenging problem of discriminating between indolent and aggressive types of localized prostate cancer, accurate prognostic prediction of post-surgery progression-free survival with a granularity within a year is attained, improving accuracy with respect to the current state of the art. The development of novel ML techniques tailored to the problem of combining multi-omics and clinical prognostic biomarkers is a promising new line of attack for sharpening the capability to diversify and personalize cancer patient treatments. The proposed approach allows a finer post-surgery stratification of patients within the clinical high-risk category, with a potential impact on the surveillance regime and the timing of treatment decisions, complementing existing prognostic methods.
Collapse
Affiliation(s)
- Marco Pellegrini
- Institute of Informatics and Telematics (IIT), CNR, 56124, Pisa, Italy.
| |
Collapse
|
28
|
Lewis MJ, Spiliopoulou A, Goldmann K, Pitzalis C, McKeigue P, Barnes MR. nestedcv: an R package for fast implementation of nested cross-validation with embedded feature selection designed for transcriptomics and high-dimensional data. BIOINFORMATICS ADVANCES 2023; 3:vbad048. [PMID: 37113250 PMCID: PMC10125905 DOI: 10.1093/bioadv/vbad048] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 02/21/2023] [Accepted: 04/12/2023] [Indexed: 04/29/2023]
Abstract
Motivation Although machine learning models are commonly used in medical research, many analyses implement a simple partition into training data and hold-out test data, with cross-validation (CV) for tuning of model hyperparameters. Nested CV with embedded feature selection is especially suited to biomedical data where the sample size is frequently limited, but the number of predictors may be significantly larger (P ≫ n). Results The nestedcv R package implements fully nested k × l-fold CV for lasso and elastic-net regularized linear models via the glmnet package and supports a large array of other machine learning models via the caret framework. Inner CV is used to tune models and outer CV is used to determine model performance without bias. Fast filter functions for feature selection are provided and the package ensures that filters are nested within the outer CV loop to avoid information leakage from performance test sets. Measurement of performance by outer CV is also used to implement Bayesian linear and logistic regression models using the horseshoe prior over parameters to encourage a sparse model and determine unbiased model accuracy. Availability and implementation The R package nestedcv is available from CRAN: https://CRAN.R-project.org/package=nestedcv.
Collapse
Affiliation(s)
- Myles J Lewis
- Centre for Experimental Medicine & Rheumatology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
- Alan Turing Institute, London NW1 2AJ, UK
| | - Athina Spiliopoulou
- Usher Institute, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh EH16 4UX, UK
| | - Katriona Goldmann
- Centre for Experimental Medicine & Rheumatology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
- Centre for Translational Bioinformatics, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Costantino Pitzalis
- Centre for Experimental Medicine & Rheumatology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Paul McKeigue
- Usher Institute, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh EH16 4UX, UK
| | - Michael R Barnes
- Alan Turing Institute, London NW1 2AJ, UK
- Centre for Translational Bioinformatics, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| |
Collapse
|
29
|
Robellada‐Zárate CM, Luna‐Palacios JE, Caballero CAZ, Acuña‐González JP, Lara‐Pereyra I, González‐Azpeitia DI, Acuña‐González RJ, Moreno‐Verduzco ER, Flores‐Herrera H, Osorio‐Caballero M. First‐trimester plasma extracellular heat shock proteins levels and risk of preeclampsia. J Cell Mol Med 2023; 27:1206-1213. [PMID: 37002651 PMCID: PMC10148059 DOI: 10.1111/jcmm.17674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 12/19/2022] [Accepted: 12/28/2022] [Indexed: 04/03/2023] Open
Abstract
Preeclampsia (PE) occurs annually in 8% of pregnancies. Patients without risk factors represent 10% of these. There are currently no first-trimester biochemical markers that accurately predict PE. An increase in serum 60- and 70-KDa extracellular heat shock proteins (eHsp) has been shown in patients who developed PE at 34 weeks. We sought to determine whether there is a relationship between first-trimester eHsp and the development of PE. This was a prospective cohort study performed at a third level hospital in Mexico City from 2019 to 2020. eHsp levels were measured during the first-trimester ultrasound in singleton pregnancies with no comorbidities. First-trimester eHsp levels and biochemical parameters of organ dysfunction were compared between patients who developed preeclampsia and those who did not. All statistical analyses and model of correlation (r) between eHsp and clinical parameter were performed using bootstrapping R-software. p-values <0.05 were considered significant. The final analysis included 41 patients. PE occurred in 11 cases. eHsp-60 and eHsp-70 were significantly higher at 12 weeks in patients who developed PE (p = 0.001), while eHsp-27 was significantly lower (p = 0.004). Significant differences in first-trimester eHsp concentration suggest that these are possible early biomarkers useful for the prediction of PE.
Collapse
Affiliation(s)
- Claudia Melina Robellada‐Zárate
- Departamento de Ginecología y Obstetricia Instituto Nacional de Perinatología “Isidro Espinosa de los Reyes” Ciudad de México Mexico
| | | | - Carlos Agustín Zapata Caballero
- Departamento de Ginecología y Obstetricia Instituto Nacional de Perinatología “Isidro Espinosa de los Reyes” Ciudad de México Mexico
| | - Juan Pablo Acuña‐González
- Departamento de Matemáticas, Facultad de Ciencias Universidad Nacional Autónoma de México Ciudad de México Mexico
| | - Irlando Lara‐Pereyra
- Departamento de Ginecología, Hospital General de Zona 252 Instituto Mexicano del Seguro Social Atlacomulco Mexico
| | | | - Ricardo Josué Acuña‐González
- Departamento de Inmunobioquimica Instituto Nacional de Perinatología “Isidro Espinosa de los Reyes” Ciudad de México Mexico
| | - Elsa Romelia Moreno‐Verduzco
- Subdirección de Servicios Auxiliares de Diagnóstico Instituto Nacional de Perinatología “Isidro Espinosa de los Reyes” Ciudad de México Mexico
| | - Héctor Flores‐Herrera
- Departamento de Inmunobioquimica Instituto Nacional de Perinatología “Isidro Espinosa de los Reyes” Ciudad de México Mexico
| | - Mauricio Osorio‐Caballero
- Departamento de Salud Sexual y Reproductiva Instituto Nacional de Perinatología “Isidro Espinosa de los Reyes” Ciudad de México Mexico
| |
Collapse
|
30
|
Litwińczuk MC, Muhlert N, Trujillo-Barreto N, Woollams A. Using graph theory as a common language to combine neural structure and function in models of healthy cognitive performance. Hum Brain Mapp 2023; 44:3007-3022. [PMID: 36880608 PMCID: PMC10171528 DOI: 10.1002/hbm.26258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/05/2022] [Accepted: 02/18/2023] [Indexed: 03/08/2023] Open
Abstract
Graph theory has been used in cognitive neuroscience to understand how organisational properties of structural and functional brain networks relate to cognitive function. Graph theory may bridge the gap in integration of structural and functional connectivity by introducing common measures of network characteristics. However, the explanatory and predictive value of combined structural and functional graph theory have not been investigated in modelling of cognitive performance of healthy adults. In this work, a Principal Component Regression approach with embedded Step-Wise Regression was used to fit multiple regression models of Executive Function, Self-regulation, Language, Encoding and Sequence Processing with a collection of 20 different graph theoretic measures of structural and functional network organisation used as regressors. The predictive ability of graph theory-based models was compared to that of connectivity-based models. The present work shows that using combinations of graph theory metrics to predict cognition in healthy populations does not produce a consistent benefit relative to making predictions based on structural and functional connectivity values directly.
Collapse
Affiliation(s)
- Marta Czime Litwińczuk
- Division of Neuroscience and Experimental Psychology, University of Manchester, Manchester, UK
| | - Nils Muhlert
- Division of Neuroscience and Experimental Psychology, University of Manchester, Manchester, UK
| | - Nelson Trujillo-Barreto
- Division of Neuroscience and Experimental Psychology, University of Manchester, Manchester, UK
| | - Anna Woollams
- Division of Neuroscience and Experimental Psychology, University of Manchester, Manchester, UK
| |
Collapse
|
31
|
Tsamardinos I. Don't lose samples to estimation. PATTERNS (NEW YORK, N.Y.) 2022; 3:100612. [PMID: 36569551 PMCID: PMC9782254 DOI: 10.1016/j.patter.2022.100612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In a typical predictive modeling task, we are asked to produce a final predictive model to employ operationally for predictions, as well as an estimate of its out-of-sample predictive performance. Typically, analysts hold out a portion of the available data, called a Test set, to estimate the model predictive performance on unseen (out-of-sample) records, thus "losing these samples to estimation." However, this practice is unacceptable when the total sample size is low. To avoid losing data to estimation, we need a shift in our perspective: we do not estimate the performance of a specific model instance; we estimate the performance of the pipeline that produces the model. This pipeline is applied on all available samples to produce the final model; no samples are lost to estimation. An estimate of its performance is provided by training the same pipeline on subsets of the samples. When multiple pipelines are tried, additional considerations that correct for the "winner's curse" need to be in place.
Collapse
Affiliation(s)
- Ioannis Tsamardinos
- Computer Science Department, University of Crete, Heraklion, Greece,JADBio – Gnosis DA S.A, Heraklion, Greece,Institute of Applied and Computational Mathematics, Foundation for Research and Technology, Hellas, Heraklion, Greece,Corresponding author
| |
Collapse
|
32
|
Penaluna BE, Burnett JD, Christiansen K, Arismendi I, Johnson SL, Griswold K, Holycross B, Kolstoe SH. UPRLIMET: UPstream Regional LiDAR Model for Extent of Trout in stream networks. Sci Rep 2022; 12:20266. [PMID: 36456610 PMCID: PMC9715699 DOI: 10.1038/s41598-022-23754-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 11/04/2022] [Indexed: 12/05/2022] Open
Abstract
Predicting the edges of species distributions is fundamental for species conservation, ecosystem services, and management decisions. In North America, the location of the upstream limit of fish in forested streams receives special attention, because fish-bearing portions of streams have more protections during forest management activities than fishless portions. We present a novel model development and evaluation framework, wherein we compare 26 models to predict upper distribution limits of trout in streams. The models used machine learning, logistic regression, and a sophisticated nested spatial cross-validation routine to evaluate predictive performance while accounting for spatial autocorrelation. The model resulting in the best predictive performance, termed UPstream Regional LiDAR Model for Extent of Trout (UPRLIMET), is a two-stage model that uses a logistic regression algorithm calibrated to observations of Coastal Cutthroat Trout (Oncorhynchus clarkii clarkii) occurrence and variables representing hydro-topographic characteristics of the landscape. We predict trout presence along reaches throughout a stream network, and include a stopping rule to identify a discrete upper limit point above which all stream reaches are classified as fishless. Although there is no simple explanation for the upper distribution limit identified in UPRLIMET, four factors, including upstream channel length above the point of uppermost fish, drainage area, slope, and elevation, had highest importance. Across our study region of western Oregon, we found that more of the fish-bearing network is on private lands than on state, US Bureau of Land Mangement (BLM), or USDA Forest Service (USFS) lands, highlighting the importance of using spatially consistent maps across a region and working across land ownerships. Our research underscores the value of using occurrence data to develop simple, but powerful, prediction tools to capture complex ecological processes that contribute to distribution limits of species.
Collapse
Affiliation(s)
- Brooke E Penaluna
- U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station, 3200 SW Jefferson Way, Corvallis, OR, 97331, USA.
| | - Jonathan D Burnett
- U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station, 3200 SW Jefferson Way, Corvallis, OR, 97331, USA
| | - Kelly Christiansen
- U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station, 3200 SW Jefferson Way, Corvallis, OR, 97331, USA
| | - Ivan Arismendi
- Department of Fisheries, Wildlife, and Conservation Sciences, Oregon State University, 104 Nash Hall, Corvallis, OR, 97331, USA
| | - Sherri L Johnson
- U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station, 3200 SW Jefferson Way, Corvallis, OR, 97331, USA
| | - Kitty Griswold
- Department of Biological Sciences, Idaho State University, 921 S. 8th Ave Mail, Stop 8007, Pocatello, ID, 83209-8007, USA
| | - Brett Holycross
- Pacific States Marine Fisheries Commission, 205 SE Spokane St., Portland, OR, 97202, USA
| | - Sonja H Kolstoe
- U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station, 1220 SW 3rd Avenue, Suite 1410, Portland, OR, 97204, USA
| |
Collapse
|
33
|
Yaeger JP, Jones J, Ertefaie A, Caserta MT, Fiscella KA. Derivation of a clinical-based model to detect invasive bacterial infections in febrile infants. J Hosp Med 2022; 17:893-900. [PMID: 36036211 PMCID: PMC9633417 DOI: 10.1002/jhm.12956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 07/28/2022] [Accepted: 08/15/2022] [Indexed: 11/10/2022]
Abstract
BACKGROUND Febrile infants are at risk for invasive bacterial infections (IBIs) (i.e., bacteremia and bacterial meningitis), which, when undiagnosed, may have devastating consequences. Current IBI predictive models rely on serum biomarkers, which may not provide timely results and may be difficult to obtain in low-resource settings. OBJECTIVE The aim of this study was to derive a clinical-based IBI predictive model for febrile infants. DESIGNS, SETTING, AND PARTICIPANTS This is a cross-sectional study of infants brought to two pediatric emergency departments from January 2011 to December 2018. Inclusion criteria were age 0-90 days, temperature ≥38°C, and documented gestational age, fever duration, and illness duration. MAIN OUTCOME AND MEASURES To detect IBIs, we used regression and ensemble machine learning models and evidence-based predictors (i.e., sex, age, chronic medical condition, gestational age, appearance, maximum temperature, fever duration, illness duration, cough status, and urinary tract inflammation). We up-weighted infants with IBIs 8-fold and used 10-fold cross-validation to avoid overfitting. We calculated the area under the receiver operating characteristic curve (AUC), prioritizing a high sensitivity to identify the optimal cut-point to estimate sensitivity and specificity. RESULTS Of 2311 febrile infants, 39 had an IBI (1.7%); the median age was 54 days (interquartile range: 35-71). The AUC was 0.819 (95% confidence interval: 0.762, 0.868). The predictive model achieved a sensitivity of 0.974 (0.800, 1.00) and a specificity of 0.530 (0.484, 0.575). Findings suggest that a clinical-based model can detect IBIs in febrile infants, performing similarly to serum biomarker-based models. This model may improve health equity by enabling clinicians to estimate IBI risk in any setting. Future studies should prospectively validate findings across multiple sites and investigate performance by age.
Collapse
Affiliation(s)
- Jeffrey P Yaeger
- Department of Pediatrics, University of Rochester School of Medicine and Dentistry, Rochester, New York, USA
- Department of Public Health Sciences, University of Rochester Medical Center, Rochester, New York, USA
| | - Jeremiah Jones
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York, USA
| | - Ashkan Ertefaie
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York, USA
| | - Mary T Caserta
- Department of Pediatrics, University of Rochester School of Medicine and Dentistry, Rochester, New York, USA
| | - Kevin A Fiscella
- Department of Family Medicine, University of Rochester School of Medicine and Dentistry, Rochester, New York, USA
| |
Collapse
|
34
|
Bowler S, Papoutsoglou G, Karanikas A, Tsamardinos I, Corley MJ, Ndhlovu LC. A machine learning approach utilizing DNA methylation as an accurate classifier of COVID-19 disease severity. Sci Rep 2022; 12:17480. [PMID: 36261477 PMCID: PMC9580434 DOI: 10.1038/s41598-022-22201-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 10/11/2022] [Indexed: 01/12/2023] Open
Abstract
Since the onset of the COVID-19 pandemic, increasing cases with variable outcomes continue globally because of variants and despite vaccines and therapies. There is a need to identify at-risk individuals early that would benefit from timely medical interventions. DNA methylation provides an opportunity to identify an epigenetic signature of individuals at increased risk. We utilized machine learning to identify DNA methylation signatures of COVID-19 disease from data available through NCBI Gene Expression Omnibus. A training cohort of 460 individuals (164 COVID-19-infected and 296 non-infected) and an external validation dataset of 128 individuals (102 COVID-19-infected and 26 non-COVID-associated pneumonia) were reanalyzed. Data was processed using ChAMP and beta values were logit transformed. The JADBio AutoML platform was leveraged to identify a methylation signature associated with severe COVID-19 disease. We identified a random forest classification model from 4 unique methylation sites with the power to discern individuals with severe COVID-19 disease. The average area under the curve of receiver operator characteristic (AUC-ROC) of the model was 0.933 and the average area under the precision-recall curve (AUC-PRC) was 0.965. When applied to our external validation, this model produced an AUC-ROC of 0.898 and an AUC-PRC of 0.864. These results further our understanding of the utility of DNA methylation in COVID-19 disease pathology and serve as a platform to inform future COVID-19 related studies.
Collapse
Affiliation(s)
- Scott Bowler
- Division of Infectious Diseases, Department of Medicine, Weill Cornell Medicine, 413 E 69th St, New York, NY, 10021, USA
| | - Georgios Papoutsoglou
- JADBio - Gnosis DA S.A, Science and Technology Park of Crete, 70013, Heraklion, Greece
| | - Aristides Karanikas
- JADBio - Gnosis DA S.A, Science and Technology Park of Crete, 70013, Heraklion, Greece
| | - Ioannis Tsamardinos
- JADBio - Gnosis DA S.A, Science and Technology Park of Crete, 70013, Heraklion, Greece
- Department of Computer Science, University of Crete, 70013, Heraklion, Greece
| | - Michael J Corley
- Division of Infectious Diseases, Department of Medicine, Weill Cornell Medicine, 413 E 69th St, New York, NY, 10021, USA
| | - Lishomwa C Ndhlovu
- Division of Infectious Diseases, Department of Medicine, Weill Cornell Medicine, 413 E 69th St, New York, NY, 10021, USA.
| |
Collapse
|
35
|
Marmolejo-Ramos F, Ospina R, García-Ceja E, Correa JC. Ingredients for Responsible Machine Learning: A Commented Review of The Hitchhiker’s Guide to Responsible Machine Learning. JOURNAL OF STATISTICAL THEORY AND APPLICATIONS 2022; 21:175-185. [PMID: 36160758 PMCID: PMC9483296 DOI: 10.1007/s44199-022-00048-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 09/02/2022] [Indexed: 11/25/2022] Open
Abstract
AbstractIn The hitchhiker’s guide to responsible machine learning, Biecek, Kozak, and Zawada (here BKZ) provide an illustrated and engaging step-by-step guide on how to perform a machine learning (ML) analysis such that the algorithms, the software, and the entire process is interpretable and transparent for both the data scientist and the end user. This review summarises BKZ’s book and elaborates on three elements key to ML analyses: inductive inference, causality, and interpretability.
Collapse
Affiliation(s)
- Fernando Marmolejo-Ramos
- Centre for Change and Complexity in Learning, University of South Australia, Adelaide, SA 5001 Australia
| | - Raydonal Ospina
- CASTLab, Department of Statistics, Universidade Federal de Pernambuco, Recife, Pernambuco 51280-000 Brazil
| | - Enrique García-Ceja
- Escuela de Ingeniería y Ciencias, Tecnológico de Monterrey, 64849 Monterrey, Nuevo León Mexico
| | - Juan C. Correa
- CESA Business School, Bogotá, Bogotá, DC, 110231 Colombia
| |
Collapse
|
36
|
Chen RJ, Lu MY, Williamson DFK, Chen TY, Lipkova J, Noor Z, Shaban M, Shady M, Williams M, Joo B, Mahmood F. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 2022; 40:865-878.e6. [PMID: 35944502 PMCID: PMC10397370 DOI: 10.1016/j.ccell.2022.07.004] [Citation(s) in RCA: 178] [Impact Index Per Article: 59.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 10/08/2021] [Accepted: 07/11/2022] [Indexed: 02/07/2023]
Abstract
The rapidly emerging field of computational pathology has demonstrated promise in developing objective prognostic models from histology images. However, most prognostic models are either based on histology or genomics alone and do not address how these data sources can be integrated to develop joint image-omic prognostic models. Additionally, identifying explainable morphological and molecular descriptors from these models that govern such prognosis is of interest. We use multimodal deep learning to jointly examine pathology whole-slide images and molecular profile data from 14 cancer types. Our weakly supervised, multimodal deep-learning algorithm is able to fuse these heterogeneous modalities to predict outcomes and discover prognostic features that correlate with poor and favorable outcomes. We present all analyses for morphological and molecular correlates of patient prognosis across the 14 cancer types at both a disease and a patient level in an interactive open-access database to allow for further exploration, biomarker discovery, and feature assessment.
Collapse
Affiliation(s)
- Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA; Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Jana Lipkova
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Zahra Noor
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Muhammad Shaban
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Maha Shady
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Mane Williams
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Bumjin Joo
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA; Harvard Data Sciences Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
37
|
Litwińczuk MC, Trujillo-Barreto N, Muhlert N, Cloutman L, Woollams A. Combination of structural and functional connectivity explains unique variation in specific domains of cognitive function. Neuroimage 2022; 262:119531. [PMID: 35931312 DOI: 10.1016/j.neuroimage.2022.119531] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 07/20/2022] [Accepted: 08/01/2022] [Indexed: 11/29/2022] Open
Abstract
The relationship between structural and functional brain networks has been characterised as complex: the two networks mirror each other and show mutual influence but they also diverge in their organisation. This work explored whether a combination of structural and functional connectivity can improve the fit of regression models of cognitive performance. Principal Component Analysis (PCA) was first applied to cognitive data from the Human Connectome Project to identify latent cognitive components: Executive Function, Self-regulation, Language, Encoding and Sequence Processing. A Principal Component Regression approach with embedded Step-Wise Regression (SWR-PCR) was then used to fit regression models of each cognitive domain based on structural (SC), functional (FC) or combined structural-functional (CC) connectivity. Executive Function was best explained by the CC model. Self-regulation was equally well explained by SC and FC. Language was equally well explained by CC and FC models. Encoding and Sequence Processing were best explained by SC. Evaluation of out-of-sample models' skill via cross-validation showed that SC, FC and CC produced generalisable models of Language performance. SC models performed most effectively at predicting Language performance in unseen sample. Executive Function was most effectively predicted by SC models, followed only by CC models. Self-regulation was only effectively predicted by CC models and Sequence Processing was only effectively predicted by FC models. The present study demonstrates that integrating structural and functional connectivity can help explaining cognitive performance, but that the added explanatory value (in sample) may be domain-specific and can come at the expense of reduced generalisation performance (out-of-sample).
Collapse
Affiliation(s)
| | | | - Nils Muhlert
- Division of Neuroscience and Experimental Psychology, University of Manchester, UK
| | - Lauren Cloutman
- Division of Neuroscience and Experimental Psychology, University of Manchester, UK
| | - Anna Woollams
- Division of Neuroscience and Experimental Psychology, University of Manchester, UK
| |
Collapse
|
38
|
Bhattacharyay S, Milosevic I, Wilson L, Menon DK, Stevens RD, Steyerberg EW, Nelson DW, Ercole A, the CENTER-TBI investigators participants. The leap to ordinal: Detailed functional prognosis after traumatic brain injury with a flexible modelling approach. PLoS One 2022; 17:e0270973. [PMID: 35788768 PMCID: PMC9255749 DOI: 10.1371/journal.pone.0270973] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 06/21/2022] [Indexed: 11/30/2022] Open
Abstract
When a patient is admitted to the intensive care unit (ICU) after a traumatic brain injury (TBI), an early prognosis is essential for baseline risk adjustment and shared decision making. TBI outcomes are commonly categorised by the Glasgow Outcome Scale–Extended (GOSE) into eight, ordered levels of functional recovery at 6 months after injury. Existing ICU prognostic models predict binary outcomes at a certain threshold of GOSE (e.g., prediction of survival [GOSE > 1]). We aimed to develop ordinal prediction models that concurrently predict probabilities of each GOSE score. From a prospective cohort (n = 1,550, 65 centres) in the ICU stratum of the Collaborative European NeuroTrauma Effectiveness Research in TBI (CENTER-TBI) patient dataset, we extracted all clinical information within 24 hours of ICU admission (1,151 predictors) and 6-month GOSE scores. We analysed the effect of two design elements on ordinal model performance: (1) the baseline predictor set, ranging from a concise set of ten validated predictors to a token-embedded representation of all possible predictors, and (2) the modelling strategy, from ordinal logistic regression to multinomial deep learning. With repeated k-fold cross-validation, we found that expanding the baseline predictor set significantly improved ordinal prediction performance while increasing analytical complexity did not. Half of these gains could be achieved with the addition of eight high-impact predictors to the concise set. At best, ordinal models achieved 0.76 (95% CI: 0.74–0.77) ordinal discrimination ability (ordinal c-index) and 57% (95% CI: 54%– 60%) explanation of ordinal variation in 6-month GOSE (Somers’ Dxy). Model performance and the effect of expanding the predictor set decreased at higher GOSE thresholds, indicating the difficulty of predicting better functional outcomes shortly after ICU admission. Our results motivate the search for informative predictors that improve confidence in prognosis of higher GOSE and the development of ordinal dynamic prediction models.
Collapse
Affiliation(s)
- Shubhayu Bhattacharyay
- Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, United Kingdom
- Laboratory of Computational Intensive Care Medicine, Johns Hopkins University, Baltimore, MD, United States of America
- * E-mail:
| | - Ioan Milosevic
- Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom
| | - Lindsay Wilson
- Division of Psychology, University of Stirling, Stirling, United Kingdom
| | - David K. Menon
- Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom
| | - Robert D. Stevens
- Laboratory of Computational Intensive Care Medicine, Johns Hopkins University, Baltimore, MD, United States of America
- Department of Anesthesiology and Critical Care Medicine, Johns Hopkins University, Baltimore, MD, United States of America
| | - Ewout W. Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - David W. Nelson
- Department of Physiology and Pharmacology, Section for Perioperative Medicine and Intensive Care, Karolinska Institutet, Stockholm, Sweden
| | - Ari Ercole
- Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom
- Cambridge Centre for Artificial Intelligence in Medicine, Cambridge, United Kingdom
| | | |
Collapse
|
39
|
Danilatou V, Nikolakakis S, Antonakaki D, Tzagkarakis C, Mavroidis D, Kostoulas T, Ioannidis S. Outcome Prediction in Critically-Ill Patients with Venous Thromboembolism and/or Cancer Using Machine Learning Algorithms: External Validation and Comparison with Scoring Systems. Int J Mol Sci 2022; 23:ijms23137132. [PMID: 35806137 PMCID: PMC9266386 DOI: 10.3390/ijms23137132] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/17/2022] [Accepted: 06/19/2022] [Indexed: 12/16/2022] Open
Abstract
Intensive care unit (ICU) patients with venous thromboembolism (VTE) and/or cancer suffer from high mortality rates. Mortality prediction in the ICU has been a major medical challenge for which several scoring systems exist but lack in specificity. This study focuses on two target groups, namely patients with thrombosis or cancer. The main goal is to develop and validate interpretable machine learning (ML) models to predict early and late mortality, while exploiting all available data stored in the medical record. To this end, retrospective data from two freely accessible databases, MIMIC-III and eICU, were used. Well-established ML algorithms were implemented utilizing automated and purposely built ML frameworks for addressing class imbalance. Prediction of early mortality showed excellent performance in both disease categories, in terms of the area under the receiver operating characteristic curve (AUC–ROC): VTE-MIMIC-III 0.93, eICU 0.87, cancer-MIMIC-III 0.94. On the other hand, late mortality prediction showed lower performance, i.e., AUC–ROC: VTE 0.82, cancer 0.74–0.88. The predictive model of early mortality developed from 1651 VTE patients (MIMIC-III) ended up with a signature of 35 features and was externally validated in 2659 patients from the eICU dataset. Our model outperformed traditional scoring systems in predicting early as well as late mortality. Novel biomarkers, such as red cell distribution width, were identified.
Collapse
Affiliation(s)
- Vasiliki Danilatou
- Sphynx Technology Solutions, 6300 Zug, Switzerland
- School of Medicine, European University of Cyprus, 2404 Nicosia, Cyprus
- Correspondence: or
| | - Stylianos Nikolakakis
- School of Electrical and Computer Engineering, Technical University of Crete, 73100 Chania, Greece; (S.N.); (S.I.)
| | - Despoina Antonakaki
- Institute of Computer Science (ICS)-Foundation for Research and Technology-Hellas (FORTH), 70013 Heraklion, Greece; (D.A.); (C.T.); (D.M.)
| | - Christos Tzagkarakis
- Institute of Computer Science (ICS)-Foundation for Research and Technology-Hellas (FORTH), 70013 Heraklion, Greece; (D.A.); (C.T.); (D.M.)
| | - Dimitrios Mavroidis
- Institute of Computer Science (ICS)-Foundation for Research and Technology-Hellas (FORTH), 70013 Heraklion, Greece; (D.A.); (C.T.); (D.M.)
| | - Theodoros Kostoulas
- Department of Information and Communication Systems Engineering, School of Engineering, University of the Aegean, 83200 Samos, Greece;
| | - Sotirios Ioannidis
- School of Electrical and Computer Engineering, Technical University of Crete, 73100 Chania, Greece; (S.N.); (S.I.)
- Institute of Computer Science (ICS)-Foundation for Research and Technology-Hellas (FORTH), 70013 Heraklion, Greece; (D.A.); (C.T.); (D.M.)
| |
Collapse
|
40
|
Combination of Whole-Body Baseline CT Radiomics and Clinical Parameters to Predict Response and Survival in a Stage-IV Melanoma Cohort Undergoing Immunotherapy. Cancers (Basel) 2022; 14:cancers14122992. [PMID: 35740659 PMCID: PMC9221470 DOI: 10.3390/cancers14122992] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Revised: 06/13/2022] [Accepted: 06/15/2022] [Indexed: 11/17/2022] Open
Abstract
Simple Summary The use of immunotherapeutic agents significantly improved stage-IV melanoma patients’ overall progression-free survival. To identify patients who do not benefit from immunotherapy, both clinical parameters and experimental biomarkers such as radiomics are currently being evaluated. However, no radiomic biomarker is widely accepted for routine clinical use. In a large cohort of 262 stage-IV melanoma patients given first-line immunotherapy treatment, we investigated whether radiomics—based on the segmentation of all baseline metastases in the whole body—in combination with clinical parameters offered added value compared to the usage of clinical parameters alone in a machine-learning prediction model. The primary endpoints were response at three months, and survival rates at six and twelve months. The study indicated a potential, but non-significant, added value of radiomics for six-month and twelve-month survival prediction, thus underlining the relevance of clinical parameters. Abstract Background: This study investigated whether a machine-learning-based combination of radiomics and clinical parameters was superior to the use of clinical parameters alone in predicting therapy response after three months, and overall survival after six and twelve months, in stage-IV malignant melanoma patients undergoing immunotherapy with PD-1 checkpoint inhibitors and CTLA-4 checkpoint inhibitors. Methods: A random forest model using clinical parameters (demographic variables and tumor markers = baseline model) was compared to a random forest model using clinical parameters and radiomics (extended model) via repeated 5-fold cross-validation. For this purpose, the baseline computed tomographies of 262 stage-IV malignant melanoma patients treated at a tertiary referral center were identified in the Central Malignant Melanoma Registry, and all visible metastases were three-dimensionally segmented (n = 6404). Results: The extended model was not significantly superior compared to the baseline model for survival prediction after six and twelve months (AUC (95% CI): 0.664 (0.598, 0.729) vs. 0.620 (0.545, 0.692) and AUC (95% CI): 0.600 (0.526, 0.667) vs. 0.588 (0.481, 0.629), respectively). The extended model was not significantly superior compared to the baseline model for response prediction after three months (AUC (95% CI): 0.641 (0.581, 0.700) vs. 0.656 (0.587, 0.719)). Conclusions: The study indicated a potential, but non-significant, added value of radiomics for six-month and twelve-month survival prediction of stage-IV melanoma patients undergoing immunotherapy.
Collapse
|
41
|
Just Add Data: automated predictive modeling for knowledge discovery and feature selection. NPJ Precis Oncol 2022; 6:38. [PMID: 35710826 PMCID: PMC9203777 DOI: 10.1038/s41698-022-00274-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 04/13/2022] [Indexed: 01/20/2023] Open
Abstract
Fully automated machine learning (AutoML) for predictive modeling is becoming a reality, giving rise to a whole new field. We present the basic ideas and principles of Just Add Data Bio (JADBio), an AutoML platform applicable to the low-sample, high-dimensional omics data that arise in translational medicine and bioinformatics applications. In addition to predictive and diagnostic models ready for clinical use, JADBio focuses on knowledge discovery by performing feature selection and identifying the corresponding biosignatures, i.e., minimal-size subsets of biomarkers that are jointly predictive of the outcome or phenotype of interest. It also returns a palette of useful information for interpretation, clinical use of the models, and decision making. JADBio is qualitatively and quantitatively compared against Hyper-Parameter Optimization Machine Learning libraries. Results show that in typical omics dataset analysis, JADBio manages to identify signatures comprising of just a handful of features while maintaining competitive predictive performance and accurate out-of-sample performance estimation.
Collapse
|
42
|
Dyba K, Wąsala R, Piekarczyk J, Gabała E, Gawlak M, Jasiewicz J, Ratajkiewicz H. Reflectance spectroscopy and machine learning as a tool for the categorization of twin species based on the example of the Diachrysia genus. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 273:121058. [PMID: 35220048 DOI: 10.1016/j.saa.2022.121058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 02/11/2022] [Accepted: 02/14/2022] [Indexed: 06/14/2023]
Abstract
In our work we used noninvasive point reflectance spectroscopy in the range from 400 to 2100 nm coupled with machine learning to study scales on the brown and golden iridescent areas on the dorsal side of the forewing of Diachrysia chrysitis and D. stenochrysis. We used our approach to distinguish between these species of moths. The basis for the study was a statistically significant collection of 95 specimens identified based on morphological feature and gathered during 23 years in Poland. The numerical part of an experiment included two independent discriminant analyses: stochastic and deterministic. The more sensitive stochastic approach achieved average compliance with the species identification made by entomologists at the level of 99-100%. It demonstrated high stability against the different configurations of training and validation sets, hence strong predictors of Diachrysia siblings distinctiveness. Both methods resulted in the same small set of relevant features, where minimal fully discriminating subsets of wavelengths were three for glass scales on the golden area and four for the brown. The differences between species in scales primarily concern their major components and ultrastructure. In melanin-absent glass scales, this is mainly chitin configuration, while in melanin-present brown scales, melanin reveals as an additional factor.
Collapse
Affiliation(s)
- Krzysztof Dyba
- Institute of Geoecology and Geoinformation, Adam Mickiewicz University in Poznań, Poland
| | - Roman Wąsala
- Department of Entomology and Environment Protection, Poznań University of Life Sciences, Poland
| | - Jan Piekarczyk
- Institute of Physical Geography and Environmental Planning, Adam Mickiewicz University in Poznań, Poland
| | - Elżbieta Gabała
- Research Centre of Quarantine, Invasive and Genetically Modified Organisms, Institute of Plant Protection - National Research Institute, Poland
| | - Magdalena Gawlak
- Research Centre of Quarantine, Invasive and Genetically Modified Organisms, Institute of Plant Protection - National Research Institute, Poland
| | - Jarosław Jasiewicz
- Institute of Geoecology and Geoinformation, Adam Mickiewicz University in Poznań, Poland.
| | - Henryk Ratajkiewicz
- Department of Entomology and Environment Protection, Poznań University of Life Sciences, Poland.
| |
Collapse
|
43
|
Karagiannaki I, Gourlia K, Lagani V, Pantazis Y, Tsamardinos I. Learning biologically-interpretable latent representations for gene expression data: Pathway Activity Score Learning Algorithm. Mach Learn 2022; 112:4257-4287. [PMID: 37900054 PMCID: PMC10600308 DOI: 10.1007/s10994-022-06158-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 11/12/2021] [Accepted: 02/19/2022] [Indexed: 11/24/2022]
Abstract
Molecular gene-expression datasets consist of samples with tens of thousands of measured quantities (i.e., high dimensional data). However, lower-dimensional representations that retain the useful biological information do exist. We present a novel algorithm for such dimensionality reduction called Pathway Activity Score Learning (PASL). The major novelty of PASL is that the constructed features directly correspond to known molecular pathways (genesets in general) and can be interpreted as pathway activity scores. Hence, unlike PCA and similar methods, PASL's latent space has a fairly straightforward biological interpretation. PASL is shown to outperform in predictive performance the state-of-the-art method (PLIER) on two collections of breast cancer and leukemia gene expression datasets. PASL is also trained on a large corpus of 50000 gene expression samples to construct a universal dictionary of features across different tissues and pathologies. The dictionary validated on 35643 held-out samples for reconstruction error. It is then applied on 165 held-out datasets spanning a diverse range of diseases. The AutoML tool JADBio is employed to show that the predictive information in the PASL-created feature space is retained after the transformation. The code is available at https://github.com/mensxmachina/PASL.
Collapse
Affiliation(s)
- Ioulia Karagiannaki
- Institute of Electronic Structure and Laser, Foundation for Research and Technology-Hellas (IESL-FORTH), Heraklion, Greece
| | | | - Vincenzo Lagani
- Institute of Chemical Biology, Ilia State University, Tbilisi, 0162 Georgia
- JADBio, Gnosis Data Analysis PC, Heraklion, Crete Greece
| | - Yannis Pantazis
- Institute of Applied and Computational Mathematics, Foundation for Research and Technology - Hellas, Heraklion, Greece
| | - Ioannis Tsamardinos
- Department of Computer Science, University of Crete, Heraklion, Greece
- JADBio, Gnosis Data Analysis PC, Heraklion, Crete Greece
- Institute of Applied and Computational Mathematics, Foundation for Research and Technology - Hellas, Heraklion, Greece
| |
Collapse
|
44
|
Karaglani M, Panagopoulou M, Baltsavia I, Apalaki P, Theodosiou T, Iliopoulos I, Tsamardinos I, Chatzaki E. Tissue-Specific Methylation Biosignatures for Monitoring Diseases: An In Silico Approach. Int J Mol Sci 2022; 23:2959. [PMID: 35328380 PMCID: PMC8952417 DOI: 10.3390/ijms23062959] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 03/01/2022] [Accepted: 03/03/2022] [Indexed: 02/06/2023] Open
Abstract
Tissue-specific gene methylation events are key to the pathogenesis of several diseases and can be utilized for diagnosis and monitoring. Here, we established an in silico pipeline to analyze high-throughput methylome datasets to identify specific methylation fingerprints in three pathological entities of major burden, i.e., breast cancer (BrCa), osteoarthritis (OA) and diabetes mellitus (DM). Differential methylation analysis was conducted to compare tissues/cells related to the pathology and different types of healthy tissues, revealing Differentially Methylated Genes (DMGs). Highly performing and low feature number biosignatures were built with automated machine learning, including: (1) a five-gene biosignature discriminating BrCa tissue from healthy tissues (AUC 0.987 and precision 0.987), (2) three equivalent OA cartilage-specific biosignatures containing four genes each (AUC 0.978 and precision 0.986) and (3) a four-gene pancreatic β-cell-specific biosignature (AUC 0.984 and precision 0.995). Next, the BrCa biosignature was validated using an independent ccfDNA dataset showing an AUC and precision of 1.000, verifying the biosignature's applicability in liquid biopsy. Functional and protein interaction prediction analysis revealed that most DMGs identified are involved in pathways known to be related to the studied diseases or pointed to new ones. Overall, our data-driven approach contributes to the maximum exploitation of high-throughput methylome readings, helping to establish specific disease profiles to be applied in clinical practice and to understand human pathology.
Collapse
Affiliation(s)
- Makrina Karaglani
- Laboratory of Pharmacology, Medical School, Democritus University of Thrace, GR-68100 Alexandroupolis, Greece; (M.K.); (M.P.); (P.A.); (T.T.)
| | - Maria Panagopoulou
- Laboratory of Pharmacology, Medical School, Democritus University of Thrace, GR-68100 Alexandroupolis, Greece; (M.K.); (M.P.); (P.A.); (T.T.)
| | - Ismini Baltsavia
- Department of Basic Sciences, School of Medicine, University of Crete, GR-71003 Heraklion, Greece; (I.B.); (I.I.)
| | - Paraskevi Apalaki
- Laboratory of Pharmacology, Medical School, Democritus University of Thrace, GR-68100 Alexandroupolis, Greece; (M.K.); (M.P.); (P.A.); (T.T.)
| | - Theodosis Theodosiou
- Laboratory of Pharmacology, Medical School, Democritus University of Thrace, GR-68100 Alexandroupolis, Greece; (M.K.); (M.P.); (P.A.); (T.T.)
| | - Ioannis Iliopoulos
- Department of Basic Sciences, School of Medicine, University of Crete, GR-71003 Heraklion, Greece; (I.B.); (I.I.)
| | - Ioannis Tsamardinos
- JADBio Gnosis DA S.A., Science and Technology Park of Crete, GR-70013 Heraklion, Greece;
- Department of Computer Science, University of Crete, GR-70013 Heraklion, Greece
- Institute of Applied and Computational Mathematics, Foundation for Research and Technology—Hellas, GR-70013 Heraklion, Greece
| | - Ekaterini Chatzaki
- Laboratory of Pharmacology, Medical School, Democritus University of Thrace, GR-68100 Alexandroupolis, Greece; (M.K.); (M.P.); (P.A.); (T.T.)
- Institute of Agri-Food and Life Sciences, Hellenic Mediterranean University Research Centre, GR-71410 Heraklion, Greece
| |
Collapse
|
45
|
Tsagris M, Papadovasilakis Z, Lakiotaki K, Tsamardinos I. The γ-OMP Algorithm for Feature Selection With Application to Gene Expression Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1214-1224. [PMID: 33035156 DOI: 10.1109/tcbb.2020.3029952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Feature selection for predictive analytics is the problem of identifying a minimal-size subset of features that is maximally predictive of an outcome of interest. To apply to molecular data, feature selection algorithms need to be scalable to tens of thousands of features. In this paper, we propose γ-OMP, a generalisation of the highly-scalable Orthogonal Matching Pursuit feature selection algorithm. γ-OMP can handle (a)various types of outcomes, such as continuous, binary, nominal, time-to-event, (b)discrete (categorical)features, (c)different statistical-based stopping criteria, (d)several predictive models (e.g., linear or logistic regression), (e)various types of residuals, and (f)different types of association. We compare γ-OMP against LASSO, a prototypical, widely used algorithm for high-dimensional data. On both simulated data and several real gene expression datasets, γ-OMP is on par, or outperforms LASSO in binary classification (case-control data), regression (quantified outcomes), and time-to-event data (censored survival times). γ-OMP is based on simple statistical ideas, it is easy to implement and to extend, and our extensive evaluation shows that it is also effective in bioinformatics analysis settings.
Collapse
|
46
|
Hatmal MM, Al-Hatamleh MAI, Olaimat AN, Mohamud R, Fawaz M, Kateeb ET, Alkhairy OK, Tayyem R, Lounis M, Al-Raeei M, Dana RK, Al-Ameer HJ, Taha MO, Bindayna KM. Reported Adverse Effects and Attitudes among Arab Populations Following COVID-19 Vaccination: A Large-Scale Multinational Study Implementing Machine Learning Tools in Predicting Post-Vaccination Adverse Effects Based on Predisposing Factors. Vaccines (Basel) 2022; 10:366. [PMID: 35334998 PMCID: PMC8955470 DOI: 10.3390/vaccines10030366] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 02/23/2022] [Accepted: 02/24/2022] [Indexed: 02/04/2023] Open
Abstract
Background: The unprecedented global spread of coronavirus disease 2019 (COVID-19) has imposed huge challenges on the healthcare facilities, and impacted every aspect of life. This has led to the development of several vaccines against COVID-19 within one year. This study aimed to assess the attitudes and the side effects among Arab communities after receiving a COVID-19 vaccine and use of machine learning (ML) tools to predict post-vaccination side effects based on predisposing factors. Methods: An online-based multinational survey was carried out via social media platforms from 14 June to 31 August 2021, targeting individuals who received at least one dose of a COVID-19 vaccine from 22 Arab countries. Descriptive statistics, correlation, and chi-square tests were used to analyze the data. Moreover, extensive ML tools were utilized to predict 30 post vaccination adverse effects and their severity based on 15 predisposing factors. The importance of distinct predisposing factors in predicting particular side effects was determined using global feature importance employing gradient boost as AutoML. Results: A total of 10,064 participants from 19 Arab countries were included in this study. Around 56% were female and 59% were aged from 20 to 39 years old. A high rate of vaccine hesitancy (51%) was reported among participants. Almost 88% of the participants were vaccinated with one of three COVID-19 vaccines, including Pfizer-BioNTech (52.8%), AstraZeneca (20.7%), and Sinopharm (14.2%). About 72% of participants experienced post-vaccination side effects. This study reports statistically significant associations (p < 0.01) between various predisposing factors and post-vaccinations side effects. In terms of predicting post-vaccination side effects, gradient boost, random forest, and XGBoost outperformed other ML methods. The most important predisposing factors for predicting certain side effects (i.e., tiredness, fever, headache, injection site pain and swelling, myalgia, and sleepiness and laziness) were revealed to be the number of doses, gender, type of vaccine, age, and hesitancy to receive a COVID-19 vaccine. Conclusions: The reported side effects following COVID-19 vaccination among Arab populations are usually non-life-threatening; flu-like symptoms and injection site pain. Certain predisposing factors have greater weight and importance as input data in predicting post-vaccination side effects. Based on the most significant input data, ML can also be used to predict these side effects; people with certain predicted side effects may require additional medical attention, or possibly hospitalization.
Collapse
Affiliation(s)
- Ma’mon M. Hatmal
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, P.O. Box 330127, Zarqa 13133, Jordan
| | - Mohammad A. I. Al-Hatamleh
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia; (M.A.I.A.-H.); (R.M.)
| | - Amin N. Olaimat
- Department of Clinical Nutrition and Dietetics, Faculty of Applied Medical Sciences, The Hashemite University, P.O. Box 330127, Zarqa 13133, Jordan;
| | - Rohimah Mohamud
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Malaysia; (M.A.I.A.-H.); (R.M.)
| | - Mirna Fawaz
- Nursing Department, Faculty of Health Sciences, Beirut Arab University, Beirut 1105, Lebanon;
| | - Elham T. Kateeb
- Oral Health Research and Promotion Unit, Faculty of Dentistry, Al-Quds University, Jerusalem 51000, Palestine;
| | - Omar K. Alkhairy
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Ministry of National Guard Health Affairs, P.O. Box 22490, Riyadh 11426, Saudi Arabia;
- King Saud bin Abdulaziz University for Health Sciences, P.O. Box 3660, Riyadh 11481, Saudi Arabia
- King Abdullah International Medical Research Center (KAIMRC), P.O. Box 3660, Riyadh 11481, Saudi Arabia
| | - Reema Tayyem
- Department of Human Nutrition, College of Health Sciences, QU Health, Qatar University, Doha P.O. Box 2713, Qatar;
| | - Mohamed Lounis
- Department of Agro-Veterinary Science, Faculty of Natural and Life Sciences, University of Ziane Achour, BP 3117, Djelfa 17000, Algeria;
| | - Marwan Al-Raeei
- Faculty of Sciences, Damascus University, Damascus P.O. Box 30621, Syria;
| | - Rasheed K. Dana
- Faculty of Medicine, Mansoura University, Mansoura, Dakahlia 35516, Egypt;
| | - Hamzeh J. Al-Ameer
- Department of Biology and Biotechnology, Faculty of Science, American University of Madaba, P.O. Box 99, Madaba 17110, Jordan;
| | - Mutasem O. Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, The University of Jordan, Amman 11942, Jordan;
| | - Khalid M. Bindayna
- Department of Microbiology, Immunology and Infectious Diseases, College of Medicine and Medical Sciences, Arabian Gulf University, Manama 329, Bahrain
| |
Collapse
|
47
|
Karaglani M, Panagopoulou M, Cheimonidi C, Tsamardinos I, Maltezos E, Papanas N, Papazoglou D, Mastorakos G, Chatzaki E. Liquid Biopsy in Type 2 Diabetes Mellitus Management: Building Specific Biosignatures via Machine Learning. J Clin Med 2022; 11:1045. [PMID: 35207316 PMCID: PMC8876363 DOI: 10.3390/jcm11041045] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 02/09/2022] [Accepted: 02/15/2022] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND The need for minimally invasive biomarkers for the early diagnosis of type 2 diabetes (T2DM) prior to the clinical onset and monitoring of β-pancreatic cell loss is emerging. Here, we focused on studying circulating cell-free DNA (ccfDNA) as a liquid biopsy biomaterial for accurate diagnosis/monitoring of T2DM. METHODS ccfDNA levels were directly quantified in sera from 96 T2DM patients and 71 healthy individuals via fluorometry, and then fragment DNA size profiling was performed by capillary electrophoresis. Following this, ccfDNA methylation levels of five β-cell-related genes were measured via qPCR. Data were analyzed by automated machine learning to build classifying predictive models. RESULTS ccfDNA levels were found to be similar between groups but indicative of apoptosis in T2DM. INS (Insulin), IAPP (Islet Amyloid Polypeptide-Amylin), GCK (Glucokinase), and KCNJ11 (Potassium Inwardly Rectifying Channel Subfamily J member 11) levels differed significantly between groups. AutoML analysis delivered biosignatures including GCK, IAPP and KCNJ11 methylation, with the highest ever reported discriminating performance of T2DM from healthy individuals (AUC 0.927). CONCLUSIONS Our data unravel the value of ccfDNA as a minimally invasive biomaterial carrying important clinical information for T2DM. Upon prospective clinical evaluation, the built biosignature can be disruptive for T2DM clinical management.
Collapse
Affiliation(s)
- Makrina Karaglani
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (M.K.); (M.P.); (C.C.)
| | - Maria Panagopoulou
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (M.K.); (M.P.); (C.C.)
| | - Christina Cheimonidi
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (M.K.); (M.P.); (C.C.)
| | - Ioannis Tsamardinos
- JADBio Gnosis DA, Science and Technology Park of Crete, 71500 Heraklion, Greece;
| | - Efstratios Maltezos
- Diabetes Centre, 2nd Department of Internal Medicine, Democritus University of Thrace, University Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece; (E.M.); (N.P.); (D.P.)
| | - Nikolaos Papanas
- Diabetes Centre, 2nd Department of Internal Medicine, Democritus University of Thrace, University Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece; (E.M.); (N.P.); (D.P.)
| | - Dimitrios Papazoglou
- Diabetes Centre, 2nd Department of Internal Medicine, Democritus University of Thrace, University Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece; (E.M.); (N.P.); (D.P.)
| | - George Mastorakos
- Endocrine Unit, 2nd Department of Obstetrics and Gynecology, National and Kapodistrian University of Athens, “Aretaieion” University Hospital, 11528 Athens, Greece;
| | - Ekaterini Chatzaki
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (M.K.); (M.P.); (C.C.)
- Institute of Agri-Food and Life Sciences, Hellenic Mediterranean University Research Centre, 71003 Heraklion, Greece
| |
Collapse
|
48
|
Fanourgakis GS, Gkagkas K, Froudakis G. Introducing artificial MOFs for improved machine learning predictions: Identification of top-performing materials for methane storage. J Chem Phys 2022; 156:054103. [DOI: 10.1063/5.0075994] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- George S. Fanourgakis
- Department of Chemistry, University of Crete, Voutes Campus, GR-70013 Heraklion, Crete, Greece
| | - Konstantinos Gkagkas
- Material Engineering Division, Toyota Motor Europe NV/SA, Technical Center, Hoge Wei 33B, 1930 Zaventem, Belgium
| | - George Froudakis
- Department of Chemistry, University of Crete, Voutes Campus, GR-70013 Heraklion, Crete, Greece
| |
Collapse
|
49
|
Fischer A, Hertwig A, Hahn R, Anwar M, Siebenrock T, Pesta M, Liebau K, Timmermann I, Brugger J, Posch M, Ringl H, Tamandl D, Hiesmayr M, Roth D, Zielinski C, Jäger U, Staudinger T, Schellongowski P, Lang I, Gottsauner-Wolf M, Mascherbauer J, Heinz G, Oberbauer R, Trauner M, Ferlitsch A, Zauner C, Wolf Husslein P, Krepler P, Shariat S, Gnant M, Sahora K, Laufer G, Taghavi S, Huk I, Radtke C, Markstaller K, Rössler B, Schaden E, Bacher A, Faybik P, Ullrich R, Plöchl W, Ihra G, Schäfer B, Mouhieddine M, Neugebauer T, Mares P, Steinlechner B, Schiferer A, Tschernko E. Validation of bedside ultrasound to predict lumbar muscle area in the computed tomography in 200 non-critically ill patients: The USVALID prospective study. Clin Nutr 2022; 41:829-837. [PMID: 35263692 DOI: 10.1016/j.clnu.2022.01.034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 01/19/2022] [Accepted: 01/31/2022] [Indexed: 12/25/2022]
Abstract
BACKGROUND & AIMS Skeletal muscle area (SMA) in the computed tomography (CT) at the third lumbar vertebra (L3) level is a proxy for whole-body muscle mass but is only performed for clinical reasons. Ultrasound is a promising tool to determine muscle mass at the bedside. It is still unclear how well ultrasound and which ultrasound measuring points can predict CT L3 SMA. METHODS This prospective observational trial included 200 non-critically ill patients, who underwent an abdominal CT scan for any clinical reason within 48 h before the ultrasound examination. Ultrasound muscle thickness was evaluated at 3 measuring points on the thigh and 2 measuring points on the upper arm with minimal compression. On the CT scan, the entire L3 SMA was measured based on Hounsfield units. Using a model selection algorithm based on the Bayesian information criterion (BIC) and clinical considerations, a linear prediction model for CT L3 SMA based on the ultrasound muscle thickness and other independent variables was fitted and assessed with cross-validation. RESULTS 67,5% and 32,5% of the patients were from surgical and medical wards, respectively. Mean ultrasound muscle thickness values were between 2,2 and 3,6 cm on the thigh and between 1,4 and 2,8 cm on the upper arm. All ultrasound muscle thickness values were higher in men than in women (P < 0,05). CT L3 SMA was 40 cm2 higher in men than in women (P < 0,001). The final prediction model for CT L3 SMA included the following 4 independent variables: ultrasound muscle thickness at the ventral measuring point of the thigh in the short-axis plane, sex, weight, and height. It had a similar BIC (BIC of 1515) compared to larger models with 6-8 independent variables including multiple ultrasound measuring points (BIC of 1506-1519). Additional clinical considerations to choose the final model were less time consumption when measuring a single ultrasound measuring point and better anatomical overview at the short-axis plane. The final model predicted CT L3 SMA with a R2 of 0,74 (P < 0,001) and a cross-validated R2 of 0,65. CONCLUSIONS One single ultrasound measuring point at the thigh together with sex, height and weight very well predicts CT L3 SMA across different clinical populations. Ultrasound is a safe and bedside method to measure muscle thickness longitudinally to monitor the effects of nutrition and physical therapy.
Collapse
Affiliation(s)
- Arabella Fischer
- Division of Cardiothoracic and Vascular Anaesthesia and Intensive Care Medicine, Medical University of Vienna, Austria
| | - Anatol Hertwig
- Division of Cardiothoracic and Vascular Anaesthesia and Intensive Care Medicine, Medical University of Vienna, Austria
| | - Ricarda Hahn
- Division of Cardiothoracic and Vascular Anaesthesia and Intensive Care Medicine, Medical University of Vienna, Austria
| | - Martin Anwar
- Division of Cardiothoracic and Vascular Anaesthesia and Intensive Care Medicine, Medical University of Vienna, Austria
| | - Timo Siebenrock
- Division of Cardiothoracic and Vascular Anaesthesia and Intensive Care Medicine, Medical University of Vienna, Austria
| | - Maximilian Pesta
- Division of Cardiothoracic and Vascular Anaesthesia and Intensive Care Medicine, Medical University of Vienna, Austria
| | - Konstantin Liebau
- Division of Cardiothoracic and Vascular Anaesthesia and Intensive Care Medicine, Medical University of Vienna, Austria
| | - Isabel Timmermann
- Division of Cardiothoracic and Vascular Anaesthesia and Intensive Care Medicine, Medical University of Vienna, Austria
| | - Jonas Brugger
- Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Austria
| | - Martin Posch
- Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Austria
| | - Helmut Ringl
- Department of Biomedical Imaging and Image-guided Therapy, Medical University of Vienna, Austria
| | - Dietmar Tamandl
- Department of Biomedical Imaging and Image-guided Therapy, Medical University of Vienna, Austria
| | - Michael Hiesmayr
- Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Austria.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Radiomics Features of the Spleen as Surrogates for CT-Based Lymphoma Diagnosis and Subtype Differentiation. Cancers (Basel) 2022; 14:cancers14030713. [PMID: 35158980 PMCID: PMC8833623 DOI: 10.3390/cancers14030713] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 01/26/2022] [Accepted: 01/27/2022] [Indexed: 02/05/2023] Open
Abstract
Simple Summary In malignant lymphoma an early and accurate diagnosis is essential for therapy initiation and patient outcome. Within the diagnostic process, imaging plays a crucial role in disease staging. However, an invasive biopsy is required for subtype classification. Involvement of the spleen, a major lymphoid organ, is frequent in malignant lymphoma; this may be reactive or due to infiltration by malignant cells. Using radiomics features of the spleen in a machine learning approach, we investigated the possibility of distinguishing malignant lymphoma patients from other cancer patients and to classify lymphoma subtypes in the case of disease presence. Recent studies have proven the value of radiomics analysis in differentiating lymphoma from non-lymphoma groups on involved sites. Supported by machine learning, imaging could gain importance as a noninvasive diagnostic tool for future lymphoma classification, offering more precise radiological information for an interdisciplinary approach regarding treatment planning. Abstract The spleen is often involved in malignant lymphoma, which manifests on CT as either splenomegaly or focal, hypodense lymphoma lesions. This study aimed to investigate the diagnostic value of radiomics features of the spleen in classifying malignant lymphoma against non-lymphoma as well as the determination of malignant lymphoma subtypes in the case of disease presence—in particular Hodgkin lymphoma (HL), diffuse large B-cell lymphoma (DLBCL), mantle-cell lymphoma (MCL), and follicular lymphoma (FL). Spleen segmentations of 326 patients (139 female, median age 54.1 +/− 18.7 years) were generated and 1317 radiomics features per patient were extracted. For subtype classification, we created four different binary differentiation tasks and addressed them with a Random Forest classifier using 10-fold cross-validation. To detect the most relevant features, permutation importance was analyzed. Classifier results using all features were: malignant lymphoma vs. non-lymphoma AUC = 0.86 (p < 0.01); HL vs. NHL AUC = 0.75 (p < 0.01); DLBCL vs. other NHL AUC = 0.65 (p < 0.01); MCL vs. FL AUC = 0.67 (p < 0.01). Classifying malignant lymphoma vs. non-lymphoma was also possible using only shape features AUC = 0.77 (p < 0.01), with the most important feature being sphericity. Based on only shape features, a significant AUC could be achieved for all tasks, however, best results were achieved combining shape and textural features. This study demonstrates the value of splenic imaging and radiomic analysis in the diagnostic process in malignant lymphoma detection and subtype classification.
Collapse
|