1
|
Faust O, Salvi M, Barua PD, Chakraborty S, Molinari F, Acharya UR. Issues and Limitations on the Road to Fair and Inclusive AI Solutions for Biomedical Challenges. SENSORS (BASEL, SWITZERLAND) 2025; 25:205. [PMID: 39796996 PMCID: PMC11723364 DOI: 10.3390/s25010205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 12/14/2024] [Accepted: 12/20/2024] [Indexed: 01/13/2025]
Abstract
OBJECTIVE In this paper, we explore the correlation between performance reporting and the development of inclusive AI solutions for biomedical problems. Our study examines the critical aspects of bias and noise in the context of medical decision support, aiming to provide actionable solutions. Contributions: A key contribution of our work is the recognition that measurement processes introduce noise and bias arising from human data interpretation and selection. We introduce the concept of "noise-bias cascade" to explain their interconnected nature. While current AI models handle noise well, bias remains a significant obstacle in achieving practical performance in these models. Our analysis spans the entire AI development lifecycle, from data collection to model deployment. RECOMMENDATIONS To effectively mitigate bias, we assert the need to implement additional measures such as rigorous study design; appropriate statistical analysis; transparent reporting; and diverse research representation. Furthermore, we strongly recommend the integration of uncertainty measures during model deployment to ensure the utmost fairness and inclusivity. These comprehensive recommendations aim to minimize both bias and noise, thereby improving the performance of future medical decision support systems.
Collapse
Affiliation(s)
- Oliver Faust
- School of Computing and Information Science, Anglia Ruskin University, Cambridge Campus, Cambridge CB1 1PT, UK
| | - Massimo Salvi
- PoliToBIOMed Lab, Biolab, Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca Degli Abruzzi 24, 10129 Turin, Italy; (M.S.); (F.M.)
| | - Prabal Datta Barua
- Cogninet Australia, Sydney, NSW 2010, Australia;
- School of Business (Information Systems), University of Southern Queensland, Toowoomba, QLD 4350, Australia
- Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
- Australian International Institute of Higher Education, Sydney, NSW 2000, Australia
- School of Science and Technology, University of New England, Armidale, NSW 2351, Australia;
- School of Biosciences, Taylor’s University, Subang Jaya 47500, Malaysia
- School of Computing, SRM Institute of Science and Technology, Kattankulathur 603203, India
- School of Science and Technology, Kumamoto University, Kumamoto 860-8555, Japan
- Sydney School of Education and Social Work, University of Sydney, Camperdown, NSW 2050, Australia
| | - Subrata Chakraborty
- School of Science and Technology, University of New England, Armidale, NSW 2351, Australia;
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia
- Griffith Business School, Griffith University, Brisbane, QLD 4111, Australia
| | - Filippo Molinari
- PoliToBIOMed Lab, Biolab, Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca Degli Abruzzi 24, 10129 Turin, Italy; (M.S.); (F.M.)
| | - U. Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, QLD 4300, Australia;
- Centre for Health Research, University of Southern Queensland, Ipswich, QLD 4305, Australia
| |
Collapse
|
2
|
Jingjing H, Sufang H, Xiaorong L, Yuchen L, Kexin Z, Shiya L. Leveraging healthcare professionals' insights to enhance data quality in medical big data platforms: A qualitative study. Digit Health 2025; 11:20552076251326697. [PMID: 40103645 PMCID: PMC11915297 DOI: 10.1177/20552076251326697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 02/24/2025] [Indexed: 03/20/2025] Open
Abstract
Objective This study aims to explore the awareness, attitudes, and actual usage of medical big data platforms among healthcare professionals to provide practical guidance and theoretical support for improving data quality for the development of medical informatization. Method Semistructured interviews were conducted with 19 doctors and nurses from a tertiary hospital in Wuhan City between April and June 2024. Results The analysis yielded seven major themes and nine subthemes: cognitive status, Value of Medical Big Data Platforms, data trust (subjective data, objective data), purposes of data recording (patient condition observation, self-protection, task completion), practical challenges (conflict between work purposes and recording requirements, inconsistent departmental training standards, influence of leadership style and team culture), standardization of data recording, and concerns about data privacy. Conclusion The insufficient understanding of medical big data platforms among healthcare professionals affects the quality of data recording and research value, emphasizing the necessity of strengthening training, standardizing data recording, providing technical support, and ensuring data security.
Collapse
Affiliation(s)
- Huang Jingjing
- Department of Emergency, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Huang Sufang
- Department of Emergency, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Lang Xiaorong
- Department of Emergency, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Liu Yuchen
- Department of Emergency, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Zhang Kexin
- Department of Emergency, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Liu Shiya
- Department of Emergency, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| |
Collapse
|
3
|
Gabaldón-Rodríguez I, de Francisco-Montero C, Menéndez-Moreno I, Balongo-Molina Á, Gómez-Lorenzo AI, Rodríguez-García R, Vilches-Arenas Á, Ortega-Calvo M. Pregnancy-Associated Plasma Protein A (PAPP-A) as a Predictor of Third Trimester Obesity: Insights from the CRIOBES Project. PATHOPHYSIOLOGY 2024; 31:631-642. [PMID: 39585163 PMCID: PMC11587435 DOI: 10.3390/pathophysiology31040046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2024] [Revised: 11/13/2024] [Accepted: 11/14/2024] [Indexed: 11/26/2024] Open
Abstract
Introduction: Our objective in this article was to develop a predictive model for obesity in the third trimester of pregnancy using the plasma and clinical biomarkers that are managed within the Chromosomopathies Programme in the Andalusian Public Healthcare System. Methods: The epidemiological design was observational, of the unmatched case-control type. The geographical environment was the Seville Primary Healthcare District (DSAP Sevilla). The information was collected between 2011 and 2021. The reference cohort consisted of women who had carried a pregnancy to term. The variables and biomarkers studied correspond to those managed within the primary-care Pregnancy Integrated Care Pathway (ICP). Unconditional binary logistic regression (BLR) models were created, with the outcome variable being whether or not the women were obese in their third trimester of pregnancy. Results: A total of 423 controls and 104 cases of obesity were obtained for women in their third trimester who had not been obese in their first trimester. The average age for the sample group (P50) was 34 years old. The final, most parsimonious model included the variables PAPP-A (p = 0.074), beta-hCG (p = 0.1631), and systolic blood pressure (SBP) (p = 0.085). ROC curve = 0.75 (C.I. at 95%: 0.63-0.86). Discussion: The results of this research can only be extrapolated to primary care and to pregnancies with no complications. PAPP-A has been shown in our research to be a significant predictor of obesity risk in the third trimester of pregnancies with no complications (OR = 0.53; C.I. at 95%: 0.39-0.66; p = 0.04 in the single-variant study; OR = 0.58; C.I. at 95%: 0.29-0.93; p = 0.074 in the multi-variant analysis). This predictive capacity is further enhanced from an operational perspective by beta-hCG and 12-week SBP.
Collapse
Affiliation(s)
- Inmaculada Gabaldón-Rodríguez
- Andalusian Health Service, Primary Care Seville District, 41004 Seville, Spain; (I.G.-R.); (C.d.F.-M.); (I.M.-M.); (Á.B.-M.); (A.I.G.-L.); (R.R.-G.)
| | - Carmen de Francisco-Montero
- Andalusian Health Service, Primary Care Seville District, 41004 Seville, Spain; (I.G.-R.); (C.d.F.-M.); (I.M.-M.); (Á.B.-M.); (A.I.G.-L.); (R.R.-G.)
| | - Inmaculada Menéndez-Moreno
- Andalusian Health Service, Primary Care Seville District, 41004 Seville, Spain; (I.G.-R.); (C.d.F.-M.); (I.M.-M.); (Á.B.-M.); (A.I.G.-L.); (R.R.-G.)
| | - Álvaro Balongo-Molina
- Andalusian Health Service, Primary Care Seville District, 41004 Seville, Spain; (I.G.-R.); (C.d.F.-M.); (I.M.-M.); (Á.B.-M.); (A.I.G.-L.); (R.R.-G.)
| | - Ana Isabel Gómez-Lorenzo
- Andalusian Health Service, Primary Care Seville District, 41004 Seville, Spain; (I.G.-R.); (C.d.F.-M.); (I.M.-M.); (Á.B.-M.); (A.I.G.-L.); (R.R.-G.)
| | - Rubén Rodríguez-García
- Andalusian Health Service, Primary Care Seville District, 41004 Seville, Spain; (I.G.-R.); (C.d.F.-M.); (I.M.-M.); (Á.B.-M.); (A.I.G.-L.); (R.R.-G.)
| | | | - Manuel Ortega-Calvo
- Andalusian Health Service, Primary Care Seville District, 41004 Seville, Spain; (I.G.-R.); (C.d.F.-M.); (I.M.-M.); (Á.B.-M.); (A.I.G.-L.); (R.R.-G.)
| |
Collapse
|
4
|
Goules AV, Chatzis L, Pezoulas VC, Patsouras M, Mavragani C, Quartuccio L, Baldini C, De Vita S, Fotiadis DI, Tzioufas AG. Identification and evolution of predictors of Sjögren's disease-associated mucosa-associated lymphoid tissue lymphoma development over time: a case-control study. THE LANCET. RHEUMATOLOGY 2024; 6:e693-e702. [PMID: 39182505 DOI: 10.1016/s2665-9913(24)00183-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 06/17/2024] [Accepted: 06/17/2024] [Indexed: 08/27/2024]
Abstract
BACKGROUND Non-Hodgkin lymphomas have a substantial impact on individuals with Sjögren's disease. This study focuses on mucosal-associated lymphoid tissue (MALT) lymphomas, which constitute the majority of Sjögren's disease-associated non-Hodgkin lymphomas. We aimed to identify reliable lymphoma predictors in patients with Sjögren's disease and study their progression over time. METHODS In this case-control study, patients diagnosed with Sjögren's disease-associated MALT lymphoma, with a minimum of 3 years between Sjögren's disease diagnosis and MALT lymphoma diagnosis, were included from three centres specialising in Sjögren's disease (University of Athens, Athens, Greece; University of Pisa, Pisa, Italy; and University of Udine, Udine, Italy) and matched 1:1 with control participants with Sjögren's disease who did not have lymphoma according to age, sex, disease duration at last follow up, and treatment modality. Three harmonised datasets were constructed, curated, and analysed to identify MALT lymphoma predictors, representing three distinct timepoints in lymphomagenesis progression: V1 at Sjögren's disease diagnosis, V2 3-4 years before lymphoma diagnosis, and V3 0·5-1·5 years before lymphoma diagnosis. All recruited patients fulfilled the 2016 American College of Rheumatology-European League Against Rheumatism criteria for Sjögren's disease. The primary outcome was to identify MALT lymphoma predictors in Sjögren's disease, present at the timepoint of Sjögren's disease diagnosis and 3-4 years before the diagnosis of MALT lymphoma. A fast correlation-based feature selection and logistic regression model was used at V1 and V2 to identify MALT lymphoma predictors. The progression of potential predictors was studied across V1, V2, and V3. Histological parameters were not included in the analysis. An individual with lived experience of Sjögren's disease was involved in the study design. FINDINGS 80 patients with Sjögren's disease-associated MALT lymphoma were included in the V1 dataset, 68 in the V2 dataset, and 80 in the V3 dataset, and matched to control participants with Sjögren's disease who did not have lymphoma. In both groups, 72 (90%) of 80 participants were women and eight (10%) were men. The mean age at Sjögren's disease diagnosis was 48·6 years (SD 11·6) in the lymphoma group and 48·7 years (11·5) in the control group. All patients were White, with 88 (55%) of 160 individuals of Greek nationality and 72 (45%) of Italian nationality. At the V1 timepoint, rheumatoid factor was the only independent lymphoma predictor (odds ratio 3·33 [95% CI 1·96-5·64]). At the V2 timepoint, rheumatoid factor (3·66 [95% CI 2·08-6·42]) and European League Against Rheumatism Sjögren's Syndrome Disease Activity Index ≥5 (3·88 [1·69-8·90]) were identified as independent lymphoma risk factors. The high disease activity during the transition from the V1 to V2 timepoint was attributed to specific B-cell-derived manifestations, including cryoglobulinaemia and glandular, cutaneous, and hematological manifestations. INTERPRETATION Following up patients with high-risk of Sjögren's disease-associated MALT lymphoma based on the temporal progression of predictors presents an opportunity for early diagnosis and potential therapeutic interventions. Rheumatoid factor was the earliest and most persistent independent predictor of lymphoma. Specific B-cell manifestations in combination with rheumatoid factor indicate a more advanced stage of the lymphomagenesis process. FUNDING European Commission-Horizon 2020.
Collapse
Affiliation(s)
- Andreas V Goules
- Department of Pathophysiology, Joint Academic Rheumatology Program, School of Medicine, National and Kapodistrian University of Athens, Laiko General Hospital, Athens, Greece; Research Institute for Systemic Autoimmune Diseases, Athens, Greece.
| | - Loukas Chatzis
- Department of Pathophysiology, Joint Academic Rheumatology Program, School of Medicine, National and Kapodistrian University of Athens, Laiko General Hospital, Athens, Greece; Research Institute for Systemic Autoimmune Diseases, Athens, Greece; Laboratory of Immunobiology, Center for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | - Vasilis C Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, University of Ioannina, Ioannina, Greece
| | - Markos Patsouras
- Department of Pathophysiology, Joint Academic Rheumatology Program, School of Medicine, National and Kapodistrian University of Athens, Laiko General Hospital, Athens, Greece
| | - Clio Mavragani
- Department of Physiology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Luca Quartuccio
- Rheumatology Clinic, Department of Medical and Biological Sciences, University of Udine, Udine, Italy
| | - Chiara Baldini
- Rheumatology Unit, Department of Clinical and Experimental Medicine, University of Pisa, Pisa, Italy
| | - Salvatore De Vita
- Rheumatology Clinic, Department of Medical and Biological Sciences, University of Udine, Udine, Italy
| | - Dimitrios I Fotiadis
- Biomedical Research Institute, Foundation for Research and Technology, Ioannina, Greece
| | - Athanasios G Tzioufas
- Research Institute for Systemic Autoimmune Diseases, Athens, Greece; Laboratory of Immunobiology, Center for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| |
Collapse
|
5
|
Kosvyra A, Filos DT, Fotopoulos DT, Tsave O, Chouvarda I. Toward Ensuring Data Quality in Multi-Site Cancer Imaging Repositories. INFORMATION 2024; 15:533. [DOI: 10.3390/info15090533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025] Open
Abstract
Cancer remains a major global health challenge, affecting diverse populations across various demographics. Integrating Artificial Intelligence (AI) into clinical settings to enhance disease outcome prediction presents notable challenges. This study addresses the limitations of AI-driven cancer care due to low-quality datasets by proposing a comprehensive three-step methodology to ensure high data quality in large-scale cancer-imaging repositories. Our methodology encompasses (i) developing a Data Quality Conceptual Model with specific metrics for assessment, (ii) creating a detailed data-collection protocol and a rule set to ensure data homogeneity and proper integration of multi-source data, and (iii) implementing a Data Integration Quality Check Tool (DIQCT) to verify adherence to quality requirements and suggest corrective actions. These steps are designed to mitigate biases, enhance data integrity, and ensure that integrated data meets high-quality standards. We applied this methodology within the INCISIVE project, an EU-funded initiative aimed at a pan-European cancer-imaging repository. The use-case demonstrated the effectiveness of our approach in defining quality rules and assessing compliance, resulting in improved data integration and higher data quality. The proposed methodology can assist the deployment of big data centralized or distributed repositories with data from diverse data sources, thus facilitating the development of AI tools.
Collapse
Affiliation(s)
- Alexandra Kosvyra
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
| | - Dimitrios T. Filos
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
| | - Dimitris Th. Fotopoulos
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
| | - Olga Tsave
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
| | - Ioanna Chouvarda
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece
| |
Collapse
|
6
|
Zhao Y, Li X, Zhou C, Peng H, Zheng Z, Chen J, Ding W. A review of cancer data fusion methods based on deep learning. INFORMATION FUSION 2024; 108:102361. [DOI: 10.1016/j.inffus.2024.102361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
7
|
Pezoulas V, Ehret G, Dobretz K, Fotiadis DI, Sakellarios AI. The Diagnosis of Cardiovascular Disease Using Simple Blood Biomarkers Through AI and Big Data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039748 DOI: 10.1109/embc53108.2024.10782688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Cardiovascular disease (CVD) is the leading cause of global mortality, diagnosed primarily through costly imaging modalities which are often overused in asymptomatic patients. Our project aims to develop an AI-based solution for CVD risk stratification using routine blood biomarkers, serving as a pre-imaging test. We used anonymized data from over 500,000 UK Biobank (UKB) patients with CVD assessments. Initially, 701 features including demographics, blood tests, medical conditions, and clinical assessments were selected. The UKB dataset was refined using an automated data curation pipeline to deal with outliers, duplicated fields, and missing values. Then, a hybrid XGBoost classifier was employed, with a scalable loss function, to address overfitting effects during the training process, yielding 0.83 accuracy, 0.82 sensitivity, and 0.84 specificity in diagnosing CVD comorbidities. Key biomarkers identified included blood pressure, BMI, and age. To our knowledge, this is the first case study which utilizes the UKB data towards the identification of cost-effective CVD (non-imaging) risk factors, thus reducing the reliance on imaging modalities.
Collapse
|
8
|
Thompson YT, Li Y, Silovsky J. From Scientific Research to Practical Implementations: Applications to Improve Data Quality in Child Welfare. J Behav Health Serv Res 2024; 51:289-301. [PMID: 38153681 DOI: 10.1007/s11414-023-09875-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/14/2023] [Indexed: 12/29/2023]
Abstract
Child welfare decisions have life-impacting consequences which, often times, are underpinned by limited or inadequate data and poor quality. Though research of data quality has gained popularity and made advancements in various practical areas, it has not made significant inroads for child welfare fields or data systems. Poor data quality can hinder service decision-making, impacting child behavioral health and well-being as well as increasing unnecessary expenditure of time and resources. Poor data quality can also undermine the validity of research and slow policymaking processes. The purpose of this commentary is to summarize the data quality research base in other fields, describe obstacles and uniqueness to improve data quality in child welfare, and propose necessary steps to scientific research and practical implementation that enables researchers and practitioners to improve the quality of child welfare services based on the enhanced quality of data.
Collapse
Affiliation(s)
- Yutian T Thompson
- University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
| | - Yaqi Li
- University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
| | - Jane Silovsky
- University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| |
Collapse
|
9
|
Nowakowska K, Sakellarios A, Kaźmierski J, Fotiadis DI, Pezoulas VC. AI-Enhanced Predictive Modeling for Identifying Depression and Delirium in Cardiovascular Patients Scheduled for Cardiac Surgery. Diagnostics (Basel) 2023; 14:67. [PMID: 38201376 PMCID: PMC10795764 DOI: 10.3390/diagnostics14010067] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 12/21/2023] [Accepted: 12/26/2023] [Indexed: 01/12/2024] Open
Abstract
Several studies have demonstrated a critical association between cardiovascular disease (CVD) and mental health, revealing that approximately one-third of individuals with CVD also experience depression. This comorbidity significantly increases the risk of cardiac complications and mortality, a risk that persists regardless of traditional factors. Addressing this issue, our study pioneers a straightforward, explainable, and data-driven pipeline for predicting depression in CVD patients. METHODS Our study was conducted at a cardiac surgical intensive care unit. A total of 224 participants who were scheduled for elective coronary artery bypass graft surgery (CABG) were enrolled in the study. Prior to surgery, each patient underwent psychiatric evaluation to identify major depressive disorder (MDD) based on the DSM-5 criteria. An advanced data curation workflow was applied to eliminate outliers and inconsistencies and improve data quality. An explainable AI-empowered pipeline was developed, where sophisticated machine learning techniques, including the AdaBoost, random forest, and XGBoost algorithms, were trained and tested on the curated data based on a stratified cross-validation approach. RESULTS Our findings identified a significant correlation between the biomarker "sRAGE" and depression (r = 0.32, p = 0.038). Among the applied models, the random forest classifier demonstrated superior accuracy in predicting depression, with notable scores in accuracy (0.62), sensitivity (0.71), specificity (0.53), and area under the curve (0.67). CONCLUSIONS This study provides compelling evidence that depression in CVD patients, particularly those with elevated "sRAGE" levels, can be predicted with a 62% accuracy rate. Our AI-driven approach offers a promising way for early identification and intervention, potentially revolutionizing care strategies in this vulnerable population.
Collapse
Affiliation(s)
- Karina Nowakowska
- Department of Old Age Psychiatry and Psychotic Disorders, Medical University of Lodz, 90-419 Lodz, Poland; (K.N.); (J.K.)
| | - Antonis Sakellarios
- Laboratory of Biomechanics and Biomedical Engineering, Department of Mechanical and Aeronautics Engineering, University of Patras, 26504 Patras, Greece;
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, 45110 Ioannina, Greece;
| | - Jakub Kaźmierski
- Department of Old Age Psychiatry and Psychotic Disorders, Medical University of Lodz, 90-419 Lodz, Poland; (K.N.); (J.K.)
| | - Dimitrios I. Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, 45110 Ioannina, Greece;
- Biomedical Research Institute—FORTH, University Campus of Ioannina, 45110 Ioannina, Greece
| | - Vasileios C. Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, 45110 Ioannina, Greece;
- Biomedical Research Institute—FORTH, University Campus of Ioannina, 45110 Ioannina, Greece
| |
Collapse
|
10
|
Lee S, Roh GH, Kim JY, Ho Lee Y, Woo H, Lee S. Effective data quality management for electronic medical record data using SMART DATA. Int J Med Inform 2023; 180:105262. [PMID: 37871445 DOI: 10.1016/j.ijmedinf.2023.105262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 10/03/2023] [Accepted: 10/11/2023] [Indexed: 10/25/2023]
Abstract
OBJECTIVES In the medical field, we face many challenges, including the high cost of data collection and processing, difficult standards issues, and complex preprocessing techniques. It is necessary to establish an objective and systematic data quality management system that ensures data reliability, mitigates risks caused by incorrect data, reduces data management costs, and increases data utilization. We introduce the concept of SMART data in a data quality management system and conducted a case study using real-world data on colorectal cancer. METHODS We defined the data quality management system from three aspects (Construction - Operation - Utilization) based on the life cycle of medical data. Based on this, we proposed the "SMART DATA" concept and tested it on colorectal cancer data, which is actual real-world data. RESULTS We define "SMART DATA" as systematized, high-quality data collected based on the life cycle of data construction, operation, and utilization through quality control activities for medical data. In this study, we selected a scenario using data on colorectal cancer patients from a single medical institution provided by the Clinical Oncology Network (CONNECT). As SMART DATA, we curated 1,724 learning data and 27 Clinically Critical Set (CCS) data for colorectal cancer prediction. These datasets contributed to the development and fine-tuning of the colorectal cancer prediction model, and it was determined that CCS cases had unique characteristics and patterns that warranted additional clinical review and consideration in the context of colorectal cancer prediction. CONCLUSIONS In this study, we conducted primary research to develop a medical data quality management system. This will standardize medical data extraction and quality control methods and increase the utilization of medical data. Ultimately, we aim to provide an opportunity to develop a medical data quality management methodology and contribute to the establishment of a medical data quality management system.
Collapse
Affiliation(s)
- Seunghee Lee
- Healthcare Data Science Center, Konyang University Hospital, Daejeon, 35365, Republic of Korea
| | - Gyun-Ho Roh
- Biomedical Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
| | - Jong-Yeup Kim
- Healthcare Data Science Center, Konyang University Hospital, Daejeon, 35365, Republic of Korea; Department of Biomedical Informatics, College of Medicine, Konyang University, Daejeon, 35365, Republic of Korea
| | - Young Ho Lee
- Department of Computer Engineering, Gachon University, Seongnam, Republic of Korea
| | - Hyekyung Woo
- Department of Health Administration, Kongju National University, Kongju, 32588, Republic of Korea.
| | - Suehyun Lee
- Department of Computer Engineering, Gachon University, Seongnam, Republic of Korea.
| |
Collapse
|
11
|
Sen J, Huynh Q, Marwick TH. Prognostic Signals From Moderate Valve Disease in Big Data: An Artefact of Digital Imaging and Communications in Medicine Structured Reporting? J Am Soc Echocardiogr 2023; 36:1190-1200. [PMID: 37321422 DOI: 10.1016/j.echo.2023.05.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 05/25/2023] [Accepted: 05/27/2023] [Indexed: 06/17/2023]
Abstract
BACKGROUND Recent studies have identified an association between moderate aortic stenosis (AS) and outcome. We assessed whether Digital Imaging and Communications in Medicine (DICOM) structured reporting (SR), which captures and inserts echocardiographic measurements and text data directly into radiological reports, may lead to misclassifying patients with severe AS as moderate. METHODS Moderate or severe AS cases were filtered from an echocardiography data set based on aortic valve area (AVA) < 1.5 cm2, indexed AVA (AVAi) ≤ 0.85 cm2/m2, mean pressure gradient ≥ 25 mm Hg, dimensionless severity index (DSI) ≤ 0.5, or peak velocity > 3 m/sec. Data validation was conducted by verification of each parameter. All echocardiographic parameters and definitions of AS were compared pre- and postvalidation by taking differences in measurements. Misclassification rates were assessed by determining the percentage of cases that changed AS severity classification and impact on outcomes. Patients were followed over 4.3 ± 1.5 years. RESULTS Of 2,595 validated echocardiograms with AS, up to 36% of the echocardiographic parameters for AS criteria had a >10% difference between DICOM-SR and manual validation, the highest with mean pressure gradient (36%) and the lowest with DSI (6.5%). The validation process changed the reported degree of AS in up to 20.6% of echocardiograms with resultant changes in AS severity and its association with mortality or heart failure-related hospitalizations. In contrast to multiple quantitative metrics in DICOM-SR after manual validation, clinicians' evaluation of AS severity was unable to distinguish composite outcomes over 3 years between moderate and severe AS. The risk of composite outcomes was significantly increased when severe AS was evidenced by at least 1 echocardiographic parameter of severe AS (hazard ratio = 1.24; 95% CI, 1.12-1.37; P < .001). The greatest hazard was based on DSI only (hazard ratio = 1.26; 95% CI, 1.10-1.44; P < .001), which was higher after manual validation compared to DICOM-SR. Averaging of repeated echo measures including invalid values contributed the most to erroneous data. CONCLUSIONS Nonpeak data in DICOM-SR led to incorrect categorization of a high proportion of patients based on AS severity definitions. Standardization of data fields and curation to ensure that only peak values are imported from DICOM-SR data are essential.
Collapse
Affiliation(s)
- Jonathan Sen
- Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia; Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia
| | - Quan Huynh
- Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia; Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia
| | - Thomas H Marwick
- Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia; Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia; Department of Western Health, Melbourne, Victoria, Australia.
| |
Collapse
|
12
|
Guo LL, Calligan M, Vettese E, Cook S, Gagnidze G, Han O, Inoue J, Lemmon J, Li J, Roshdi M, Sadovy B, Wallace S, Sung L. Development and validation of the SickKids Enterprise-wide Data in Azure Repository (SEDAR). Heliyon 2023; 9:e21586. [PMID: 38027579 PMCID: PMC10661187 DOI: 10.1016/j.heliyon.2023.e21586] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 09/15/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023] Open
Abstract
Objectives To describe the processes developed by The Hospital for Sick Children (SickKids) to enable utilization of electronic health record (EHR) data by creating sequentially transformed schemas for use across multiple user types. Methods We used Microsoft Azure as the cloud service provider and named this effort the SickKids Enterprise-wide Data in Azure Repository (SEDAR). Epic Clarity data from on-premises was copied to a virtual network in Microsoft Azure. Three sequential schemas were developed. The Filtered Schema added a filter to retain only SickKids and valid patients. The Curated Schema created a data structure that was easier to navigate and query. Each table contained a logical unit such as patients, hospital encounters or laboratory tests. Data validation of randomly sampled observations in the Curated Schema was performed. The SK-OMOP Schema was designed to facilitate research and machine learning. Two individuals mapped medical elements to standard Observational Medical Outcomes Partnership (OMOP) concepts. Results A copy of Clarity data was transferred to Microsoft Azure and updated each night using log shipping. The Filtered Schema and Curated Schema were implemented as stored procedures and executed each night with incremental updates or full loads. Data validation required up to 16 iterations for each Curated Schema table. OMOP concept mapping achieved at least 80 % coverage for each SK-OMOP table. Conclusions We described our experience in creating three sequential schemas to address different EHR data access requirements. Future work should consider replicating this approach at other institutions to determine whether approaches are generalizable.
Collapse
Affiliation(s)
- Lin Lawrence Guo
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada
| | - Maryann Calligan
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada
| | - Emily Vettese
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada
| | - Sadie Cook
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada
| | - George Gagnidze
- Information Management Technology, The Hospital for Sick Children, Toronto, Canada
| | - Oscar Han
- Information Management Technology, The Hospital for Sick Children, Toronto, Canada
| | - Jiro Inoue
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada
| | - Joshua Lemmon
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada
| | - Johnson Li
- Information Management Technology, The Hospital for Sick Children, Toronto, Canada
| | - Medhat Roshdi
- Information Management Technology, The Hospital for Sick Children, Toronto, Canada
| | - Bohdan Sadovy
- Information Management Technology, The Hospital for Sick Children, Toronto, Canada
| | - Steven Wallace
- Information Management Technology, The Hospital for Sick Children, Toronto, Canada
| | - Lillian Sung
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada
- Division of Haematology/Oncology, The Hospital for Sick Children, Toronto, Canada
| |
Collapse
|
13
|
Bernardi FA, Alves D, Crepaldi N, Yamada DB, Lima VC, Rijo R. Data Quality in Health Research: Integrative Literature Review. J Med Internet Res 2023; 25:e41446. [PMID: 37906223 PMCID: PMC10646672 DOI: 10.2196/41446] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 04/18/2023] [Accepted: 07/14/2023] [Indexed: 11/02/2023] Open
Abstract
BACKGROUND Decision-making and strategies to improve service delivery must be supported by reliable health data to generate consistent evidence on health status. The data quality management process must ensure the reliability of collected data. Consequently, various methodologies to improve the quality of services are applied in the health field. At the same time, scientific research is constantly evolving to improve data quality through better reproducibility and empowerment of researchers and offers patient groups tools for secured data sharing and privacy compliance. OBJECTIVE Through an integrative literature review, the aim of this work was to identify and evaluate digital health technology interventions designed to support the conducting of health research based on data quality. METHODS A search was conducted in 6 electronic scientific databases in January 2022: PubMed, SCOPUS, Web of Science, Institute of Electrical and Electronics Engineers Digital Library, Cumulative Index of Nursing and Allied Health Literature, and Latin American and Caribbean Health Sciences Literature. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist and flowchart were used to visualize the search strategy results in the databases. RESULTS After analyzing and extracting the outcomes of interest, 33 papers were included in the review. The studies covered the period of 2017-2021 and were conducted in 22 countries. Key findings revealed variability and a lack of consensus in assessing data quality domains and metrics. Data quality factors included the research environment, application time, and development steps. Strategies for improving data quality involved using business intelligence models, statistical analyses, data mining techniques, and qualitative approaches. CONCLUSIONS The main barriers to health data quality are technical, motivational, economical, political, legal, ethical, organizational, human resources, and methodological. The data quality process and techniques, from precollection to gathering, postcollection, and analysis, are critical for the final result of a study or the quality of processes and decision-making in a health care organization. The findings highlight the need for standardized practices and collaborative efforts to enhance data quality in health research. Finally, context guides decisions regarding data quality strategies and techniques. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) RR2-10.1101/2022.05.31.22275804.
Collapse
Affiliation(s)
| | - Domingos Alves
- Ribeirão Preto School of Medicine, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Nathalia Crepaldi
- Ribeirão Preto School of Medicine, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Diego Bettiol Yamada
- Ribeirão Preto School of Medicine, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Vinícius Costa Lima
- Ribeirão Preto School of Medicine, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Rui Rijo
- Ribeirão Preto School of Medicine, University of Sao Paulo, Ribeirão Preto, Brazil
- Polytechnic Institute of Leiria, Leiria, Portugal
- Institute for Systems and Computers Engineering, Coimbra, Portugal
- Center for Research in Health Technologies and Services, Porto, Portugal
| |
Collapse
|
14
|
Lewis AE, Weiskopf N, Abrams ZB, Foraker R, Lai AM, Payne PRO, Gupta A. Electronic health record data quality assessment and tools: a systematic review. J Am Med Inform Assoc 2023; 30:1730-1740. [PMID: 37390812 PMCID: PMC10531113 DOI: 10.1093/jamia/ocad120] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 05/16/2023] [Accepted: 06/23/2023] [Indexed: 07/02/2023] Open
Abstract
OBJECTIVE We extended a 2013 literature review on electronic health record (EHR) data quality assessment approaches and tools to determine recent improvements or changes in EHR data quality assessment methodologies. MATERIALS AND METHODS We completed a systematic review of PubMed articles from 2013 to April 2023 that discussed the quality assessment of EHR data. We screened and reviewed papers for the dimensions and methods defined in the original 2013 manuscript. We categorized papers as data quality outcomes of interest, tools, or opinion pieces. We abstracted and defined additional themes and methods though an iterative review process. RESULTS We included 103 papers in the review, of which 73 were data quality outcomes of interest papers, 22 were tools, and 8 were opinion pieces. The most common dimension of data quality assessed was completeness, followed by correctness, concordance, plausibility, and currency. We abstracted conformance and bias as 2 additional dimensions of data quality and structural agreement as an additional methodology. DISCUSSION There has been an increase in EHR data quality assessment publications since the original 2013 review. Consistent dimensions of EHR data quality continue to be assessed across applications. Despite consistent patterns of assessment, there still does not exist a standard approach for assessing EHR data quality. CONCLUSION Guidelines are needed for EHR data quality assessment to improve the efficiency, transparency, comparability, and interoperability of data quality assessment. These guidelines must be both scalable and flexible. Automation could be helpful in generalizing this process.
Collapse
Affiliation(s)
- Abigail E Lewis
- Division of Computational and Data Sciences, Washington University in St. Louis, St. Louis, Missouri, USA
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Nicole Weiskopf
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | - Zachary B Abrams
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Randi Foraker
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Albert M Lai
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Philip R O Payne
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Aditi Gupta
- Institute for Informatics, Data Science and Biostatistics, Washington University in St. Louis, St. Louis, Missouri, USA
| |
Collapse
|
15
|
Pezoulas VC, Exarchos TP, Tachos NS, Goules A, Tzioufas AG, Fotiadis DI. Boosting the performance of MALT lymphoma classification in patients with primary Sjögren's Syndrome through data augmentation: a case study. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083761 DOI: 10.1109/embc40787.2023.10340802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Sjögren's Syndrome (SS) patients with mucosa associated lymphoid tissue lymphomas (MALTLs) and diffuse large B-cell lymphomas (DLBCLs) have 10-year survival rates of 80% and 40%, respectively. This highlights the unique biologic burden of the two histologic forms, as well as, the need for early detection and thorough monitoring of these patients. The lack of MALTL patients and the fact that most studies are single cohort and combine patients with different lymphoma subtypes narrow the understanding of MALTL progression. Here, we propose a data augmentation pipeline that utilizes an advanced synthetic data generator which is trained on a Pan European data hub with primary SS (pSS) patients to yield a high-quality synthetic data pool. The latter is used for the development of an enhanced MALTL classification model. Four scenarios were defined to assess the reliability of augmentation. Our results revealed an overall improvement in the accuracy, sensitivity, specificity, and AUC by 7%, 6.3%, 9%, and 6.3%, respectively. This is the first case study that utilizes data augmentation to reflect the progression of MALTL in pSS.
Collapse
|
16
|
Mashoufi M, Ayatollahi H, Khorasani-Zavareh D, Talebi Azad Boni T. Data Quality in Health Care: Main Concepts and Assessment Methodologies. Methods Inf Med 2023; 62:5-18. [PMID: 36716776 DOI: 10.1055/s-0043-1761500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
INTRODUCTION In the health care environment, a huge volume of data is produced on a daily basis. However, the processes of collecting, storing, sharing, analyzing, and reporting health data usually face with numerous challenges that lead to producing incomplete, inaccurate, and untimely data. As a result, data quality issues have received more attention than before. OBJECTIVE The purpose of this article is to provide an insight into the data quality definitions, dimensions, and assessment methodologies. METHODS In this article, a scoping literature review approach was used to describe and summarize the main concepts related to data quality and data quality assessment methodologies. Search terms were selected to find the relevant articles published between January 1, 2012 and September 31, 2022. The retrieved articles were then reviewed and the results were reported narratively. RESULTS In total, 23 papers were included in the study. According to the results, data quality dimensions were various and different methodologies were used to assess them. Most studies used quantitative methods to measure data quality dimensions either in paper-based or computer-based medical records. Only two studies investigated respondents' opinions about data quality. CONCLUSION In health care, high-quality data not only are important for patient care, but also are vital for improving quality of health care services and better decision making. Therefore, using technical and nontechnical solutions as well as constant assessment and supervision is suggested to improve data quality.
Collapse
Affiliation(s)
- Mehrnaz Mashoufi
- Department of Health Information Management, School of Medicine, Ardabil University of Medical Sciences, Ardabil, Iran
| | - Haleh Ayatollahi
- Health Management and Economics Research Center, Health Management Research Institute, Iran University of Medical Sciences, Tehran, Iran.,Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Davoud Khorasani-Zavareh
- Department of Health in Emergencies and Disasters, Safety Promotion and Injury Prevention Research Center, School of Public Health and Safety, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Tahere Talebi Azad Boni
- Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran.,Social Determinants of Health Research Center, Saveh University of Medical Sciences, Saveh, Iran
| |
Collapse
|
17
|
Pezoulas VC, Liontos A, Mylona E, Papaloukas C, Milionis O, Biros D, Kyriakopoulos C, Kostikas K, Milionis H, Fotiadis DI. Predicting the need for mechanical ventilation and mortality in hospitalized COVID-19 patients who received heparin. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1020-1023. [PMID: 36086001 DOI: 10.1109/embc48229.2022.9871261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Although several studies have utilized AI (artificial intelligence)-based solutions to enhance the decision making for mechanical ventilation, as well as, for mortality in COVID-19, the extraction of explainable predictors regarding heparin's effect in intensive care and mortality has been left unresolved. In the present study, we developed an explainable AI (XAI) workflow to shed light into predictors for admission in the intensive care unit (ICU), as well as, for mortality across those hospitalized COVID-19 patients who received heparin. AI empowered classifiers, such as, the hybrid Extreme gradient boosting (HXGBoost) with customized loss functions were trained on time-series curated clinical data to develop robust AI models. Shapley additive explanation analysis (SHAP) was conducted to determine the positive or negative impact of the predictors in the model's output. The HXGBoost predicted the risk for intensive care and mortality with 0.84 and 0.85 accuracy, respectively. SHAP analysis indicated that the low percentage of lymphocytes at day 7 along with increased FiO2 at days 1 and 5, low SatO2 at days 3 and 7 increase the probability for mortality and highlight the positive effect of heparin administration at the early days of hospitalization for reducing mortality.
Collapse
|
18
|
Kigka VI, Sakellarios AI, Tsakanikas VD, Potsika VT, Koncar I, Fotiadis DI. Detection of Asymptomatic Carotid Artery Stenosis through Machine Learning. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1041-1044. [PMID: 36085692 DOI: 10.1109/embc48229.2022.9870927] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Carotid artery disease, the pathological condition of carotid arteries, is considered as the most significant cause of cerebral events and stroke. Carotid artery disease is considered as an inflammatory process, which involves the deposition and accumulation of atherosclerotic plaque inside the carotid intima, resulting in the narrowing of the arteries. Carotid artery stenosis (CAS) is either symptomatic or asymptomatic and its presence and location is determined by different imaging modalities, such as the carotid duplex ultrasound, the computed tomography angiography, the magnetic resonance angiography (MRA) and the cerebral angiography. The aim of this study is to present a machine learning model for the diagnosis and identification of individuals of asymptomatic carotid artery stenosis, using as input typical health data. More specifically, the overall model is trained with typical demographics, clinical data, risk factors and medical treatment data and is able to classify the individuals into high risk (Class 1-CAS group) and low risk (Class 0-non CAS group) individuals. In the presented study, we implemented a statistical analysis to check the data quality and the distribution into the two classes. Different feature selection techniques, in combination with classification schemes were applied for the development of our machine learning model. The overall methodology has been trained and tested using 881 cases (443 subjects in low risk class and 438 in high risk class). The highest accuracy 0.82 and an area under curve 0.9 were achieved using the relief feature selection technique and the random forest classification scheme.
Collapse
|
19
|
Pezoulas VC, Tachos NS, Olivotto I, Barlocco F, Fotiadis DI. A "smart" Imputation Approach for Effective Quality Control Across Complex Clinical Data Structures. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:1049-1052. [PMID: 36086027 DOI: 10.1109/embc48229.2022.9871919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The overwhelming need to improve the quality of complex data structures in healthcare is more important than ever. Although data quality has been the point of interest in many studies, none of them has focused on the development of quantitative and explainable methods for data imputation. In this work, we propose a "smart" imputation workflow to address missing data across complex data structures in the context of in silico clinical trials. AI algorithms were utilized to produce high-quality virtual patient profiles. A search algorithm was then developed to extract the best virtual patient profiles through the definition of a profile matching score (PMS). A case study was conducted, where the real dataset was randomly contaminated with multiple missing values (e.g., 10 to 50%). In total, 10000 virtual patient profiles with less than 0.02 Kullback-Leibler (KL) divergence were produced to estimate the PMS distribution. The best generator achieved the lowest average squared absolute difference (0.4) and average correlation difference (0.02) with the real dataset highlighting its increased effectiveness for data imputation across complex clinical data structures.
Collapse
|
20
|
Pezoulas VC, Tachos NS, Gkois G, Olivotto I, Barlocco F, Fotiadis DI. Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2022; 3:108-114. [PMID: 36860496 PMCID: PMC9970043 DOI: 10.1109/ojemb.2022.3181796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 05/05/2022] [Accepted: 06/06/2022] [Indexed: 12/26/2023] Open
Abstract
Goal: To develop a computationally efficient and unbiased synthetic data generator for large-scale in silico clinical trials (CTs). Methods: We propose the BGMM-OCE, an extension of the conventional BGMM (Bayesian Gaussian Mixture Models) algorithm to provide unbiased estimations regarding the optimal number of Gaussian components and yield high-quality, large-scale synthetic data at reduced computational complexity. Spectral clustering with efficient eigenvalue decomposition is applied to estimate the hyperparameters of the generator. A case study is conducted to compare the performance of BGMM-OCE against four straightforward synthetic data generators for in silico CTs in hypertrophic cardiomyopathy (HCM). Results: The BGMM-OCE generated 30000 virtual patient profiles having the lowest coefficient-of-variation (0.046), inter- and intra-correlation differences (0.017, and 0.016, respectively) with the real ones in reduced execution time. Conclusions: BGMM-OCE overcomes the lack of population size in HCM which obscures the development of targeted therapies and robust risk stratification models.
Collapse
Affiliation(s)
- Vasileios C. Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of IoanninaGR45110IoanninaGreece
| | - Nikolaos S. Tachos
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of IoanninaGR45110IoanninaGreece
| | - George Gkois
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of IoanninaGR45110IoanninaGreece
| | - Iacopo Olivotto
- Department of Experimental and Clinical MedicineUniversity of Florence50121FlorenceItaly
| | - Fausto Barlocco
- Department of Experimental and Clinical MedicineUniversity of Florence50121FlorenceItaly
| | - Dimitrios I. Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of IoanninaGR45110IoanninaGreece
- Department of Biomedical ResearchFORTH-IMBBGR45110IoanninaGreece
| |
Collapse
|
21
|
Pezoulas VC, Kourou KD, Mylona E, Papaloukas C, Liontos A, Biros D, Milionis OI, Kyriakopoulos C, Kostikas K, Milionis H, Fotiadis DI. ICU admission and mortality classifiers for COVID-19 patients based on subgroups of dynamically associated profiles across multiple timepoints. Comput Biol Med 2022; 141:105176. [PMID: 35007991 PMCID: PMC8711179 DOI: 10.1016/j.compbiomed.2021.105176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 12/22/2021] [Accepted: 12/23/2021] [Indexed: 01/08/2023]
Abstract
The coronavirus disease 2019 (COVID-19) which is caused by severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) is consistently causing profound wounds in the global healthcare system due to its increased transmissibility. Currently, there is an urgent unmet need to identify the underlying dynamic associations among COVID-19 patients and distinguish patient subgroups with common clinical profiles towards the development of robust classifiers for ICU admission and mortality. To address this need, we propose a four step pipeline which: (i) enhances the quality of multiple timeseries clinical data through an automated data curation workflow, (ii) deploys Dynamic Bayesian Networks (DBNs) for the detection of features with increased connectivity based on dynamic association analysis across multiple points, (iii) utilizes Self Organizing Maps (SOMs) and trajectory analysis for the early identification of COVID-19 patients with common clinical profiles, and (iv) trains robust multiple additive regression trees (MART) for ICU admission and mortality classification based on the extracted homogeneous clusters, to identify risk factors and biomarkers for disease progression. The contribution of the extracted clusters and the dynamically associated clinical data improved the classification performance for ICU admission to sensitivity 0.83 and specificity 0.83, and for mortality to sensitivity 0.74 and specificity 0.76. Additional information was included to enhance the performance of the classifiers yielding an increase by 4% in sensitivity and specificity for mortality. According to the risk factor analysis, the number of lymphocytes, SatO2, PO2/FiO2, and O2 supply type were highlighted as risk factors for ICU admission and the percentage of neutrophils and lymphocytes, PO2/FiO2, LDH, and ALP for mortality, among others. To our knowledge, this is the first study that combines dynamic modeling with clustering analysis to identify homogeneous groups of COVID-19 patients towards the development of robust classifiers for ICU admission and mortality.
Collapse
Affiliation(s)
- Vasileios C Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece
| | - Konstantina D Kourou
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece
| | - Eugenia Mylona
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece
| | - Costas Papaloukas
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece; Dept. of Biological Applications and Technology, University of Ioannina, Ioannina, GR45100, Greece
| | - Angelos Liontos
- Dept. of Internal Medicine, School of Medicine, University of Ioannina, Ioannina, GR45110, Greece
| | - Dimitrios Biros
- Dept. of Internal Medicine, School of Medicine, University of Ioannina, Ioannina, GR45110, Greece
| | - Orestis I Milionis
- Dept. of Internal Medicine, School of Medicine, University of Ioannina, Ioannina, GR45110, Greece
| | - Chris Kyriakopoulos
- Respiratory Medicine Dept., School of Medicine, University of Ioannina, Ioannina, GR45110, Greece
| | - Kostantinos Kostikas
- Respiratory Medicine Dept., School of Medicine, University of Ioannina, Ioannina, GR45110, Greece
| | - Haralampos Milionis
- Dept. of Internal Medicine, School of Medicine, University of Ioannina, Ioannina, GR45110, Greece
| | - Dimitrios I Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece; Institute of Biomedical Research, FORTH, Ioannina, GR45110, Greece.
| |
Collapse
|
22
|
Pezoulas VC, Goules A, Kalatzis F, Chatzis L, Kourou KD, Venetsanopoulou A, Exarchos TP, Gandolfo S, Votis K, Zampeli E, Burmeister J, May T, Marcelino Pérez M, Lishchuk I, Chondrogiannis T, Andronikou V, Varvarigou T, Filipovic N, Tsiknakis M, Baldini C, Bombardieri M, Bootsma H, Bowman SJ, Soyfoo MS, Parisis D, Delporte C, Devauchelle-Pensec V, Pers JO, Dörner T, Bartoloni E, Gerli R, Giacomelli R, Jonsson R, Ng WF, Priori R, Ramos-Casals M, Sivils K, Skopouli F, Torsten W, A. G. van Roon J, Xavier M, De Vita S, Tzioufas AG, Fotiadis DI. Addressing the clinical unmet needs in primary Sjögren's Syndrome through the sharing, harmonization and federated analysis of 21 European cohorts. Comput Struct Biotechnol J 2022; 20:471-484. [PMID: 35070169 PMCID: PMC8760551 DOI: 10.1016/j.csbj.2022.01.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 12/30/2021] [Accepted: 01/01/2022] [Indexed: 12/26/2022] Open
Abstract
For many decades, the clinical unmet needs of primary Sjögren's Syndrome (pSS) have been left unresolved due to the rareness of the disease and the complexity of the underlying pathogenic mechanisms, including the pSS-associated lymphomagenesis process. Here, we present the HarmonicSS cloud-computing exemplar which offers beyond the state-of-the-art data analytics services to address the pSS clinical unmet needs, including the development of lymphoma classification models and the identification of biomarkers for lymphomagenesis. The users of the platform have been able to successfully interlink, curate, and harmonize 21 regional, national, and international European cohorts of 7,551 pSS patients with respect to the ethical and legal issues for data sharing. Federated AI algorithms were trained across the harmonized databases, with reduced execution time complexity, yielding robust lymphoma classification models with 85% accuracy, 81.25% sensitivity, 85.4% specificity along with 5 biomarkers for lymphoma development. To our knowledge, this is the first GDPR compliant platform that provides federated AI services to address the pSS clinical unmet needs.
Collapse
Affiliation(s)
- Vasileios C. Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
| | - Andreas Goules
- Dept. of Pathophysiology, School of Medicine, University of Athens, Athens, Greece
| | - Fanis Kalatzis
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
| | - Luke Chatzis
- Dept. of Pathophysiology, School of Medicine, University of Athens, Athens, Greece
| | - Konstantina D. Kourou
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
| | - Aliki Venetsanopoulou
- Dept. of Pathophysiology, School of Medicine, University of Athens, Athens, Greece
- University Hospital of Ioannina, Ioannina, Greece
| | - Themis P. Exarchos
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
- Dept. of Informatics, Ionian University, Corfu, Greece
| | - Saviana Gandolfo
- Clinic of Rheumatology, Dept. of Medical and Biological Sciences, Udine University, Udine, Italy
| | | | - Evi Zampeli
- Institute for Systemic Autoimmune and Neurological Diseases, Athens, Greece
| | - Jan Burmeister
- Fraunhofer Institute for Computer Graphics Research IGD, Darmstadt, Germany
| | - Thorsten May
- Fraunhofer Institute for Computer Graphics Research IGD, Darmstadt, Germany
| | | | - Iryna Lishchuk
- Institute of Legal Informatics, Leibniz Universität Hannover, Hannover, Germany
| | - Thymios Chondrogiannis
- Institute of Communication and Computer Systems, School of Electrical and Computer Engineering, National and Technical University of Athens, Athens, Greece
| | - Vassiliki Andronikou
- Institute of Communication and Computer Systems, School of Electrical and Computer Engineering, National and Technical University of Athens, Athens, Greece
| | - Theodora Varvarigou
- Institute of Communication and Computer Systems, School of Electrical and Computer Engineering, National and Technical University of Athens, Athens, Greece
| | - Nenad Filipovic
- Bioengineering Research and Development Center, Faculty of Engineering, University of Kragujevac, Kragujevac, Serbia
| | - Manolis Tsiknakis
- Biomedical Informatics and eHealth Laboratory, Dept. of Electrical and Computer Engineering, Hellenic Mediterranean University, Heraklion, Greece
| | - Chiara Baldini
- Dept. of Clinical and Experimental Medicine, University of Pisa, Pisa, Italy
| | - Michele Bombardieri
- Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Queen Mary University of London and Barts’ Health NHS Trust, London, United Kingdom
| | - Hendrika Bootsma
- Dept. of Rheumatology and Clinical Immunology, University of Groningen, University Medical Center Groningen, the Netherlands
| | - Simon J. Bowman
- Rheumatology Dept., University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | | | - Dorian Parisis
- Laboratory of Pathophysiological Biochemistry and Nutrition, Université Libre de Bruxelles, Brussels, Belgium
| | - Christine Delporte
- Laboratory of Pathophysiological Biochemistry and Nutrition, Université Libre de Bruxelles, Brussels, Belgium
| | | | - Jacques-Olivier Pers
- Univ Brest, Inserm, CHU de Brest, UMR1227, Lymphocytes B et Autoimmunité, Brest, France
| | - Thomas Dörner
- Dept. of Rheumatology and Clinical Immunology, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Elena Bartoloni
- Rheumatology Unit, Dept. of Medicine and Surgery, University of Perugia, Perugia, Italy
| | - Roberto Gerli
- Rheumatology Unit, Dept. of Medicine and Surgery, University of Perugia, Perugia, Italy
| | - Roberto Giacomelli
- Division of Rheumatology, Dept. of Biotechnological and Applied Clinical Sciences, University of L'Aquila, L'Aquila, Italy
| | - Roland Jonsson
- Dept. of Clinical Science, University of Bergen, Bergen, Norway
| | - Wan-Fai Ng
- Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, UK
| | - Roberta Priori
- Dept. of Internal Medicine and Medical Specialties, Rheumatology Clinic, Sapienza University of Rome, Rome, Italy
| | - Manuel Ramos-Casals
- Laboratory of Autoimmune Diseases Josep Font, IDIBAPS-CELLEX, Barcelona, Spain
| | | | - Fotini Skopouli
- Institute for Systemic Autoimmune and Neurological Diseases, Athens, Greece
- Dept. of Internal Medicine and Clinical Immunology, Euroclinic Hospital, Athens, Greece
| | - Witte Torsten
- Dept. of Rheumatology and Immunology, Hanover Medical School, Hanover, Germany
| | - Joel A. G. van Roon
- Dept. of Rheumatology and Clinical Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Mariette Xavier
- Dept. of Rheumatology, Hôpital Bicêtre, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Salvatore De Vita
- Clinic of Rheumatology, Dept. of Medical and Biological Sciences, Udine University, Udine, Italy
| | | | - Dimitrios I. Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Dept. of Materials Science and Engineering, University of Ioannina, Ioannina, Greece
- Dept. of Biomedical Research, FORTH-IMBB, Ioannina, Greece
| |
Collapse
|
23
|
Razzaghi H, Greenberg J, Bailey LC. Developing a systematic approach to assessing data quality in secondary use of clinical data based on intended use. Learn Health Syst 2022; 6:e10264. [PMID: 35036548 PMCID: PMC8753309 DOI: 10.1002/lrh2.10264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 02/24/2021] [Accepted: 03/01/2021] [Indexed: 11/10/2022] Open
Abstract
INTRODUCTION Secondary use of electronic health record (EHR) data for research requires that the data are fit for use. Data quality (DQ) frameworks have traditionally focused on structural conformance and completeness of clinical data extracted from source systems. In this paper, we propose a framework for evaluating semantic DQ that will allow researchers to evaluate fitness for use prior to analyses. METHODS We reviewed current DQ literature, as well as experience from recent multisite network studies, and identified gaps in the literature and current practice. Derived principles were used to construct the conceptual framework with attention to both analytic fitness and informatics practice. RESULTS We developed a systematic framework that guides researchers in assessing whether a data source is fit for use for their intended study or project. It combines tools for evaluating clinical context with DQ principles, as well as factoring in the characteristics of the data source, in order to develop semantic DQ checks. CONCLUSIONS Our framework provides a systematic process for DQ development. Further work is needed to codify practices and metadata around both structural and semantic data quality.
Collapse
Affiliation(s)
- Hanieh Razzaghi
- Department of Pediatrics and Biomedical and Health InformaticsChildren's Hospital of PhiladelphiaPhiladelphiaPennsylvaniaUSA
- Metadata Research CenterCollege of Computing and Informatics, Drexel UniversityPhiladelphiaPennsylvaniaUSA
| | - Jane Greenberg
- Metadata Research CenterCollege of Computing and Informatics, Drexel UniversityPhiladelphiaPennsylvaniaUSA
| | - L. Charles Bailey
- Department of Pediatrics and Biomedical and Health InformaticsChildren's Hospital of PhiladelphiaPhiladelphiaPennsylvaniaUSA
- Department of PediatricsPerelman School of Medicine, University of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| |
Collapse
|
24
|
Pezoulas VC, Kourou KD, Papaloukas C, Triantafyllia V, Lampropoulou V, Siouti E, Papadaki M, Salagianni M, Koukaki E, Rovina N, Koutsoukou A, Andreakos E, Fotiadis DI. A Multimodal Approach for the Risk Prediction of Intensive Care and Mortality in Patients with COVID-19. Diagnostics (Basel) 2021; 12:56. [PMID: 35054223 PMCID: PMC8774804 DOI: 10.3390/diagnostics12010056] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 12/22/2021] [Accepted: 12/26/2021] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Although several studies have been launched towards the prediction of risk factors for mortality and admission in the intensive care unit (ICU) in COVID-19, none of them focuses on the development of explainable AI models to define an ICU scoring index using dynamically associated biological markers. METHODS We propose a multimodal approach which combines explainable AI models with dynamic modeling methods to shed light into the clinical features of COVID-19. Dynamic Bayesian networks were used to seek associations among cytokines across four time intervals after hospitalization. Explainable gradient boosting trees were trained to predict the risk for ICU admission and mortality towards the development of an ICU scoring index. RESULTS Our results highlight LDH, IL-6, IL-8, Cr, number of monocytes, lymphocyte count, TNF as risk predictors for ICU admission and survival along with LDH, age, CRP, Cr, WBC, lymphocyte count for mortality in the ICU, with prediction accuracy 0.79 and 0.81, respectively. These risk factors were combined with dynamically associated biological markers to develop an ICU scoring index with accuracy 0.9. CONCLUSIONS to our knowledge, this is the first multimodal and explainable AI model which quantifies the risk of intensive care with accuracy up to 0.9 across multiple timepoints.
Collapse
Affiliation(s)
- Vasileios C. Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, GR45110 Ioannina, Greece; (V.C.P.); (K.D.K.); (C.P.)
| | - Konstantina D. Kourou
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, GR45110 Ioannina, Greece; (V.C.P.); (K.D.K.); (C.P.)
| | - Costas Papaloukas
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, GR45110 Ioannina, Greece; (V.C.P.); (K.D.K.); (C.P.)
- Department of Biological Applications and Technology, University of Ioannina, GR45100 Ioannina, Greece
| | - Vassiliki Triantafyllia
- Laboratory of Immunobiology, Center for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, GR11527 Athens, Greece; (V.T.); (V.L.); (E.S.); (M.P.); (M.S.); (E.A.)
| | - Vicky Lampropoulou
- Laboratory of Immunobiology, Center for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, GR11527 Athens, Greece; (V.T.); (V.L.); (E.S.); (M.P.); (M.S.); (E.A.)
| | - Eleni Siouti
- Laboratory of Immunobiology, Center for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, GR11527 Athens, Greece; (V.T.); (V.L.); (E.S.); (M.P.); (M.S.); (E.A.)
| | - Maria Papadaki
- Laboratory of Immunobiology, Center for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, GR11527 Athens, Greece; (V.T.); (V.L.); (E.S.); (M.P.); (M.S.); (E.A.)
| | - Maria Salagianni
- Laboratory of Immunobiology, Center for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, GR11527 Athens, Greece; (V.T.); (V.L.); (E.S.); (M.P.); (M.S.); (E.A.)
| | - Evangelia Koukaki
- Intensive Care Unit (ICU), 1st Department of Respiratory Medicine, Medical School, National and Kapodistrian University of Athens, ‘Sotiria’ General Hospital of Chest Diseases, GR11527 Athens, Greece; (E.K.); (N.R.); (A.K.)
| | - Nikoletta Rovina
- Intensive Care Unit (ICU), 1st Department of Respiratory Medicine, Medical School, National and Kapodistrian University of Athens, ‘Sotiria’ General Hospital of Chest Diseases, GR11527 Athens, Greece; (E.K.); (N.R.); (A.K.)
| | - Antonia Koutsoukou
- Intensive Care Unit (ICU), 1st Department of Respiratory Medicine, Medical School, National and Kapodistrian University of Athens, ‘Sotiria’ General Hospital of Chest Diseases, GR11527 Athens, Greece; (E.K.); (N.R.); (A.K.)
| | - Evangelos Andreakos
- Laboratory of Immunobiology, Center for Clinical, Experimental Surgery and Translational Research, Biomedical Research Foundation of the Academy of Athens, GR11527 Athens, Greece; (V.T.); (V.L.); (E.S.); (M.P.); (M.S.); (E.A.)
| | - Dimitrios I. Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, GR45110 Ioannina, Greece; (V.C.P.); (K.D.K.); (C.P.)
- Department of Biomedical Research, Foundation for Research and Technology-Hellas, Institute of Molecular Biology and Biotechnology (FORTH-IMBB), GR45110 Ioannina, Greece
| |
Collapse
|
25
|
Pezoulas VC, Grigoriadis GI, Tachos NS, Barlocco F, Olivotto I, Fotiadis DI. Variational Gaussian Mixture Models with robust Dirichlet concentration priors for virtual population generation in hypertrophic cardiomyopathy: a comparison study. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:1674-1677. [PMID: 34891607 DOI: 10.1109/embc46164.2021.9629653] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Nowadays, there is a growing need for the development of computationally efficient virtual population generators for large-scale in-silico clinical trials. In this work, we utilize the Gaussian Mixture Models (GMM) with variational Bayesian inference (BGMM) using robust estimations of Dirichlet concentration priors for the generation of virtual populations. The estimations were based on an exponential transformation of the number of Gaussian components. The proposed method was compared against state-of-the-art virtual data generators, such as, the Bayesian networks, the supervised tree ensembles (STE), the unsupervised tree ensembles (UTE), and the artificial neural networks (ANN) towards the generation of 20000 virtual patients with hypertrophic cardiomyopathy (HCM). Our results suggest that the proposed BGMM can yield virtual distributions with small inter- and intra-correlation difference (0.013 and 0.012), in lower execution time (4.321 sec) than STE which achieved the second-best performance.
Collapse
|
26
|
Kigka VI, Sakellarios AI, Mantzaris MD, Tsakanikas VD, Potsika VT, Palombo D, Montecucco F, Fotiadis DI. A Machine Learning Model for the Identification of High risk Carotid Atherosclerotic Plaques. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:2266-2269. [PMID: 34891738 DOI: 10.1109/embc46164.2021.9630654] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Carotid artery disease is an inflammatory condition involving the deposition and accumulation of lipid species and leucocytes from blood into the arterial wall, which causes the narrowing of the carotid arteries on either side of the neck. Different imaging modalities can by implemented to determine the presence and the location of carotid artery stenosis, such as carotid ultrasound, computed tomography angiography (CTA), magnetic resonance angiography (MRA), or cerebral angiography. However, except of the presence and the degree of stenosis of the carotid arteries, the vulnerability of the carotid atherosclerotic plaques constitutes a significant factor for the progression of the disease and the presence of disease symptoms. In this study, our aim is to develop and present a machine learning model for the identification of high risk plaques using non imaging based features and non-invasive imaging based features. Firstly, we implemented statistical analysis to identify the most statistical significant features according to the defined output, and subsequently, we implemented different feature selection techniques and classification schemes for the development of our machine learning model. The overall methodology has been trained and tested using 208 cases of 107 cases of low risk plaques and 101 cases of high risk plaques. The highest accuracy of 0.76 was achieved using the relief feature selection technique and the support vector machine classification scheme. The innovative aspect of the proposed machine learning model is both the different categories of the utilized input features and the definition of the problem to be solved.
Collapse
|
27
|
Pezoulas VC, Exarchos TP, Tzioufas AG, Fotiadis DI. Multiple additive regression trees with hybrid loss for classification tasks across heterogeneous clinical data in distributed environments: a case study. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:1670-1673. [PMID: 34891606 DOI: 10.1109/embc46164.2021.9629912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Multiple additive regression trees (MART) have been widely used in the literature for various classification tasks. However, the overfitting effects of MART across heterogeneous and highly imbalanced big data structures within distributed environments has not yet been investigated. In this work, we utilize distributed MART with hybrid loss to resolve overfitting effects during the training of disease classification models in a case study with 10 heterogeneous and distributed clinical datasets. Lexical and semantic analysis methods were utilized to match heterogeneous terminologies with 80% overlap. Data augmentation was used to resolve class imbalance yielding virtual data with goodness of fit 0.01 and correlation difference 0.02. Our results highlight the favorable performance of the proposed distributed MART on the augmented data with an average increase by 7.3% in the accuracy, 6.8% in sensitivity, 10.4% in specificity, for a specific loss function topology.
Collapse
|
28
|
Pezoulas VC, Kalatzis F, Exarchos TP, Chatzis L, Gandolfo S, Goules A, De Vita S, Tzioufas AG, Fotiadis DI. A federated AI strategy for the classification of patients with Mucosa Associated Lymphoma Tissue (MALT) lymphoma across multiple harmonized cohorts. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:1666-1669. [PMID: 34891605 DOI: 10.1109/embc46164.2021.9630014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Mucosa Associated Lymphoma Tissue (MALT) type is an extremely rare type of lymphoma which occurs in less than 3% of patients with primary Sjögren's Syndrome (pSS). No reported studies so far have been able to investigate risk factors for MALT development across multiple cohort databases with sufficient statistical power. Here, we present a generalized, federated AI (artificial intelligence) strategy which enables the training of AI algorithms across multiple harmonized databases. A case study is conducted towards the development of MALT classification models across 17 databases on pSS. Advanced AI algorithms were developed, including federated Multinomial Naïve Bayes (FMNB), federated gradient boosting trees (FGBT), FGBT with dropouts (FDART), and the federated Multilayer Perceptron (FMLP). The FDART with dropout rate 0.3 achieved the best performance with sensitivity 0.812, and specificity 0.829, yielding 8 biomarkers as prominent for MALT development.
Collapse
|
29
|
Abstract
A huge array of data in nephrology is collected through patient registries, large epidemiological studies, electronic health records, administrative claims, clinical trial repositories, mobile health devices and molecular databases. Application of these big data, particularly using machine-learning algorithms, provides a unique opportunity to obtain novel insights into kidney diseases, facilitate personalized medicine and improve patient care. Efforts to make large volumes of data freely accessible to the scientific community, increased awareness of the importance of data sharing and the availability of advanced computing algorithms will facilitate the use of big data in nephrology. However, challenges exist in accessing, harmonizing and integrating datasets in different formats from disparate sources, improving data quality and ensuring that data are secure and the rights and privacy of patients and research participants are protected. In addition, the optimism for data-driven breakthroughs in medicine is tempered by scepticism about the accuracy of calibration and prediction from in silico techniques. Machine-learning algorithms designed to study kidney health and diseases must be able to handle the nuances of this specialty, must adapt as medical practice continually evolves, and must have global and prospective applicability for external and future datasets.
Collapse
|
30
|
Wang J, Pan C, Ma X. Assessment of the Quality Management System for Clinical Nutrition in Jiangsu: Survey Study. JMIR Form Res 2021; 5:e27285. [PMID: 34569942 PMCID: PMC8506260 DOI: 10.2196/27285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 08/02/2021] [Accepted: 08/24/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND An electronic system that automatically collects medical information can realize timely monitoring of patient health and improve the effectiveness and accuracy of medical treatment. To our knowledge, the application of artificial intelligence (AI) in medical service quality assessment has been minimally evaluated, especially for clinical nutrition departments in China. From the perspective of medical ethics, patient safety comes before any other factors within health science, and this responsibility belongs to the quality management system (QMS) within medical institutions. OBJECTIVE This study aims to evaluate the QMS for clinical nutrition in Jiangsu, monitor its performance in quality assessment and human resource management from a nutrition aspect, and investigate the application and development of AI in medical quality control. METHODS The participants for this study were the staff of 70 clinical nutrition departments of the tertiary hospitals in Jiangsu Province, China. These departments are all members of the Quality Management System of Clinical Nutrition in Jiangsu (QMSNJ). An online survey was conducted on all 341 employees within all clinical nutrition departments based on the staff information from the surveyed medical institutions. The questionnaire contains five sections, and the data analysis and AI evaluation were focused on human resource information. RESULTS A total of 330 questionnaires were collected, with a response rate of 96.77% (330/341). A QMS for clinical nutrition was built for clinical nutrition departments in Jiangsu and achieved its target of human resource improvements, especially among dietitians. The growing number of participating departments (an increase of 42.8% from 2018 to 2020) and the significant growth of dietitians (t93.4=-0.42; P=.02) both show the advancements of the QMSNJ. CONCLUSIONS As the first innovation of an online platform for quality management in Jiangsu, the Jiangsu Province Clinical Nutrition Management Platform was successfully implemented as a QMS for this study. This multidimensional electronic system can help the QMSNJ and clinical nutrition departments achieve quality assessment from various aspects so as to realize the continuous improvement of clinical nutrition. The use of an online platform and AI technology for quality assessment is worth recommending and promoting in the future.
Collapse
Affiliation(s)
- Jin Wang
- First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Chen Pan
- First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Xianghua Ma
- First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| |
Collapse
|
31
|
Pezoulas VC, Grigoriadis GI, Gkois G, Tachos NS, Smole T, Bosnić Z, Pičulin M, Olivotto I, Barlocco F, Robnik-Šikonja M, Jakovljevic DG, Goules A, Tzioufas AG, Fotiadis DI. A computational pipeline for data augmentation towards the improvement of disease classification and risk stratification models: A case study in two clinical domains. Comput Biol Med 2021; 134:104520. [PMID: 34118751 DOI: 10.1016/j.compbiomed.2021.104520] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 05/13/2021] [Accepted: 05/24/2021] [Indexed: 11/20/2022]
Abstract
Virtual population generation is an emerging field in data science with numerous applications in healthcare towards the augmentation of clinical research databases with significant lack of population size. However, the impact of data augmentation on the development of AI (artificial intelligence) models to address clinical unmet needs has not yet been investigated. In this work, we assess whether the aggregation of real with virtual patient data can improve the performance of the existing risk stratification and disease classification models in two rare clinical domains, namely the primary Sjögren's Syndrome (pSS) and the hypertrophic cardiomyopathy (HCM), for the first time in the literature. To do so, multivariate approaches, such as, the multivariate normal distribution (MVND), and straightforward ones, such as, the Bayesian networks, the artificial neural networks (ANNs), and the tree ensembles are compared against their performance towards the generation of high-quality virtual data. Both boosting and bagging algorithms, such as, the Gradient boosting trees (XGBoost), the AdaBoost and the Random Forests (RFs) were trained on the augmented data to evaluate the performance improvement for lymphoma classification and HCM risk stratification. Our results revealed the favorable performance of the tree ensemble generators, in both domains, yielding virtual data with goodness-of-fit 0.021 and KL-divergence 0.029 in pSS and 0.029, 0.027 in HCM, respectively. The application of the XGBoost on the augmented data revealed an increase by 10.9% in accuracy, 10.7% in sensitivity, 11.5% in specificity for lymphoma classification and 16.1% in accuracy, 16.9% in sensitivity, 13.7% in specificity in HCM risk stratification.
Collapse
Affiliation(s)
- Vasileios C Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece
| | - Grigoris I Grigoriadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece
| | - George Gkois
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece
| | - Nikolaos S Tachos
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece
| | - Tim Smole
- Faculty of Computer and Information Science, University of Ljubljana, Večna Pot 113, 1000, Ljubljana, Slovenia
| | - Zoran Bosnić
- Faculty of Computer and Information Science, University of Ljubljana, Večna Pot 113, 1000, Ljubljana, Slovenia
| | - Matej Pičulin
- Faculty of Computer and Information Science, University of Ljubljana, Večna Pot 113, 1000, Ljubljana, Slovenia
| | - Iacopo Olivotto
- Department of Experimental and Clinical Medicine, University of Florence and Cardiomyopathies Unit, Azienda Ospedaliera Careggi, Florence, Italy
| | - Fausto Barlocco
- Department of Experimental and Clinical Medicine, University of Florence and Cardiomyopathies Unit, Azienda Ospedaliera Careggi, Florence, Italy
| | - Marko Robnik-Šikonja
- Faculty of Computer and Information Science, University of Ljubljana, Večna Pot 113, 1000, Ljubljana, Slovenia
| | - Djordje G Jakovljevic
- Faculty of Medical Sciences, Newcastle University, Newcastle Upon Tyne, UK and with the Faculty of Health and Life Sciences, Coventry University, Coventry, UK
| | - Andreas Goules
- Department of Pathophysiology, Faculty of Medicine, National and Kapodistrian University of Athens (NKUA), GR 15772, Athens, Greece
| | - Athanasios G Tzioufas
- Department of Pathophysiology, Faculty of Medicine, National and Kapodistrian University of Athens (NKUA), GR 15772, Athens, Greece
| | - Dimitrios I Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, GR45110, Greece; Department of Biomedical Research, FORTH-IMBB, Ioannina, GR45110, Greece.
| |
Collapse
|
32
|
Pezoulas VC, Papaloukas C, Veyssiere M, Goules A, Tzioufas AG, Soumelis V, Fotiadis DI. A computational workflow for the detection of candidate diagnostic biomarkers of Kawasaki disease using time-series gene expression data. Comput Struct Biotechnol J 2021; 19:3058-3068. [PMID: 34136104 PMCID: PMC8178098 DOI: 10.1016/j.csbj.2021.05.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 05/17/2021] [Accepted: 05/20/2021] [Indexed: 12/15/2022] Open
Abstract
Unlike autoimmune diseases, there is no known constitutive and disease-defining biomarker for systemic autoinflammatory diseases (SAIDs). Kawasaki disease (KD) is one of the "undiagnosed" types of SAIDs whose pathogenic mechanism and gene mutation still remain unknown. To address this issue, we have developed a sequential computational workflow which clusters KD patients with similar gene expression profiles across the three different KD phases (Acute, Subacute and Convalescent) and utilizes the resulting clustermap to detect prominent genes that can be used as diagnostic biomarkers for KD. Self-Organizing Maps (SOMs) were employed to cluster patients with similar gene expressions across the three phases through inter-phase and intra-phase clustering. Then, false discovery rate (FDR)-based feature selection was applied to detect genes that significantly deviate across the per-phase clusters. Our results revealed five genes as candidate biomarkers for KD diagnosis, namely, the HLA-DQB1, HLA-DRA, ZBTB48, TNFRSF13C, and CASD1. To our knowledge, these five genes are reported for the first time in the literature. The impact of the discovered genes for KD diagnosis against the known ones was demonstrated by training boosting ensembles (AdaBoost and XGBoost) for KD classification on common platform and cross-platform datasets. The classifiers which were trained on the proposed genes from the common platform data yielded an average increase by 4.40% in accuracy, 5.52% in sensitivity, and 3.57% in specificity than the known genes in the Acute and Subacute phases, followed by a notable increase by 2.30% in accuracy, 2.20% in sensitivity, and 4.70% in specificity in the cross-platform analysis.
Collapse
Affiliation(s)
- Vasileios C. Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina GR45110, Greece
| | - Costas Papaloukas
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina GR45110, Greece
- Department of Biological Applications and Technology, University of Ioannina, Ioannina GR45100, Greece
| | - Maëva Veyssiere
- INSERM U976, Human Immunology, Physiopathology and Immunotherapy, Paris, France
| | - Andreas Goules
- Department of Pathophysiology, School of Medicine, University of Athens, Athens GR15772, Greece
| | - Athanasios G. Tzioufas
- Department of Pathophysiology, School of Medicine, University of Athens, Athens GR15772, Greece
| | - Vassili Soumelis
- INSERM U976, Human Immunology, Physiopathology and Immunotherapy, Paris, France
- Hôpital Saint Louis, Saint Louis Research Institute, Paris, France
| | - Dimitrios I. Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina GR45110, Greece
- Department of Biomedical Research, FORTH (Foundation for Research & Technology)-IMBB (Institute of Molecular Biology and Biotechnology), Ioannina GR45110, Greece
| |
Collapse
|
33
|
Abstract
Abstract
Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy.
Collapse
|
34
|
Kourou K, Manikis G, Poikonen-Saksela P, Mazzocco K, Pat-Horenczyk R, Sousa B, Oliveira-Maia AJ, Mattson J, Roziner I, Pettini G, Kondylakis H, Marias K, Karademas E, Simos P, Fotiadis DI. A machine learning-based pipeline for modeling medical, socio-demographic, lifestyle and self-reported psychological traits as predictors of mental health outcomes after breast cancer diagnosis: An initial effort to define resilience effects. Comput Biol Med 2021; 131:104266. [PMID: 33607379 DOI: 10.1016/j.compbiomed.2021.104266] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 02/01/2021] [Accepted: 02/09/2021] [Indexed: 12/19/2022]
Abstract
Displaying resilience following a diagnosis of breast cancer is crucial for successful adaptation to illness, well-being, and health outcomes. Several theoretical and computational models have been proposed toward understanding the complex process of illness adaptation, involving a large variety of patient sociodemographic, lifestyle, medical, and psychological characteristics. To date, conventional multivariate statistical methods have been used extensively to model resilience. In the present work we describe a computational pipeline designed to identify the most prominent predictors of mental health outcomes following breast cancer diagnosis. A machine learning framework was developed and tested on the baseline data (recorded immediately post diagnosis) from an ongoing prospective, multinational study. This fully annotated dataset includes socio-demographic, lifestyle, medical and self-reported psychological characteristics of women recently diagnosed with breast cancer (N = 609). Nine different feature selection and cross-validated classification schemes were compared on their performance in classifying patients into low vs high depression symptom severity. Best-performing approaches involved a meta-estimator combined with a Support Vector Machines (SVMs) classification algorithm, exhibiting balanced accuracy of 0.825, and a fair balance between sensitivity (90%) and specificity (74%). These models consistently identified a set of psychological traits (optimism, perceived ability to cope with trauma, resilience as trait, ability to comprehend the illness), and subjective perceptions of personal functionality (physical, social, cognitive) as key factors accounting for concurrent depression symptoms. A comprehensive supervised learning pipeline is proposed for the identification of predictors of depression symptoms which could severely impede adaptation to illness.
Collapse
Affiliation(s)
- Konstantina Kourou
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, Greece; Foundation for Research and Technology-Hellas, Institute of Molecular Biology and Biotechnology, Department of Biomedical Research, Ioannina, Greece
| | - Georgios Manikis
- Computational Biomedicine Laboratory, FORTH-ICS, Heraklion, Greece
| | - Paula Poikonen-Saksela
- Helsinki University Hospital Comprehensive Cancer Center and Helsinki University, Finland
| | - Ketti Mazzocco
- Applied Research Division for Cognitive and Psychological Science, European Institute of Oncology, Milan, Italy; Department of Oncology and Hemato-oncology, University of Milan, Italy
| | - Ruth Pat-Horenczyk
- School of Social Work and Social Welfare,The Hebrew University of Jerusalem, Israel
| | - Berta Sousa
- Breast Unit, Champalimaud Clinical Centre/ Champalimaud Foundation, Champalimaud Research, Lisboa, Portugal
| | - Albino J Oliveira-Maia
- Champalimaud Research and Clinical Centre, Champalimaud Centre for the Unknown, Lisboa, Portugal; NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisboa, Portugal
| | - Johanna Mattson
- Helsinki University Hospital Comprehensive Cancer Center and Helsinki University, Finland
| | - Ilan Roziner
- Department of Communication Disorders, Sackler Faculty of Medicine, Tel Aviv University, Israel
| | - Greta Pettini
- Applied Research Division for Cognitive and Psychological Science, European Institute of Oncology, Milan, Italy
| | | | - Kostas Marias
- Computational Biomedicine Laboratory, FORTH-ICS, Heraklion, Greece
| | - Evangelos Karademas
- Computational Biomedicine Laboratory, FORTH-ICS, Heraklion, Greece; Department of Psychology, University of Crete, Rethymno, Greece
| | - Panagiotis Simos
- Computational Biomedicine Laboratory, FORTH-ICS, Heraklion, Greece; School of Medicine, University of Crete, Heraklion, Greece
| | - Dimitrios I Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, Greece; Foundation for Research and Technology-Hellas, Institute of Molecular Biology and Biotechnology, Department of Biomedical Research, Ioannina, Greece.
| |
Collapse
|
35
|
Goules AV, Argyropoulou OD, Pezoulas VC, Chatzis L, Critselis E, Gandolfo S, Ferro F, Binutti M, Donati V, Zandonella Callegher S, Venetsanopoulou A, Zampeli E, Mavrommati M, Voulgari PV, Exarchos T, Mavragani CP, Baldini C, Skopouli FN, Fotiadis DI, De Vita S, Moutsopoulos HM, Tzioufas AG. Primary Sjögren's Syndrome of Early and Late Onset: Distinct Clinical Phenotypes and Lymphoma Development. Front Immunol 2020; 11:594096. [PMID: 33193443 PMCID: PMC7604905 DOI: 10.3389/fimmu.2020.594096] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 09/28/2020] [Indexed: 11/13/2022] Open
Abstract
Objectives To study the clinical, serological and histologic features of primary Sjögren’s syndrome (pSS) patients with early (young ≤35 years) or late (old ≥65 years) onset and to explore the differential effect on lymphoma development. Methods From a multicentre study population of 1997 consecutive pSS patients, those with early or late disease onset, were matched and compared with pSS control patients of middle age onset. Data driven analysis was applied to identify the independent variables associated with lymphoma in both age groups. Results Young pSS patients (19%, n = 379) had higher frequency of salivary gland enlargement (SGE, lymphadenopathy, Raynaud’s phenomenon, autoantibodies, C4 hypocomplementemia, hypergammaglobulinemia, leukopenia, and lymphoma (10.3% vs. 5.7%, p = 0.030, OR = 1.91, 95% CI: 1.11–3.27), while old pSS patients (15%, n = 293) had more frequently dry mouth, interstitial lung disease, and lymphoma (6.8% vs. 2.1%, p = 0.011, OR = 3.40, 95% CI: 1.34–8.17) compared to their middle-aged pSS controls, respectively. In young pSS patients, cryoglobulinemia, C4 hypocomplementemia, lymphadenopathy, and SGE were identified as independent lymphoma associated factors, as opposed to old pSS patients in whom SGE, C4 hypocomplementemia and male gender were the independent lymphoma associated factors. Early onset pSS patients displayed two incidence peaks of lymphoma within 3 years of onset and after 10 years, while in late onset pSS patients, lymphoma occurred within the first 6 years. Conclusion Patients with early and late disease onset constitute a significant proportion of pSS population with distinct clinical phenotypes. They possess a higher prevalence of lymphoma, with different predisposing factors and lymphoma distribution across time.
Collapse
Affiliation(s)
- Andreas V Goules
- Department of Pathophysiology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.,Joint Rheumatology Academic Program, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Ourania D Argyropoulou
- Department of Pathophysiology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.,Joint Rheumatology Academic Program, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Vasileios C Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, University of Ioannina, Ioannina, Greece
| | - Loukas Chatzis
- Department of Pathophysiology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.,Joint Rheumatology Academic Program, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Elena Critselis
- Proteomics Facility, Center for Systems Biology, Biomedical Research Foundation of the Academy of Athens, Athens, Greece.,Department of Nutrition and Clinical Dietetics, Harokopio University of Athens, Athens, Greece
| | - Saviana Gandolfo
- Rheumatology Clinic, Department of Medical Area, University of Udine, Udine, Italy
| | - Francesco Ferro
- Rheumatology Unit, Department of Clinical and Experimental Medicine, University of Pisa, Pisa, Italy
| | - Marco Binutti
- Rheumatology Clinic, Department of Medical Area, University of Udine, Udine, Italy
| | - Valentina Donati
- Rheumatology Unit, Department of Clinical and Experimental Medicine, University of Pisa, Pisa, Italy
| | | | - Aliki Venetsanopoulou
- Department of Pathophysiology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.,Joint Rheumatology Academic Program, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Evangelia Zampeli
- Institute for Autoimmune Systemic and Neurological Diseases, Athens, Greece
| | - Maria Mavrommati
- Department of Nutrition and Clinical Dietetics, Harokopio University of Athens, Athens, Greece
| | - Paraskevi V Voulgari
- Rheumatology Clinic, Department of Internal Medicine, Medical School, University of Ioannina, Ioannina, Greece
| | | | - Clio P Mavragani
- Department of Physiology, Medical School, National and Kapodistrian University of Athens, Athens, Greece
| | - Chiara Baldini
- Rheumatology Unit, Department of Clinical and Experimental Medicine, University of Pisa, Pisa, Italy
| | - Fotini N Skopouli
- Department of Nutrition and Clinical Dietetics, Harokopio University of Athens, Athens, Greece
| | - Dimitrios I Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, University of Ioannina, Ioannina, Greece.,Department of Biomedical Research, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology - Hellas, Ioannina, Greece
| | - Salvatore De Vita
- Rheumatology Clinic, Department of Medical Area, University of Udine, Udine, Italy
| | - Haralampos M Moutsopoulos
- Institute for Autoimmune Systemic and Neurological Diseases, Athens, Greece.,Chair Medical Sciences/Immunology, Academy of Athens, Athens, Greece
| | - Athanasios G Tzioufas
- Department of Pathophysiology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.,Joint Rheumatology Academic Program, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| |
Collapse
|
36
|
Wang Z, Talburt JR, Wu N, Dagtas S, Zozus MN. A Rule-Based Data Quality Assessment System for Electronic Health Record Data. Appl Clin Inform 2020; 11:622-634. [PMID: 32968999 DOI: 10.1055/s-0040-1715567] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
Abstract
OBJECTIVE Rule-based data quality assessment in health care facilities was explored through compilation, implementation, and evaluation of 63,397 data quality rules in a single-center case study to assess the ability of rules-based data quality assessment to identify data errors of importance to physicians and system owners. METHODS We applied a design science framework to design, demonstrate, test, and evaluate a scalable framework with which data quality rules can be managed and used in health care facilities for data quality assessment and monitoring. RESULTS We identified 63,397 rules partitioned into 28 logic templates. A total of 819,683 discrepancies were identified by 4.5% of the rules. Nine out of 11 participating clinical and operational leaders indicated that the rules identified data quality problems and articulated next steps that they wanted to take based on the reported information. DISCUSSION The combined rule template and knowledge table approach makes governance and maintenance of otherwise large rule sets manageable. Identified challenges to rule-based data quality monitoring included the lack of curated and maintained knowledge sources relevant to data error detection and lack of organizational resources to support clinical and operational leaders with investigation and characterization of data errors and pursuit of corrective and preventative actions. Limitations of our study included implementation within a single center and dependence of the results on the implemented rule set. CONCLUSION This study demonstrates a scalable framework (up to 63,397 rules) with which data quality rules can be implemented and managed in health care facilities to identify data errors. The data quality problems identified at the implementation site were important enough to prompt action requests from clinical and operational leaders.
Collapse
Affiliation(s)
- Zhan Wang
- Department of Population Health Science, University of Texas Health Science Center at San Antonio, San Antonio, Texas, United States
| | - John R Talburt
- Department of Information Science, University of Arkansas at Little Rock, Little Rock, Arkansas, United States
| | - Ningning Wu
- Department of Information Science, University of Arkansas at Little Rock, Little Rock, Arkansas, United States
| | - Serhan Dagtas
- Department of Information Science, University of Arkansas at Little Rock, Little Rock, Arkansas, United States
| | - Meredith Nahm Zozus
- Department of Population Health Science, University of Texas Health Science Center at San Antonio, San Antonio, Texas, United States
| |
Collapse
|
37
|
Abstract
Asthma is one of the most common chronic diseases around the world and represents a serious problem in human health. Predictive models have become important in medical sciences because they provide valuable information for data-driven decision-making. In this work, a methodology of data-influence analytics based on mixed-effects logistic regression models is proposed for detecting potentially influential observations which can affect the quality of these models. Global and local influence diagnostic techniques are used simultaneously in this detection, which are often used separately. In addition, predictive performance measures are considered for this analytics. A study with children and adolescent asthma real data, collected from a public hospital of São Paulo, Brazil, is conducted to illustrate the proposed methodology. The results show that the influence diagnostic methodology is helpful for obtaining an accurate predictive model that provides scientific evidence when data-driven medical decision-making.
Collapse
|
38
|
Zhao L, Ciallella HL, Aleksunes LM, Zhu H. Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling. Drug Discov Today 2020; 25:1624-1638. [PMID: 32663517 PMCID: PMC7572559 DOI: 10.1016/j.drudis.2020.07.005] [Citation(s) in RCA: 84] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 06/26/2020] [Accepted: 07/06/2020] [Indexed: 02/06/2023]
Abstract
Advancing a new drug to market requires substantial investments in time as well as financial resources. Crucial bioactivities for drug candidates, including their efficacy, pharmacokinetics (PK), and adverse effects, need to be investigated during drug development. With advancements in chemical synthesis and biological screening technologies over the past decade, a large amount of biological data points for millions of small molecules have been generated and are stored in various databases. These accumulated data, combined with new machine learning (ML) approaches, such as deep learning, have shown great potential to provide insights into relevant chemical structures to predict in vitro, in vivo, and clinical outcomes, thereby advancing drug discovery and development in the big data era.
Collapse
Affiliation(s)
- Linlin Zhao
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Heather L Ciallella
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Lauren M Aleksunes
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ 08854, USA
| | - Hao Zhu
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA; Department of Chemistry, Rutgers University, Camden, NJ 08102, USA.
| |
Collapse
|
39
|
Pezoulas VC, Grigoriadis GI, Tachos NS, Barlocco F, Olivotto I, Fotiadis DI. Generation of virtual patient data for in-silico cardiomyopathies drug development using tree ensembles: a comparative study. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:5343-5346. [PMID: 33019190 DOI: 10.1109/embc44109.2020.9176567] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In-silico clinical platforms have been recently used as a new revolutionary path for virtual patients (VP) generation and further analysis, such as, drug development. Advanced individualized models have been developed to enhance flexibility and reliability of the virtual patient cohorts. This study focuses on the implementation and comparison of three different methodologies for generating virtual data for in-silico clinical trials. Towards this direction, three computational methods, namely: (i) the multivariate log-normal distribution (log- MVND), (ii) the supervised tree ensembles, and (iii) the unsupervised tree ensembles are deployed and evaluated against their performance towards the generation of high-quality virtual data using the goodness of fit (gof) and the dataset correlation matrix as performance evaluation measures. Our results reveal the dominance of the tree ensembles towards the generation of virtual data with similar distributions (gof values less than 0.2) and correlation patterns (average difference less than 0.03).
Collapse
|
40
|
Abstract
Introduction: Primary Sjögren's syndrome (pSS) is an autoimmune systemic disease characterized by a complex and not yet completely elucidated etiopathogenesis, where autoimmune manifestations coexist with different degree of lymphoproliferation, resulting in multiple possible scenarios extremely heterogeneous from patient to patient. Although considerable progress has been made in the identifications of potential novel therapeutic targets in recent years, the biological complexity of pSS, combined to such heterogeneous clinical manifestations, makes the treatment of pSS, even today, a great challenge. Areas covered: A therapy specifically approved for pSS is still lacking. In recent years, several novel promising agents are being tested in pSS. Based on a deep revision of drugs evaluated for pSS therapy, it is striking that several clinical trials, some of them testing very promising agents, failed. Expert opinion: a renewal of clinical trial design, including the definition of novel inclusion criteria and outcome measures, together with the development of a stratification model of pSS patients and the advance in the definition of pathogenetic mechanisms underlying peculiar pSS subsets, represent preliminary and crucial steps to overcome the current therapeutic impasse in pSS.
Collapse
Affiliation(s)
- Saviana Gandolfo
- a Rheumatology Clinic, Udine University Hospital, Department of Medical Area , University of Udine , Udine , Italy
| | - Salvatore De Vita
- a Rheumatology Clinic, Udine University Hospital, Department of Medical Area , University of Udine , Udine , Italy
| |
Collapse
|
41
|
Hao D, Zhang L, Sumkin J, Mohamed A, Wu S. Inaccurate Labels in Weakly-Supervised Deep Learning: Automatic Identification and Correction and Their Impact on Classification Performance. IEEE J Biomed Health Inform 2020; 24:2701-2710. [PMID: 32078570 DOI: 10.1109/jbhi.2020.2974425] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
In data-driven deep learning-based modeling, data quality may substantially influence classification performance. Correct data labeling for deep learning modeling is critical. In weakly-supervised learning, a challenge lies in dealing with potentially inaccurate or mislabeled training data. In this paper, we proposed an automated methodological framework to identify mislabeled data using two metric functions, namely, Cross-entropy Loss that indicates divergence between a prediction and ground truth, and Influence function that reflects the dependence of a model on data. After correcting the identified mislabels, we measured their impact on the classification performance. We also compared the mislabeling effects in three experiments on two different real-world clinical questions. A total of 10,500 images were studied in the contexts of clinical breast density category classification and breast cancer malignancy diagnosis. We used intentionally flipped labels as mislabels to evaluate the proposed method at a varying proportion of mislabeled data included in model training. We also compared the effects of our method to two published schemes for breast density category classification. Experiment results show that when the dataset contains 10% of mislabeled data, our method can automatically identify up to 98% of these mislabeled data by examining/checking the top 30% of the full dataset. Furthermore, we show that correcting the identified mislabels leads to an improvement in the classification performance. Our method provides a feasible solution for weakly-supervised deep learning modeling in dealing with inaccurate labels.
Collapse
|
42
|
Pezoulas VC, Kourou KD, Kalatzis F, Exarchos TP, Zampeli E, Gandolfo S, Goules A, Baldini C, Skopouli F, De Vita S, Tzioufas AG, Fotiadis DI. Overcoming the Barriers That Obscure the Interlinking and Analysis of Clinical Data Through Harmonization and Incremental Learning. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2020; 1:83-90. [PMID: 35402941 PMCID: PMC8940202 DOI: 10.1109/ojemb.2020.2981258] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Revised: 02/23/2020] [Accepted: 03/09/2020] [Indexed: 11/22/2022] Open
Abstract
Goal: To present a framework for data sharing, curation, harmonization and federated data analytics to solve open issues in healthcare, such as, the development of robust disease prediction models. Methods: Data curation is applied to remove data inconsistencies. Lexical and semantic matching methods are used to align the structure of the heterogeneous, curated cohort data along with incremental learning algorithms including class imbalance handling and hyperparameter optimization to enable the development of disease prediction models. Results: The applicability of the framework is demonstrated in a case study of primary Sjögren's Syndrome, yielding harmonized data with increased quality and more than 85% agreement, along with lymphoma prediction models with more than 80% sensitivity and specificity. Conclusions: The framework provides data quality, harmonization and analytics workflows that can enhance the statistical power of heterogeneous clinical data and enables the development of robust models for disease prediction.
Collapse
Affiliation(s)
- Vasileios C Pezoulas
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of Ioannina GR45110 Ioannina Greece
| | - Konstantina D Kourou
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of Ioannina GR45110 Ioannina Greece
- Department of Biological Applications and TechnologyUniversity of Ioannina GR45110 Ioannina Greece
| | - Fanis Kalatzis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of Ioannina GR45110 Ioannina Greece
| | - Themis P Exarchos
- Department of InformaticsIonian University GR49100 Corfu Greece
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of Ioannina GR45100 Ioannina Greece
| | - Evi Zampeli
- Institute for Systemic Autoimmune and Neurological Diseases GR11743 Athens Greece
| | - Saviana Gandolfo
- Clinic of Rheumatology, Department of Medical and Biological SciencesUdine University IT33100 Udine Italy
| | - Andreas Goules
- Department of Pathophysiology, School of MedicineUniversity of Athens GR15772 Athens Greece
| | - Chiara Baldini
- Department of Clinical and Experimental MedicineUniversity of Pisa Pisa IT56126 Italy
| | - Fotini Skopouli
- Department of Internal Medicine and Clinical ImmunologyEuroclinic Hospital GR11521 Athens Greece
| | - Salvatore De Vita
- Clinic of Rheumatology, Department of Medical and Biological SciencesUdine University IT33100 Udine Italy
| | - Athanasios G Tzioufas
- Department of Pathophysiology, School of MedicineUniversity of Athens GR15772 Athens Greece
| | - Dimitrios I Fotiadis
- Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringUniversity of Ioannina GR45110 Ioannina Greece
- Department of Biomedical ResearchFORTH-IMBB GR45110 Ioannina Greece
| |
Collapse
|
43
|
Kourou KD, Pezoulas VC, Georga EI, Exarchos T, Papaloukas C, Voulgarelis M, Goules A, Nezos A, Tzioufas AG, Moutsopoulos EM, Mavragani C, Fotiadis DI. Predicting Lymphoma Development by Exploiting Genetic Variants and Clinical Findings in a Machine Learning-Based Methodology With Ensemble Classifiers in a Cohort of Sjögren's Syndrome Patients. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2020; 1:49-56. [PMID: 35402956 PMCID: PMC8979630 DOI: 10.1109/ojemb.2020.2965191] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 12/20/2019] [Accepted: 12/20/2019] [Indexed: 11/18/2022] Open
Abstract
Lymphoma development constitutes one of the most serious clinico-pathological manifestations of patients with Sjögren's Syndrome (SS). Over the last decades the risk for lymphomagenesis in SS patients has been studied aiming to identify novel biomarkers and risk factors predicting lymphoma development in this patient population. Objective: The current study aims to explore whether genetic susceptibility profiles of SS patients along with known clinical, serological and histological risk factors enhance the accuracy of predicting lymphoma development in this patient population. Methods: The potential predicting role of both genetic variants, clinical and laboratory risk factors were investigated through a Machine Learning-based (ML) framework which encapsulates ensemble classifiers. Results: Ensemble methods empower the classification accuracy with approaches which are sensitive to minor perturbations in the training phase. The evaluation of the proposed methodology based on a 10-fold stratified cross validation procedure yielded considerable results in terms of balanced accuracy (GB: 0.7780 ± 0.1514, RF Gini: 0.7626 ± 0.1787, RF Entropy: 0.7590 ± 0.1837). Conclusions: The initial clinical, serological, histological and genetic findings at an early diagnosis have been exploited in an attempt to establish predictive tools in clinical practice and further enhance our understanding towards lymphoma development in SS.
Collapse
Affiliation(s)
- Konstantina D Kourou
- 1 Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringThe University of Ioannina GR45110 Ioannina Greece
- 2 Department of Biological Applications and TechnologyThe University of Ioannina GR45110 Ioannina Greece
| | - Vasileios C Pezoulas
- 1 Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringThe University of Ioannina GR45110 Ioannina Greece
| | - Eleni I Georga
- 1 Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringThe University of Ioannina GR45110 Ioannina Greece
| | - Themis Exarchos
- 1 Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringThe University of Ioannina GR45110 Ioannina Greece
- 3 Department of InformaticsIonian University GR49100 Corfu Greece
| | - Costas Papaloukas
- 1 Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringThe University of Ioannina GR45110 Ioannina Greece
- 2 Department of Biological Applications and TechnologyThe University of Ioannina GR45110 Ioannina Greece
| | - Michalis Voulgarelis
- 4 Foundation for Research and Technology-HellasInstitute of Molecular Biology and BiotechnologyDepartment of Biomedical Research Ioannina GR45110 Greece
| | - Andreas Goules
- 4 Foundation for Research and Technology-HellasInstitute of Molecular Biology and BiotechnologyDepartment of Biomedical Research Ioannina GR45110 Greece
| | - Andrianos Nezos
- 6 Department of Physiology, School of MedicineNational and Kapodistrian University of Athens GR15772 Athens Greece
| | - Athanasios G Tzioufas
- 4 Foundation for Research and Technology-HellasInstitute of Molecular Biology and BiotechnologyDepartment of Biomedical Research Ioannina GR45110 Greece
| | | | - Clio Mavragani
- 5 Department of Pathophysiology, School of MedicineNational and Kapodistrian University of Athens GR15772 Athens Greece
- 6 Department of Physiology, School of MedicineNational and Kapodistrian University of Athens GR15772 Athens Greece
| | - Dimitrios I Fotiadis
- 1 Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringThe University of Ioannina GR45110 Ioannina Greece
- 4 Foundation for Research and Technology-HellasInstitute of Molecular Biology and BiotechnologyDepartment of Biomedical Research Ioannina GR45110 Greece
- 7 Academy of Athens GR10679 Athens Greece
| |
Collapse
|
44
|
De Vita S, Gandolfo S. Predicting lymphoma development in patients with Sjögren's syndrome. Expert Rev Clin Immunol 2019; 15:929-938. [PMID: 31347413 DOI: 10.1080/1744666x.2019.1649596] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Introduction: The issue of predicting lymphoma in primary Sjögren's syndrome (pSS) starts from its clinical and biologic essence, i.e., an autoimmune exocrinopathy with sicca syndrome, inflammation and lymphoproliferation of MALT (mucosa-associated lymphoid tissue) in exocrine glands. Areas covered: The two major predictors to be firstly focused are persistent salivary gland (SG) swelling and cryoglobulinemic vasculitis with related features as purpura and low C4, or the sole serum cryoglobulinemia repeatedly detected. They are pathogenetically linked and reflect a heavier MALT involvement by histopathology, with the expansion of peculiar rheumatoid factor (RF)-positive clones/idiotypes. Other predictors include lymphadenopathy, splenomegaly, neutropenia, lymphopenia, serum beta2-microglobulin, monoclonal immunoglobulins, light chains, and RF. Composite indexes/scores may also predict lymphoma. Expert opinion: Prediction at baseline needs amelioration, and must be repeated in the follow-up. Careful clinical characterization, with harmonization and stratification of large cohorts, is a relevant preliminary step. Validated and new biomarkers are needed in biologic fluids and tissues. SG echography with automatic scoring could represent a future imaging biomarker, still lacking. Scoring MALT involvement in pSS, as an additional tool to evaluate disease activity and possibly to predict lymphoma, is welcomed. All these efforts are now ongoing within the HarmonicSS project and in other research initiatives in pSS.
Collapse
Affiliation(s)
- Salvatore De Vita
- Rheumatology Clinic, Udine University Hospital, Department of Medical Area, University of Udine , Udine , Italy
| | - Saviana Gandolfo
- Rheumatology Clinic, Udine University Hospital, Department of Medical Area, University of Udine , Udine , Italy
| |
Collapse
|