1
|
De Abreu Ferreira R, Zhong S, Moureaud C, Le MT, Rothstein A, Li X, Wang L, Patwardhan M. A Pilot, Predictive Surveillance Model in Pharmacovigilance Using Machine Learning Approaches. Adv Ther 2024; 41:2435-2445. [PMID: 38704799 PMCID: PMC11133112 DOI: 10.1007/s12325-024-02870-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/04/2024] [Indexed: 05/07/2024]
Abstract
INTRODUCTION The identification of a new adverse event (AE) caused by a drug product is one of the key activities in the pharmaceutical industry to ensure the safety profile of a drug product. Machine learning (ML) has the potential to assist with signal detection and supplement traditional pharmacovigilance (PV) surveillance methods. This pilot ML modeling study was designed to detect potential safety signals for two AbbVie products and test the model's capability of detecting safety signals earlier than humans. METHODS Drug X, a mature product with post-marketing data, and Drug Y, a recently approved drug in another therapeutic area, were selected. Gradient boosting-based ML approaches (e.g., XGBoost) were applied as the main modeling strategy. RESULTS For Drug X, eight true signals were present in the test set. Among 12 potential new signals generated, four were true signals with a 50.0% sensitivity rate and a 33.3% positive predictive value (PPV) rate. Among the remaining eight potential new signals, one was confirmed as a signal and detected six months earlier than humans. For Drug Y, nine true signals were present in the test set. Among 13 potential new signals generated, five were true signals with a 55.6% sensitivity rate and a 38.5% PPV rate. Among the remaining eight potential new signals, none were confirmed as true signals upon human review. CONCLUSION This model demonstrated acceptable accuracy for safety signal detection and potential for earlier detection when compared to humans. Expert judgment, flexibility, and critical thinking are essential human skills required for the final, accurate assessment of adverse event cases.
Collapse
Affiliation(s)
- Rosa De Abreu Ferreira
- Medical Safety Evaluation, Pharmacovigilance and Patient Safety, Epidemiology, and Research and Development Quality Assurance, AbbVie, Inc., North Chicago, IL, USA
| | - Sheng Zhong
- Statistical Sciences and Analytics, Data and Statistical Sciences, AbbVie, Inc., North Chicago, IL, USA
| | - Charlotte Moureaud
- Safety Data Sciences, Pharmacovigilance and Patient Safety, Epidemiology, and Research and Development Quality Assurance, AbbVie, Inc., North Chicago, IL, USA.
| | - Michelle T Le
- Medication Safety Fellow, Purdue University College of Pharmacy, West Lafayette, IN, USA
| | - Adrienne Rothstein
- Medical Safety Evaluation, Pharmacovigilance and Patient Safety, Epidemiology, and Research and Development Quality Assurance, AbbVie, Inc., North Chicago, IL, USA
| | - Xiaomeng Li
- Statistical Sciences and Analytics, Data and Statistical Sciences, AbbVie, Inc., North Chicago, IL, USA
| | - Li Wang
- Statistical Sciences and Analytics, Data and Statistical Sciences, AbbVie, Inc., North Chicago, IL, USA
| | - Meenal Patwardhan
- Medical Safety Evaluation, Pharmacovigilance and Patient Safety, Epidemiology, and Research and Development Quality Assurance, AbbVie, Inc., North Chicago, IL, USA
- Safety Data Sciences, Pharmacovigilance and Patient Safety, Epidemiology, and Research and Development Quality Assurance, AbbVie, Inc., North Chicago, IL, USA
| |
Collapse
|
2
|
Grdinic AG, Radovanovic S, Gleditsch J, Jørgensen CT, Asady E, Pettersen HH, Delibasic B, Ghanima W. Developing a machine learning model for bleeding prediction in patients with cancer-associated thrombosis receiving anticoagulation therapy. J Thromb Haemost 2024; 22:1094-1104. [PMID: 38184201 DOI: 10.1016/j.jtha.2023.12.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 12/07/2023] [Accepted: 12/12/2023] [Indexed: 01/08/2024]
Abstract
BACKGROUND Only 1 conventional score is available for assessing bleeding risk in patients with cancer-associated thrombosis (CAT): the CAT-BLEED score. OBJECTIVES Our aim was to develop a machine learning-based risk assessment model for predicting bleeding in CAT and to evaluate its predictive performance in comparison to that of the CAT-BLEED score. METHODS We collected 488 attributes (clinical data, biochemistry, and International Classification of Diseases, 10th Revision, diagnosis) in 1080 unique patients with CAT. We compared CAT-BLEED score, Ridge and Lasso logistic regression, random forest, and Extreme Gradient Boosting (XGBoost) algorithms for predicting major bleeding or clinically relevant nonmajor bleeding occurring 1 to 90 days, 1 to 365 days, and 90 to 455 days after venous thromboembolism (VTE). RESULTS The predictive performances of Lasso logistic regression, random forest, and XGBoost were higher than that of the CAT-BLEED score in the prediction of bleeding occurring 1 to 90 days and 1 to 365 days after VTE. For predicting major bleeding or clinically relevant nonmajor bleeding 1 to 90 days after VTE, the CAT-BLEED score achieved a mean area under the receiver operating characteristic curve (AUROC) of 0.48 ± 0.13, while Lasso logistic regression and XGBoost both achieved AUROCs of 0.64 ± 0.12. For predicting bleeding 1 to 365 days after VTE, the CAT-BLEED score achieved a mean AUROC of 0.47 ± 0.08, while Lasso logistic regression and XGBoost achieved AUROCs of 0.64 ± 0.08 and 0.59 ± 0.08, respectively. CONCLUSION This is the first machine learning-based risk model for bleeding prediction in patients with CAT receiving anticoagulation therapy. Its predictive performance was higher than that of the conventional CAT-BLEED score. With further development, this novel algorithm might enable clinicians to perform personalized anticoagulation strategies with improved clinical outcomes.
Collapse
Affiliation(s)
- Aleksandra G Grdinic
- Department of Cardiology, Østfold Hospital, Sarpsborg, Norway; Department of Research, Østfold Hospital, Sarpsborg, Norway.
| | - Sandro Radovanovic
- Faculty of Organizational Sciences, University of Belgrade, Belgrade, Serbia
| | - Jostein Gleditsch
- Department of Radiology, Østfold Hospital, Sarpsborg, Norway; Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Camilla Tøvik Jørgensen
- Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Department of Emergency Medicine, Østfold Hospital, Sarpsborg, Norway
| | - Elia Asady
- Department of Research, Østfold Hospital, Sarpsborg, Norway; Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | | | - Boris Delibasic
- Faculty of Organizational Sciences, University of Belgrade, Belgrade, Serbia
| | - Waleed Ghanima
- Department of Research, Østfold Hospital, Sarpsborg, Norway; Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Department of Hematology, Oslo University Hospital, Oslo, Norway
| |
Collapse
|
3
|
Priebe D, Ghani B, Stowell D. Efficient Speech Detection in Environmental Audio Using Acoustic Recognition and Knowledge Distillation. SENSORS (BASEL, SWITZERLAND) 2024; 24:2046. [PMID: 38610256 PMCID: PMC11014398 DOI: 10.3390/s24072046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 03/01/2024] [Accepted: 03/07/2024] [Indexed: 04/14/2024]
Abstract
The ongoing biodiversity crisis, driven by factors such as land-use change and global warming, emphasizes the need for effective ecological monitoring methods. Acoustic monitoring of biodiversity has emerged as an important monitoring tool. Detecting human voices in soundscape monitoring projects is useful both for analyzing human disturbance and for privacy filtering. Despite significant strides in deep learning in recent years, the deployment of large neural networks on compact devices poses challenges due to memory and latency constraints. Our approach focuses on leveraging knowledge distillation techniques to design efficient, lightweight student models for speech detection in bioacoustics. In particular, we employed the MobileNetV3-Small-Pi model to create compact yet effective student architectures to compare against the larger EcoVAD teacher model, a well-regarded voice detection architecture in eco-acoustic monitoring. The comparative analysis included examining various configurations of the MobileNetV3-Small-Pi-derived student models to identify optimal performance. Additionally, a thorough evaluation of different distillation techniques was conducted to ascertain the most effective method for model selection. Our findings revealed that the distilled models exhibited comparable performance to the EcoVAD teacher model, indicating a promising approach to overcoming computational barriers for real-time ecological monitoring.
Collapse
Affiliation(s)
- Drew Priebe
- Department of Cognitive Science and Artificial Intelligence, Tilburg University, 5037 Tilburg, The Netherlands
| | - Burooj Ghani
- Naturalis Biodiversity Center, 2333 Leiden, The Netherlands;
| | - Dan Stowell
- Department of Cognitive Science and Artificial Intelligence, Tilburg University, 5037 Tilburg, The Netherlands
- Naturalis Biodiversity Center, 2333 Leiden, The Netherlands;
| |
Collapse
|
4
|
Whangbo J, Lee YS, Kim YJ, Kim J, Kim KG. Predicting Mismatch Repair Deficiency Status in Endometrial Cancer through Multi-Resolution Ensemble Learning in Digital Pathology. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-00997-z. [PMID: 38378964 DOI: 10.1007/s10278-024-00997-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 12/18/2023] [Accepted: 12/21/2023] [Indexed: 02/22/2024]
Abstract
For molecular classification of endometrial carcinoma, testing for mismatch repair (MMR) status is becoming a routine process. Mismatch repair deficiency (MMR-D) is caused by loss of expression in one or more of the 4 major MMR proteins: MLH1, MSH2, MSH6, PHS2. Over 30% of patients with endometrial cancer have MMR-D. Determining the MMR status holds significance as individuals with MMR-D are potential candidates for immunotherapy. Pathological whole slide image (WSI) of endometrial cancer with immunohistochemistry results of MMR proteins were gathered. Color normalization was applied to the tiles using a CycleGAN-based network. The WSI was divided into tiles at three different magnifications (2.5 × , 5 × , and 10 ×). Three distinct networks of the same architecture were employed to include features from all three magnification levels and were stacked for ensemble learning. Three architectures, InceptionResNetV2, EfficientNetB2, and EfficientNetB3 were employed and subjected to comparison. The per-tile results were gathered to classify MMR status in the WSI, and prediction accuracy was evaluated using the following performance metrics: AUC, accuracy, sensitivity, and specificity. The EfficientNetB2 was able to make predictions with an AUC of 0.821, highest among the three architectures, and an overall AUC range of 0.767 - 0.821 was reported across the three architectures. In summary, our study successfully predicted MMR classification from pathological WSIs in endometrial cancer through a multi-resolution ensemble learning approach, which holds the potential to facilitate swift decisions on tailored treatment, such as immunotherapy, in clinical settings.
Collapse
Affiliation(s)
- Jongwook Whangbo
- Department of Computer Science, Wesleyan University, Middletown, Connecticut, USA
- Medical Devices R&D Center, Gachon University Gil Hospital, Incheon, Republic of Korea
| | - Young Seop Lee
- Medical Devices R&D Center, Gachon University Gil Hospital, Incheon, Republic of Korea
| | - Young Jae Kim
- Medical Devices R&D Center, Gachon University Gil Hospital, Incheon, Republic of Korea
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health & Sciences and Technology (GAIHST), Gachon University, Incheon, Republic of Korea
| | - Jisup Kim
- Department of Pathology, Gil Medical Center, Gachon University College of Medicine, 38-13, Dokjeom-Ro 3Beon-Gil, Namdong-Gu, Incheon, Republic of Korea.
| | - Kwang Gi Kim
- Medical Devices R&D Center, Gachon University Gil Hospital, Incheon, Republic of Korea.
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health & Sciences and Technology (GAIHST), Gachon University, Incheon, Republic of Korea.
- Department of Biomedical Engineering, College of Health Science, Gachon University, Incheon, Republic of Korea.
| |
Collapse
|
5
|
He F, Fei R, Gao M, Su L, Zhang X, Xu D. Parameter-Efficient Fine-Tuning Enhances Adaptation of Single Cell Large Language Model for Cell Type Identification. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.27.577455. [PMID: 38352605 PMCID: PMC10862733 DOI: 10.1101/2024.01.27.577455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Single-cell sequencing transformed biology and medicine, providing an unprecedented high-resolution view at the cellular level. However, the vast variability inherent in single-cell sequencing data impedes its utility for in-depth downstream analysis. Inspired by the foundation models in natural language processing, recent advancements have led to the development of single-cell Large Language Models (scLLMs). These models are designed to discern universal patterns across diverse single-cell datasets, thereby enhancing the signal-to-noise ratio. Despite their potential, multiple studies indicate existing scLLMs do not perform well in zero-short settings, highlighting a pressing need for more effective adaptation techniques. This research proposes several adaptation techniques for scLLMs by preserving the original model parameters while selectively updating newly introduced tensors. This approach aims to overcome the limitations associated with traditional fine-tuning practices, such as catastrophic forgetting and computational inefficiencies. We introduce two Parameter-Efficient Fine-Tuning (PEFT) strategies specifically tailored to refine scLLMs for cell type identification. Our investigations utilizing scGPT demonstrate that PEFT can enhance performance, with the added benefit of up to a 90% reduction in parameter training compared to conventional fine-tuning methodologies. This work paves the way for a new direction in leveraging single-cell models with greater efficiency and efficacy in single-cell biology.
Collapse
Affiliation(s)
- Fei He
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA
| | - Ruixin Fei
- School of Information Science and Technology, Northeast Normal University, Changchun Jilin 130017, China
| | - Mingyue Gao
- School of Information Science and Technology, Northeast Normal University, Changchun Jilin 130017, China
| | - Li Su
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA
| | - Xinyu Zhang
- School of Information Science and Technology, Northeast Normal University, Changchun Jilin 130017, China
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA
| |
Collapse
|
6
|
Yuan G, Zhai Y, Tang J, Zhou X. Selection of HBV key reactivation factors based on maximum information coefficient combined with cosine similarity. Technol Health Care 2024; 32:749-763. [PMID: 37393455 DOI: 10.3233/thc-230161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2023]
Abstract
BACKGROUND Hepatitis B Virus (HBV) reactivation is the most common complication for patients with primary liver cancer (PLC) after radiotherapy. How to reduce the reactivation of HBV has been a hot topic in the study of postoperative radiotherapy for liver cancer. OBJECTIVE To find out the inducement of HBV reactivation, a feature selection algorithm (MIC-CS) using maximum information coefficient (MIC) combined with cosine similarity (CS) was proposed to screen the risk factors that may affect HBV reactivation. METHOD Firstly, different factors were coded and MIC between patients was calculated to acquire the association between different factors and HBV reactivation. Secondly, a cosine similarity algorithm was constructed to calculate the similarity relationship between different factors, thus removing redundant information. Finally, combined with the weight of the two, the potential risk factors were sorted and the key factors leading to HBV reactivation were selected. RESULTS The results indicated that HBV baseline, external boundary, TNM, KPS score, VD, AFP, and Child-Pugh could lead to HBV reactivation after radiotherapy. The classification model was constructed for the above factors, with the highest classification accuracy of 84% and the AUC value of 0.71. CONCLUSION Comparing multiple feature selection methods, the results showed that the effect of the MIC-CS was significantly better than MIM, CMIM, and mRMR, so it has a very broad application prospect.
Collapse
Affiliation(s)
- Gaoteng Yuan
- College of Computer and Information, Hohai University, Nanjing, Jiangsu, China
| | - Yi Zhai
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, Shandong, China
| | - Jiansong Tang
- College of Computer and Information, Hohai University, Nanjing, Jiangsu, China
| | - Xiaofeng Zhou
- College of Computer and Information, Hohai University, Nanjing, Jiangsu, China
| |
Collapse
|
7
|
Engelke M, Schmidt CS, Baldini G, Parmar V, Hosch R, Borys K, Koitka S, Turki AT, Haubold J, Horn PA, Nensa F. Optimizing platelet transfusion through a personalized deep learning risk assessment system for demand management. Blood 2023; 142:2315-2326. [PMID: 37890142 DOI: 10.1182/blood.2023021172] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 09/29/2023] [Accepted: 10/17/2023] [Indexed: 10/29/2023] Open
Abstract
ABSTRACT Platelet demand management (PDM) is a resource-consuming task for physicians and transfusion managers of large hospitals. Inpatient numbers and institutional standards play significant roles in PDM. However, reliance on these factors alone commonly results in platelet shortages. Using data from multiple sources, we developed, validated, tested, and implemented a patient-specific approach to support PDM that uses a deep learning-based risk score to forecast platelet transfusions for each hospitalized patient in the next 24 hours. The models were developed using retrospective electronic health record data of 34 809 patients treated between 2017 and 2022. Static and time-dependent features included demographics, diagnoses, procedures, blood counts, past transfusions, hematotoxic medications, and hospitalization duration. Using an expanding window approach, we created a training and live-prediction pipeline with a 30-day input and 24-hour forecast. Hyperparameter tuning determined the best validation area under the precision-recall curve (AUC-PR) score for long short-term memory deep learning models, which were then tested on independent data sets from the same hospital. The model tailored for hematology and oncology patients exhibited the best performance (AUC-PR, 0.84; area under the receiver operating characteristic curve [ROC-AUC], 0.98), followed by a multispecialty model covering all other patients (AUC-PR, 0.73). The model specific to cardiothoracic surgery had the lowest performance (AUC-PR, 0.42), likely because of unexpected intrasurgery bleedings. To our knowledge, this is the first deep learning-based platelet transfusion predictor enabling individualized 24-hour risk assessments at high AUC-PR. Implemented as a decision-support system, deep-learning forecasts might improve patient care by detecting platelet demand earlier and preventing critical transfusion shortages.
Collapse
Affiliation(s)
- Merlin Engelke
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, Germany
| | - Cynthia Sabrina Schmidt
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute for Transfusion Medicine, University Medicine Essen, Essen, Germany
| | - Giulia Baldini
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, Germany
| | - Vicky Parmar
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, Germany
| | - René Hosch
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, Germany
| | - Katarzyna Borys
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, Germany
| | - Sven Koitka
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, Germany
| | - Amin T Turki
- Computational Hematology Laboratory, Department of Hematology and Stem Cell Transplantation, West-German Cancer Center, University Medicine Essen, Essen, Germany
- Department of Hematology and Oncology, Marienhospital University Hospital, Ruhr University Bochum, Bochum, Germany
| | - Johannes Haubold
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, Germany
| | - Peter A Horn
- Institute for Transfusion Medicine, University Medicine Essen, Essen, Germany
| | - Felix Nensa
- Institute for Artificial Intelligence in Medicine, University Medicine Essen, Essen, Germany
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Medicine Essen, Essen, Germany
| |
Collapse
|
8
|
Pi SW, Lee BD, Lee MS, Lee HJ. Ensemble deep-learning networks for automated osteoarthritis grading in knee X-ray images. Sci Rep 2023; 13:22887. [PMID: 38129653 PMCID: PMC10739741 DOI: 10.1038/s41598-023-50210-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 12/16/2023] [Indexed: 12/23/2023] Open
Abstract
The Kellgren-Lawrence (KL) grading system is a scoring system for classifying the severity of knee osteoarthritis using X-ray images, and it is the standard X-ray-based grading system for diagnosing knee osteoarthritis. However, KL grading depends on the clinician's subjective assessment. Moreover, the accuracy varies significantly depending on the clinician's experience and can be particularly low. Therefore, in this study, we developed an ensemble network that can predict a consistent and accurate KL grade for knee osteoarthritis severity using a deep learning approach. We trained individual models on knee X-ray datasets using the most suitable image size for each model in an ensemble network rather than using datasets with a single image size. We then built the ensemble network using these models to overcome the instability of single models and further improve accuracy. We conducted various experiments using a dataset of 8260 images from the Osteoarthritis Initiative open dataset. The proposed ensemble network exhibited the best performance, achieving an accuracy of 76.93% and an F1-score of 0.7665. The Grad-CAM visualization technique was used to further evaluate the focus of the model. The results demonstrated that the proposed ensemble network outperforms existing techniques that have performed well in KL grade classification. Moreover, the proposed model focuses on the joint space around the knee to extract the imaging features required for KL grade classification, revealing its high potential for diagnosing knee osteoarthritis.
Collapse
Affiliation(s)
- Sun-Woo Pi
- Division of AI and Computer Engineering, Kyonggi University, Suwon, Republic of Korea
| | - Byoung-Dai Lee
- Division of AI and Computer Engineering, Kyonggi University, Suwon, Republic of Korea
| | - Mu Sook Lee
- Department of Radiology, Keimyung University Dongsan Hospital, Daegu, Republic of Korea.
| | - Hae Jeong Lee
- Department of Pediatrics, Samsung Changwon Hospital, Sungkyunkwan University School of Medicine, Changwon, Republic of Korea
| |
Collapse
|
9
|
Barreto TDO, Veras NVR, Cardoso PH, Fernandes FRDS, Medeiros LPDS, Bezerra MV, de Andrade FMQ, Pinheiro CDO, Sánchez-Gendriz I, Silva GJPC, Rodrigues LF, de Morais AHF, dos Santos JPQ, Paiva JC, de Andrade IGM, Valentim RADM. Artificial intelligence applied to analyzes during the pandemic: COVID-19 beds occupancy in the state of Rio Grande do Norte, Brazil. Front Artif Intell 2023; 6:1290022. [PMID: 38145230 PMCID: PMC10748397 DOI: 10.3389/frai.2023.1290022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 11/17/2023] [Indexed: 12/26/2023] Open
Abstract
The COVID-19 pandemic is already considered one of the biggest global health crises. In Rio Grande do Norte, a Brazilian state, the RegulaRN platform was the health information system used to regulate beds for patients with COVID-19. This article explored machine learning and deep learning techniques with RegulaRN data in order to identify the best models and parameters to predict the outcome of a hospitalized patient. A total of 25,366 bed regulations for COVID-19 patients were analyzed. The data analyzed comes from the RegulaRN Platform database from April 2020 to August 2022. From these data, the nine most pertinent characteristics were selected from the twenty available, and blank or inconclusive data were excluded. This was followed by the following steps: data pre-processing, database balancing, training, and test. The results showed better performance in terms of accuracy (84.01%), precision (79.57%), and F1-score (81.00%) for the Multilayer Perceptron model with Stochastic Gradient Descent optimizer. The best results for recall (84.67%), specificity (84.67%), and ROC-AUC (91.6%) were achieved by Root Mean Squared Propagation. This study compared different computational methods of machine and deep learning whose objective was to classify bed regulation data for patients with COVID-19 from the RegulaRN Platform. The results have made it possible to identify the best model to help health professionals during the process of regulating beds for patients with COVID-19. The scientific findings of this article demonstrate that the computational methods used applied through a digital health solution, can assist in the decision-making of medical regulators and government institutions in situations of public health crisis.
Collapse
Affiliation(s)
- Tiago de Oliveira Barreto
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Nícolas Vinícius Rodrigues Veras
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Pablo Holanda Cardoso
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Felipe Ricardo dos Santos Fernandes
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | | | - Maria Valéria Bezerra
- Secretary of Public Health of Rio Grande do Norte, Natal, Rio Grande do Norte, Brazil
| | | | | | - Ignacio Sánchez-Gendriz
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | - Gleyson José Pinheiro Caldeira Silva
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Leandro Farias Rodrigues
- Brazilian Company of Hospital Services (EBSERH), University Hospital of Pelotas, Federal University of Pelotas (UFPel), Pelotas, Rio Grande do Sul, Brazil
| | - Antonio Higor Freire de Morais
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - João Paulo Queiroz dos Santos
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Jailton Carlos Paiva
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
- Advanced Nucleus of Technological Innovation (NAVI), Federal Institute of Rio Grande do Norte (IFRN), Natal, Rio Grande do Norte, Brazil
| | - Ion Garcia Mascarenhas de Andrade
- Laboratory of Technological Innovation in Health (LAIS), Federal University of Rio Grande do Norte (UFRN), Natal, Rio Grande do Norte, Brazil
| | | |
Collapse
|
10
|
Overcast I, Noguerales V, Meramveliotakis E, Andújar C, Arribas P, Creedy TJ, Emerson BC, Vogler AP, Papadopoulou A, Morlon H. Inferring the ecological and evolutionary determinants of community genetic diversity. Mol Ecol 2023; 32:6093-6109. [PMID: 37221561 DOI: 10.1111/mec.16958] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 04/06/2023] [Accepted: 04/12/2023] [Indexed: 05/25/2023]
Abstract
Understanding the relative contributions of ecological and evolutionary processes to the structuring of ecological communities is needed to improve our ability to predict how communities may respond to future changes in an increasingly human-modified world. Metabarcoding methods make it possible to gather population genetic data for all species within a community, unlocking a new axis of data to potentially unveil the origins and maintenance of biodiversity at local scales. Here, we present a new eco-evolutionary simulation model for investigating community assembly dynamics using metabarcoding data. The model makes joint predictions of species abundance, genetic variation, trait distributions and phylogenetic relationships under a wide range of parameter settings (e.g. high speciation/low dispersal or vice versa) and across a range of community states, from pristine and unmodified to heavily disturbed. We first demonstrate that parameters governing metacommunity and local community processes leave detectable signatures in simulated biodiversity data axes. Next, using a simulation-based machine learning approach we show that neutral and non-neutral models are distinguishable and that reasonable estimates of several model parameters within the local community can be obtained using only community-scale genetic data, while phylogenetic information is required to estimate those describing metacommunity dynamics. Finally, we apply the model to soil microarthropod metabarcoding data from the Troodos mountains of Cyprus, where we find that communities in widespread forest habitats are structured by neutral processes, while high-elevation and isolated habitats act as an abiotic filter generating non-neutral community structure. We implement our model within the ibiogen R package, a package dedicated to the investigation of island, and more generally community-scale, biodiversity using community-scale genetic data.
Collapse
Affiliation(s)
- Isaac Overcast
- Institut de Biologie de l'ENS (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, Paris, France
- Department of Vertebrate Zoology, American Museum of Natural History, New York, New York, USA
| | - Víctor Noguerales
- Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), San Cristóbal de La Laguna, Spain
- Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | | | - Carmelo Andújar
- Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), San Cristóbal de La Laguna, Spain
| | - Paula Arribas
- Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), San Cristóbal de La Laguna, Spain
| | - Thomas J Creedy
- Department of Life Sciences, Natural History Museum, London, UK
| | - Brent C Emerson
- Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), San Cristóbal de La Laguna, Spain
| | - Alfried P Vogler
- Department of Life Sciences, Natural History Museum, London, UK
- Department of Life Sciences, Imperial College London, Ascot, UK
| | - Anna Papadopoulou
- Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Hélène Morlon
- Institut de Biologie de l'ENS (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, Paris, France
| |
Collapse
|
11
|
Langenberger B. Machine learning as a tool to identify inpatients who are not at risk of adverse drug events in a large dataset of a tertiary care hospital in the USA. Br J Clin Pharmacol 2023; 89:3523-3538. [PMID: 37430382 DOI: 10.1111/bcp.15846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 07/03/2023] [Accepted: 07/06/2023] [Indexed: 07/12/2023] Open
Abstract
AIMS Adverse drug events (ADEs) are a major threat to inpatients in the United States of America (USA). It is unknown how well machine learning (ML) is able to predict whether or not a patient will suffer from an ADE during hospital stay based on data available at hospital admission for emergency department patients of all ages (binary classification task). It is further unknown whether ML is able to outperform logistic regression (LR) in doing so, and which variables are the most important predictors. METHODS In this study, 5 ML models- namely a random forest, gradient boosting machine (GBM), ridge regression, least absolute shrinkage and selection operator (LASSO) regression, and elastic net regression-as well as a LR were trained and tested for the prediction of inpatient ADEs identified using ICD-10-CM codes based on comprehensive previous work in a diverse population. In total, 210 181 observations from patients who were admitted to a large tertiary care hospital after emergency department stay between 2011 and 2019 were included. The area under the receiver operating characteristics curve (AUC) and AUC-precision-recall (AUC-PR) were used as primary performance indicators. RESULTS Tree-based models performed best with respect to AUC and AUC-PR. The gradient boosting machine (GBM) reached an AUC of 0.747 (95% confidence interval (CI): 0.735 to 0.759) and an AUC-PR of 0.134 (95% CI: 0.131 to 0.137) on unforeseen test data, while the random forest reached an AUC of 0.743 (95% CI: 0.731 to 0.755) and an AUC-PR of 0.139 (95% CI: 0.135 to 0.142), respectively. ML statistically significantly outperformed LR both on AUC and AUC-PR. Nonetheless, overall, models did not differ much with respect to their performance. Most important predictors were admission type, temperature and chief complaint for the best performing model (GBM). CONCLUSIONS The study demonstrated a first application of ML to predict inpatient ADEs based on ICD-10-CM codes, and a comparison with LR. Future research should address concerns arising from low precision and related problems.
Collapse
Affiliation(s)
- Benedikt Langenberger
- Department of Health Care Management, Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
12
|
Xue Y, Zheng X, Wu G, Wang J. Rapid diagnosis of cervical cancer based on serum FTIR spectroscopy and support vector machines. Lasers Med Sci 2023; 38:276. [PMID: 38001244 DOI: 10.1007/s10103-023-03930-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 11/06/2023] [Indexed: 11/26/2023]
Abstract
Cervical cancer is one of the most common malignant tumors among female gynecological diseases. This paper aims to explore the feasibility of utilizing serum Fourier Transform Infrared (FTIR) spectroscopy, combined with machine learning and deep learning algorithms, to efficiently differentiate between healthy individuals, hysteromyoma patients, and cervical cancer patients. In this study, serum samples from 30 groups of hysteromyoma, 36 groups of cervical cancer, and 30 healthy groups were collected and FTIR spectra of each group were recorded. In addition, the raw datasets were averaged according to the number of scans to obtain an average dataset, and the raw datasets were spectrally enhanced to obtain an augmentation dataset, resulting in a total of three sets of data with sizes of 258, 96, and 1806, respectively. Then, the hyperparameters in the four kernel functions of the Support Vector Machine (SVM) model were optimized by grid search and leave-one-out (LOO) cross-validation. The resulting SVM models achieved recognition accuracies ranging from 85.0% to 100.0% on the test set. Furthermore, a one-dimensional convolutional neural network (1D-CNN) demonstrated a recognition accuracy of 75.0% to 90.0% on the test set. It can be concluded that the use of serum FTIR spectroscopy combined with the SVM algorithm for the diagnosis of cervical cancer has important medical significance.
Collapse
Affiliation(s)
- Yunfei Xue
- College of Software, Xinjiang University, 830046, Urumqi, China
| | - Xiangxiang Zheng
- Tianjin Key Laboratory for Control Theory & Applications in Complicated Systems, School of Electrical Engineering and Automation, Tianjin University of Technology, 300384, Tianjin, China
| | - Guohua Wu
- School of Electronic Engineering, Beijing University of Posts and Telecommunicationsn, 100876, Beijing, China.
| | - Jing Wang
- State Key Laboratory of Pathogenesis, Prevention and Treatment of High Incidence Diseases in Central Asia, Department of Gynecology, The First Affiliated Hospital of Xinjiang Medical University, 830054, Urumqi, China
| |
Collapse
|
13
|
Pansuwan T, Quaegebeur A, Kaalund SS, Hidari E, Briggs M, Rowe JB, Rittman T. Accurate digital quantification of tau pathology in progressive supranuclear palsy. Acta Neuropathol Commun 2023; 11:178. [PMID: 37946288 PMCID: PMC10634011 DOI: 10.1186/s40478-023-01674-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 10/20/2023] [Indexed: 11/12/2023] Open
Abstract
The development of novel treatments for Progressive Supranuclear Palsy (PSP) is hindered by a knowledge gap of the impact of neurodegenerative neuropathology on brain structure and function. The current standard practice for measuring postmortem tau histology is semi-quantitative assessment, which is prone to inter-rater variability, time-consuming and difficult to scale. We developed and optimized a tau aggregate type-specific quantification pipeline for cortical and subcortical regions, in human brain donors with PSP. We quantified 4 tau objects ('neurofibrillary tangles', 'coiled bodies', 'tufted astrocytes', and 'tau fragments') using a probabilistic random forest machine learning classifier. The tau pipeline achieved high classification performance (F1-score > 0.90), comparable to neuropathologist inter-rater reliability in the held-out test set. Using 240 AT8 slides from 32 postmortem brains, the tau burden was correlated against the PSP pathology staging scheme using Spearman's rank correlation. We assessed whether clinical severity (PSP rating scale, PSPRS) score reflects neuropathological severity inferred from PSP stage and tau burden using Bayesian linear mixed regression. Tufted astrocyte density in cortical regions and coiled body density in subcortical regions showed the highest correlation to PSP stage (r = 0.62 and r = 0.38, respectively). Using traditional manual staging, only PSP patients in stage 6, not earlier stages, had significantly higher clinical severity than stage 2. Cortical tau density and neurofibrillary tangle density in subcortical regions correlated with clinical severity. Overall, our data indicate the potential for highly accurate digital tau aggregate type-specific quantification for neurodegenerative tauopathies; and the importance of studying tau aggregate type-specific burden in different brain regions as opposed to overall tau, to gain insights into the pathogenesis and progression of tauopathies.
Collapse
Affiliation(s)
- Tanrada Pansuwan
- Department of Clinical Neurosciences, Cambridge University Centre for Parkinson-Plus, University of Cambridge, Herchel Smith Building, Robinson Way, Cambridge, CB2 0SZ, UK.
| | - Annelies Quaegebeur
- Department of Clinical Neurosciences, Cambridge University Centre for Parkinson-Plus, University of Cambridge, Herchel Smith Building, Robinson Way, Cambridge, CB2 0SZ, UK
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Sanne S Kaalund
- Centre for Neuroscience and Stereology, Bispebjerg University Hospital, Copenhagen, Denmark
| | - Eric Hidari
- Department of Clinical Neurosciences, Cambridge University Centre for Parkinson-Plus, University of Cambridge, Herchel Smith Building, Robinson Way, Cambridge, CB2 0SZ, UK
| | - Mayen Briggs
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - James B Rowe
- Department of Clinical Neurosciences, Cambridge University Centre for Parkinson-Plus, University of Cambridge, Herchel Smith Building, Robinson Way, Cambridge, CB2 0SZ, UK
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
- Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK
| | - Timothy Rittman
- Department of Clinical Neurosciences, Cambridge University Centre for Parkinson-Plus, University of Cambridge, Herchel Smith Building, Robinson Way, Cambridge, CB2 0SZ, UK
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| |
Collapse
|
14
|
Binet T, Padiolleau-Lefèvre S, Octave S, Avalle B, Maffucci I. Comparative Study of Single-stranded Oligonucleotides Secondary Structure Prediction Tools. BMC Bioinformatics 2023; 24:422. [PMID: 37940855 PMCID: PMC10634105 DOI: 10.1186/s12859-023-05532-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 10/13/2023] [Indexed: 11/10/2023] Open
Abstract
BACKGROUND Single-stranded nucleic acids (ssNAs) have important biological roles and a high biotechnological potential linked to their ability to bind to numerous molecular targets. This depends on the different spatial conformations they can assume. The first level of ssNAs spatial organisation corresponds to their base pairs pattern, i.e. their secondary structure. Many computational tools have been developed to predict the ssNAs secondary structures, making the choice of the appropriate tool difficult, and an up-to-date guide on the limits and applicability of current secondary structure prediction tools is missing. Therefore, we performed a comparative study of the performances of 9 freely available tools (mfold, RNAfold, CentroidFold, CONTRAfold, MC-Fold, LinearFold, UFold, SPOT-RNA, and MXfold2) on a dataset of 538 ssNAs with known experimental secondary structure. RESULTS The minimum free energy-based tools, namely mfold and RNAfold, and some tools based on artificial intelligence, namely CONTRAfold and MXfold2, provided the best results, with [Formula: see text] of exact predictions, whilst MC-fold seemed to be the worst performing tool, with only [Formula: see text] of exact predictions. In addition, UFold and SPOT-RNA are the only options for pseudoknots prediction. Including in the analysis of mfold and RNAfold results 5-10 suboptimal solutions further improved the performances of these tools. Nevertheless, we could observe issues in predicting particular motifs, such as multiple-ways junctions and mini-dumbbells, or the ssNAs whose structure has been determined in complex with a protein. In addition, our benchmark shows that some effort has to be paid for ssDNA secondary structure predictions. CONCLUSIONS In general, Mfold, RNAfold, and MXfold2 seem to currently be the best choice for the ssNAs secondary structure prediction, although they still show some limits linked to specific structural motifs. Nevertheless, actual trends suggest that artificial intelligence has a high potential to overcome these remaining issues, for example the recently developed UFold and SPOT-RNA have a high success rate in predicting pseudoknots.
Collapse
Affiliation(s)
- Thomas Binet
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France
| | - Séverine Padiolleau-Lefèvre
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France
| | - Stéphane Octave
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France
| | - Bérangère Avalle
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France.
| | - Irene Maffucci
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France.
| |
Collapse
|
15
|
Lin X, Dong R, Zhao Y, Wang R. Efficient ship noise classification with positive incentive noise and fused features using a simple convolutional network. Sci Rep 2023; 13:17905. [PMID: 37863973 PMCID: PMC10589344 DOI: 10.1038/s41598-023-45245-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 10/17/2023] [Indexed: 10/22/2023] Open
Abstract
Ship noise analysis is a critical area of research in hydroacoustic remote sensing due to its practical implications in identifying ship direction, type, and even specific ship identities. However, the limited availability of data poses challenges in developing accurate ship noise classification models. Previous studies have mainly focused on small-sample learning approaches, resulting in complex network structures. Nonetheless, underwater robots often have limited computing power, making it essential to develop simpler recognition networks. In this paper, we address the issue of data scarcity by introducing positive incentive noise. We propose a CNN-based hydroacoustic signal recognition method that achieves comparable or superior performance to previous studies, using a simple network structure as a back-end decision system. We describe the feature extraction process using a dataset with added noise and compare the performance of various features. Additionally, we compare our proposed method with previous studies. Experimental results demonstrate that simple neural networks can achieve high performance and excellent generalizability without the need for complex network structures like adversarial learning models.
Collapse
Affiliation(s)
- Xu Lin
- College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, China.
| | - Ruichun Dong
- College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
| | - Yuqing Zhao
- College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
| | - Rui Wang
- College of Mechanical and Electronic Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
| |
Collapse
|
16
|
Noh J, Chang H. Data-Driven Prediction of Configurational Stability of Molecule-Adsorbed Heterogeneous Catalysts. J Chem Inf Model 2023; 63:5981-5995. [PMID: 37715300 DOI: 10.1021/acs.jcim.3c00591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2023]
Abstract
The design of new heterogeneous catalysts that convert small molecules into valuable chemicals is a key challenge for constructing sustainable energy systems. Density functional theory (DFT)-based design frameworks based on the understanding of molecular adsorption on the catalytic surface have been widely proposed to accelerate experimental approaches to develop novel catalysts. In addition, a machine learning (ML)-combined design framework was recently proposed to further reduce the inherent time cost of DFT-based frameworks. However, because of the lack of prior information on chemical interactions between arbitrary surfaces and adsorbates, the efficacy of the computational screening approaches would be reduced by obtaining unexpected structural anomalies (i.e., abnormally converged surface-adsorbate geometries after the DFT calculations) during an exhaustive exploration of chemical space. To overcome this challenge, we propose an ML framework that directly predicts the configurational stability of a given initial surface-adsorbate geometry. Our benchmark experiments with the Open Catalysts 20 (OC20) dataset show promising performance on classifying stable geometry (i.e., F1-score of 0.922, the area under the receiver operating characteristics (AUROC) of 0.906, and Matthews correlation coefficient (MCC) of 0.633) with a high precision of 0.921 by utilizing an ensemble approach. We further interpret the generalizability and domain applicability of the trained model in terms of the chemical space of the OC20 dataset. Furthermore, from an experiment on the training set size dependence of model performance, we found that our ML model could be practically applicable to classify stable configurations even with a relatively small number of training data.
Collapse
Affiliation(s)
- Juhwan Noh
- Korea Research Institute of Chemical Technology (KRICT), Daejeon 34114, Republic of Korea
| | - Hyunju Chang
- Korea Research Institute of Chemical Technology (KRICT), Daejeon 34114, Republic of Korea
| |
Collapse
|
17
|
Wang Y, Niu M, Liu K, Shen M, Qin B, Wang H. A Novel Data Augmentation Method Based on CoralGAN for Prediction of Part Surface Roughness. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7024-7033. [PMID: 34995197 DOI: 10.1109/tnnls.2021.3137172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep learning networks can be applied to the field of intelligent prediction of part surface roughness. However, the surface roughness samples of parts have the problems of high collection cost, unbalanced categories, and complicated data distribution, which inevitably limit the application of deep learning network models in the field of intelligent prediction of part surface roughness. To solve these problems, this article proposes a novel data augmentation method based on CoralGAN for prediction of part surface roughness, which introduces the domain adaptive method deep coral function to help optimize the network parameters of the generator of generative adversarial network (GAN). Specifically, the vibration signal collected during processing is converted into frequency spectrum data and input into CoralGAN. The training of the generator is guided by coral loss, that is, the distance between the covariances of the real samples and generated samples features, not just the statistical consistency of the traditional GAN. Experiments have been carried out on the three-axis vertical machining center. Research shows that the proposed method can improve the prediction accuracy of part surface roughness to 95.5%.
Collapse
|
18
|
Shyrokykh K, Girnyk M, Dellmuth L. Short text classification with machine learning in the social sciences: The case of climate change on Twitter. PLoS One 2023; 18:e0290762. [PMID: 37773969 PMCID: PMC10540966 DOI: 10.1371/journal.pone.0290762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 08/15/2023] [Indexed: 10/01/2023] Open
Abstract
To analyse large numbers of texts, social science researchers are increasingly confronting the challenge of text classification. When manual labeling is not possible and researchers have to find automatized ways to classify texts, computer science provides a useful toolbox of machine-learning methods whose performance remains understudied in the social sciences. In this article, we compare the performance of the most widely used text classifiers by applying them to a typical research scenario in social science research: a relatively small labeled dataset with infrequent occurrence of categories of interest, which is a part of a large unlabeled dataset. As an example case, we look at Twitter communication regarding climate change, a topic of increasing scholarly interest in interdisciplinary social science research. Using a novel dataset including 5,750 tweets from various international organizations regarding the highly ambiguous concept of climate change, we evaluate the performance of methods in automatically classifying tweets based on whether they are about climate change or not. In this context, we highlight two main findings. First, supervised machine-learning methods perform better than state-of-the-art lexicons, in particular as class balance increases. Second, traditional machine-learning methods, such as logistic regression and random forest, perform similarly to sophisticated deep-learning methods, whilst requiring much less training time and computational resources. The results have important implications for the analysis of short texts in social science research.
Collapse
Affiliation(s)
- Karina Shyrokykh
- Department of Economic History and International Relations, Stockholm University, Stockholm, Sweden
| | - Max Girnyk
- Department of Economic History and International Relations, Stockholm University, Stockholm, Sweden
| | - Lisa Dellmuth
- Department of Economic History and International Relations, Stockholm University, Stockholm, Sweden
| |
Collapse
|
19
|
Adamson B, Waskom M, Blarre A, Kelly J, Krismer K, Nemeth S, Gippetti J, Ritten J, Harrison K, Ho G, Linzmayer R, Bansal T, Wilkinson S, Amster G, Estola E, Benedum CM, Fidyk E, Estévez M, Shapiro W, Cohen AB. Approach to machine learning for extraction of real-world data variables from electronic health records. Front Pharmacol 2023; 14:1180962. [PMID: 37781703 PMCID: PMC10541019 DOI: 10.3389/fphar.2023.1180962] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 08/25/2023] [Indexed: 10/03/2023] Open
Abstract
Background: As artificial intelligence (AI) continues to advance with breakthroughs in natural language processing (NLP) and machine learning (ML), such as the development of models like OpenAI's ChatGPT, new opportunities are emerging for efficient curation of electronic health records (EHR) into real-world data (RWD) for evidence generation in oncology. Our objective is to describe the research and development of industry methods to promote transparency and explainability. Methods: We applied NLP with ML techniques to train, validate, and test the extraction of information from unstructured documents (e.g., clinician notes, radiology reports, lab reports, etc.) to output a set of structured variables required for RWD analysis. This research used a nationwide electronic health record (EHR)-derived database. Models were selected based on performance. Variables curated with an approach using ML extraction are those where the value is determined solely based on an ML model (i.e. not confirmed by abstraction), which identifies key information from visit notes and documents. These models do not predict future events or infer missing information. Results: We developed an approach using NLP and ML for extraction of clinically meaningful information from unstructured EHR documents and found high performance of output variables compared with variables curated by manually abstracted data. These extraction methods resulted in research-ready variables including initial cancer diagnosis with date, advanced/metastatic diagnosis with date, disease stage, histology, smoking status, surgery status with date, biomarker test results with dates, and oral treatments with dates. Conclusion: NLP and ML enable the extraction of retrospective clinical data in EHR with speed and scalability to help researchers learn from the experience of every person with cancer.
Collapse
Affiliation(s)
- Blythe Adamson
- Flatiron Health, Inc., New York, NY, United States
- The Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute, Department of Pharmacy, University of Washington, Seattle, WA, United States
| | | | | | | | | | | | | | - John Ritten
- Flatiron Health, Inc., New York, NY, United States
| | | | - George Ho
- Flatiron Health, Inc., New York, NY, United States
| | | | - Tarun Bansal
- Flatiron Health, Inc., New York, NY, United States
| | | | - Guy Amster
- Flatiron Health, Inc., New York, NY, United States
| | - Evan Estola
- Flatiron Health, Inc., New York, NY, United States
| | | | - Erin Fidyk
- Flatiron Health, Inc., New York, NY, United States
| | | | - Will Shapiro
- Flatiron Health, Inc., New York, NY, United States
| | - Aaron B. Cohen
- Flatiron Health, Inc., New York, NY, United States
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, United States
| |
Collapse
|
20
|
Kim Y, Lee M, Yoon J, Kim Y, Min H, Cho H, Park J, Shin T. Predicting Future Incidences of Cardiac Arrhythmias Using Discrete Heartbeats from Normal Sinus Rhythm ECG Signals via Deep Learning Methods. Diagnostics (Basel) 2023; 13:2849. [PMID: 37685387 PMCID: PMC10487044 DOI: 10.3390/diagnostics13172849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 08/29/2023] [Accepted: 09/01/2023] [Indexed: 09/10/2023] Open
Abstract
This study aims to compare the effectiveness of using discrete heartbeats versus an entire 12-lead electrocardiogram (ECG) as the input for predicting future occurrences of arrhythmia and atrial fibrillation using deep learning models. Experiments were conducted using two types of inputs: a combination of discrete heartbeats extracted from 12-lead ECG and an entire 12-lead ECG signal of 10 s. This study utilized 326,904 ECG signals from 134,447 patients and categorized them into three groups: true-normal sinus rhythm (T-NSR), atrial fibrillation-normal sinus rhythm (AF-NSR), and clinically important arrhythmia-normal sinus rhythm (CIA-NSR). The T-NSR group comprised patients with at least three normal rhythms in a year and no atrial fibrillation or arrhythmias history. Clinically important arrhythmia included atrial fibrillation, atrial flutter, atrial premature contraction, atrial tachycardia, ventricular premature contraction, ventricular tachycardia, right and left bundle branch block, and atrioventricular block over the second degree. The AF-NSR group included normal sinus rhythm paired with atrial fibrillation or atrial flutter within 14 days, and the CIA-NSR group comprised normal sinus rhythm paired with CIA occurring within 14 days. Three deep learning models, ResNet-18, LSTM, and Transformer-based models, were utilized to distinguish T-NSR from AF-NSR and T-NSR from CIA-NSR. The experiments demonstrated the potential of using discrete heartbeats in predicting future arrhythmia and atrial fibrillation incidences extracted from 12-lead electrocardiogram (ECG) signals alone, without any additional patient information. The analysis reveals that these discrete heartbeats contain subtle patterns that deep learning models can identify. Focusing on discrete heartbeats may lead to more timely and accurate diagnoses of these conditions, improving patient outcomes and enabling automated diagnosis using ECG signals as a biomarker.
Collapse
Affiliation(s)
- Yehyun Kim
- Synergy A.I. Co., Ltd., Seoul 07573, Republic of Korea; (Y.K.); (M.L.); (J.Y.)
| | - Myeonggyu Lee
- Synergy A.I. Co., Ltd., Seoul 07573, Republic of Korea; (Y.K.); (M.L.); (J.Y.)
| | - Jaeung Yoon
- Synergy A.I. Co., Ltd., Seoul 07573, Republic of Korea; (Y.K.); (M.L.); (J.Y.)
| | - Yeji Kim
- Department of Cardiology, Ewha Womans University Mokdong Hospital, Seoul 07985, Republic of Korea;
| | - Hyunseok Min
- Tomocube Inc., Daejeon 34141, Republic of Korea; (H.M.); (H.C.)
| | - Hyungjoo Cho
- Tomocube Inc., Daejeon 34141, Republic of Korea; (H.M.); (H.C.)
| | - Junbeom Park
- Synergy A.I. Co., Ltd., Seoul 07573, Republic of Korea; (Y.K.); (M.L.); (J.Y.)
- Department of Cardiology, Ewha Womans University Mokdong Hospital, Seoul 07985, Republic of Korea;
| | - Taeyoung Shin
- Synergy A.I. Co., Ltd., Seoul 07573, Republic of Korea; (Y.K.); (M.L.); (J.Y.)
- Department of Urology, Ewha Womans University Mokdong Hospital, Seoul 07985, Republic of Korea
| |
Collapse
|
21
|
Liu Z, Pan L, Chen G. Link-Information Augmented Twin Autoencoders for Network Denoising. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:5585-5595. [PMID: 35358055 DOI: 10.1109/tcyb.2022.3160470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Removing noisy links from an observed network is a task commonly required for preprocessing real-world network data. However, containing both noisy and clean links, the observed network cannot be treated as a trustworthy information source for supervised learning. Therefore, it is necessary but also technically challenging to detect noisy links in the context of data contamination. To address this issue, in the present article, a two-phased computational model is proposed, called link-information augmented twin autoencoders, which is able to deal with: 1) link information augmentation; 2) link-level contrastive denoising; 3) link information correction. Extensive experiments on six real-world networks verify that the proposed model outperforms other comparable methods in removing noisy links from the observed network so as to recover the real network from the corrupted one very accurately. Extended analyses also provide interpretable evidence to support the superiority of the proposed model for the task of network denoising.
Collapse
|
22
|
van Kevelaer R, Langenkämper D, Nilssen I, Buhl-Mortensen P, Nattkemper TW. A data science approach for multi-sensor marine observatory data monitoring cold water corals (Paragorgia arborea) in two campaigns. PLoS One 2023; 18:e0282723. [PMID: 37467187 DOI: 10.1371/journal.pone.0282723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 02/21/2023] [Indexed: 07/21/2023] Open
Abstract
Fixed underwater observatories (FUO), equipped with digital cameras and other sensors, become more commonly used to record different kinds of time series data for marine habitat monitoring. With increasing numbers of campaigns, numbers of sensors and campaign time, the volume and heterogeneity of the data, ranging from simple temperature time series to series of HD images or video call for new data science approaches to analyze the data. While some works have been published on the analysis of data from one campaign, we address the problem of analyzing time series data from two consecutive monitoring campaigns (starting late 2017 and late 2018) in the same habitat. While the data from campaigns in two separate years provide an interesting basis for marine biology research, it also presents new data science challenges, like the the marine image analysis in data form more than one campaign. In this paper, we analyze the polyp activity of two Paragorgia arborea cold water coral (CWC) colonies using FUO data collected from November 2017 to June 2018 and from December 2018 to April 2019. We successfully apply convolutional neural networks (CNN) for the segmentation and classification of the coral and the polyp activities. The result polyp activity data alone showed interesting temporal patterns with differences and similarities between the two time periods. A one month "sleeping" period in spring with almost no activity was observed in both coral colonies, but with a shift of approximately one month. A time series prediction experiment allowed us to predict the polyp activity from the non-image sensor data using recurrent neural networks (RNN). The results pave a way to a new multi-sensor monitoring strategy for Paragorgia arborea behaviour.
Collapse
Affiliation(s)
- Robin van Kevelaer
- Biodata Mining Group, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| | - Daniel Langenkämper
- Biodata Mining Group, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| | | | - Pål Buhl-Mortensen
- Research Group Benthic Habitat, Institute of Marine Research, Bergen, Norway
| | - Tim W Nattkemper
- Biodata Mining Group, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
23
|
Jang MG, Cha S, Kim S, Lee S, Lee KE, Shin KH. Application of tree-based machine learning classification methods to detect signals of fluoroquinolones using the Korea Adverse Event Reporting System (KAERS) database. Expert Opin Drug Saf 2023; 22:629-636. [PMID: 36794497 DOI: 10.1080/14740338.2023.2181341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 01/23/2023] [Indexed: 02/17/2023]
Abstract
BACKGROUND Safety issues for fluoroquinolones have been provided by regulatory agencies. This study was conducted to identify signals of fluoroquinolones reported in the Korea Adverse Event Reporting System (KAERS) using tree-based machine learning (ML) methods. RESEARCH DESIGN AND METHODS All adverse events (AEs) associated with the target drugs reported in the KAERS from 2013 to 2017 were matched with drug label information. A dataset containing label-positive and -negative AEs was arbitrarily divided into training and test sets. Decision tree, random forest (RF), bagging, and gradient boosting machine (GBM) were fitted on the training set with hyperparameters tuned using five-fold cross-validation and applied to the test set. The ML method with the highest area under the curve (AUC) scores was selected as the final ML model. RESULTS Bagging was selected as the final ML model for gemifloxacin (AUC score: 1) and levofloxacin (AUC: 0.9987). RF was selected in ciprofloxacin, moxifloxacin, and ofloxacin (AUC scores: 0.9859, 0.9974, and 0.9999 respectively). We found that the final ML methods detected additional signals that were not detected using the disproportionality analysis (DPA) methods. CONCLUSIONS The bagging-or-RF-based ML methods performed better than DPA and detected novel AE signals previously unidentified using the DPA methods.
Collapse
Affiliation(s)
- Min-Gyo Jang
- College of Pharmacy, Research Institute of Pharmaceutical Sciences, Kyungpook National University, Daegu, Republic of Korea
| | - SangHun Cha
- Department of Statistics, College of Natural Sciences, Kyungpook National University, Daegu, Republic of Korea
| | - Seunghwak Kim
- Department of Statistics, College of Natural Sciences, Kyungpook National University, Daegu, Republic of Korea
| | - Sojung Lee
- Department of Statistics, College of Natural Sciences, Kyungpook National University, Daegu, Republic of Korea
| | - Kyeong Eun Lee
- Department of Statistics, College of Natural Sciences, Kyungpook National University, Daegu, Republic of Korea
| | - Kwang-Hee Shin
- College of Pharmacy, Research Institute of Pharmaceutical Sciences, Kyungpook National University, Daegu, Republic of Korea
| |
Collapse
|
24
|
Tselemponis A, Stefanis C, Giorgi E, Kalmpourtzi A, Olmpasalis I, Tselemponis A, Adam M, Kontogiorgis C, Dokas IM, Bezirtzoglou E, Constantinidis TC. Coastal Water Quality Modelling Using E. coli, Meteorological Parameters and Machine Learning Algorithms. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:6216. [PMID: 37444064 PMCID: PMC10341787 DOI: 10.3390/ijerph20136216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 06/19/2023] [Accepted: 06/21/2023] [Indexed: 07/15/2023]
Abstract
In this study, machine learning models were implemented to predict the classification of coastal waters in the region of Eastern Macedonia and Thrace (EMT) concerning Escherichia coli (E. coli) concentration and weather variables in the framework of the Directive 2006/7/EC. Six sampling stations of EMT, located on beaches of the regional units of Kavala, Xanthi, Rhodopi, Evros, Thasos and Samothraki, were selected. All 1039 samples were collected from May to September within a 14-year follow-up period (2009-2021). The weather parameters were acquired from nearby meteorological stations. The samples were analysed according to the ISO 9308-1 for the detection and the enumeration of E. coli. The vast majority of the samples fall into category 1 (Excellent), which is a mark of the high quality of the coastal waters of EMT. The experimental results disclose, additionally, that two-class classifiers, namely Decision Forest, Decision Jungle and Boosted Decision Tree, achieved high Accuracy scores over 99%. In addition, comparing our performance metrics with those of other researchers, diversity is observed in using algorithms for water quality prediction, with algorithms such as Decision Tree, Artificial Neural Networks and Bayesian Belief Networks demonstrating satisfactory results. Machine learning approaches can provide critical information about the dynamic of E. coli contamination and, concurrently, consider the meteorological parameters for coastal waters classification.
Collapse
Affiliation(s)
- Athanasios Tselemponis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Christos Stefanis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Elpida Giorgi
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Aikaterini Kalmpourtzi
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Ioannis Olmpasalis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Antonios Tselemponis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Maria Adam
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Christos Kontogiorgis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Ioannis M. Dokas
- Department of Civil Engineering, Democritus University of Thrace, 69100 Komotini, Greece;
| | - Eugenia Bezirtzoglou
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Theodoros C. Constantinidis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| |
Collapse
|
25
|
Işık Ü, Güven A, Batbat T. Evaluation of Emotions from Brain Signals on 3D VAD Space via Artificial Intelligence Techniques. Diagnostics (Basel) 2023; 13:2141. [PMID: 37443535 DOI: 10.3390/diagnostics13132141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 06/12/2023] [Accepted: 06/14/2023] [Indexed: 07/15/2023] Open
Abstract
Recent achievements have made emotion studies a rising field contributing to many areas, such as health technologies, brain-computer interfaces, psychology, etc. Emotional states can be evaluated in valence, arousal, and dominance (VAD) domains. Most of the work uses only VA due to the easiness of differentiation; however, very few studies use VAD like this study. Similarly, segment comparisons of emotion analysis with handcrafted features also use VA space. At this point, we primarily focused on VAD space to evaluate emotions and segmentations. The DEAP dataset is used in this study. A comprehensive analytical approach is implemented with two sub-studies: first, segmentation (Segments I-VIII), and second, binary cross-comparisons and evaluations of eight emotional states, in addition to comparisons of selected segments (III, IV, and V), class separation levels (5, 4-6, and 3-7), and unbalanced and balanced data with SMOTE. In both sub-studies, Wavelet Transform is applied to electroencephalography signals to separate the brain waves into their bands (α, β, γ, and θ bands), twenty-four attributes are extracted, and Sequential Minimum Optimization, K-Nearest Neighbors, Fuzzy Unordered Rule Induction Algorithm, Random Forest, Optimized Forest, Bagging, Random Committee, and Random Subspace are used for classification. In our study, we have obtained high accuracy results, which can be seen in the figures in the second part. The best accuracy result in this study for unbalanced data is obtained for Low Arousal-Low Valence-High Dominance and High Arousal-High Valence-Low Dominance emotion comparisons (Segment III and 4.5-5.5 class separation), and an accuracy rate of 98.94% is obtained with the IBk classifier. Data-balanced results mostly seem to outperform unbalanced results.
Collapse
Affiliation(s)
- Ümran Işık
- Biomedical Engineering Graduate Program, Graduate School of Natural and Applied Sciences, Erciyes University, 38039 Kayseri, Türkiye
| | - Ayşegül Güven
- Department of Biomedical Engineering, Engineering Faculty, Erciyes University, 38039 Kayseri, Türkiye
| | - Turgay Batbat
- Department of Biomedical Engineering, Engineering Faculty, Erciyes University, 38039 Kayseri, Türkiye
| |
Collapse
|
26
|
Yuce F, Öziç MÜ, Tassoker M. Detection of pulpal calcifications on bite-wing radiographs using deep learning. Clin Oral Investig 2023; 27:2679-2689. [PMID: 36564651 DOI: 10.1007/s00784-022-04839-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 12/21/2022] [Indexed: 12/25/2022]
Abstract
OBJECTIVES Pulpal calcifications are discrete hard calcified masses of varying sizes in the dental pulp cavity. This study is aimed at measuring the performance of the YOLOv4 deep learning algorithm to automatically determine whether there is calcification in the pulp chambers in bite-wing radiographs. MATERIALS AND METHODS In this study, 2000 bite-wing radiographs were collected from the faculty database. The oral radiologists labeled the pulp chambers on the radiographs as "Present" and "Absent" according to whether there was calcification. The data were randomly divided into 80% training, 10% validation, and 10% testing. The weight file for pulpal calcification was obtained by training the YOLOv4 algorithm with the transfer learning method. Using the weights obtained, pulp chambers and calcifications were automatically detected on the test radiographs that the algorithm had never seen. Two oral radiologists evaluated the test results, and performance criteria were calculated. RESULTS The results obtained on the test data were evaluated in two stages: detection of pulp chambers and detection of pulpal calcification. The detection performance of pulp chambers was as follows: recall 86.98%, precision 98.94%, F1-score 91.60%, and accuracy 86.18%. Pulpal calcification "Absent" and "Present" detection performance was as follows: recall 86.39%, precision 85.23%, specificity 97.94%, F1-score 85.49%, and accuracy 96.54%. CONCLUSION The YOLOv4 algorithm trained with bite-wing radiographs detected pulp chambers and calcification with high success rates. CLINICAL RELEVANCE Automatic detection of pulpal calcifications with deep learning will be used in clinical practice as a decision support system with high accuracy rates in diagnosing dentists.
Collapse
Affiliation(s)
- Fatma Yuce
- Department of Oral and Maxillofacial Radiology, Faculty of Dentistry, Okan University, Istanbul, Turkey
| | - Muhammet Üsame Öziç
- Faculty of Technology Department of Biomedical Engineering, Pamukkale University, Denizli, Turkey
| | - Melek Tassoker
- Department of Oral and Maxillofacial Radiology, Faculty of Dentistry, Necmettin Erbakan University, Konya, Turkey.
| |
Collapse
|
27
|
Lima T, Luz D, Oseas A, Veras R, Araújo F. Automatic classification of pulmonary nodules in computed tomography images using pre-trained networks and bag of features. MULTIMEDIA TOOLS AND APPLICATIONS 2023:1-17. [PMID: 37362706 PMCID: PMC10116084 DOI: 10.1007/s11042-023-14900-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 07/26/2022] [Accepted: 02/10/2023] [Indexed: 06/28/2023]
Abstract
Lung cancer has the highest incidence in the world. The standard tests for its diagnostics are medical imaging exams, sputum cytology, and lung biopsy. Computed Tomography (CT) of the chest plays an essential role in the early detection of nodules since it can allow for more treatment options and increases patient survival. However, the analysis of these exams is a tiring and error-prone process. Thus, computational methods can help the specialist in this analysis. This work addresses the classification of pulmonary nodules as benign or malignant on CT images. Our approach uses the pre-trained VGG16, VGG19, Inception, Resnet50, and Xception, to extract features from each 2D slice of the 3D nodules. Then, we use Principal Component Analysis to reduce the dimensionality of the feature vectors and make them all the same length. Then, we use Bag of Features (BoF) to combine the feature vectors of the different 2D slices and generate only one signature representing the 3D nodule. The classification step uses Random Forest. We evaluated the proposed method with 1,405 segmented nodules from the LIDC-IDRI database and obtained an accuracy of 95.34%, F1-Score of 91.73, kappa of 0.88, sensitivity of 90.53%, specificity of 97.26% and AUC of 0.99. The main conclusion was that the combination by BoF of features extracted from 2D slices using pre-trained architectures produced better results than training 2D and 3D CNNs in the nodules. In addition, the use of BoF also makes the creation of the nodule signature independent of the number of slices.
Collapse
Affiliation(s)
- Thiago Lima
- Departamento de Computação, Universidade Federal do Piauí, Teresina, PI Brasil
- Departamento de Engenharia Elétrica, Universidade Federal do Piauí, Teresina, PI Brasil
| | - Daniel Luz
- Departamento de Computação, Universidade Federal do Piauí, Teresina, PI Brasil
- Departamento de Engenharia Elétrica, Universidade Federal do Piauí, Teresina, PI Brasil
- Departamento de Informática, Instituto Federal de Educação, Ciência e Tecnologia do Piauí, Picos, PI Brasil
| | - Antonio Oseas
- Departamento de Computação, Universidade Federal do Piauí, Teresina, PI Brasil
- Departamento de Engenharia Elétrica, Universidade Federal do Piauí, Teresina, PI Brasil
- Departamento de Sistemas de Informação, Universidade Federal do Piauí, Picos, PI Brasil
| | - Rodrigo Veras
- Departamento de Computação, Universidade Federal do Piauí, Teresina, PI Brasil
| | - Flávio Araújo
- Departamento de Computação, Universidade Federal do Piauí, Teresina, PI Brasil
- Departamento de Engenharia Elétrica, Universidade Federal do Piauí, Teresina, PI Brasil
- Departamento de Sistemas de Informação, Universidade Federal do Piauí, Picos, PI Brasil
| |
Collapse
|
28
|
Grekov AN, Kabanov AA, Vyshkvarkova EV, Trusevich VV. Anomaly Detection in Biological Early Warning Systems Using Unsupervised Machine Learning. SENSORS (BASEL, SWITZERLAND) 2023; 23:2687. [PMID: 36904891 PMCID: PMC10007031 DOI: 10.3390/s23052687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 02/15/2023] [Accepted: 02/27/2023] [Indexed: 06/18/2023]
Abstract
The use of bivalve mollusks as bioindicators in automated monitoring systems can provide real-time detection of emergency situations associated with the pollution of aquatic environments. The behavioral reactions of Unio pictorum (Linnaeus, 1758) were employed in the development of a comprehensive automated monitoring system for aquatic environments by the authors. The study used experimental data obtained by an automated system from the Chernaya River in the Sevastopol region of the Crimean Peninsula. Four traditional unsupervised machine learning techniques were implemented to detect emergency signals in the activity of bivalves: elliptic envelope, isolation forest (iForest), one-class support vector machine (SVM), and local outlier factor (LOF). The results showed that the use of the elliptic envelope, iForest, and LOF methods with proper hyperparameter tuning can detect anomalies in mollusk activity data without false alarms, with an F1 score of 1. A comparison of anomaly detection times revealed that the iForest method is the most efficient. These findings demonstrate the potential of using bivalve mollusks as bioindicators in automated monitoring systems for the early detection of pollution in aquatic environments.
Collapse
Affiliation(s)
- Aleksandr N. Grekov
- Institute of Natural and Technical Systems, 299011 Sevastopol, Russia
- Department of Informatics and Control in Technical Systems, Sevastopol State University, 299053 Sevastopol, Russia
| | - Aleksey A. Kabanov
- Department of Informatics and Control in Technical Systems, Sevastopol State University, 299053 Sevastopol, Russia
| | | | | |
Collapse
|
29
|
He Y, Li X, Zhang M, Fournier‐Viger P, Huang JZ, Salloum S. A novel observation points‐based positive‐unlabeled learning algorithm. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2023. [DOI: 10.1049/cit2.12152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Affiliation(s)
- Yulin He
- Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ) Shenzhen China
- College of Computer Science and Software Engineering Shenzhen University Shenzhen China
| | - Xu Li
- College of Computer Science and Software Engineering Shenzhen University Shenzhen China
| | - Manjing Zhang
- Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ) Shenzhen China
| | | | - Joshua Zhexue Huang
- Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ) Shenzhen China
- College of Computer Science and Software Engineering Shenzhen University Shenzhen China
| | - Salman Salloum
- School of Computing National University of Singapore Singapore Singapore
| |
Collapse
|
30
|
Albora G, Pietronero L, Tacchella A, Zaccaria A. Product progression: a machine learning approach to forecasting industrial upgrading. Sci Rep 2023; 13:1481. [PMID: 36707529 PMCID: PMC9880377 DOI: 10.1038/s41598-023-28179-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 01/13/2023] [Indexed: 01/28/2023] Open
Abstract
Economic complexity methods, and in particular relatedness measures, lack a systematic evaluation and comparison framework. We argue that out-of-sample forecast exercises should play this role, and we compare various machine learning models to set the prediction benchmark. We find that the key object to forecast is the activation of new products, and that tree-based algorithms clearly outperform both the quite strong auto-correlation benchmark and the other supervised algorithms. Interestingly, we find that the best results are obtained in a cross-validation setting, when data about the predicted country was excluded from the training set. Our approach has direct policy implications, providing a quantitative and scientifically tested measure of the feasibility of introducing a new product in a given country.
Collapse
Affiliation(s)
- Giambattista Albora
- Dipartimento di Fisica, Universitá Sapienza, Rome, Italy
- Centro Ricerche Enrico Fermi, Rome, Italy
| | | | | | - Andrea Zaccaria
- Centro Ricerche Enrico Fermi, Rome, Italy.
- Istituto dei Sistemi Complessi-CNR, UOS Sapienza, Rome, Italy.
| |
Collapse
|
31
|
Leveraging image complexity in macro-level neural network design for medical image segmentation. Sci Rep 2022; 12:22286. [PMID: 36566313 PMCID: PMC9790020 DOI: 10.1038/s41598-022-26482-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 12/15/2022] [Indexed: 12/25/2022] Open
Abstract
Recent progress in encoder-decoder neural network architecture design has led to significant performance improvements in a wide range of medical image segmentation tasks. However, state-of-the-art networks for a given task may be too computationally demanding to run on affordable hardware, and thus users often resort to practical workarounds by modifying various macro-level design aspects. Two common examples are downsampling of the input images and reducing the network depth or size to meet computer memory constraints. In this paper, we investigate the effects of these changes on segmentation performance and show that image complexity can be used as a guideline in choosing what is best for a given dataset. We consider four statistical measures to quantify image complexity and evaluate their suitability on ten different public datasets. For the purpose of our illustrative experiments, we use DeepLabV3+ (deep large-size), M2U-Net (deep lightweight), U-Net (shallow large-size), and U-Net Lite (shallow lightweight). Our results suggest that median frequency is the best complexity measure when deciding on an acceptable input downsampling factor and using a deep versus shallow, large-size versus lightweight network. For high-complexity datasets, a lightweight network running on the original images may yield better segmentation results than a large-size network running on downsampled images, whereas the opposite may be the case for low-complexity images.
Collapse
|
32
|
HS-Gen: a hypersphere-constrained generation mechanism to improve synthetic minority oversampling for imbalanced classification. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00938-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
AbstractMitigating the impact of class-imbalance data on classifiers is a challenging task in machine learning. SMOTE is a well-known method to tackle this task by modifying class distribution and generating synthetic instances. However, most of the SMOTE-based methods focus on the phase of data selection, while few consider the phase of data generation. This paper proposes a hypersphere-constrained generation mechanism (HS-Gen) to improve synthetic minority oversampling. Unlike linear interpolation commonly used in SMOTE-based methods, HS-Gen generates a minority instance in a hypersphere rather than on a straight line. This mechanism expands the distribution range of minority instances with significant randomness and diversity. Furthermore, HS-Gen is attached with a noise prevention strategy that adaptively shrinks the hypersphere by determining whether new instances fall into the majority class region. HS-Gen can be regarded as an oversampling optimization mechanism and flexibly embedded into the SMOTE-based methods. We conduct comparative experiments by embedding HS-Gen into the original SMOTE, Borderline-SMOTE, ADASYN, k-means SMOTE, and RSMOTE. Experimental results show that the embedded versions can generate higher quality synthetic instances than the original ones. Moreover, on these oversampled datasets, the conventional classifiers (C4.5 and Adaboost) obtain significant performance improvement in terms of F1 measure and G-mean.
Collapse
|
33
|
Ghaderi Zefrehi H, Altınçay H. MaMiPot: a paradigm shift for the classification of imbalanced data. J Intell Inf Syst 2022. [DOI: 10.1007/s10844-022-00763-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
34
|
Kąkol K, Korvel G, Kostek B. Noise profiling for speech enhancement employing machine learning models. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3595. [PMID: 36586827 DOI: 10.1121/10.0016495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 11/21/2022] [Indexed: 06/17/2023]
Abstract
This paper aims to propose a noise profiling method that can be performed in near real time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features derived from the Aurora noise dataset. This is to select the best-performing classifier in the context of noise profiling. Therefore, a comparison of all classifier outcomes is shown based on effectiveness metrics. Also, confusion matrices prepared for all tested models are presented. The second part of the experiment consists of selecting the algorithm that scored the best, i.e., Naive Bayes, resulting in an accuracy of 96.76%, and using it in a noise-type recognition model to demonstrate that it can perform in a stable way. Classification results are derived from the real-life recordings performed in momentary and averaging modes. The key contribution is discussed regarding speech intelligibility improvements in the presence of noise, where identifying the type of noise is crucial. Finally, conclusions deliver the overall findings and future work directions.
Collapse
Affiliation(s)
| | - Gražina Korvel
- Institute of Data Science and Digital Technologies, Vilnius University, Vilnius, 08412, Lithuania
| | - Bożena Kostek
- Audio Acoustics Laboratory, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdańsk, 80-233, Poland
| |
Collapse
|
35
|
Chabane N, Bouaoune A, Tighilt R, Abdar M, Boc A, Lord E, Tahiri N, Mazoure B, Acharya UR, Makarenkov V. Intelligent personalized shopping recommendation using clustering and supervised machine learning algorithms. PLoS One 2022; 17:e0278364. [PMID: 36454766 PMCID: PMC9714752 DOI: 10.1371/journal.pone.0278364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 11/15/2022] [Indexed: 12/02/2022] Open
Abstract
Next basket recommendation is a critical task in market basket data analysis. It is particularly important in grocery shopping, where grocery lists are an essential part of shopping habits of many customers. In this work, we first present a new grocery Recommender System available on the MyGroceryTour platform. Our online system uses different traditional machine learning (ML) and deep learning (DL) algorithms, and provides recommendations to users in a real-time manner. It aims to help Canadian customers create their personalized intelligent weekly grocery lists based on their individual purchase histories, weekly specials offered in local stores, and product cost and availability information. We perform clustering analysis to partition given customer profiles into four non-overlapping clusters according to their grocery shopping habits. Then, we conduct computational experiments to compare several traditional ML algorithms and our new DL algorithm based on the use of a gated recurrent unit (GRU)-based recurrent neural network (RNN) architecture. Our DL algorithm can be viewed as an extension of DREAM (Dynamic REcurrent bAsket Model) adapted to multi-class (i.e. multi-store) classification, since a given user can purchase recommended products in different grocery stores in which these products are available. Among traditional ML algorithms, the highest average F-score of 0.516 for the considered data set of 831 customers was obtained using Random Forest, whereas our proposed DL algorithm yielded the average F-score of 0.559 for this data set. The main advantage of the presented Recommender System is that our intelligent recommendation is personalized, since a separate traditional ML or DL model is built for each customer considered. Such a personalized approach allows us to outperform the prediction results provided by general state-of-the-art DL models.
Collapse
Affiliation(s)
- Nail Chabane
- Department of Computer Science, Université du Québec à Montréal, Montreal, QC, Canada
| | - Achraf Bouaoune
- Department of Computer Science, Université du Québec à Montréal, Montreal, QC, Canada
| | - Reda Tighilt
- Department of Computer Science, Université du Québec à Montréal, Montreal, QC, Canada
| | - Moloud Abdar
- Institute for Intelligent Systems Research and Innovation, Deakin University, Geelong, Victoria, Australia
| | - Alix Boc
- Department of Computer Science, Université du Québec à Montréal, Montreal, QC, Canada
| | - Etienne Lord
- Department of Computer Science, Université du Québec à Montréal, Montreal, QC, Canada
| | - Nadia Tahiri
- University of Sherbrooke, Sherbrooke, QC, Canada
| | - Bogdan Mazoure
- School of Computer Science, McGill University, Montreal, QC, Canada
- Quebec AI Institute (MILA), Montreal, QC, Canada
| | - U. Rajendra Acharya
- Department of Electronics and Computer Engineering, Ngee Ann Polytechnic, Singapore, Singapore
- Department of Biomedical Engineering, School of Science and Technology, SUSS University, Singapore, Singapore
- Department of Biomedical Informatics and Medical Engineering, Asia University, Taichung, Taiwan
| | - Vladimir Makarenkov
- Department of Computer Science, Université du Québec à Montréal, Montreal, QC, Canada
- * E-mail:
| |
Collapse
|
36
|
Temperature-robust rapid eye movement and slow wave sleep in the lizard Laudakia vulgaris. Commun Biol 2022; 5:1310. [PMID: 36446903 PMCID: PMC9709036 DOI: 10.1038/s42003-022-04261-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 11/15/2022] [Indexed: 11/30/2022] Open
Abstract
During sleep our brain switches between two starkly different brain states - slow wave sleep (SWS) and rapid eye movement (REM) sleep. While this two-state sleep pattern is abundant across birds and mammals, its existence in other vertebrates is not universally accepted, its evolutionary emergence is unclear and it is undetermined whether it is a fundamental property of vertebrate brains or an adaptation specific to homeotherms. To address these questions, we conducted electrophysiological recordings in the Agamid lizard, Laudakia vulgaris during sleep. We found clear signatures of two-state sleep that resemble the mammalian and avian sleep patterns. These states switched periodically throughout the night with a cycle of ~90 seconds and were remarkably similar to the states previously reported in Pogona vitticeps. Interestingly, in contrast to the high temperature sensitivity of mammalian states, state switches were robust to large variations in temperature. We also found that breathing rate, micro-movements and eye movements were locked to the REM state as they are in mammals. Collectively, these findings suggest that two-state sleep is abundant across the agamid family, shares physiological similarity to mammalian sleep, and can be maintain in poikilothems, increasing the probability that it existed in the cold-blooded ancestor of amniotes.
Collapse
|
37
|
Feng J, Wu S, Yang H, Ai C, Qiao J, Xu J, Guo F. Microbe-bridged disease-metabolite associations identification by heterogeneous graph fusion. Brief Bioinform 2022; 23:6720417. [PMID: 36168719 DOI: 10.1093/bib/bbac423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Metabolomics has developed rapidly in recent years, and metabolism-related databases are also gradually constructed. Nowadays, more and more studies are being carried out on diverse microbes, metabolites and diseases. However, the logics of various associations among microbes, metabolites and diseases are limited understanding in the biomedicine of gut microbial system. The collection and analysis of relevant microbial bioinformation play an important role in the revelation of microbe-metabolite-disease associations. Therefore, the dataset that integrates multiple relationships and the method based on complex heterogeneous graphs need to be developed. RESULTS In this study, we integrated some databases and extracted a variety of associations data among microbes, metabolites and diseases. After obtaining the three interconnected bilateral association data (microbe-metabolite, metabolite-disease and disease-microbe), we considered building a heterogeneous graph to describe the association data. In our model, microbes were used as a bridge between diseases and metabolites. In order to fuse the information of disease-microbe-metabolite graph, we used the bipartite graph attention network on the disease-microbe and metabolite-microbe bipartite graph. The experimental results show that our model has good performance in the prediction of various disease-metabolite associations. Through the case study of type 2 diabetes mellitus, Parkinson's disease, inflammatory bowel disease and liver cirrhosis, it is noted that our proposed methodology are valuable for the mining of other associations and the prediction of biomarkers for different human diseases.Availability and implementation: https://github.com/Selenefreeze/DiMiMe.git.
Collapse
Affiliation(s)
- Jitong Feng
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Shengbo Wu
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Zhejiang Shaoxing Research Institute of Tianjin University, Shaoxing, China
| | - Hongpeng Yang
- School of Computational Science and Engineering, University of South Carolina, Columbia, U.S
| | - Chengwei Ai
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jianjun Qiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.,Zhejiang Shaoxing Research Institute of Tianjin University, Shaoxing, China
| | - Junhai Xu
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
38
|
Mehedi MAA, Smith V, Hosseiny H, Jiao X. Unraveling the complexities of urban fluvial flood hydraulics through AI. Sci Rep 2022; 12:18738. [PMID: 36333429 PMCID: PMC9636396 DOI: 10.1038/s41598-022-23214-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 10/26/2022] [Indexed: 11/06/2022] Open
Abstract
As urbanization increases across the globe, urban flooding is an ever-pressing concern. Urban fluvial systems are highly complex, depending on a myriad of interacting variables. Numerous hydraulic models are available for analyzing urban flooding; however, meeting the demand of high spatial extension and finer discretization and solving the physics-based numerical equations are computationally expensive. Computational efforts increase drastically with an increase in model dimension and resolution, preventing current solutions from fully realizing the data revolution. In this research, we demonstrate the effectiveness of artificial intelligence (AI), in particular, machine learning (ML) methods including the emerging deep learning (DL) to quantify urban flooding considering the lower part of Darby Creek, PA, USA. Training datasets comprise multiple geographic and urban hydraulic features (e.g., coordinates, elevation, water depth, flooded locations, discharge, average slope, and the impervious area within the contributing region, downstream distance from stormwater outfalls and dams). ML Classifiers such as logistic regression (LR), decision tree (DT), support vector machine (SVM), and K-nearest neighbors (KNN) are used to identify the flooded locations. A Deep neural network (DNN)-based regression model is used to quantify the water depth. The values of the evaluation matrices indicate satisfactory performance both for the classifiers and DNN model (F-1 scores- 0.975, 0.991, 0.892, and 0.855 for binary classifiers; root mean squared error- 0.027 for DNN regression). In addition, the blocked K-folds Cross Validation (CV) of ML classifiers in detecting flooded locations showed satisfactory performance with the average accuracy of 0.899, which validates the models to generalize to the unseen area. This approach is a significant step towards resolving the complexities of urban fluvial flooding with a large multi-dimensional dataset in a highly computationally efficient manner.
Collapse
Affiliation(s)
- Md Abdullah Al Mehedi
- grid.267871.d0000 0001 0381 6134Villanova Centre of Resilient Water System, Villanova University, Villanova, PA USA
| | - Virginia Smith
- grid.267871.d0000 0001 0381 6134Villanova Centre of Resilient Water System, Villanova University, Villanova, PA USA
| | - Hossein Hosseiny
- grid.4367.60000 0001 2355 7002Department of Earth and Planetary Sciences, Washington University in St. Louis, St. Louis, MO USA
| | - Xun Jiao
- grid.267871.d0000 0001 0381 6134Department of Electrical and Computer Engineering, Villanova University, Villanova, PA USA
| |
Collapse
|
39
|
Mahajan UM, Oehrle B, Sirtl S, Alnatsha A, Goni E, Regel I, Beyer G, Vornhülz M, Vielhauer J, Chromik A, Bahra M, Klein F, Uhl W, Fahlbusch T, Distler M, Weitz J, Grützmann R, Pilarsky C, Weiss FU, Adam MG, Neoptolemos JP, Kalthoff H, Rad R, Christiansen N, Bethan B, Kamlage B, Lerch MM, Mayerle J. Independent Validation and Assay Standardization of Improved Metabolic Biomarker Signature to Differentiate Pancreatic Ductal Adenocarcinoma From Chronic Pancreatitis. Gastroenterology 2022; 163:1407-1422. [PMID: 35870514 DOI: 10.1053/j.gastro.2022.07.047] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 06/28/2022] [Accepted: 07/14/2022] [Indexed: 12/19/2022]
Abstract
BACKGROUND & AIMS Pancreatic ductal adenocarcinoma cancer (PDAC) is a highly lethal malignancy requiring efficient detection when the primary tumor is still resectable. We previously developed the MxPancreasScore comprising 9 analytes and serum carbohydrate antigen 19-9 (CA19-9), achieving an accuracy of 90.6%. The necessity for 5 different analytical platforms and multiple analytical runs, however, hindered clinical applicability. We therefore aimed to develop a simpler single-analytical run, single-platform diagnostic signature. METHODS We evaluated 941 patients (PDAC, 356; chronic pancreatitis [CP], 304; nonpancreatic disease, 281) in 3 multicenter independent tests, and identification (ID) and validation cohort 1 (VD1) and 2 (VD2) were evaluated. Targeted quantitative plasma metabolite analysis was performed on a liquid chromatography-tandem mass spectrometry platform. A machine learning-aided algorithm identified an improved (i-Metabolic) and minimalistic metabolic (m-Metabolic) signatures, and compared them for performance. RESULTS The i-Metabolic Signature, (12 analytes plus CA19-9) distinguished PDAC from CP with area under the curve (95% confidence interval) of 97.2% (97.1%-97.3%), 93.5% (93.4%-93.7%), and 92.2% (92.1%-92.3%) in the ID, VD1, and VD2 cohorts, respectively. In the VD2 cohort, the m-Metabolic signature (4 analytes plus CA19-9) discriminated PDAC from CP with a sensitivity of 77.3% and specificity of 89.6%, with an overall accuracy of 82.4%. For the subset of 45 patients with PDAC with resectable stages IA-IIB tumors, the sensitivity, specificity, and accuracy were 73.2%, 89.6%, and 82.7%, respectively; for those with detectable CA19-9 >2 U/mL, 81.6%, 88.7%, and 84.5%, respectively; and for those with CA19-9 <37 U/mL, 39.7%, 94.1%, and 76.3%, respectively. CONCLUSIONS The single-platform, single-run, m-Metabolic signature of just 4 metabolites used in combination with serum CA19-9 levels is an innovative accurate diagnostic tool for PDAC at the time of clinical presentation, warranting further large-scale evaluation.
Collapse
Affiliation(s)
- Ujjwal M Mahajan
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Bettina Oehrle
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Simon Sirtl
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Ahmed Alnatsha
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Elisabetta Goni
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Ivonne Regel
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Georg Beyer
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Marlies Vornhülz
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Jakob Vielhauer
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany
| | - Ansgar Chromik
- Department of General and Visceral Surgery, Asklepios Klinikum Hamburg, Hamburg, Germany
| | - Markus Bahra
- Zentrum für Onkologische Oberbauchchirurgie und Robotik, Krankenhaus Waldfriede, Berlin, Germany
| | - Fritz Klein
- Department of General, Visceral and Transplantation Surgery, Charité, Campus Virchow Klinikum, Berlin, Germany
| | - Waldemar Uhl
- Department of General and Visceral Surgery, Katholisches Klinikum Bochum, Bochum, Germany
| | - Tim Fahlbusch
- Department of General and Visceral Surgery, Katholisches Klinikum Bochum, Bochum, Germany
| | - Marius Distler
- Department for Visceral, Thoracic and Vascular Surgery, University Hospital, Technical University Dresden, Dresden, Germany
| | - Jürgen Weitz
- Department for Visceral, Thoracic and Vascular Surgery, University Hospital, Technical University Dresden, Dresden, Germany
| | - Robert Grützmann
- Department of Surgery, Erlangen University Hospital, Erlangen, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Erlangen, Germany
| | - Christian Pilarsky
- Department of Surgery, Erlangen University Hospital, Erlangen, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Erlangen, Germany
| | - Frank Ulrich Weiss
- Department of Medicine A, University Medicine Greifswald, Greifswald, Germany
| | - M Gordian Adam
- Metanomics Health GmbH, Berlin, Germany; biocrates life sciences ag, Innsbruck, Austria
| | - John P Neoptolemos
- Department of General, Visceral and Transplantation Surgery, University of Heidelberg, Heidelberg, Germany
| | - Holger Kalthoff
- Section for Molecular Oncology, Institut for Experimental Cancer Research (IET), Universitätsklinikum Schleswig-Holstein, Kiel, Germany
| | - Roland Rad
- Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany; Institute of Molecular Oncology and Functional Genomics, TUM School of Medicine and Center for Translational Cancer Research (TranslaTUM), Technische Universität München, Munich, Germany
| | - Nicole Christiansen
- Metanomics Health GmbH, Berlin, Germany; TrinamiX GmbH, Ludwigshafen am Rhein, Rheinland-Pfalz, Germany
| | | | | | - Markus M Lerch
- Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany; Department of Medicine A, University Medicine Greifswald, Greifswald, Germany; Ludwig Maximilian University Klinikum, Munich, Germany
| | - Julia Mayerle
- Department of Medicine II, University Hospital, Ludwig Maximilian University of Munich, Munich, Germany; Bavarian Centre for Cancer Research (Bayerisches Zentrum für Krebsforschung), Munich, Germany.
| |
Collapse
|
40
|
Chen X, Cheng G, Liu S, Meng S, Jiao Y, Zhang W, Liang J, Zhang W, Wang B, Xu X, Xu J. Probing 1D convolutional neural network adapted to near-infrared spectroscopy for efficient classification of mixed fish. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 279:121350. [PMID: 35609391 DOI: 10.1016/j.saa.2022.121350] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 05/02/2022] [Accepted: 05/03/2022] [Indexed: 06/15/2023]
Abstract
Salmon and Cod are economically significant world-class fish that have high economic value. It is difficult to accurately sort and process them by appearance during harvest and transportation. Conventional chemical detection means are time-consuming and costly, which greatly affects the cost and efficiency of Fishery production. Therefore, there is an urgent need for smart Fisheries methods which use for the classification of mixed fish. In this paper, near-infrared spectroscopy (NIRS) was used to assess salmon and cod samples. This study aims to evaluate feasibility of a back-propagation neural network (BPNN) and a convolutional neural network (CNN) for identifying different species of fishes by the corresponding spectra in comparison to traditional chemometrics Partial Least Squares. After comparing the effects of different batch sizes, number of convolutional kernels, number of convolutional layers, and number of pooling layers on the classification of NIRS spectra comparing different structures of one-dimensional (1D)-CNN, we propose the 1D-CNN-8 model that is most suitable for the classification of mixed fish. Compared with the results of traditional chemometrics methods and BPNN, the prediction model of the 1D-CNN model can reach 98.00% Accuracy and the parameters are significantly better than others. Meanwhile, the parameters and floating-point operations of the optimal model are both small. Therefore, the improved CNN model based on the NIRS can effectively and quickly identify different kinds of fish samples and contribute to realizing edge computing at the same time.
Collapse
Affiliation(s)
- Xinghao Chen
- College of Artificial Intelligence, Nankai University, Tianjin 300350, China
| | - Gongyi Cheng
- The Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics, Nankai University, Tianjin 300071, China
| | - Shuhan Liu
- The Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics, Nankai University, Tianjin 300071, China
| | - Sizhuo Meng
- The Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics, Nankai University, Tianjin 300071, China
| | - Yiping Jiao
- The Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics, Nankai University, Tianjin 300071, China
| | - Wenjie Zhang
- The Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics, Nankai University, Tianjin 300071, China
| | - Jing Liang
- The Key Laboratory of Weak-Light Nonlinear Photonics, Ministry of Education, School of Physics, Nankai University, Tianjin 300071, China
| | - Wang Zhang
- Lianyungang Customs P.R.C, Lianyungang 222042, China
| | - Bin Wang
- College of Artificial Intelligence, Nankai University, Tianjin 300350, China
| | - Xiaoxuan Xu
- College of Artificial Intelligence, Nankai University, Tianjin 300350, China.
| | - Jing Xu
- College of Artificial Intelligence, Nankai University, Tianjin 300350, China
| |
Collapse
|
41
|
Choi JW, Kim DH, Koo DL, Park Y, Nam H, Lee JH, Kim HJ, Hong SN, Jang G, Lim S, Kim B. Automated Detection of Sleep Apnea-Hypopnea Events Based on 60 GHz Frequency-Modulated Continuous-Wave Radar Using Convolutional Recurrent Neural Networks: A Preliminary Report of a Prospective Cohort Study. SENSORS (BASEL, SWITZERLAND) 2022; 22:7177. [PMID: 36236274 PMCID: PMC9570824 DOI: 10.3390/s22197177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 09/12/2022] [Accepted: 09/20/2022] [Indexed: 06/16/2023]
Abstract
Radar is a promising non-contact sensor for overnight polysomnography (PSG), the gold standard for diagnosing obstructive sleep apnea (OSA). This preliminary study aimed to demonstrate the feasibility of the automated detection of apnea-hypopnea events for OSA diagnosis based on 60 GHz frequency-modulated continuous-wave radar using convolutional recurrent neural networks. The dataset comprised 44 participants from an ongoing OSA cohort, recruited from July 2021 to April 2022, who underwent overnight PSG with a radar sensor. All PSG recordings, including sleep and wakefulness, were included in the dataset. Model development and evaluation were based on a five-fold cross-validation. The area under the receiver operating characteristic curve for the classification of 1-min segments ranged from 0.796 to 0.859. Depending on OSA severity, the sensitivities for apnea-hypopnea events were 49.0-67.6%, and the number of false-positive detections per participant was 23.4-52.8. The estimated apnea-hypopnea index showed strong correlations (Pearson correlation coefficient = 0.805-0.949) and good to excellent agreement (intraclass correlation coefficient = 0.776-0.929) with the ground truth. There was substantial agreement between the estimated and ground truth OSA severity (kappa statistics = 0.648-0.736). The results demonstrate the potential of radar as a standalone screening tool for OSA.
Collapse
Affiliation(s)
- Jae Won Choi
- Department of Radiology, Armed Forces Yangju Hospital, Yangju 11429, Korea
| | - Dong Hyun Kim
- Department of Radiology, Seoul Metropolitan Government—Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul 07061, Korea
| | - Dae Lim Koo
- Department of Neurology, Seoul Metropolitan Government—Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul 07061, Korea
| | - Yangmi Park
- Department of Neurology, Seoul Metropolitan Government—Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul 07061, Korea
| | - Hyunwoo Nam
- Department of Neurology, Seoul Metropolitan Government—Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul 07061, Korea
| | - Ji Hyun Lee
- Department of Radiology, Seoul Metropolitan Government—Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul 07061, Korea
| | - Hyo Jin Kim
- Department of Radiology, Seoul Metropolitan Government—Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul 07061, Korea
| | - Seung-No Hong
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul Metropolitan Government—Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul 07061, Korea
| | | | | | | |
Collapse
|
42
|
Mackie T, Al Turkestani N, Bianchi J, Li T, Ruellas A, Gurgel M, Benavides E, Soki F, Cevidanes L. Quantitative bone imaging biomarkers and joint space analysis of the articular Fossa in temporomandibular joint osteoarthritis using artificial intelligence models. FRONTIERS IN DENTAL MEDICINE 2022; 3:1007011. [PMID: 36404987 PMCID: PMC9673279 DOI: 10.3389/fdmed.2022.1007011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2024] Open
Abstract
Temporomandibular joint osteoarthritis (TMJ OA) is a disease with a multifactorial etiology, involving many pathophysiological processes, and requiring comprehensive assessments to characterize progressive cartilage degradation, subchondral bone remodeling, and chronic pain. This study aimed to integrate quantitative biomarkers of bone texture and morphometry of the articular fossa and joint space to advance the role of imaging phenotypes for diagnosis of Temporomandibular Joint Osteoarthritis (TMJ OA) in early to moderate stages by improving the performance of machine-learning algorithms to detect TMJ OA status. Ninety-two patients were prospectively enrolled (184 h-CBCT scans of the right and left mandibular condyles), divided into two groups: 46 control and 46 TMJ OA subjects. No significant difference in the articular fossa radiomic biomarkers was found between TMJ OA and control patients. The superior condyle-to-fossa distance (p < 0.05) was significantly smaller in diseased patients. The interaction effects of the articular fossa radiomic biomarkers enhanced the performance of machine-learning algorithms to detect TMJ OA status. The LightGBM model achieved an AUC 0.842 to diagnose the TMJ OA status with Headaches and Range of Mouth Opening Without Pain ranked as top features, and top interactions of VE-cadherin in Serum and Angiogenin in Saliva, TGF-β1 in Saliva and Headaches, Gender and Muscle Soreness, PA1 in Saliva and Range of Mouth Opening Without Pain, Lateral Condyle Grey Level Non-Uniformity and Lateral Fossa Short Run Emphasis, TGF-β1 in Serum and Lateral Fossa Trabeculae number, MMP3 in Serum and VEGF in Serum, Headaches and Lateral Fossa Trabecular spacing, Headaches and PA1 in Saliva, and Headaches and BDNF in Saliva. Our preliminary results indicate that condyle imaging features may be more important in regards to main effects, but the fossa imaging features may have a larger contribution in terms of interaction effects. More studies are needed to optimize and further enhance machine-learning algorithms to detect early markers of disease, improve prediction of disease progression and severity to ultimately better serve clinical decision support systems in the treatment of patients with TMJ OA.
Collapse
Affiliation(s)
- Tamara Mackie
- Department of Orthodontics and Pediatric Dentistry, University of Michigan, Ann Arbor, MI, United States
| | - Najla Al Turkestani
- Department of Orthodontics and Pediatric Dentistry, University of Michigan, Ann Arbor, MI, United States
- Department of Restorative and Aesthetic Dentistry, Faculty of Dentistry, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Jonas Bianchi
- Department of Orthodontics, University of the Pacific, Arthur Dugoni School of Dentistry, San Francisco, CA, United States
| | - Tengfei Li
- Department of Radiology and Biomedical Research Imaging Center, University of North, Chapel Hill, NC, United States
| | - Antonio Ruellas
- Department of Orthodontics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Marcela Gurgel
- Department of Orthodontics and Pediatric Dentistry, University of Michigan, Ann Arbor, MI, United States
| | - Erika Benavides
- Department of Periodontics and Oral Medicine, University of Michigan, Ann Arbor, MI, United States
| | - Fabiana Soki
- Department of Periodontics and Oral Medicine, University of Michigan, Ann Arbor, MI, United States
| | - Lucia Cevidanes
- Department of Orthodontics and Pediatric Dentistry, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
43
|
Threshold prediction for detecting rare positive samples using a meta-learner. Pattern Anal Appl 2022. [DOI: 10.1007/s10044-022-01103-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
44
|
Rudisill SS, Hornung AL, Barajas JN, Bridge JJ, Mallow GM, Lopez W, Sayari AJ, Louie PK, Harada GK, Tao Y, Wilke HJ, Colman MW, Phillips FM, An HS, Samartzis D. Artificial intelligence in predicting early-onset adjacent segment degeneration following anterior cervical discectomy and fusion. EUROPEAN SPINE JOURNAL : OFFICIAL PUBLICATION OF THE EUROPEAN SPINE SOCIETY, THE EUROPEAN SPINAL DEFORMITY SOCIETY, AND THE EUROPEAN SECTION OF THE CERVICAL SPINE RESEARCH SOCIETY 2022; 31:2104-2114. [PMID: 35543762 DOI: 10.1007/s00586-022-07238-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 02/12/2022] [Accepted: 04/17/2022] [Indexed: 01/20/2023]
Abstract
PURPOSE Anterior cervical discectomy and fusion (ACDF) is a common surgical treatment for degenerative disease in the cervical spine. However, resultant biomechanical alterations may predispose to early-onset adjacent segment degeneration (EO-ASD), which may become symptomatic and require reoperation. This study aimed to develop and validate a machine learning (ML) model to predict EO-ASD following ACDF. METHODS Retrospective review of prospectively collected data of patients undergoing ACDF at a quaternary referral medical center was performed. Patients > 18 years of age with > 6 months of follow-up and complete pre- and postoperative X-ray and MRI imaging were included. An ML-based algorithm was developed to predict EO-ASD based on preoperative demographic, clinical, and radiographic parameters, and model performance was evaluated according to discrimination and overall performance. RESULTS In total, 366 ACDF patients were included (50.8% male, mean age 51.4 ± 11.1 years). Over 18.7 ± 20.9 months of follow-up, 97 (26.5%) patients developed EO-ASD. The model demonstrated good discrimination and overall performance according to precision (EO-ASD: 0.70, non-ASD: 0.88), recall (EO-ASD: 0.73, non-ASD: 0.87), accuracy (0.82), F1-score (0.79), Brier score (0.203), and AUC (0.794), with C4/C5 posterior disc bulge, C4/C5 anterior disc bulge, C6 posterior superior osteophyte, presence of osteophytes, and C6/C7 anterior disc bulge identified as the most important predictive features. CONCLUSIONS Through an ML approach, the model identified risk factors and predicted development of EO-ASD following ACDF with good discrimination and overall performance. By addressing the shortcomings of traditional statistics, ML techniques can support discovery, clinical decision-making, and precision-based spine care.
Collapse
Affiliation(s)
- Samuel S Rudisill
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Alexander L Hornung
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - J Nicolás Barajas
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Jack J Bridge
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,Department of Data Science and Analytics, University of Missouri, Colombia, MO, USA
| | - G Michael Mallow
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Wylie Lopez
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Arash J Sayari
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Philip K Louie
- Virginia Mason Medical Center, Neuroscience Institute, Seattle, WA, USA
| | - Garrett K Harada
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Youping Tao
- Institute of Orthopaedic Research and Biomechanics, Ulm University Medical Centre, Ulm, Germany
| | - Hans-Joachim Wilke
- Institute of Orthopaedic Research and Biomechanics, Ulm University Medical Centre, Ulm, Germany
| | - Matthew W Colman
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Frank M Phillips
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Howard S An
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA.,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA
| | - Dino Samartzis
- Department of Orthopaedic Surgery, Rush University Medical Center, 1611 W. Harrison Street, Chicago, IL, USA. .,International Spine Research and Innovation Initiative, Rush University Medical Center, Chicago, IL, USA.
| |
Collapse
|
45
|
Are smartphones and machine learning enough to diagnose tremor? J Neurol 2022; 269:6104-6115. [PMID: 35861853 DOI: 10.1007/s00415-022-11293-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 05/09/2022] [Accepted: 07/13/2022] [Indexed: 10/17/2022]
Abstract
BACKGROUND Patients with essential tremor (ET), Parkinson's disease (PD) and dystonic tremor (DT) can be difficult to classify and often share similar characteristics. OBJECTIVES To use ubiquitous smartphone accelerometers with and without clinical features to automate tremor classification using supervised machine learning, and to use unsupervised learning to evaluate if natural clusterings of patients correspond to assigned clinical diagnoses. METHODS A supervised machine learning classifier was trained to classify 78 tremor patients using leave-one-out cross-validation to estimate performance on unseen accelerometer data. An independent cohort of 27 patients were also studied. Next, we focused on a subset of 48 patients with both smartphone-based tremor measurements and detailed clinical assessment metrics and compared two separate machine learning classifiers trained on these data. RESULTS The classifier yielded a total accuracy of 74.4% and F1-score of 0.74 for a trinary classification with an area under the curve of 0.904, average F1-score of 0.94, specificity of 97% and sensitivity of 84% in classifying PD from ET or DT. The algorithm classified ET from non-ET with 88% accuracy, but only classified DT from non-DT with 29% accuracy. A poorer performance was found in the independent cohort. Classifiers trained on accelerometer and clinical data respectively obtained similar results. CONCLUSIONS Machine learning classifiers achieved a high accuracy of PD, however moderate accuracy of ET, and poor accuracy of DT classification. This underscores the difficulty of using AI to classify some tremors due to lack of specificity in clinical and neuropathological features, reinforcing that they may represent overlapping syndromes.
Collapse
|
46
|
Liu R, Wang M, Zheng T, Zhang R, Li N, Chen Z, Yan H, Shi Q. An artificial intelligence-based risk prediction model of myocardial infarction. BMC Bioinformatics 2022; 23:217. [PMID: 35672659 PMCID: PMC9175344 DOI: 10.1186/s12859-022-04761-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 05/30/2022] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Myocardial infarction can lead to malignant arrhythmia, heart failure, and sudden death. Clinical studies have shown that early identification of and timely intervention for acute MI can significantly reduce mortality. The traditional MI risk assessment models are subjective, and the data that go into them are difficult to obtain. Generally, the assessment is only conducted among high-risk patient groups. OBJECTIVE To construct an artificial intelligence-based risk prediction model of myocardial infarction (MI) for continuous and active monitoring of inpatients, especially those in noncardiovascular departments, and early warning of MI. METHODS The imbalanced data contain 59 features, which were constructed into a specific dataset through proportional division, upsampling, downsampling, easy ensemble, and w-easy ensemble. Then, the dataset was traversed using supervised machine learning, with recursive feature elimination as the top-layer algorithm and random forest, gradient boosting decision tree (GBDT), logistic regression, and support vector machine as the bottom-layer algorithms, to select the best model out of many through a variety of evaluation indices. RESULTS GBDT was the best bottom-layer algorithm, and downsampling was the best dataset construction method. In the validation set, the F1 score and accuracy of the 24-feature downsampling GBDT model were both 0.84. In the test set, the F1 score and accuracy of the 24-feature downsampling GBDT model were both 0.83, and the area under the curve was 0.91. CONCLUSION Compared with traditional models, artificial intelligence-based machine learning models have better accuracy and real-time performance and can reduce the occurrence of in-hospital MI from a data-driven perspective, thereby increasing the cure rate of patients and improving their prognosis.
Collapse
Affiliation(s)
- Ran Liu
- MOE Key Lab for Neuroinformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 Sichuan China
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Miye Wang
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Tao Zheng
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Rui Zhang
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Nan Li
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Zhongxiu Chen
- Department of Cardiology, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Hongmei Yan
- MOE Key Lab for Neuroinformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 Sichuan China
| | - Qingke Shi
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| |
Collapse
|
47
|
Rezaei A, Wu LC. Automated soccer head impact exposure tracking using video and deep learning. Sci Rep 2022; 12:9282. [PMID: 35661123 PMCID: PMC9166706 DOI: 10.1038/s41598-022-13220-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 05/18/2022] [Indexed: 12/05/2022] Open
Abstract
Head impacts are highly prevalent in sports and there is a pressing need to investigate the potential link between head impact exposure and brain injury risk. Wearable impact sensors and manual video analysis have been utilized to collect impact exposure data. However, wearable sensors suffer from high deployment cost and limited accuracy, while manual video analysis is a long and resource-intensive task. Here we develop and apply DeepImpact, a computer vision algorithm to automatically detect soccer headers using soccer game videos. Our data-driven pipeline uses two deep learning networks including an object detection algorithm and temporal shift module to extract visual and temporal features of video segments and classify the segments as header or nonheader events. The networks were trained and validated using a large-scale professional-level soccer video dataset, with labeled ground truth header events. The algorithm achieved 95.3% sensitivity and 96.0% precision in cross-validation, and 92.9% sensitivity and 21.1% precision in an independent test that included videos of five professional soccer games. Video segments identified as headers in the test data set correspond to 3.5 min of total film time, which can be reviewed through additional manual video verification to eliminate false positives. DeepImpact streamlines the process of manual video analysis and can help to collect large-scale soccer head impact exposure datasets for brain injury research. The fully video-based solution is a low-cost alternative for head impact exposure monitoring and may also be expanded to other sports in future work.
Collapse
Affiliation(s)
- Ahmad Rezaei
- Department of Mechanical Engineering, University of British Columbia, Vancouver, V6T 1Z4, Canada
| | - Lyndia C Wu
- Department of Mechanical Engineering, University of British Columbia, Vancouver, V6T 1Z4, Canada.
| |
Collapse
|
48
|
Harpaz C, Russo S, Leitão JP, Penn R. Potential of supervised machine learning algorithms for estimating the impact of water efficient scenarios on solids accumulation in sewers. WATER RESEARCH 2022; 216:118247. [PMID: 35344912 DOI: 10.1016/j.watres.2022.118247] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 02/28/2022] [Accepted: 03/01/2022] [Indexed: 06/14/2023]
Abstract
Understanding the negative effects of widespread implementation of optimal water efficient solutions may have on existing centralised sewer systems is still limited - one of these effects is the accumulation of solids in sewer pipes. Predicting these effects requires setting up and simulating complex detailed hydraulic sewer network models. Often, precise details of the sewer network layout and diurnal patterns of the wastewater flows are not available, limiting the applicability of using model predictions for such phenomena. In this study, the applicability of supervised machine learning (ML) algorithms for the development of a simplified surrogate model to predict solid accumulation in sewer pipes was investigated. A large number of highly variable sewer networks were synthetically generated and used to produce results that can be generalizable within the limitations of the current study. A hydrodynamic sewer model was set up and simulated for each synthetic sewer network and various scenarios in which different water-efficient solutions were considered. Simulation results indicated that the most impacts are expected to occur in the upstream part of the sewer networks, and that with 50% reduction in (waste-)water flows, 3-20% more pipes are expected to accumulate solids. It was further found that ML algorithms can be used to successfully predict locations of solids accumulation in sewer pipes without using hydrodynamic models. A simple tool based on the findings of this study, sparing the need to conduct complex hydraulic simulations, was developed. It allows the user to enter a set of pipe characteristics and the proportion of flow that is reduced due to the implementation of water efficient solutions, and it predicts whether the pipe will accumulate solids or not. The study results and the proposed ML algorithms can support the implementation of optimal water-efficient solutions that will promote designing and managing the water sensitive cities of the future.
Collapse
Affiliation(s)
- C Harpaz
- Department of Civil and Environmental Engineering, Technion Israel Institute of Technology, Haifa 3200, Israel
| | - S Russo
- ETH Zürich, Ecovision Lab, Photogrammetry and Remote Sensing, Zürich, Switzerland
| | - J P Leitão
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Dübendorf 8600, Switzerland
| | - R Penn
- Department of Civil and Environmental Engineering, Technion Israel Institute of Technology, Haifa 3200, Israel.
| |
Collapse
|
49
|
Kuru N, Dereli O, Akkoyun E, Bircan A, Tastan O, Adebali O. PHACT: Phylogeny-aware computing of tolerance for missense mutations. Mol Biol Evol 2022; 39:6593375. [PMID: 35639618 PMCID: PMC9178230 DOI: 10.1093/molbev/msac114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Evolutionary conservation is a fundamental resource for predicting the substitutability of amino acids and loss of function in proteins. The use of multiple sequence alignment alone-without considering the evolutionary relationships among sequences-results in the redundant counting of evolutionarily related alteration events as if they were independent. Here we propose a new method, PHACT that predicts the pathogenicity of missense mutations directly from the phylogenetic tree of proteins. PHACT travels through the nodes of the phylogenetic tree and evaluates the deleteriousness of a substitution based on the probability differences of ancestral amino acids between neighboring nodes in the tree. Moreover, PHACT assigns weights to each node in the tree based on their distance to the query organism. For each potential amino acid substitution, the algorithm generates a score that is used to calculate the effect of substitution on protein function. To analyze the predictive performance of PHACT, we performed various experiments over the subsets of two datasets that include 3023 proteins and 61662 variants in total. The experiments demonstrated that our method outperformed the widely used pathogenicity prediction tools (i.e., SIFT and PolyPhen-2) and achieved better predictive performance than did other conventional statistical approaches presented in dbNSFP. The PHACT source code is available at https://github.com/CompGenomeLab/PHACT.
Collapse
Affiliation(s)
- Nurdan Kuru
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Onur Dereli
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Emrah Akkoyun
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Aylin Bircan
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Oznur Tastan
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| | - Ogun Adebali
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| |
Collapse
|
50
|
A BERT model generates diagnostically relevant semantic embeddings from pathology synopses with active learning. COMMUNICATIONS MEDICINE 2022; 1:11. [PMID: 35602188 PMCID: PMC9053264 DOI: 10.1038/s43856-021-00008-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 05/13/2021] [Indexed: 02/08/2023] Open
Abstract
Background Pathology synopses consist of semi-structured or unstructured text summarizing visual information by observing human tissue. Experts write and interpret these synopses with high domain-specific knowledge to extract tissue semantics and formulate a diagnosis in the context of ancillary testing and clinical information. The limited number of specialists available to interpret pathology synopses restricts the utility of the inherent information. Deep learning offers a tool for information extraction and automatic feature generation from complex datasets. Methods Using an active learning approach, we developed a set of semantic labels for bone marrow aspirate pathology synopses. We then trained a transformer-based deep-learning model to map these synopses to one or more semantic labels, and extracted learned embeddings (i.e., meaningful attributes) from the model's hidden layer. Results Here we demonstrate that with a small amount of training data, a transformer-based natural language model can extract embeddings from pathology synopses that capture diagnostically relevant information. On average, these embeddings can be used to generate semantic labels mapping patients to probable diagnostic groups with a micro-average F1 score of 0.779 Â ± 0.025. Conclusions We provide a generalizable deep learning model and approach to unlock the semantic information inherent in pathology synopses toward improved diagnostics, biodiscovery and AI-assisted computational pathology.
Collapse
|