1
|
Arends B, Vessies M, van Osch D, Teske A, van der Harst P, van Es R, van Es B. Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification. BMC Med Inform Decis Mak 2025; 25:115. [PMID: 40050820 PMCID: PMC11887187 DOI: 10.1186/s12911-025-02897-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Accepted: 01/28/2025] [Indexed: 03/09/2025] Open
Abstract
BACKGROUND Clinical machine learning research and artificial intelligence driven clinical decision support models rely on clinically accurate labels. Manually extracting these labels with the help of clinical specialists is often time-consuming and expensive. This study tests the feasibility of automatic span- and document-level diagnosis extraction from unstructured Dutch echocardiogram reports. METHODS We included 115,692 unstructured echocardiogram reports from the University Medical Center Utrecht, a large university hospital in the Netherlands. A randomly selected subset was manually annotated for the occurrence and severity of eleven commonly described cardiac characteristics. We developed and tested several automatic labelling techniques at both span and document levels, using weighted and macro F1-score, precision, and recall for performance evaluation. We compared the performance of span labelling against document labelling methods, which included both direct document classifiers and indirect document classifiers that rely on span classification results. RESULTS The SpanCategorizer and MedRoBERTa.nl models outperformed all other span and document classifiers, respectively. The weighted F1-score varied between characteristics, ranging from 0.60 to 0.93 in SpanCategorizer and 0.96 to 0.98 in MedRoBERTa.nl. Direct document classification was superior to indirect document classification using span classifiers. SetFit achieved competitive document classification performance using only 10% of the training data. Utilizing a reduced label set yielded near-perfect document classification results. CONCLUSION We recommend using our published SpanCategorizer and MedRoBERTa.nl models for span- and document-level diagnosis extraction from Dutch echocardiography reports. For settings with limited training data, SetFit may be a promising alternative for document classification. Future research should be aimed at training a RoBERTa based span classifier and applying English based models on translated echocardiogram reports.
Collapse
Affiliation(s)
- Bauke Arends
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands.
| | - Melle Vessies
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Dirk van Osch
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Arco Teske
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Pim van der Harst
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - René van Es
- Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Bram van Es
- Central Diagnostic Laboratory, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
2
|
Mustafa A, Naseem U, Rahimi Azghadi M. Large language models vs human for classifying clinical documents. Int J Med Inform 2025; 195:105800. [PMID: 39848078 DOI: 10.1016/j.ijmedinf.2025.105800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 01/16/2025] [Accepted: 01/19/2025] [Indexed: 01/25/2025]
Abstract
BACKGROUND Accurate classification of medical records is crucial for clinical documentation, particularly when using the 10th revision of the International Classification of Diseases (ICD-10) coding system. The use of machine learning algorithms and Systematized Nomenclature of Medicine (SNOMED) mapping has shown promise in performing these classifications. However, challenges remain, particularly in reducing false negatives, where certain diagnoses are not correctly identified by either approach. OBJECTIVE This study explores the potential of leveraging advanced large language models to improve the accuracy of ICD-10 classifications in challenging cases of medical records where machine learning and SNOMED mapping fail. METHODS We evaluated the performance of ChatGPT 3.5 and ChatGPT 4 in classifying ICD-10 codes from discharge summaries within selected records of the Medical Information Mart for Intensive Care (MIMIC) IV dataset. These records comprised 802 discharge summaries identified as false negatives by both machine learning and SNOMED mapping methods, showing their challenging case. Each summary was assessed by ChatGPT 3.5 and 4 using a classification prompt, and the results were compared to human coder evaluations. Five human coders, with a combined experience of over 30 years, independently classified a stratified sample of 100 summaries to validate ChatGPT's performance. RESULTS ChatGPT 4 demonstrated significantly improved consistency over ChatGPT 3.5, with matching results between runs ranging from 86% to 89%, compared to 57% to 67% for ChatGPT 3.5. The classification accuracy of ChatGPT 4 was variable across different ICD-10 codes. Overall, human coders performed better than ChatGPT. However, ChatGPT matched the median performance of human coders, achieving an accuracy rate of 22%. CONCLUSION This study underscores the potential of integrating advanced language models with clinical coding processes to improve documentation accuracy. ChatGPT 4 demonstrated improved consistency and comparable performance to median human coders, achieving 22% accuracy in challenging cases. Combining ChatGPT with methods like SNOMED mapping could further enhance clinical coding accuracy, particularly for complex scenarios.
Collapse
Affiliation(s)
- Akram Mustafa
- College of Science and Engineering, James Cook University, Townsville, 4811, QLD, Australia.
| | - Usman Naseem
- School of Computing, Macquarie University, Sydney, 2113, NSW, Australia.
| | - Mostafa Rahimi Azghadi
- College of Science and Engineering, James Cook University, Townsville, 4811, QLD, Australia.
| |
Collapse
|
3
|
Shujaat S. Automated Machine Learning in Dentistry: A Narrative Review of Applications, Challenges, and Future Directions. Diagnostics (Basel) 2025; 15:273. [PMID: 39941203 PMCID: PMC11817062 DOI: 10.3390/diagnostics15030273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2024] [Revised: 01/06/2025] [Accepted: 01/13/2025] [Indexed: 02/16/2025] Open
Abstract
The adoption of automated machine learning (AutoML) in dentistry is transforming clinical practices by enabling clinicians to harness machine learning (ML) models without requiring extensive technical expertise. This narrative review aims to explore the impact of autoML in dental applications. A comprehensive search of PubMed, Scopus, and Google Scholar was conducted without time and language restrictions. Inclusion criteria focused on studies evaluating autoML applications and performance for dental tasks. Exclusion criteria included non-dental studies, single-case reports, and conference abstracts. This review highlights multiple promising applications of autoML in dentistry. Diagnostic tasks showed high accuracy, such as 95.4% precision in dental implant classification and 92% accuracy in paranasal sinus disease detection. Predictive tasks also demonstrated promise, including 84% accuracy for ICU admissions due to dental infections and 93.9% accuracy in orthodontic extraction predictions. AutoML frameworks like Google Vertex AI and H2O AutoML emerged as key tools for these applications. AutoML shows great promise in transforming dentistry by facilitating data-driven decision-making and improving patient care quality through accessible, automated solutions. Future advancements should focus on enhancing model interpretability, developing large and annotated datasets, and creating pipelines tailored to dental tasks. Educating clinicians on autoML and integrating domain-specific knowledge into automated platforms could further bridge the gap between complex ML technology and practical dental applications.
Collapse
Affiliation(s)
- Sohaib Shujaat
- King Abdullah International Medical Research Center, Department of Maxillofacial Surgery and Diagnostic Sciences, College of Dentistry, King Saud Bin Abdulaziz University for Health Sciences, Ministry of National Guard Health Affairs, P.O. Box 3660, Riyadh 11481, Saudi Arabia; ; Tel.: +966-582940293
- OMFS IMPATH Research Group, Department of Imaging & Pathology, Faculty of Medicine, KU Leuven & Oral and Maxillofacial Surgery, University Hospitals Leuven, 3000 Leuven, Belgium
| |
Collapse
|
4
|
Choi J, Lee H, Kim‐Godwin Y. Decoding machine learning in nursing research: A scoping review of effective algorithms. J Nurs Scholarsh 2025; 57:119-129. [PMID: 39294553 PMCID: PMC11771615 DOI: 10.1111/jnu.13026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 08/16/2024] [Accepted: 08/23/2024] [Indexed: 09/20/2024]
Abstract
INTRODUCTION The rapid evolution of artificial intelligence (AI) technology has revolutionized healthcare, particularly through the integration of AI into health information systems. This transformation has significantly impacted the roles of nurses and nurse practitioners, prompting extensive research to assess the effectiveness of AI-integrated systems. This scoping review focuses on machine learning (ML) used in nursing, specifically investigating ML algorithms, model evaluation methods, areas of focus related to nursing, and the most effective ML algorithms. DESIGN The scoping review followed the Preferred Reporting Items for Systematic Review and Meta-Analysis Extension for Scoping Reviews (PRISMA-ScR) guidelines. METHODS A structured search was performed across seven databases according to PRISMA-ScR: PubMed, EMBASE, CINAHL, Web of Science, OVID, PsycINFO, and ProQuest. The quality of the final reviewed studies was assessed using the Medical Education Research Study Quality Instrument (MERSQI). RESULTS Twenty-six articles published between 2019 and 2023 met the inclusion and exclusion criteria, and 46% of studies were conducted in the US. The average MERSQI score was 12.2, indicative of moderate- to high-quality studies. The most used ML algorithm was Random Forest. The four second-most used were logistic regression, least absolute shrinkage and selection operator, decision tree, and support vector machine. Most ML models were evaluated by calculating sensitivity (recall)/specificity, accuracy, receiver operating characteristic (ROC), area under the ROC (AUROC), and positive/negative prediction value (precision). Half of the studies focused on nursing staff or students and hospital readmission or emergency department visits. Only 11 articles reported the most effective ML algorithm(s). CONCLUSION The scoping review provides insights into the current status of ML research in nursing and recognition of its significance in nursing research, confirming the benefits of ML in healthcare. Recommendations include incorporating experimental designs in research studies to optimize the use of ML models across various nursing domains. CLINICAL RELEVANCE The scoping review demonstrates substantial clinical relevance of ML applications for nurses, nurse practitioners, administrators, and researchers. The integration of ML into healthcare systems and its impact on nursing practices have important implications for patient care, resource management, and the evolution of nursing research.
Collapse
Affiliation(s)
- Jeeyae Choi
- School of Nursing, College of Health and Human ServicesUniversity of North Carolina WilmingtonWilmingtonNorth CarolinaUSA
| | - Hanjoo Lee
- Joint Biomedical Engineering Department, School of MedicineUniversity of North Carolina Chapel HillChapel HillNorth CarolinaUSA
| | - Yeounsoo Kim‐Godwin
- School of Nursing, College of Health and Human ServicesUniversity of North Carolina WilmingtonWilmingtonNorth CarolinaUSA
| |
Collapse
|
5
|
Choi J, Woo S, Ferrell A. Artificial intelligence assisted telehealth for nursing: A scoping review. J Telemed Telecare 2025; 31:140-149. [PMID: 37071572 DOI: 10.1177/1357633x231167613] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2023]
Abstract
BACKGROUND Due to the COVID-19 pandemic, telehealth resurfaced as a convenient efficient healthcare delivery method. Researchers indicate that Artificial Intelligence (AI) could further facilitate delivering quality care in telehealth. It is essential to find supporting evidence to use AI-assisted telehealth interventions in nursing. OBJECTIVES This scoping review focuses on finding users' satisfaction and perception of AI-assisted telehealth intervention, performances of AI algorithms, and the types of AI technology used. METHODS A structured search was performed in six databases, PubMed, CINAHL, Web of Science, OVID, PsycINFO, and ProQuest, following the guidance of the Preferred Reporting Items for Systematic Review and Meta-Analysis Extension for Scoping Reviews. The quality of the final reviewed studies was assessed using the Medical Education Research Study Quality Instrument. RESULTS Eight of the 41 studies published between 2017 and 2022 were included in the final review. Six studies were conducted in the United States, one in Japan, and one in South Korea. Four studies collected data from participants (n = 3014). Two studies used image data (n = 1986), and two used sensor data from smart homes to detect patients' health events for nurses (n = 35). The quality of studies implied moderate to high-quality study (mean = 10.1, range = 7.7-13.7). Two studies reported high user satisfaction, three assessed user perception of AI in telehealth, and only one showed high AI acceptability. Two studies revealed the high performance of AI algorithms. Five studies used machine learning algorithms. CONCLUSIONS AI-assisted telehealth interventions were efficient and promising and could be an effective care delivery method in nursing.
Collapse
Affiliation(s)
- Jeeyae Choi
- School of Nursing, College of Health and Human Services, University of North Carolina at Wilmington, Wilmington, NC, USA
| | - Seoyoon Woo
- School of Nursing, College of Health and Human Services, University of North Carolina at Wilmington, Wilmington, NC, USA
| | - Anastasiya Ferrell
- School of Nursing, College of Health and Human Services, University of North Carolina at Wilmington, Wilmington, NC, USA
| |
Collapse
|
6
|
Ali MJ, Essaid M, Moalic L, Idoumghar L. A review of AutoML optimization techniques for medical image applications. Comput Med Imaging Graph 2024; 118:102441. [PMID: 39489100 DOI: 10.1016/j.compmedimag.2024.102441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 09/06/2024] [Accepted: 09/30/2024] [Indexed: 11/05/2024]
Abstract
Automatic analysis of medical images using machine learning techniques has gained significant importance over the years. A large number of approaches have been proposed for solving different medical image analysis tasks using machine learning and deep learning approaches. These approaches are quite effective thanks to their ability to analyze large volume of medical imaging data. Moreover, they can also identify patterns that may be difficult for human experts to detect. Manually designing and tuning the parameters of these algorithms is a challenging and time-consuming task. Furthermore, designing a generalized model that can handle different imaging modalities is difficult, as each modality has specific characteristics. To solve these problems and automate the whole pipeline of different medical image analysis tasks, numerous Automatic Machine Learning (AutoML) techniques have been proposed. These techniques include Hyper-parameter Optimization (HPO), Neural Architecture Search (NAS), and Automatic Data Augmentation (ADA). This study provides an overview of several AutoML-based approaches for different medical imaging tasks in terms of optimization search strategies. The usage of optimization techniques (evolutionary, gradient-based, Bayesian optimization, etc.) is of significant importance for these AutoML approaches. We comprehensively reviewed existing AutoML approaches, categorized them, and performed a detailed analysis of different proposed approaches. Furthermore, current challenges and possible future research directions are also discussed.
Collapse
Affiliation(s)
| | - Mokhtar Essaid
- Université de Haute-Alsace, IRIMAS UR7499, Mulhouse, 68100, France.
| | - Laurent Moalic
- Université de Haute-Alsace, IRIMAS UR7499, Mulhouse, 68100, France.
| | | |
Collapse
|
7
|
Park Y. Automated machine learning with R: AutoML tools for beginners in clinical research. JOURNAL OF MINIMALLY INVASIVE SURGERY 2024; 27:129-137. [PMID: 39300720 PMCID: PMC11416892 DOI: 10.7602/jmis.2024.27.3.129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Accepted: 09/09/2024] [Indexed: 09/22/2024]
Abstract
Recently, interest in machine learning (ML) has increased as the application fields have expanded significantly. Although ML methods excel in many fields, establishing an ML pipeline requires considerable time and human resources. Automated ML (AutoML) tools offer a solution by automating repetitive tasks, such as data preprocessing, model selection, hyperparameter optimization, and prediction analysis. This review introduces the use of AutoML tools for general research, including clinical studies. In particular, it outlines a simple approach that is accessible to beginners using the R programming language (R Foundation for Statistical Computing). In addition, the practical code and output results for binary classification are provided to facilitate direct application by clinical researchers in future studies.
Collapse
Affiliation(s)
- Youngho Park
- Department of Big Data Application, College of Smart Interdisciplinary Engineering, Hannam University, Daejeon, Korea
| |
Collapse
|
8
|
Alkhalaf M, Yu P, Yin M, Deng C. Applying generative AI with retrieval augmented generation to summarize and extract key clinical information from electronic health records. J Biomed Inform 2024; 156:104662. [PMID: 38880236 DOI: 10.1016/j.jbi.2024.104662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 05/25/2024] [Accepted: 05/28/2024] [Indexed: 06/18/2024]
Abstract
BACKGROUND Malnutrition is a prevalent issue in aged care facilities (RACFs), leading to adverse health outcomes. The ability to efficiently extract key clinical information from a large volume of data in electronic health records (EHR) can improve understanding about the extent of the problem and developing effective interventions. This research aimed to test the efficacy of zero-shot prompt engineering applied to generative artificial intelligence (AI) models on their own and in combination with retrieval augmented generation (RAG), for the automating tasks of summarizing both structured and unstructured data in EHR and extracting important malnutrition information. METHODOLOGY We utilized Llama 2 13B model with zero-shot prompting. The dataset comprises unstructured and structured EHRs related to malnutrition management in 40 Australian RACFs. We employed zero-shot learning to the model alone first, then combined it with RAG to accomplish two tasks: generate structured summaries about the nutritional status of a client and extract key information about malnutrition risk factors. We utilized 25 notes in the first task and 1,399 in the second task. We evaluated the model's output of each task manually against a gold standard dataset. RESULT The evaluation outcomes indicated that zero-shot learning applied to generative AI model is highly effective in summarizing and extracting information about nutritional status of RACFs' clients. The generated summaries provided concise and accurate representation of the original data with an overall accuracy of 93.25%. The addition of RAG improved the summarization process, leading to a 6% increase and achieving an accuracy of 99.25%. The model also proved its capability in extracting risk factors with an accuracy of 90%. However, adding RAG did not further improve accuracy in this task. Overall, the model has shown a robust performance when information was explicitly stated in the notes; however, it could encounter hallucination limitations, particularly when details were not explicitly provided. CONCLUSION This study demonstrates the high performance and limitations of applying zero-shot learning to generative AI models to automatic generation of structured summarization of EHRs data and extracting key clinical information. The inclusion of the RAG approach improved the model performance and mitigated the hallucination problem.
Collapse
Affiliation(s)
- Mohammad Alkhalaf
- School of Computing and Information Technology, University of Wollongong, Wollongong, NSW 2522, Australia; School of Computer Science, Qassim University, Qassim 51452, Saudi Arabia
| | - Ping Yu
- School of Computing and Information Technology, University of Wollongong, Wollongong, NSW 2522, Australia.
| | - Mengyang Yin
- Opal Healthcare, Level 11/420 George St, Sydney NSW 2000, Australia
| | - Chao Deng
- School of Medical, Indigenous and Health Sciences, University of Wollongong, Wollongong, NSW 2522, Australia
| |
Collapse
|
9
|
Scott IA, De Guzman KR, Falconer N, Canaris S, Bonilla O, McPhail SM, Marxen S, Van Garderen A, Abdel-Hafez A, Barras M. Evaluating automated machine learning platforms for use in healthcare. JAMIA Open 2024; 7:ooae031. [PMID: 38863963 PMCID: PMC11165368 DOI: 10.1093/jamiaopen/ooae031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/06/2024] [Accepted: 04/22/2024] [Indexed: 06/13/2024] Open
Abstract
Objective To describe development and application of a checklist of criteria for selecting an automated machine learning (Auto ML) platform for use in creating clinical ML models. Materials and Methods Evaluation criteria for selecting an Auto ML platform suited to ML needs of a local health district were developed in 3 steps: (1) identification of key requirements, (2) a market scan, and (3) an assessment process with desired outcomes. Results The final checklist comprising 21 functional and 6 non-functional criteria was applied to vendor submissions in selecting a platform for creating a ML heparin dosing model as a use case. Discussion A team of clinicians, data scientists, and key stakeholders developed a checklist which can be adapted to ML needs of healthcare organizations, the use case providing a relevant example. Conclusion An evaluative checklist was developed for selecting Auto ML platforms which requires validation in larger multi-site studies.
Collapse
Affiliation(s)
- Ian A Scott
- Centre for Health Services Research, University of Queensland, Brisbane, 4102, Australia
- Department of Internal Medicine and Clinical Epidemiology, Princess Alexandra Hospital, Brisbane, 4102, Australia
| | - Keshia R De Guzman
- Department of Pharmacy, Princess Alexandra Hospital, Brisbane, 4102, Australia
- School of Pharmacy, The University of Queensland, Brisbane, 4102, Australia
| | - Nazanin Falconer
- Department of Pharmacy, Princess Alexandra Hospital, Brisbane, 4102, Australia
- School of Pharmacy, The University of Queensland, Brisbane, 4102, Australia
| | - Stephen Canaris
- Digital Health and Informatics, Metro South Health, Brisbane, 4102, Australia
| | - Oscar Bonilla
- Digital Health and Informatics, Metro South Health, Brisbane, 4102, Australia
| | - Steven M McPhail
- Digital Health and Informatics, Metro South Health, Brisbane, 4102, Australia
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Queensland University of Technology, Brisbane, 4059, Australia
| | - Sven Marxen
- Pharmacy Service, Logan and Beaudesert Hospitals, Logan, 4131, Australia
| | - Aaron Van Garderen
- Digital Health and Informatics, Metro South Health, Brisbane, 4102, Australia
- Pharmacy Service, Logan and Beaudesert Hospitals, Logan, 4131, Australia
| | - Ahmad Abdel-Hafez
- Digital Health and Informatics, Metro South Health, Brisbane, 4102, Australia
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public Health and Social Work, Queensland University of Technology, Brisbane, 4059, Australia
| | - Michael Barras
- Department of Pharmacy, Princess Alexandra Hospital, Brisbane, 4102, Australia
- School of Pharmacy, The University of Queensland, Brisbane, 4102, Australia
| |
Collapse
|
10
|
Liu Y, Cao S. The analysis of aerobics intelligent fitness system for neurorobotics based on big data and machine learning. Heliyon 2024; 10:e33191. [PMID: 39022026 PMCID: PMC11253048 DOI: 10.1016/j.heliyon.2024.e33191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 05/08/2024] [Accepted: 06/16/2024] [Indexed: 07/20/2024] Open
Abstract
In modern society, people's pace of life is fast, and the pressure is enormous, leading to increasingly prominent issues such as obesity and sub-health. Traditional fitness methods cannot meet people's needs to a certain extent. Therefore, this work aims to use technology to change people's lifestyles and compensate for traditional fitness methods' shortcomings. Firstly, this work overviews neurorobotics, providing neural perception and control functions for aerobics intelligent fitness system. Secondly, the connection between big data and machine learning (ML), big data technology products, and the ML process are discussed. The Spark big data platform builds node data for calculation, and the decision tree algorithm is used for data preprocessing. These are important for future intelligent fitness analysis. This work proposes an aerobics intelligent fitness system based on neurorobotics technology and big data analysis and develops a recommendation system for the best fitness exercise. This system utilizes neural perception and control functions, combined with big data and ML technology, to solve the obesity and sub-health problems faced by people in fast-paced and high-pressure lifestyles. By harnessing the computational capabilities of the Spark big data platform and applying the decision tree algorithm for data preprocessing, the system can furnish users with personalized fitness plans and optimization recommendations. This work conducts a model performance study on 35 % aerobic fitness data on intelligent fitness Android v1.0.8 to evaluate the system's data processing ability and training effectiveness. Moreover, the aerobics intelligent fitness system models based on neurorobotics, big data, and ML are evaluated. The results indicate that normalizing the data using the Min-Max method leads to a decrease in the F1 value and a reduction in data set errors. Consequently, the dataset studied by the system model is beneficial to improving the work efficiency of the aerobics intelligent fitness system. After the comprehensive human quality of the system model is evaluated, the actual average score of the comprehensive human quality of the 13 users tested before the aerobics intelligent fitness system test is 91.44, and the average prediction score is 90.88. The results of the two tests are similar. Thus, using the intelligent fitness system can enable the user to obtain system feedback according to the actual training effect, thereby playing a guiding role in the intelligent fitness of aerobics for the user. This work designs and implements the aerobics intelligent fitness system close to the human body's training effect, further enhancing the specialization and individualization of sports and fitness.
Collapse
Affiliation(s)
- Yuanxin Liu
- Sports Department, Henan Medical College, Zhengzhou, 451191, China
| | - Shufang Cao
- Ministry of Basic Medicine Education, Dazhou Vocational College of Chinese Medicine, Dazhou, 635000, China
| |
Collapse
|
11
|
Acharya N, Kar P, Ally M, Soar J. Predicting Co-Occurring Mental Health and Substance Use Disorders in Women: An Automated Machine Learning Approach. APPLIED SCIENCES 2024; 14:1630. [DOI: 10.3390/app14041630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
Significant clinical overlap exists between mental health and substance use disorders, especially among women. The purpose of this research is to leverage an AutoML (Automated Machine Learning) interface to predict and distinguish co-occurring mental health (MH) and substance use disorders (SUD) among women. By employing various modeling algorithms for binary classification, including Random Forest, Gradient Boosted Trees, XGBoost, Extra Trees, SGD, Deep Neural Network, Single-Layer Perceptron, K Nearest Neighbors (grid), and a super learning model (constructed by combining the predictions of a Random Forest model and an XGBoost model), the research aims to provide healthcare practitioners with a powerful tool for earlier identification, intervention, and personalised support for women at risk. The present research presents a machine learning (ML) methodology for more accurately predicting the co-occurrence of mental health (MH) and substance use disorders (SUD) in women, utilising the Treatment Episode Data Set Admissions (TEDS-A) from the year 2020 (n = 497,175). A super learning model was constructed by combining the predictions of a Random Forest model and an XGBoost model. The model demonstrated promising predictive performance in predicting co-occurring MH and SUD in women with an AUC = 0.817, Accuracy = 0.751, Precision = 0.743, Recall = 0.926 and F1 Score = 0.825. The use of accurate prediction models can substantially facilitate the prompt identification and implementation of intervention strategies.
Collapse
Affiliation(s)
- Nirmal Acharya
- Australian International Institute of Higher Education, Brisbane, QLD 4000, Australia
| | - Padmaja Kar
- St Vincent’s Care Services, Mitchelton, QLD 4053, Australia
| | - Mustafa Ally
- School of Business, University of Southern Queensland, Toowoomba, QLD 4350, Australia
| | - Jeffrey Soar
- School of Business, University of Southern Queensland, Toowoomba, QLD 4350, Australia
| |
Collapse
|
12
|
Wang Y, Xu X, Fang Y, Yang S, Wang Q, Liu W, Zhang J, Liang D, Zhai W, Qian K. Self-Assembled Hyperbranched Gold Nanoarrays Decode Serum United Urine Metabolic Fingerprints for Kidney Tumor Diagnosis. ACS NANO 2024; 18:2409-2420. [PMID: 38190455 DOI: 10.1021/acsnano.3c10717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Serum united urine metabolic analysis comprehensively reveals the disease status for kidney diseases in particular. Thus, the precise and convenient acquisition of metabolic molecular information from united biofluids is vitally important for clinical disease diagnosis and biomarker discovery. Laser desorption/ionization mass spectrometry (LDI-MS) presents various advantages in metabolic analysis; however, there remain challenges in ionization efficiency and MS signal reproducibility. Herein, we constructed a self-assembled hyperbranched black gold nanoarray (HyBrAuNA) assisted LDI-MS platform to profile serum united urine metabolic fingerprints (S-UMFs) for diagnosis of early stage renal cell carcinoma (RCC). The closely packed HyBrAuNA afforded strong electromagnetic field enhancement and high photothermal conversion efficacy, enabling effective ionization of low abundant metabolites for S-UMF collection. With a uniform nanoarray, the platform presented excellent reproducibility to ensure the accuracy of S-UMFs obtained in seconds. When it was combined with automated machine learning analysis of S-UMFs, early stage RCC patients were discriminated from the healthy controls with an area under the curve (AUC) > 0.99. Furthermore, we screened out a panel of 9 metabolites (4 from serum and 5 from urine) and related pathways toward early stage kidney tumor. In view of its high-throughput, fast analytical speed, and low sample consumption, our platform possesses potential in metabolic profiling of united biofluids for disease diagnosis and pathogenic mechanism exploration.
Collapse
Affiliation(s)
- Yuning Wang
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Xiaoyu Xu
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Yuzheng Fang
- Department of Urology, Renji Hospital, School of Medicine in Shanghai Jiao Tong University, 160 Pujian Road, Shanghai 200127, People's Republic of China
| | - Shouzhi Yang
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Qirui Wang
- Health Management Center, Renji Hospital of Medical School of Shanghai Jiao Tong University, Shanghai 200127, People's Republic of China
| | - Wanshan Liu
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Juxiang Zhang
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Dingyitai Liang
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| | - Wei Zhai
- Department of Urology, Renji Hospital, School of Medicine in Shanghai Jiao Tong University, 160 Pujian Road, Shanghai 200127, People's Republic of China
| | - Kun Qian
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200030, People's Republic of China
| |
Collapse
|
13
|
Lin E, Lin CH, Lane HY. Inference of social cognition in schizophrenia patients with neurocognitive domains and neurocognitive tests using automated machine learning. Asian J Psychiatr 2024; 91:103866. [PMID: 38128351 DOI: 10.1016/j.ajp.2023.103866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 12/07/2023] [Accepted: 12/09/2023] [Indexed: 12/23/2023]
Abstract
AIM It has been suggested that single neurocognitive domain or neurocognitive test can be used to determine the overall cognitive function in schizophrenia using machine learning algorithms. It is unknown whether social cognition in schizophrenia patients can be estimated with machine learning based on neurocognitive domains or neurocognitive tests. METHODS To predict social cognition in schizophrenia, we applied an automated machine learning (AutoML) framework resulting from the analysis of predictive factors such as six neurocognitive domain scores and nine neurocognitive test scores of 380 schizophrenia patients in the Taiwanese population. Four clinical parameters (i.e., age, gender, subgroup, and education) were also used as predictive factors. We utilized an AutoML framework called Tree-based Pipeline Optimization Tool (TPOT) to generate predictive pipelines automatically. RESULTS The analysis revealed that all neurocognitive domains and tests except the reasoning and problem solving domain/test showed significant associations with social cognition. In addition, a TPOT-generated pipeline can best predict social cognition in schizophrenia using seven predictive factors, including five neurocognitive domains (i.e., speed of processing, sustained attention, working memory, verbal learning and memory, and visual learning and memory) and two clinical parameters (i.e., age and gender). This predictive pipeline consists of machine learning algorithms such as function transformers, an approximate feature map, independent component analysis, and linear regression. CONCLUSION The study indicates that an AutoML framework such as TPOT may provide a promising way to produce truly effective machine learning pipelines for predicting social cognition in schizophrenia using neurocognitive domains and/or neurocognitive tests.
Collapse
Affiliation(s)
- Eugene Lin
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA; Department of Electrical & Computer Engineering, University of Washington, Seattle, WA 98195, USA; Graduate Institute of Biomedical Sciences, China Medical University, Taichung, Taiwan
| | - Chieh-Hsin Lin
- Graduate Institute of Biomedical Sciences, China Medical University, Taichung, Taiwan; Department of Psychiatry, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung, Taiwan; School of Medicine, Chang Gung University, Taoyuan, Taiwan.
| | - Hsien-Yuan Lane
- Graduate Institute of Biomedical Sciences, China Medical University, Taichung, Taiwan; Department of Psychiatry, China Medical University Hospital, Taichung, Taiwan; Brain Disease Research Center, China Medical University Hospital, Taichung, Taiwan; Department of Psychology, College of Medical and Health Sciences, Asia University, Taichung, Taiwan.
| |
Collapse
|
14
|
Omar I, Khan M, Starr A, Abou Rok Ba K. Automated Prediction of Crack Propagation Using H2O AutoML. SENSORS (BASEL, SWITZERLAND) 2023; 23:8419. [PMID: 37896512 PMCID: PMC10611134 DOI: 10.3390/s23208419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/06/2023] [Accepted: 10/09/2023] [Indexed: 10/29/2023]
Abstract
Crack propagation is a critical phenomenon in materials science and engineering, significantly impacting structural integrity, reliability, and safety across various applications. The accurate prediction of crack propagation behavior is paramount for ensuring the performance and durability of engineering components, as extensively explored in prior research. Nevertheless, there is a pressing demand for automated models capable of efficiently and precisely forecasting crack propagation. In this study, we address this need by developing a machine learning-based automated model using the powerful H2O library. This model aims to accurately predict crack propagation behavior in various materials by analyzing intricate crack patterns and delivering reliable predictions. To achieve this, we employed a comprehensive dataset derived from measured instances of crack propagation in Acrylonitrile Butadiene Styrene (ABS) specimens. Rigorous evaluation metrics, including Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and R-squared (R2) values, were applied to assess the model's predictive accuracy. Cross-validation techniques were utilized to ensure its robustness and generalizability across diverse datasets. Our results underscore the automated model's remarkable accuracy and reliability in predicting crack propagation. This study not only highlights the immense potential of the H2O library as a valuable tool for structural health monitoring but also advocates for the broader adoption of Automated Machine Learning (AutoML) solutions in engineering applications. In addition to presenting these findings, we define H2O as a powerful machine learning library and AutoML as Automated Machine Learning to ensure clarity and understanding for readers unfamiliar with these terms. This research not only demonstrates the significance of AutoML in future-proofing our approach to structural integrity and safety but also emphasizes the need for comprehensive reporting and understanding in scientific discourse.
Collapse
Affiliation(s)
| | - Muhammad Khan
- School of Aerospace, Transport and Manufacturing, Cranfield University, Bedford MK43 0AL, UK
| | | | | |
Collapse
|
15
|
Yu HQ, O’Neill S, Kermanizadeh A. AIMS: An Automatic Semantic Machine Learning Microservice Framework to Support Biomedical and Bioengineering Research. Bioengineering (Basel) 2023; 10:1134. [PMID: 37892864 PMCID: PMC10603862 DOI: 10.3390/bioengineering10101134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 09/21/2023] [Accepted: 09/25/2023] [Indexed: 10/29/2023] Open
Abstract
The fusion of machine learning and biomedical research offers novel ways to understand, diagnose, and treat various health conditions. However, the complexities of biomedical data, coupled with the intricate process of developing and deploying machine learning solutions, often pose significant challenges to researchers in these fields. Our pivotal achievement in this research is the introduction of the Automatic Semantic Machine Learning Microservice (AIMS) framework. AIMS addresses these challenges by automating various stages of the machine learning pipeline, with a particular emphasis on the ontology of machine learning services tailored to the biomedical domain. This ontology encompasses everything from task representation, service modeling, and knowledge acquisition to knowledge reasoning and the establishment of a self-supervised learning policy. Our framework has been crafted to prioritize model interpretability, integrate domain knowledge effortlessly, and handle biomedical data with efficiency. Additionally, AIMS boasts a distinctive feature: it leverages self-supervised knowledge learning through reinforcement learning techniques, paired with an ontology-based policy recording schema. This enables it to autonomously generate, fine-tune, and continually adapt to machine learning models, especially when faced with new tasks and data. Our work has two standout contributions demonstrating that machine learning processes in the biomedical domain can be automated, while integrating a rich domain knowledge base and providing a way for machines to have self-learning ability, ensuring they handle new tasks effectively. To showcase AIMS in action, we have highlighted its prowess in three case studies of biomedical tasks. These examples emphasize how our framework can simplify research routines, uplift the caliber of scientific exploration, and set the stage for notable advances.
Collapse
|
16
|
Clinical Screening Prediction in the Portuguese National Health Service: Data Analysis, Machine Learning Models, Explainability and Meta-Evaluation. FUTURE INTERNET 2023. [DOI: 10.3390/fi15010026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
This paper presents an analysis of the calls made to the Portuguese National Health Contact Center (SNS24) during a three years period. The final goal was to develop a system to help nurse attendants select the appropriate clinical pathway (from 59 options) for each call. It examines several aspects of the calls distribution like age and gender of the user, date and time of the call and final referral, among others and presents comparative results for alternative classification models (SVM and CNN) and different data samples (three months, one and two years data models). For the task of selecting the appropriate pathway, the models, learned on the basis of the available data, achieved F1 values that range between 0.642 (3 months CNN model) and 0.783 (2 years CNN model), with SVM having a more stable performance (between 0.743 and 0.768 for the corresponding data samples). These results are discussed regarding error analysis and possibilities for explaining the system decisions. A final meta evaluation, based on a clinical expert overview, compares the different choices: the nurse attendants (reference ground truth), the expert and the automatic decisions (2 models), revealing a higher agreement between the ML models, followed by their agreement with the clinical expert, and minor agreement with the reference.
Collapse
|
17
|
Loganathan T, Priya Doss C G. The influence of machine learning technologies in gut microbiome research and cancer studies - A review. Life Sci 2022; 311:121118. [DOI: 10.1016/j.lfs.2022.121118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 10/19/2022] [Accepted: 10/19/2022] [Indexed: 11/18/2022]
|
18
|
Christidis N, Lindberg V, Jounger SL, Christidis M. Early steps towards professional clinical note-taking in a Swedish study programme in dentistry. BMC MEDICAL EDUCATION 2022; 22:676. [PMID: 36104688 PMCID: PMC9472420 DOI: 10.1186/s12909-022-03727-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 09/02/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Higher education tends to focus on academic writing only, instead of emphasizing that professional texts are also used as a basis for communication in contexts with a variety of participators. When it comes to clinical notes, research is scarce and focused on technology and informatics. Therefore, the aim was to explore dental students' clinical notes, and specifically which aspects of the clinical notes characterizes clinical notes that are not sufficient enough for professional purposes. METHODS The object of analysis was the student's written completion of a teacher constructed protocol regarding oral mucosa, the dental apparatus including pathology on tooth level, oral hygiene, and a validated international clinical examination protocol of the temporomandibular region. The study was framed within the New Literacy Studies approach, and the clinical notes were analyzed using thematic analysis. RESULTS Within the clinical notes three themes were identified; a) familiar content; b) familiar content in new context; and c) new content. The forms of notes could refer to either categorizational clinical notes or descriptive clinical notes. Most students were able to write acceptable clinical notes when the content was familiar, but as soon as the familiar content was in a new context the students had difficulties to write acceptable notes. When it comes to descriptive notes students suffered difficulties to write acceptable notes both when it came to familiar content, or familiar content in a new context. CONCLUSIONS Taken together, the results indicate that students have difficulties writing acceptable notes when they are novices to the content or context, making their notes either insufficient, too short or even wrong for professional purposes. With this in mind, this study suggests that there is a need to strengthen the demands on sufficient professional quality in clinical notes and focus on clinical notes already in the early stages of the different medical educations.
Collapse
Affiliation(s)
- Nikolaos Christidis
- Department of Dental Medicine, Division of Oral Diagnostics and Rehabilitation, Karolinska Institutet, SE-141 04, Huddinge, Sweden.
| | - Viveca Lindberg
- Department of Teaching and Learning, Stockholm University, Stockholm, Sweden
- Department of Special Education, Stockholm University, Stockholm, Sweden
| | - Sofia Louca Jounger
- Department of Dental Medicine, Division of Oral Diagnostics and Rehabilitation, Karolinska Institutet, SE-141 04, Huddinge, Sweden
| | - Maria Christidis
- Department of Health Sciences, The Swedish Red Cross University, SE-151 47, Huddinge, Sweden
- Department of Neurobiology, Care Sciences and Society, Karolinska Institutet, SE-141 83, Huddinge, Sweden
| |
Collapse
|
19
|
A Romero RA, Y Deypalan MN, Mehrotra S, Jungao JT, Sheils NE, Manduchi E, Moore JH. Benchmarking AutoML frameworks for disease prediction using medical claims. BioData Min 2022; 15:15. [PMID: 35883154 PMCID: PMC9327416 DOI: 10.1186/s13040-022-00300-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 06/27/2022] [Indexed: 11/10/2022] Open
Abstract
Objectives Ascertain and compare the performances of Automated Machine Learning (AutoML) tools on large, highly imbalanced healthcare datasets. Materials and Methods We generated a large dataset using historical de-identified administrative claims including demographic information and flags for disease codes in four different time windows prior to 2019. We then trained three AutoML tools on this dataset to predict six different disease outcomes in 2019 and evaluated model performances on several metrics. Results The AutoML tools showed improvement from the baseline random forest model but did not differ significantly from each other. All models recorded low area under the precision-recall curve and failed to predict true positives while keeping the true negative rate high. Model performance was not directly related to prevalence. We provide a specific use-case to illustrate how to select a threshold that gives the best balance between true and false positive rates, as this is an important consideration in medical applications. Discussion Healthcare datasets present several challenges for AutoML tools, including large sample size, high imbalance, and limitations in the available features. Improvements in scalability, combinations of imbalance-learning resampling and ensemble approaches, and curated feature selection are possible next steps to achieve better performance. Conclusion Among the three explored, no AutoML tool consistently outperforms the rest in terms of predictive performance. The performances of the models in this study suggest that there may be room for improvement in handling medical claims data. Finally, selection of the optimal prediction threshold should be guided by the specific practical application. Supplementary Information The online version contains supplementary material available at (10.1186/s13040-022-00300-2).
Collapse
Affiliation(s)
| | | | | | | | | | - Elisabetta Manduchi
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center Suite G540, West Hollywood, 90069, CA, USA
| | - Jason H Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center Suite G540, West Hollywood, 90069, CA, USA.
| |
Collapse
|
20
|
Just Add Data: automated predictive modeling for knowledge discovery and feature selection. NPJ Precis Oncol 2022; 6:38. [PMID: 35710826 PMCID: PMC9203777 DOI: 10.1038/s41698-022-00274-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 04/13/2022] [Indexed: 01/20/2023] Open
Abstract
Fully automated machine learning (AutoML) for predictive modeling is becoming a reality, giving rise to a whole new field. We present the basic ideas and principles of Just Add Data Bio (JADBio), an AutoML platform applicable to the low-sample, high-dimensional omics data that arise in translational medicine and bioinformatics applications. In addition to predictive and diagnostic models ready for clinical use, JADBio focuses on knowledge discovery by performing feature selection and identifying the corresponding biosignatures, i.e., minimal-size subsets of biomarkers that are jointly predictive of the outcome or phenotype of interest. It also returns a palette of useful information for interpretation, clinical use of the models, and decision making. JADBio is qualitatively and quantitatively compared against Hyper-Parameter Optimization Machine Learning libraries. Results show that in typical omics dataset analysis, JADBio manages to identify signatures comprising of just a handful of features while maintaining competitive predictive performance and accurate out-of-sample performance estimation.
Collapse
|
21
|
Artificial Intelligence for Health. COMPUTERS 2021. [DOI: 10.3390/computers10080100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Health is one of the major research topics that has been attracting cross-disciplinary research groups [...]
Collapse
|
22
|
Murovec B, Deutsch L, Stres B. General Unified Microbiome Profiling Pipeline (GUMPP) for Large Scale, Streamlined and Reproducible Analysis of Bacterial 16S rRNA Data to Predicted Microbial Metagenomes, Enzymatic Reactions and Metabolic Pathways. Metabolites 2021; 11:336. [PMID: 34074026 PMCID: PMC8225202 DOI: 10.3390/metabo11060336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 05/14/2021] [Accepted: 05/23/2021] [Indexed: 11/23/2022] Open
Abstract
General Unified Microbiome Profiling Pipeline (GUMPP) was developed for large scale, streamlined and reproducible analysis of bacterial 16S rRNA data and prediction of microbial metagenomes, enzymatic reactions and metabolic pathways from amplicon data. GUMPP workflow introduces reproducible data analyses at each of the three levels of resolution (genus; operational taxonomic units (OTUs); amplicon sequence variants (ASVs)). The ability to support reproducible analyses enables production of datasets that ultimately identify the biochemical pathways characteristic of disease pathology. These datasets coupled to biostatistics and mathematical approaches of machine learning can play a significant role in extraction of truly significant and meaningful information from a wide set of 16S rRNA datasets. The adoption of GUMPP in the gut-microbiota related research enables focusing on the generation of novel biomarkers that can lead to the development of mechanistic hypotheses applicable to the development of novel therapies in personalized medicine.
Collapse
Affiliation(s)
- Boštjan Murovec
- Faculty of Electrical Engineering, University of Ljubljana, Tržaška 25, SI-1000 Ljubljana, Slovenia;
| | - Leon Deutsch
- Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, SI-1000 Ljubljana, Slovenia;
| | - Blaž Stres
- Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, SI-1000 Ljubljana, Slovenia;
- Faculty of Civil and Geodetic Engineering, University of Ljubljana, Jamova 2, SI-1000 Ljubljana, Slovenia
- Department of Automation, Jožef Stefan Institute, Biocybernetics and Robotics, Jamova 39, SI-1000 Ljubljana, Slovenia
- Department of Microbiology, University of Innsbruck, Technikerstrasse 25d, A-6020 Innsbruck, Austria
| |
Collapse
|
23
|
Deutsch L, Stres B. The Importance of Objective Stool Classification in Fecal 1H-NMR Metabolomics: Exponential Increase in Stool Crosslinking Is Mirrored in Systemic Inflammation and Associated to Fecal Acetate and Methionine. Metabolites 2021; 11:172. [PMID: 33809780 PMCID: PMC8002301 DOI: 10.3390/metabo11030172] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/10/2021] [Accepted: 03/10/2021] [Indexed: 12/25/2022] Open
Abstract
Past studies strongly connected stool consistency-as measured by Bristol Stool Scale (BSS)-with microbial gene richness and intestinal inflammation, colonic transit time and metabolome characteristics that are of clinical relevance in numerous gastro intestinal conditions. While retention time, defecation rate, BSS but not water activity have been shown to account for BSS-associated inflammatory effects, the potential correlation with the strength of a gel in the context of intestinal forces, abrasion, mucus imprinting, fecal pore clogging remains unexplored as a shaping factor for intestinal inflammation and has yet to be determined. Our study introduced a minimal pressure approach (MP) by probe indentation as measure of stool material crosslinking in fecal samples. Results reported here were obtained from 170 samples collected in two independent projects, including males and females, covering a wide span of moisture contents and BSS. MP values increased exponentially with increasing consistency (i.e., lower BSS) and enabled stratification of samples exhibiting mixed BSS classes. A trade-off between lowest MP and highest dry matter content delineated the span of intermediate healthy density of gel crosslinks. The crossectional transects identified fecal surface layers with exceptionally high MP and of <5 mm thickness followed by internal structures with an order of magnitude lower MP, characteristic of healthy stool consistency. The MP and BSS values reported in this study were coupled to reanalysis of the PlanHab data and fecal 1H-NMR metabolomes reported before. The exponential association between stool consistency and MP determined in this study was mirrored in the elevated intestinal and also systemic inflammation and other detrimental physiological deconditioning effects observed in the PlanHab participants reported before. The MP approach described in this study can be used to better understand fecal hardness and its relationships to human health as it provides a simple, fine scale and objective stool classification approach for the characterization of the exact sampling locations in future microbiome and metabolome studies.
Collapse
Affiliation(s)
- Leon Deutsch
- Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, SI-1000 Ljubljana, Slovenia;
| | - Blaz Stres
- Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, SI-1000 Ljubljana, Slovenia;
- Faculty of Civil and Geodetic Engineering, University of Ljubljana, Jamova 2, SI-1000 Ljubljana, Slovenia
- Department of Automation, Jožef Stefan Institute, Biocybernetics and Robotics, Jamova 39, SI-1000 Ljubljana, Slovenia
- Department of Microbiology, University of Innsbruck, Technikerstrasse 25d, A-6020 Innsbruck, Austria
| |
Collapse
|
24
|
Aiding Clinical Triage with Text Classification. PROGRESS IN ARTIFICIAL INTELLIGENCE 2021. [DOI: 10.1007/978-3-030-86230-5_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|