1
|
Pope Q, Varma R, Tataru C, David MM, Fern X. Learning a deep language model for microbiomes: The power of large scale unlabeled microbiome data. PLoS Comput Biol 2025; 21:e1011353. [PMID: 40334224 PMCID: PMC12058177 DOI: 10.1371/journal.pcbi.1011353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 03/24/2025] [Indexed: 05/09/2025] Open
Abstract
We use open source human gut microbiome data to learn a microbial "language" model by adapting techniques from Natural Language Processing (NLP). Our microbial "language" model is trained in a self-supervised fashion (i.e., without additional external labels) to capture the interactions among different microbial taxa and the common compositional patterns in microbial communities. The learned model produces contextualized taxon representations that allow a single microbial taxon to be represented differently according to the specific microbial environment in which it appears. The model further provides a sample representation by collectively interpreting different microbial taxa in the sample and their interactions as a whole. We demonstrate that, while our sample representation performs comparably to baseline models in in-domain prediction tasks such as predicting Irritable Bowel Disease (IBD) and diet patterns, it significantly outperforms them when generalizing to test data from independent studies, even in the presence of substantial distribution shifts. Through a variety of analyses, we further show that the pre-trained, context-sensitive embedding captures meaningful biological information, including taxonomic relationships, correlations with biological pathways, and relevance to IBD expression, despite the model never being explicitly exposed to such signals.
Collapse
Affiliation(s)
- Quintin Pope
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon, United States of America
| | - Rohan Varma
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon, United States of America
| | - Christine Tataru
- Department of Pathology, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
| | - Maude M David
- Department of Pharmaceutical Sciences, Oregon State University, Corvallis, Oregon, United States of America
| | - Xiaoli Fern
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon, United States of America
| |
Collapse
|
2
|
Dakal TC, Xu C, Kumar A. Advanced computational tools, artificial intelligence and machine-learning approaches in gut microbiota and biomarker identification. FRONTIERS IN MEDICAL TECHNOLOGY 2025; 6:1434799. [PMID: 40303946 PMCID: PMC12037385 DOI: 10.3389/fmedt.2024.1434799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Accepted: 10/16/2024] [Indexed: 05/02/2025] Open
Abstract
The microbiome of the gut is a complex ecosystem that contains a wide variety of microbial species and functional capabilities. The microbiome has a significant impact on health and disease by affecting endocrinology, physiology, and neurology. It can change the progression of certain diseases and enhance treatment responses and tolerance. The gut microbiota plays a pivotal role in human health, influencing a wide range of physiological processes. Recent advances in computational tools and artificial intelligence (AI) have revolutionized the study of gut microbiota, enabling the identification of biomarkers that are critical for diagnosing and treating various diseases. This review hunts through the cutting-edge computational methodologies that integrate multi-omics data-such as metagenomics, metaproteomics, and metabolomics-providing a comprehensive understanding of the gut microbiome's composition and function. Additionally, machine learning (ML) approaches, including deep learning and network-based methods, are explored for their ability to uncover complex patterns within microbiome data, offering unprecedented insights into microbial interactions and their link to host health. By highlighting the synergy between traditional bioinformatics tools and advanced AI techniques, this review underscores the potential of these approaches in enhancing biomarker discovery and developing personalized therapeutic strategies. The convergence of computational advancements and microbiome research marks a significant step forward in precision medicine, paving the way for novel diagnostics and treatments tailored to individual microbiome profiles. Investigators have the ability to discover connections between the composition of microorganisms, the expression of genes, and the profiles of metabolites. Individual reactions to medicines that target gut microbes can be predicted by models driven by artificial intelligence. It is possible to obtain personalized and precision medicine by first gaining an understanding of the impact that the gut microbiota has on the development of disease. The application of machine learning allows for the customization of treatments to the specific microbial environment of an individual.
Collapse
Affiliation(s)
- Tikam Chand Dakal
- Genome and Computational Biology Lab, Department of Biotechnology, Mohanlal Sukhadia University, Udaipur, India
| | - Caiming Xu
- Beckman Research Institute of City of Hope, Monrovia, CA, United States
- Department of General Surgery, The First Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Abhishek Kumar
- Manipal Academy of Higher Education (MAHE), Manipal, India
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| |
Collapse
|
3
|
Novielli P, Magarelli M, Romano D, Di Bitonto P, Stellacci AM, Monaco A, Amoroso N, Bellotti R, Tangaro S. Leveraging explainable AI to predict soil respiration sensitivity and its drivers for climate change mitigation. Sci Rep 2025; 15:12527. [PMID: 40216855 PMCID: PMC11992127 DOI: 10.1038/s41598-025-96216-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2024] [Accepted: 03/26/2025] [Indexed: 04/14/2025] Open
Abstract
Global warming is one of the most pressing and critical problems facing the world today. It is mainly caused by the increase in greenhouse gases in the atmosphere, such as carbon dioxide (CO2). Understanding how soils respond to rising temperatures is critical for predicting carbon release and informing climate mitigation strategies. Q10, a measure of soil microbial respiration, quantifies the increase in CO2 release caused by a [Formula: see text] Celsius rise in temperature, serving as a key indicator of this sensitivity. However, predicting Q10 across diverse soil types remains a challenge, especially when considering the complex interactions between biochemical, microbiome, and environmental factors. In this study, we applied explainable artificial intelligence (XAI) to machine learning models to predict soil respiration sensitivity (Q10) and uncover the key factors driving this process. Using SHAP (SHapley Additive exPlanations) values, we identified glucose-induced soil respiration and the proportion of bacteria positively associated with Q10 as the most influential predictors. Our machine learning models achieved an accuracy of [Formula: see text], precision of [Formula: see text], an AUC-ROC of [Formula: see text], and an AUC-PRC of [Formula: see text], ensuring robust and reliable predictions. By leveraging t-SNE (t-distributed Stochastic Neighbor Embedding) and clustering techniques, we further segmented low Q10 soils into distinct subgroups, identifying soils with a higher probability of transitioning to high Q10 states. Our findings not only highlight the potential of XAI in making model predictions transparent and interpretable, but also provide actionable insights into managing soil carbon release in response to climate change. This research bridges the gap between AI-driven environmental modeling and practical applications in agriculture, offering new directions for targeted soil management and climate resilience strategies.
Collapse
Affiliation(s)
- Pierfrancesco Novielli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, 70125, Italy
| | - Michele Magarelli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy
| | - Donato Romano
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, 70125, Italy
| | - Pierpaolo Di Bitonto
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, 70125, Italy
| | - Anna Maria Stellacci
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy
| | - Alfonso Monaco
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, 70125, Italy
- Dipartimento Interateneo di Fisica 'M. Merlin', Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy
| | - Nicola Amoroso
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, 70125, Italy
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, 70125, Italy
- Dipartimento Interateneo di Fisica 'M. Merlin', Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy
| | - Sabina Tangaro
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, 70125, Italy.
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, 70125, Italy.
| |
Collapse
|
4
|
McDonnell KJ. Operationalizing Team Science at the Academic Cancer Center Network to Unveil the Structure and Function of the Gut Microbiome. J Clin Med 2025; 14:2040. [PMID: 40142848 PMCID: PMC11943358 DOI: 10.3390/jcm14062040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 02/28/2025] [Accepted: 03/05/2025] [Indexed: 03/28/2025] Open
Abstract
Oncologists increasingly recognize the microbiome as an important facilitator of health as well as a contributor to disease, including, specifically, cancer. Our knowledge of the etiologies, mechanisms, and modulation of microbiome states that ameliorate or promote cancer continues to evolve. The progressive refinement and adoption of "omic" technologies (genomics, transcriptomics, proteomics, and metabolomics) and utilization of advanced computational methods accelerate this evolution. The academic cancer center network, with its immediate access to extensive, multidisciplinary expertise and scientific resources, has the potential to catalyze microbiome research. Here, we review our current understanding of the role of the gut microbiome in cancer prevention, predisposition, and response to therapy. We underscore the promise of operationalizing the academic cancer center network to uncover the structure and function of the gut microbiome; we highlight the unique microbiome-related expert resources available at the City of Hope of Comprehensive Cancer Center as an example of the potential of team science to achieve novel scientific and clinical discovery.
Collapse
Affiliation(s)
- Kevin J McDonnell
- Center for Precision Medicine, Department of Medical Oncology & Therapeutics Research, City of Hope Comprehensive Cancer Center, Duarte, CA 91010, USA
| |
Collapse
|
5
|
Zhang L, Li X, Shi L, Zheng Y, Ding Y, Yuan T, Hu S, Chen J, Xiao P. Bacterial diversity and biomarkers screening of station and carriage surface in Shanghai metro system, China. CURRENT RESEARCH IN MICROBIAL SCIENCES 2025; 8:100374. [PMID: 40225043 PMCID: PMC11992389 DOI: 10.1016/j.crmicr.2025.100374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2025] Open
Abstract
Background Mass transit environments, such as the metro, can facilitate the spread of bacteria between humans and their surroundings. These environments are particularly important for human health due to their potential for spreading pathogens and their impact on large populations. To gain a deeper understanding of bacterial distribution in subways, it is essential to identify variables that affect bacterial composition and microorganisms that are probably harmful to human heath. Methods We conducted high-throughput 16S rRNA gene sequencing on surface samples from 5 subway stations in Shanghai, China, during the warm(summer), cold(winter) and transition(autumn) seasons. Bacteria community features across the three seasons were distinguished using random forest classification analyses, followed by in-depth diversity analyses. Results Significant differences were observed in surface bacterial communities across seasons. Highly abundant bacterial groups were generally ubiquitous. Among these highly abundant families and genera, some were unique to surface samples. Notably, the phyla Firmicutes, Proteobacteria, and Actinobacteria were predominant, with total abundances of 32.87 %, 29.41 %, and 16.31 %, respectively. Alpha diversity indices were statistically significant (P < 0.05) among different seasons, with autumn exhibiting significantly higher alpha diversity metrics compared to summer and winter. Beta diversity analysis revealed significant compositional dissimilarities and distinct clustering patterns among the three seasons (P < 0.05). An analysis of similarities (ANOSIM) test results indicated significant differences in bacterial patterns at the phylum, class, order, family, genus levels among the seasons (P < 0.05). Random forest classification analyses identified the top 24 bacterial taxa at the genus level across seasons in the metro system. Conclusions We provided a direct comparison of surface bacterial microbiomes, and a comprehensive survey of seasonal variation in subways using culture-independent methods. Our findings reveal differences in both diversity and abundance of certain taxa across seasons, with 24 top indicator bacterial genera identified. This work serves as a reference for understanding the composition and dynamics of bacterial communities and for biomarker screening in subways, a crucial public space in our increasingly urbanized and interconnected world.
Collapse
Affiliation(s)
- Lijun Zhang
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Xiaojing Li
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Lisha Shi
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Yi Zheng
- Shanghai Shentong Metro Group Co.,Ltd, Shanghai 201103, China
| | - Yichen Ding
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Tao Yuan
- Shanghai Jiao Tong University, Shanghai 200030, China
| | - Shuangqing Hu
- Shanghai Academy of Environmental Sciences, Shanghai 200233,China
| | - Jian Chen
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Ping Xiao
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| |
Collapse
|
6
|
Jeong IJ, Hong JK, Bae YJ, Lee TK. Enhancing Bacterial Phenotype Classification Through the Integration of Autogating and Automated Machine Learning in Flow Cytometric Analysis. Cytometry A 2025; 107:203-213. [PMID: 40062709 DOI: 10.1002/cyto.a.24923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 12/17/2024] [Accepted: 02/27/2025] [Indexed: 04/11/2025]
Abstract
Although flow cytometry produces reliable results, the data processing from gating to fingerprinting is prone to subjective bias. Here, we integrated autogating with Automated Machine Learning in flow cytometry to enhance the classification of bacterial phenotypes. We analyzed six bacterial strains prevalent in the soil and groundwater- Bacillus subtilis , Burkholderia thailandensis , Corynebacterium glutamicum , Escherichia coli , Pseudomonas putida , and Pseudomonas stutzeri . Using the H2O-AutoML framework, we applied gradient-boosting machine (GBM) models to classify bacteria across different metabolic phases. Our results demonstrated an overall classification accuracy of 82.34% for GBM. Notably, accuracy varied across metabolic phases, with the highest observed during the late log (88.06%), lag (88.43%), and early log phases (89.37%), whereas the stationary phase showed a slightly lower accuracy of 80.73%. P. stutzeri exhibited consistently high sensitivity and specificity across all the phases, which indicated that it was the most distinctly identifiable strain. In contrast, E. coli showed low sensitivity, particularly in the stationary phase, which indicated challenges in its classification. Overall, this study with incorporating autogating and the AutoML framework, substantially reduces subjective biases and enhances the reproducibility and accuracy of microbial classification. Our methodology offers a robust framework for microbial classification in flow cytometric analysis, paving the way for more precise and comprehensive analyses of microbial ecology.
Collapse
Affiliation(s)
- In Jae Jeong
- Department of Environmental and Energy Engineering, Yonsei University, Wonju, Republic of Korea
| | - Jin-Kyung Hong
- Department of Environmental and Energy Engineering, Yonsei University, Wonju, Republic of Korea
| | - Young Jun Bae
- Department of Environmental and Energy Engineering, Yonsei University, Wonju, Republic of Korea
| | - Tea Kwon Lee
- Department of Environmental and Energy Engineering, Yonsei University, Wonju, Republic of Korea
| |
Collapse
|
7
|
Przymus P, Rykaczewski K, Martín-Segura A, Truu J, Carrillo De Santa Pau E, Kolev M, Naskinova I, Gruca A, Sampri A, Frohme M, Nechyporenko A. Deep learning in microbiome analysis: a comprehensive review of neural network models. Front Microbiol 2025; 15:1516667. [PMID: 39911715 PMCID: PMC11794229 DOI: 10.3389/fmicb.2024.1516667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2024] [Accepted: 12/16/2024] [Indexed: 02/07/2025] Open
Abstract
Microbiome research, the study of microbial communities in diverse environments, has seen significant advances due to the integration of deep learning (DL) methods. These computational techniques have become essential for addressing the inherent complexity and high-dimensionality of microbiome data, which consist of different types of omics datasets. Deep learning algorithms have shown remarkable capabilities in pattern recognition, feature extraction, and predictive modeling, enabling researchers to uncover hidden relationships within microbial ecosystems. By automating the detection of functional genes, microbial interactions, and host-microbiome dynamics, DL methods offer unprecedented precision in understanding microbiome composition and its impact on health, disease, and the environment. However, despite their potential, deep learning approaches face significant challenges in microbiome research. Additionally, the biological variability in microbiome datasets requires tailored approaches to ensure robust and generalizable outcomes. As microbiome research continues to generate vast and complex datasets, addressing these challenges will be crucial for advancing microbiological insights and translating them into practical applications with DL. This review provides an overview of different deep learning models in microbiome research, discussing their strengths, practical uses, and implications for future studies. We examine how these models are being applied to solve key problems and highlight potential pathways to overcome current limitations, emphasizing the transformative impact DL could have on the field moving forward.
Collapse
Affiliation(s)
- Piotr Przymus
- Faculty of Mathematics and Computer Science, Nicolaus Copernicus University in Toruń, Toruń, Pomeranian, Poland
| | - Krzysztof Rykaczewski
- Faculty of Mathematics and Computer Science, Nicolaus Copernicus University in Toruń, Toruń, Pomeranian, Poland
| | | | - Jaak Truu
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | | | - Mikhail Kolev
- Department of Mathematics, University of Architecture, Civil Engineering and Geodesy, Sofia, Bulgaria
- Department of Applied Computer Science and Mathematical Modeling, Faculty of Mathematics and Computer Science, University of Warmia and Mazury in Olsztyn, Olsztyn, Poland
| | - Irina Naskinova
- Department of Mathematics, University of Architecture, Civil Engineering and Geodesy, Sofia, Bulgaria
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, Poland
| | - Alexia Sampri
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, United Kingdom
| | - Marcus Frohme
- Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Brandenburg, Germany
| | - Alina Nechyporenko
- Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Brandenburg, Germany
- Department of System Engineering, Kharkiv National University of Radioelectronics, Kharkiv, Ukraine
| |
Collapse
|
8
|
Kim S, Lee HC, Sim JE, Park SJ, Oh HH. Bacterial profile-based body fluid identification using a machine learning approach. Genes Genomics 2025; 47:87-98. [PMID: 39503932 DOI: 10.1007/s13258-024-01594-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Accepted: 10/23/2024] [Indexed: 01/16/2025]
Abstract
BACKGROUND Identifying the origins of biological traces is critical for the reconstruction of crime scenes in forensic investigations. Traditional methods for body fluid identification rely on chemical, enzymatic, immunological, and spectroscopic techniques, which can be sample-consuming and depend on simple color-change reactions. However, these methods have limitations when residual samples are insufficient after DNA extraction. OBJECTIVE This study aimed to develop a method for body fluid identification by leveraging bacterial DNA profiling to overcome the limitations of the conventional approaches. METHODS Bacterial profiles were determined by sequencing the hypervariable region of the 16 S rRNA gene, using DNA metabarcoding of evidence collected from criminal cases. Amplicon sequence variants (ASVs) were analyzed to identify significant microbial patterns in different body fluid samples. RESULTS The bacterial profile-based method demonstrated high discriminatory power with a machine learning model trained using the naïve Bayes algorithm, achieving an accuracy of over 98% in classifying samples into one of four body fluid types: blood, saliva, vaginal secretion, and mixture traces of vaginal secretions and semen. CONCLUSION Bacterial profiling enhances the accuracy and robustness of body fluid identification in forensic analysis, providing a valuable alternative to traditional methods by utilizing DNA and microbial community data despite the uncontrollable conditions. This approach offers significant improvements in the classification accuracy and practical applicability in forensic investigations.
Collapse
Affiliation(s)
- Sungmin Kim
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea.
| | - Han Chul Lee
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea
| | - Jeong Eun Sim
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea
| | - Su Jeong Park
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea
| | - Hye Hyun Oh
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea
| |
Collapse
|
9
|
Lv Y, Xian Y, Lei X, Xie S, Zhang B. The role of the microbiota-gut-brain axis and artificial intelligence in cognitive health of pediatric obstructive sleep apnea: A narrative review. Medicine (Baltimore) 2024; 103:e40900. [PMID: 39686454 PMCID: PMC11651515 DOI: 10.1097/md.0000000000040900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Accepted: 11/22/2024] [Indexed: 12/18/2024] Open
Abstract
Pediatric obstructive sleep apnea (OSA) is a prevalent sleep-related breathing disorder associated with significant neurocognitive and behavioral impairments. Recent studies have highlighted the role of gut microbiota and the microbiota-gut-brain axis (MGBA) in influencing cognitive health in children with OSA. This narrative review aims to summarize current knowledge on the relationship between gut microbiota, MGBA, and cognitive function in pediatric OSA. It also explores the potential of artificial intelligence and machine learning in advancing this field and identifying novel therapeutic strategies. Pediatric OSA is associated with gut dysbiosis, reduced microbial diversity, and metabolic disruptions. MGBA mechanisms, such as endocrine, immune, and neural pathways, link gut microbiota to cognitive outcomes. Artificial intelligence and machine learning methodologies offer promising tools to uncover microbial markers and mechanisms associated with cognitive deficits in OSA. Future research should focus on validating these findings through clinical trials and developing personalized therapeutic approaches targeting the gut microbiota.
Collapse
Affiliation(s)
- Yunjiao Lv
- Department of First Clinical College, Guangzhou Medical University, Guangzhou, China
| | - Yongtao Xian
- Department of First Clinical College, Guangzhou Medical University, Guangzhou, China
| | - Xinye Lei
- Department of First Clinical College, Guangzhou Medical University, Guangzhou, China
| | - Siqi Xie
- Department of First Clinical College, Guangzhou Medical University, Guangzhou, China
| | - Biyun Zhang
- Department of Pediatrics, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
10
|
Haykal D, Cartier H, Dréno B. Dermatological Health in the Light of Skin Microbiome Evolution. J Cosmet Dermatol 2024; 23:3836-3846. [PMID: 39248208 PMCID: PMC11626341 DOI: 10.1111/jocd.16557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 08/12/2024] [Accepted: 08/20/2024] [Indexed: 09/10/2024]
Abstract
BACKGROUND The complex ecosystem of the skin microbiome is essential for skin health by acting as a primary defense against infections, regulating immune responses, and maintaining barrier integrity. This literature review aims to consolidate existing information on the skin microbiome, focusing on its composition, functionality, importance, and its impact on skin aging. METHODS An exhaustive exploration of scholarly literature was performed utilizing electronic databases including PubMed, Google Scholar, and ResearchGate, focusing on studies published between 2011 and 2024. Keywords included "skin microbiome," "skin microbiota," and "aging skin." Studies involving human subjects that focused on the skin microbiome's relationship with skin health were included. Out of 100 initially identified studies, 70 met the inclusion criteria and were reviewed. RESULTS Studies showed that aging is associated with a reduction in the variety of microorganisms of the skin microbiome, leading to an increased susceptibility to skin conditions. Consequently, this underlines the interest in bacteriotherapy, mainly topical probiotics, to reinforce the skin microbiome in older adults, suggesting improvements in skin health and a reduction in age-related skin conditions. Further exploration is needed into the microbiome's role in skin health and the development of innovative, microbe-based skincare products. Biotherapeutic approaches, including the use of phages, endolysins, probiotics, prebiotics, postbiotics, and microbiome transplantation, can restore balance and enhance skin health. This article also addresses regulatory standards in the EU and the USA that ensure the safety and effectiveness of microbial skincare products. CONCLUSION This review underscores the need to advance research on the skin microbiome's role in cosmetic enhancements and tailored skincare solutions, highlighting a great interest in leveraging microbial communities for dermatological benefits.
Collapse
Affiliation(s)
| | | | - Brigitte Dréno
- Department of Dermato‐CancerologyCHU Nantes—Hôtel‐Dieu CRCINANantesFrance
| |
Collapse
|
11
|
Probul N, Huang Z, Saak CC, Baumbach J, List M. AI in microbiome-related healthcare. Microb Biotechnol 2024; 17:e70027. [PMID: 39487766 PMCID: PMC11530995 DOI: 10.1111/1751-7915.70027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 09/23/2024] [Indexed: 11/04/2024] Open
Abstract
Artificial intelligence (AI) has the potential to transform clinical practice and healthcare. Following impressive advancements in fields such as computer vision and medical imaging, AI is poised to drive changes in microbiome-based healthcare while facing challenges specific to the field. This review describes the state-of-the-art use of AI in microbiome-related healthcare. It points out limitations across topics such as data handling, AI modelling and safeguarding patient privacy. Furthermore, we indicate how these current shortcomings could be overcome in the future and discuss the influence and opportunities of increasingly complex data on microbiome-based healthcare.
Collapse
Affiliation(s)
- Niklas Probul
- Institute for Computational Systems BiologyUniversity of HamburgHamburgGermany
| | - Zihua Huang
- Data Science in Systems Biology, TUM School of Life SciencesTechnical University of MunichFreisingGermany
| | | | - Jan Baumbach
- Institute for Computational Systems BiologyUniversity of HamburgHamburgGermany
- Computational Biomedicine Lab, Department of Mathematics and Computer ScienceUniversity of Southern DenmarkOdenseDenmark
| | - Markus List
- Data Science in Systems Biology, TUM School of Life SciencesTechnical University of MunichFreisingGermany
- Munich Data Science InstituteTechnical University of MunichGarchingGermany
| |
Collapse
|
12
|
Kunjalwar R, Keerti A, Chaudhari A, Sahoo K, Meshram S. Microbial Therapeutics in Oncology: A Comprehensive Review of Bacterial Role in Cancer Treatment. Cureus 2024; 16:e70920. [PMID: 39502977 PMCID: PMC11535891 DOI: 10.7759/cureus.70920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Accepted: 10/05/2024] [Indexed: 11/08/2024] Open
Abstract
Conventional cancer therapies, including chemotherapy, radiotherapy, and immunotherapy, have significantly advanced cancer treatment. However, these modalities often face limitations such as systemic toxicity, lack of specificity, and the emergence of resistance. Recent advancements in genetic engineering and synthetic biology have rekindled interest in using bacteria as a novel therapeutic approach in oncology. This comprehensive review explores the potential of microbial therapeutics, particularly bacterial therapies, in the treatment of cancer. Bacterial therapies offer several unique advantages, such as the ability to selectively target and colonize hypoxic and necrotic regions of tumors, areas typically resistant to conventional treatments. The review delves into the mechanisms through which bacteria exert antitumor effects, including direct tumor cell lysis, modulation of the immune response, and delivery of therapeutic agents like cytotoxins and enzymes. Various bacterial species, such as Salmonella, Clostridium, Lactobacillus, and Listeria, have shown promise in preclinical and clinical studies, demonstrating diverse mechanisms of action and therapeutic potential. Moreover, the review discusses the challenges associated with bacterial therapies, such as safety concerns, immune evasion, and the need for precise targeting, and how recent advances in genetic engineering are being used to overcome these hurdles. Current clinical trials and combination strategies with conventional therapies are also highlighted to provide a comprehensive overview of the ongoing developments in this field. In conclusion, while bacterial therapeutics present a novel and promising avenue in cancer treatment, further research and clinical validation is required to fully realize their potential. This review aims to inspire further exploration into microbial oncology, paving the way for innovative and more effective cancer therapies.
Collapse
Affiliation(s)
- Radha Kunjalwar
- Microbiology, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| | - Akshunna Keerti
- Internal Medicine, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| | - Achal Chaudhari
- Microbiology, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| | - Kaushik Sahoo
- Microbiology, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| | - Supriya Meshram
- Microbiology, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| |
Collapse
|
13
|
Shamrat FMJM, Shakil R, Idris MYI, Akter B, Zhou X. FruitSeg30_Segmentation dataset & mask annotations: A novel dataset for diverse fruit segmentation and classification. Data Brief 2024; 56:110821. [PMID: 39252785 PMCID: PMC11381999 DOI: 10.1016/j.dib.2024.110821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 07/16/2024] [Accepted: 08/05/2024] [Indexed: 09/11/2024] Open
Abstract
Fruits are mature ovaries of flowering plants that are integral to human diets, providing essential nutrients such as vitamins, minerals, fiber and antioxidants that are crucial for health and disease prevention. Accurate classification and segmentation of fruits are crucial in the agricultural sector for enhancing the efficiency of sorting and quality control processes, which significantly benefit automated systems by reducing labor costs and improving product consistency. This paper introduces the "FruitSeg30_Segmentation Dataset & Mask Annotations", a novel dataset designed to advance the capability of deep learning models in fruit segmentation and classification. Comprising 1969 high-quality images across 30 distinct fruit classes, this dataset provides diverse visuals essential for a robust model. Utilizing a U-Net architecture, the model trained on this dataset achieved training accuracy of 94.72 %, validation accuracy of 92.57 %, precision of 94 %, recall of 91 %, f1-score of 92.5 %, IoU score of 86 %, and maximum dice score of 0.9472, demonstrating superior performance in segmentation tasks. The FruitSeg30 dataset fills a critical gap and sets new standards in dataset quality and diversity, enhancing agricultural technology and food industry applications.
Collapse
Affiliation(s)
| | - Rashiduzzaman Shakil
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka 1216, Bangladesh
| | - Mohd Yamani Idna Idris
- Department of Computer System and Technology, Universiti Malaya, Kuala Lumpur 50603, Malaysia
| | - Bonna Akter
- Department of Computer Science and Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka 1216, Bangladesh
| | - Xujuan Zhou
- School of Business, University of Southern Queensland, Springfield, Australia
| |
Collapse
|
14
|
Porreca A, Ibrahimi E, Maturo F, Marcos Zambrano LJ, Meto M, Lopes MB. Robust prediction of colorectal cancer via gut microbiome 16S rRNA sequencing data. J Med Microbiol 2024; 73. [PMID: 39377779 DOI: 10.1099/jmm.0.001903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/09/2024] Open
Abstract
Introduction. The study addresses the challenge of utilizing human gut microbiome data for the early detection of colorectal cancer (CRC). The research emphasizes the potential of using machine learning techniques to analyze complex microbiome datasets, providing a non-invasive approach to identifying CRC-related microbial markers.Hypothesis/Gap Statement. The primary hypothesis is that a robust machine learning-based analysis of 16S rRNA microbiome data can identify specific microbial features that serve as effective biomarkers for CRC detection, overcoming the limitations of classical statistical models in high-dimensional settings.Aim. The primary objective of this study is to explore and validate the potential of the human microbiome, specifically in the colon, as a valuable source of biomarkers for colorectal cancer (CRC) detection and progression. The focus is on developing a classifier that effectively predicts the presence of CRC and normal samples based on the analysis of three previously published faecal 16S rRNA sequencing datasets.Methodology. To achieve the aim, various machine learning techniques are employed, including random forest (RF), recursive feature elimination (RFE) and a robust correlation-based technique known as the fuzzy forest (FF). The study utilizes these methods to analyse the three datasets, comparing their performance in predicting CRC and normal samples. The emphasis is on identifying the most relevant microbial features (taxa) associated with CRC development via partial dependence plots, i.e. a machine learning tool focused on explainability, visualizing how a feature influences the predicted outcome.Results. The analysis of the three faecal 16S rRNA sequencing datasets reveals the consistent and superior predictive performance of the FF compared to the RF and RFE. Notably, FF proves effective in addressing the correlation problem when assessing the importance of microbial taxa in explaining the development of CRC. The results highlight the potential of the human microbiome as a non-invasive means to detect CRC and underscore the significance of employing FF for improved predictive accuracy.Conclusion. In conclusion, this study underscores the limitations of classical statistical techniques in handling high-dimensional information such as human microbiome data. The research demonstrates the potential of the human microbiome, specifically in the colon, as a valuable source of biomarkers for CRC detection. Applying machine learning techniques, particularly the FF, is a promising approach for building a classifier to predict CRC and normal samples. The findings advocate for integrating FF to overcome the challenges associated with correlation when identifying crucial microbial features linked to CRC development.
Collapse
Affiliation(s)
- Annamaria Porreca
- Department of Economics, Statistics and Business, Faculty of Economics and Law, Universitas Mercatorum, Rome, Italy
| | - Eliana Ibrahimi
- Department of Biology, University of Tirana, Tirana, Albania
| | - Fabrizio Maturo
- Department of Economics, Statistics and Business, Faculty of Technological and Innovation Sciences, Universitas Mercatorum, Rome, Italy
| | - Laura Judith Marcos Zambrano
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
| | - Melisa Meto
- Department of Biology, University of Tirana, Tirana, Albania
| | - Marta B Lopes
- Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, Portugal
- UNIDEMI, Research and Development Unit for Mechanical and Industrial Engineering, NOVA School of Science and Technology, Caparica, Portugal
| |
Collapse
|
15
|
Shi K, Liu Q, Ji Q, He Q, Zhao XM. MicroHDF: predicting host phenotypes with metagenomic data using a deep forest-based framework. Brief Bioinform 2024; 25:bbae530. [PMID: 39446191 PMCID: PMC11500453 DOI: 10.1093/bib/bbae530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 09/25/2024] [Accepted: 10/07/2024] [Indexed: 10/25/2024] Open
Abstract
The gut microbiota plays a vital role in human health, and significant effort has been made to predict human phenotypes, especially diseases, with the microbiota as a promising indicator or predictor with machine learning (ML) methods. However, the accuracy is impacted by a lot of factors when predicting host phenotypes with the metagenomic data, e.g. small sample size, class imbalance, high-dimensional features, etc. To address these challenges, we propose MicroHDF, an interpretable deep learning framework to predict host phenotypes, where a cascade layers of deep forest units is designed for handling sample class imbalance and high dimensional features. The experimental results show that the performance of MicroHDF is competitive with that of existing state-of-the-art methods on 13 publicly available datasets of six different diseases. In particular, it performs best with the area under the receiver operating characteristic curve of 0.9182 ± 0.0098 and 0.9469 ± 0.0076 for inflammatory bowel disease (IBD) and liver cirrhosis, respectively. Our MicroHDF also shows better performance and robustness in cross-study validation. Furthermore, MicroHDF is applied to two high-risk diseases, IBD and autism spectrum disorder, as case studies to identify potential biomarkers. In conclusion, our method provides an effective and reliable prediction of the host phenotype and discovers informative features with biological insights.
Collapse
Affiliation(s)
- Kai Shi
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, Gaungxi 541004, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent Systems, Guilin University of Technology, Guilin, Gaungxi 541004, China
| | - Qiaohui Liu
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, Gaungxi 541004, China
| | - Qingrong Ji
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, Gaungxi 541004, China
| | - Qisheng He
- College of Computer Science and Engineering, Guilin University of Technology, Guilin, Gaungxi 541004, China
| | - Xing-Ming Zhao
- Huzhou Central Hospital, Affiliated Central Hospital Huzhou University, Huzhou, Zhejiang 313000, China
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| |
Collapse
|
16
|
Novielli P, Romano D, Magarelli M, Diacono D, Monaco A, Amoroso N, Vacca M, De Angelis M, Bellotti R, Tangaro S. Personalized identification of autism-related bacteria in the gut microbiome using explainable artificial intelligence. iScience 2024; 27:110709. [PMID: 39286497 PMCID: PMC11402656 DOI: 10.1016/j.isci.2024.110709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 07/05/2024] [Accepted: 08/07/2024] [Indexed: 09/19/2024] Open
Abstract
Autism spectrum disorder (ASD) affects social interaction and communication. Emerging evidence links ASD to gut microbiome alterations, suggesting that microbial composition may play a role in the disorder. This study employs explainable artificial intelligence (XAI) to examine the contributions of individual microbial species to ASD. By using local explanation embeddings and unsupervised clustering, the research identifies distinct ASD subgroups, underscoring the disorder's heterogeneity. Specific microbial biomarkers associated with ASD are revealed, and the best classifiers achieved an AU-ROC of 0.965 ± 0.005 and an AU-PRC of 0.967 ± 0.008. The findings support the notion that gut microbiome composition varies significantly among individuals with ASD. This work's broader significance lies in its potential to inform personalized interventions, enhancing precision in ASD management and classification. These insights highlight the importance of individualized microbiome profiles for developing tailored therapeutic strategies for ASD.
Collapse
Affiliation(s)
- Pierfrancesco Novielli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, 70126 Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
| | - Donato Romano
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, 70126 Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
| | - Michele Magarelli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, 70126 Bari, Italy
| | - Domenico Diacono
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
| | - Alfonso Monaco
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
- Dipartimento Interateneo di Fisica "M. Merlin", Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Nicola Amoroso
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Mirco Vacca
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, 70126 Bari, Italy
| | - Maria De Angelis
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, 70126 Bari, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
- Dipartimento Interateneo di Fisica "M. Merlin", Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Sabina Tangaro
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, 70126 Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
| |
Collapse
|
17
|
Murovec B, Deutsch L, Osredkar D, Stres B. MetaBakery: a Singularity implementation of bioBakery tools as a skeleton application for efficient HPC deconvolution of microbiome metagenomic sequencing data to machine learning ready information. Front Microbiol 2024; 15:1426465. [PMID: 39139377 PMCID: PMC11321593 DOI: 10.3389/fmicb.2024.1426465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 07/16/2024] [Indexed: 08/15/2024] Open
Abstract
In this study, we present MetaBakery (http://metabakery.fe.uni-lj.si), an integrated application designed as a framework for synergistically executing the bioBakery workflow and associated utilities. MetaBakery streamlines the processing of any number of paired or unpaired fastq files, or a mixture of both, with optional compression (gzip, zip, bzip2, xz, or mixed) within a single run. MetaBakery uses programs such as KneadData (https://github.com/bioBakery/kneaddata), MetaPhlAn, HUMAnN and StrainPhlAn as well as integrated utilities and extends the original functionality of bioBakery. In particular, it includes MelonnPan for the prediction of metabolites and Mothur for calculation of microbial alpha diversity. Written in Python 3 and C++ the whole pipeline was encapsulated as Singularity container for efficient execution on various computing infrastructures, including large High-Performance Computing clusters. MetaBakery facilitates crash recovery, efficient re-execution upon parameter changes, and processing of large data sets through subset handling and is offered in three editions with bioBakery ingredients versions 4, 3 and 2 as versatile, transparent and well documented within the MetaBakery Users' Manual (http://metabakery.fe.uni-lj.si/metabakery_manual.pdf). It provides automatic handling of command line parameters, file formats and comprehensive hierarchical storage of output to simplify navigation and debugging. MetaBakery filters out potential human contamination and excludes samples with low read counts. It calculates estimates of alpha diversity and represents a comprehensive and augmented re-implementation of the bioBakery workflow. The robustness and flexibility of the system enables efficient exploration of changing parameters and input datasets, increasing its utility for microbiome analysis. Furthermore, we have shown that the MetaBakery tool can be used in modern biostatistical and machine learning approaches including large-scale microbiome studies.
Collapse
Affiliation(s)
- Boštjan Murovec
- University of Ljubljana, Faculty of Electrical Engineering, Ljubljana, Slovenia
| | - Leon Deutsch
- University of Ljubljana, Department of Animal Science, Biotechnical Faculty, Ljubljana, Slovenia
- The NU, The Nu B.V., Leiden, Netherlands
| | - Damjan Osredkar
- Department of Pediatric Neurology, University Children's Hospital, University Medical Centre Ljubljana, Ljubljana, Slovenia
- University of Ljubljana, Medical Faculty, Ljubljana, Slovenia
| | - Blaž Stres
- University of Ljubljana, Department of Animal Science, Biotechnical Faculty, Ljubljana, Slovenia
- D13 Department of Catalysis and Chemical Reaction Engineering, National Institute of Chemistry, Ljubljana, Slovenia
- University of Ljubljana, Faculty of Civil and Geodetic Engineering, Ljubljana, Slovenia
- Department of Automation, Biocybernetics and Robotics, Jožef Stefan Institute, Ljubljana, Slovenia
| |
Collapse
|
18
|
Regueira-Iglesias A, Suárez-Rodríguez B, Blanco-Pintos T, Relvas M, Alonso-Sampedro M, Balsa-Castro C, Tomás I. The salivary microbiome as a diagnostic biomarker of periodontitis: a 16S multi-batch study before and after the removal of batch effects. Front Cell Infect Microbiol 2024; 14:1405699. [PMID: 39071165 PMCID: PMC11272481 DOI: 10.3389/fcimb.2024.1405699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Accepted: 06/17/2024] [Indexed: 07/30/2024] Open
Abstract
Introduction Microbiome-based clinical applications that improve diagnosis related to oral health are of great interest to precision dentistry. Predictive studies on the salivary microbiome are scarce and of low methodological quality (low sample sizes, lack of biological heterogeneity, and absence of a validation process). None of them evaluates the impact of confounding factors as batch effects (BEs). This is the first 16S multi-batch study to analyze the salivary microbiome at the amplicon sequence variant (ASV) level in terms of differential abundance and machine learning models. This is done in periodontally healthy and periodontitis patients before and after removing BEs. Methods Saliva was collected from 124 patients (50 healthy, 74 periodontitis) in our setting. Sequencing of the V3-V4 16S rRNA gene region was performed in Illumina MiSeq. In parallel, searches were conducted on four databases to identify previous Illumina V3-V4 sequencing studies on the salivary microbiome. Investigations that met predefined criteria were included in the analysis, and the own and external sequences were processed using the same bioinformatics protocol. The statistical analysis was performed in the R-Bioconductor environment. Results The elimination of BEs reduced the number of ASVs with differential abundance between the groups by approximately one-third (Before=265; After=190). Before removing BEs, the model constructed using all study samples (796) comprised 16 ASVs (0.16%) and had an area under the curve (AUC) of 0.944, sensitivity of 90.73%, and specificity of 87.16%. The model built using two-thirds of the specimens (training=531) comprised 35 ASVs (0.36%) and had an AUC of 0.955, sensitivity of 86.54%, and specificity of 90.06% after being validated in the remaining one-third (test=265). After removing BEs, the models required more ASVs (all samples=200-2.03%; training=100-1.01%) to obtain slightly lower AUC (all=0.935; test=0.947), lower sensitivity (all=81.79%; test=78.85%), and similar specificity (all=91.51%; test=90.68%). Conclusions The removal of BEs controls false positive ASVs in the differential abundance analysis. However, their elimination implies a significantly larger number of predictor taxa to achieve optimal performance, creating less robust classifiers. As all the provided models can accurately discriminate health from periodontitis, implying good/excellent sensitivities/specificities, the salivary microbiome demonstrates potential clinical applicability as a precision diagnostic tool for periodontitis.
Collapse
Affiliation(s)
- Alba Regueira-Iglesias
- Oral Sciences Research Group, Special Needs Unit, Department of Surgery and Medical-Surgical Specialties, School of Medicine and Dentistry, Instituto de Investigación Sanitaria de Santiago (IDIS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Berta Suárez-Rodríguez
- Oral Sciences Research Group, Special Needs Unit, Department of Surgery and Medical-Surgical Specialties, School of Medicine and Dentistry, Instituto de Investigación Sanitaria de Santiago (IDIS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Triana Blanco-Pintos
- Oral Sciences Research Group, Special Needs Unit, Department of Surgery and Medical-Surgical Specialties, School of Medicine and Dentistry, Instituto de Investigación Sanitaria de Santiago (IDIS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Marta Relvas
- Instituto Universitário de Ciências da Saúde, Cooperativa de Ensino Superior Politécnico e Universitário (IUCS-CESPU), Unidade de Investigação em Patologia e Reabilitação Oral (UNIPRO), Gandra, Portugal
| | - Manuela Alonso-Sampedro
- Department of Internal Medicine and Clinical Epidemiology, Instituto de Investigación Sanitaria de Santiago (IDIS), Complejo Hospitalario Universitario, Santiago de Compostela, Spain
| | - Carlos Balsa-Castro
- Oral Sciences Research Group, Special Needs Unit, Department of Surgery and Medical-Surgical Specialties, School of Medicine and Dentistry, Instituto de Investigación Sanitaria de Santiago (IDIS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Inmaculada Tomás
- Oral Sciences Research Group, Special Needs Unit, Department of Surgery and Medical-Surgical Specialties, School of Medicine and Dentistry, Instituto de Investigación Sanitaria de Santiago (IDIS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| |
Collapse
|
19
|
Magarelli M, Novielli P, De Filippis F, Magliulo R, Di Bitonto P, Diacono D, Bellotti R, Tangaro S. Explainable artificial intelligence and microbiome data for food geographical origin: the Mozzarella di Bufala Campana PDO Case of Study. Front Microbiol 2024; 15:1393243. [PMID: 38887708 PMCID: PMC11180736 DOI: 10.3389/fmicb.2024.1393243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 05/13/2024] [Indexed: 06/20/2024] Open
Abstract
Identifying the origin of a food product holds paramount importance in ensuring food safety, quality, and authenticity. Knowing where a food item comes from provides crucial information about its production methods, handling practices, and potential exposure to contaminants. Machine learning techniques play a pivotal role in this process by enabling the analysis of complex data sets to uncover patterns and associations that can reveal the geographical source of a food item. This study aims to investigate the potential use of explainable artificial intelligence for identifying the food origin. The case of study of Mozzarella di Bufala Campana PDO has been considered by examining the composition of the microbiota in each samples. Three different supervised machine learning algorithms have been compared and the best classifier model is represented by Random Forest with an Area Under the Curve (AUC) value of 0.93 and the top accuracy of 0.87. Machine learning models effectively classify origin, offering innovative ways to authenticate regional products and support local economies. Further research can explore microbiota analysis and extend applicability to diverse food products and contexts for enhanced accuracy and broader impact.
Collapse
Affiliation(s)
- Michele Magarelli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Pierfrancesco Novielli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Francesca De Filippis
- Dipartimento di Agraria, Università degli Studi di Napoli Federico II, Naples, Italy
| | - Raffaele Magliulo
- Dipartimento di Agraria, Università degli Studi di Napoli Federico II, Naples, Italy
| | - Pierpaolo Di Bitonto
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Domenico Diacono
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
- Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Sabina Tangaro
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| |
Collapse
|
20
|
Hagen M, Dass R, Westhues C, Blom J, Schultheiss SJ, Patz S. Interpretable machine learning decodes soil microbiome's response to drought stress. ENVIRONMENTAL MICROBIOME 2024; 19:35. [PMID: 38812054 PMCID: PMC11138018 DOI: 10.1186/s40793-024-00578-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 05/10/2024] [Indexed: 05/31/2024]
Abstract
BACKGROUND Extreme weather events induced by climate change, particularly droughts, have detrimental consequences for crop yields and food security. Concurrently, these conditions provoke substantial changes in the soil bacterial microbiota and affect plant health. Early recognition of soil affected by drought enables farmers to implement appropriate agricultural management practices. In this context, interpretable machine learning holds immense potential for drought stress classification of soil based on marker taxa. RESULTS This study demonstrates that the 16S rRNA-based metagenomic approach of Differential Abundance Analysis methods and machine learning-based Shapley Additive Explanation values provide similar information. They exhibit their potential as complementary approaches for identifying marker taxa and investigating their enrichment or depletion under drought stress in grass lineages. Additionally, the Random Forest Classifier trained on a diverse range of relative abundance data from the soil bacterial micobiome of various plant species achieves a high accuracy of 92.3 % at the genus rank for drought stress prediction. It demonstrates its generalization capacity for the lineages tested. CONCLUSIONS In the detection of drought stress in soil bacterial microbiota, this study emphasizes the potential of an optimized and generalized location-based ML classifier. By identifying marker taxa, this approach holds promising implications for microbe-assisted plant breeding programs and contributes to the development of sustainable agriculture practices. These findings are crucial for preserving global food security in the face of climate change.
Collapse
Affiliation(s)
- Michelle Hagen
- Computomics GmbH, Eisenbahnstraße 1, 72072, Tübingen, Baden-Württemberg, Germany
| | - Rupashree Dass
- Computomics GmbH, Eisenbahnstraße 1, 72072, Tübingen, Baden-Württemberg, Germany
| | - Cathy Westhues
- Computomics GmbH, Eisenbahnstraße 1, 72072, Tübingen, Baden-Württemberg, Germany
| | - Jochen Blom
- Bioinformatics and Systems Biology, Justus Liebig University Gießen, Heinrich-Buff-Ring 58, 35390, Gießen, Hesse, Germany
| | | | - Sascha Patz
- Computomics GmbH, Eisenbahnstraße 1, 72072, Tübingen, Baden-Württemberg, Germany.
| |
Collapse
|
21
|
Shtossel O, Finkelstein S, Louzoun Y. mi-Mic: a novel multi-layer statistical test for microbiota-disease associations. Genome Biol 2024; 25:113. [PMID: 38693546 PMCID: PMC11064322 DOI: 10.1186/s13059-024-03256-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 04/22/2024] [Indexed: 05/03/2024] Open
Abstract
mi-Mic, a novel approach for microbiome differential abundance analysis, tackles the key challenges of such statistical tests: a large number of tests, sparsity, varying abundance scales, and taxonomic relationships. mi-Mic first converts microbial counts to a cladogram of means. It then applies a priori tests on the upper levels of the cladogram to detect overall relationships. Finally, it performs a Mann-Whitney test on paths that are consistently significant along the cladogram or on the leaves. mi-Mic has much higher true to false positives ratios than existing tests, as measured by a new real-to-shuffle positive score.
Collapse
Affiliation(s)
- Oshrit Shtossel
- Department of Mathematics, Bar-Ilan University, Ramat Gan, 52900, Israel
| | - Shani Finkelstein
- Department of Mathematics, Bar-Ilan University, Ramat Gan, 52900, Israel
| | - Yoram Louzoun
- Department of Mathematics, Bar-Ilan University, Ramat Gan, 52900, Israel.
| |
Collapse
|
22
|
Kamel M, Aleya S, Alsubih M, Aleya L. Microbiome Dynamics: A Paradigm Shift in Combatting Infectious Diseases. J Pers Med 2024; 14:217. [PMID: 38392650 PMCID: PMC10890469 DOI: 10.3390/jpm14020217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/15/2024] [Accepted: 02/16/2024] [Indexed: 02/24/2024] Open
Abstract
Infectious diseases have long posed a significant threat to global health and require constant innovation in treatment approaches. However, recent groundbreaking research has shed light on a previously overlooked player in the pathogenesis of disease-the human microbiome. This review article addresses the intricate relationship between the microbiome and infectious diseases and unravels its role as a crucial mediator of host-pathogen interactions. We explore the remarkable potential of harnessing this dynamic ecosystem to develop innovative treatment strategies that could revolutionize the management of infectious diseases. By exploring the latest advances and emerging trends, this review aims to provide a new perspective on combating infectious diseases by targeting the microbiome.
Collapse
Affiliation(s)
- Mohamed Kamel
- Department of Medicine and Infectious Diseases, Faculty of Veterinary Medicine, Cairo University, Giza 11221, Egypt
| | - Sami Aleya
- Faculty of Medecine, Université de Bourgogne Franche-Comté, Hauts-du-Chazal, 25030 Besançon, France;
| | - Majed Alsubih
- Department of Civil Engineering, King Khalid University, Guraiger, Abha 62529, Saudi Arabia;
| | - Lotfi Aleya
- Laboratoire de Chrono-Environnement, Université de Bourgogne Franche-Comté, UMR CNRS 6249, La Bouloie, 25030 Besançon, France;
| |
Collapse
|
23
|
Novielli P, Romano D, Magarelli M, Bitonto PD, Diacono D, Chiatante A, Lopalco G, Sabella D, Venerito V, Filannino P, Bellotti R, De Angelis M, Iannone F, Tangaro S. Explainable artificial intelligence for microbiome data analysis in colorectal cancer biomarker identification. Front Microbiol 2024; 15:1348974. [PMID: 38426064 PMCID: PMC10901987 DOI: 10.3389/fmicb.2024.1348974] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 01/24/2024] [Indexed: 03/02/2024] Open
Abstract
Background Colorectal cancer (CRC) is a type of tumor caused by the uncontrolled growth of cells in the mucosa lining the last part of the intestine. Emerging evidence underscores an association between CRC and gut microbiome dysbiosis. The high mortality rate of this cancer has made it necessary to develop new early diagnostic methods. Machine learning (ML) techniques can represent a solution to evaluate the interaction between intestinal microbiota and host physiology. Through explained artificial intelligence (XAI) it is possible to evaluate the individual contributions of microbial taxonomic markers for each subject. Our work also implements the Shapley Method Additive Explanations (SHAP) algorithm to identify for each subject which parameters are important in the context of CRC. Results The proposed study aimed to implement an explainable artificial intelligence framework using both gut microbiota data and demographic information from subjects to classify a cohort of control subjects from those with CRC. Our analysis revealed an association between gut microbiota and this disease. We compared three machine learning algorithms, and the Random Forest (RF) algorithm emerged as the best classifier, with a precision of 0.729 ± 0.038 and an area under the Precision-Recall curve of 0.668 ± 0.016. Additionally, SHAP analysis highlighted the most crucial variables in the model's decision-making, facilitating the identification of specific bacteria linked to CRC. Our results confirmed the role of certain bacteria, such as Fusobacterium, Peptostreptococcus, and Parvimonas, whose abundance appears notably associated with the disease, as well as bacteria whose presence is linked to a non-diseased state. Discussion These findings emphasizes the potential of leveraging gut microbiota data within an explainable AI framework for CRC classification. The significant association observed aligns with existing knowledge. The precision exhibited by the RF algorithm reinforces its suitability for such classification tasks. The SHAP analysis not only enhanced interpretability but identified specific bacteria crucial in CRC determination. This approach opens avenues for targeted interventions based on microbial signatures. Further exploration is warranted to deepen our understanding of the intricate interplay between microbiota and health, providing insights for refined diagnostic and therapeutic strategies.
Collapse
Affiliation(s)
- Pierfrancesco Novielli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Donato Romano
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Michele Magarelli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Pierpaolo Di Bitonto
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Domenico Diacono
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Annalisa Chiatante
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Giuseppe Lopalco
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Daniele Sabella
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Vincenzo Venerito
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Pasquale Filannino
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
- Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Maria De Angelis
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Florenzo Iannone
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Sabina Tangaro
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| |
Collapse
|
24
|
Tangaro S, Lopalco G, Sabella D, Venerito V, Novielli P, Romano D, Di Gilio A, Palmisani J, de Gennaro G, Filannino P, Latronico R, Bellotti R, De Angelis M, Iannone F. Unraveling the microbiome-metabolome nexus: a comprehensive study protocol for personalized management of Behçet's disease using explainable artificial intelligence. Front Microbiol 2024; 15:1341152. [PMID: 38410386 PMCID: PMC10895059 DOI: 10.3389/fmicb.2024.1341152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 01/31/2024] [Indexed: 02/28/2024] Open
Abstract
The presented study protocol outlines a comprehensive investigation into the interplay among the human microbiota, volatilome, and disease biomarkers, with a specific focus on Behçet's disease (BD) using methods based on explainable artificial intelligence. The protocol is structured in three phases. During the initial three-month clinical study, participants will be divided into control and experimental groups. The experimental groups will receive a soluble fiber-based dietary supplement alongside standard therapy. Data collection will encompass oral and fecal microbiota, breath samples, clinical characteristics, laboratory parameters, and dietary habits. The subsequent biological data analysis will involve gas chromatography, mass spectrometry, and metagenetic analysis to examine the volatilome and microbiota composition of salivary and fecal samples. Additionally, chemical characterization of breath samples will be performed. The third phase introduces Explainable Artificial Intelligence (XAI) for the analysis of the collected data. This novel approach aims to evaluate eubiosis and dysbiosis conditions, identify markers associated with BD, dietary habits, and the supplement. Primary objectives include establishing correlations between microbiota, volatilome, phenotypic BD characteristics, and identifying patient groups with shared features. The study aims to identify taxonomic units and metabolic markers predicting clinical outcomes, assess the supplement's impact, and investigate the relationship between dietary habits and patient outcomes. This protocol contributes to understanding the microbiome's role in health and disease and pioneers an XAI-driven approach for personalized BD management. With 70 recruited BD patients, XAI algorithms will analyze multi-modal clinical data, potentially revolutionizing BD management and paving the way for improved patient outcomes.
Collapse
Affiliation(s)
- Sabina Tangaro
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Giuseppe Lopalco
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Daniele Sabella
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Vincenzo Venerito
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Pierfrancesco Novielli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Donato Romano
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Alessia Di Gilio
- Dipartimento di Bioscienze, Biotecnologie e Ambiente, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Jolanda Palmisani
- Dipartimento di Bioscienze, Biotecnologie e Ambiente, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Gianluigi de Gennaro
- Dipartimento di Bioscienze, Biotecnologie e Ambiente, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Pasquale Filannino
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Rosanna Latronico
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
- Dipartimento Interateneo di Fisica ‘M. Merlin’, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Maria De Angelis
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Florenzo Iannone
- Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy
| |
Collapse
|
25
|
Ibrahimi E, Lopes MB, Dhamo X, Simeon A, Shigdel R, Hron K, Stres B, D’Elia D, Berland M, Marcos-Zambrano LJ. Overview of data preprocessing for machine learning applications in human microbiome research. Front Microbiol 2023; 14:1250909. [PMID: 37869650 PMCID: PMC10588656 DOI: 10.3389/fmicb.2023.1250909] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 09/22/2023] [Indexed: 10/24/2023] Open
Abstract
Although metagenomic sequencing is now the preferred technique to study microbiome-host interactions, analyzing and interpreting microbiome sequencing data presents challenges primarily attributed to the statistical specificities of the data (e.g., sparse, over-dispersed, compositional, inter-variable dependency). This mini review explores preprocessing and transformation methods applied in recent human microbiome studies to address microbiome data analysis challenges. Our results indicate a limited adoption of transformation methods targeting the statistical characteristics of microbiome sequencing data. Instead, there is a prevalent usage of relative and normalization-based transformations that do not specifically account for the specific attributes of microbiome data. The information on preprocessing and transformations applied to the data before analysis was incomplete or missing in many publications, leading to reproducibility concerns, comparability issues, and questionable results. We hope this mini review will provide researchers and newcomers to the field of human microbiome research with an up-to-date point of reference for various data transformation tools and assist them in choosing the most suitable transformation method based on their research questions, objectives, and data characteristics.
Collapse
Affiliation(s)
- Eliana Ibrahimi
- Department of Biology, Faculty of Natural Sciences, University of Tirana, Tirana, Albania
| | - Marta B. Lopes
- Department of Mathematics, Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, Portugal
- UNIDEMI, Department of Mechanical and Industrial Engineering, NOVA School of Science and Technology, Caparica, Portugal
| | - Xhilda Dhamo
- Department of Applied Mathematics, Faculty of Natural Sciences, University of Tirana, Tirana, Albania
| | - Andrea Simeon
- BioSense Institute, University of Novi Sad, Novi Sad, Serbia
| | - Rajesh Shigdel
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Karel Hron
- Department of Mathematical Analysis and Applications of Mathematics, Faculty of Science, Palacký University Olomouc, Olomouc, Czechia
| | - Blaž Stres
- Department of Catalysis and Chemical Reaction Engineering, National Institute of Chemistry, Ljubljana, Slovenia
- Faculty of Civil and Geodetic Engineering, Institute of Sanitary Engineering, Ljubljana, Slovenia
- Department of Automation, Biocybernetics and Robotics, Jožef Stefan Institute, Ljubljana, Slovenia
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Domenica D’Elia
- Department of Biomedical Sciences, National Research Council, Institute for Biomedical Technologies, Bari, Italy
| | - Magali Berland
- INRAE, MetaGenoPolis, Université Paris-Saclay, Jouy-en-Josas, France
| | - Laura Judith Marcos-Zambrano
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
| |
Collapse
|
26
|
D’Elia D, Truu J, Lahti L, Berland M, Papoutsoglou G, Ceci M, Zomer A, Lopes MB, Ibrahimi E, Gruca A, Nechyporenko A, Frohme M, Klammsteiner T, Pau ECDS, Marcos-Zambrano LJ, Hron K, Pio G, Simeon A, Suharoschi R, Moreno-Indias I, Temko A, Nedyalkova M, Apostol ES, Truică CO, Shigdel R, Telalović JH, Bongcam-Rudloff E, Przymus P, Jordamović NB, Falquet L, Tarazona S, Sampri A, Isola G, Pérez-Serrano D, Trajkovik V, Klucar L, Loncar-Turukalo T, Havulinna AS, Jansen C, Bertelsen RJ, Claesson MJ. Advancing microbiome research with machine learning: key findings from the ML4Microbiome COST action. Front Microbiol 2023; 14:1257002. [PMID: 37808321 PMCID: PMC10558209 DOI: 10.3389/fmicb.2023.1257002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 09/05/2023] [Indexed: 10/10/2023] Open
Abstract
The rapid development of machine learning (ML) techniques has opened up the data-dense field of microbiome research for novel therapeutic, diagnostic, and prognostic applications targeting a wide range of disorders, which could substantially improve healthcare practices in the era of precision medicine. However, several challenges must be addressed to exploit the benefits of ML in this field fully. In particular, there is a need to establish "gold standard" protocols for conducting ML analysis experiments and improve interactions between microbiome researchers and ML experts. The Machine Learning Techniques in Human Microbiome Studies (ML4Microbiome) COST Action CA18131 is a European network established in 2019 to promote collaboration between discovery-oriented microbiome researchers and data-driven ML experts to optimize and standardize ML approaches for microbiome analysis. This perspective paper presents the key achievements of ML4Microbiome, which include identifying predictive and discriminatory 'omics' features, improving repeatability and comparability, developing automation procedures, and defining priority areas for the novel development of ML methods targeting the microbiome. The insights gained from ML4Microbiome will help to maximize the potential of ML in microbiome research and pave the way for new and improved healthcare practices.
Collapse
Affiliation(s)
- Domenica D’Elia
- Department of Biomedical Sciences, National Research Council, Institute for Biomedical Technologies, Bari, Italy
| | - Jaak Truu
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Leo Lahti
- Department of Computing, University of Turku, Turku, Finland
| | - Magali Berland
- Université Paris-Saclay, INRAE, MetaGenoPolis, Jouy-en-Josas, France
| | - Georgios Papoutsoglou
- JADBio Gnosis DA S.A., Science and Technology Park of Crete, Heraklion, Greece
- Department of Computer Science, University of Crete, Heraklion, Greece
| | - Michelangelo Ceci
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
| | - Aldert Zomer
- Department of Biomolecular Health Sciences (Infectious Diseases and Immunology), Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands
| | - Marta B. Lopes
- Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, Portugal
- UNIDEMI, Department of Mechanical and Industrial Engineering, NOVA School of Science and Technology, Caparica, Portugal
| | - Eliana Ibrahimi
- Department of Biology, University of Tirana, Tirana, Albania
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, Poland
| | - Alina Nechyporenko
- Systems Engineering Department, Kharkiv National University of Radio Electronics, Kharkiv, Ukraine
- Department of Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Germany
| | - Marcus Frohme
- Department of Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Germany
| | - Thomas Klammsteiner
- Department of Microbiology, Universität Innsbruck, Innsbruck, Austria
- Department of Ecology, Universität Innsbruck, Innsbruck, Austria
| | - Enrique Carrillo-de Santa Pau
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, CEI UAM+CSIC, Madrid, Spain
| | - Laura Judith Marcos-Zambrano
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, CEI UAM+CSIC, Madrid, Spain
| | - Karel Hron
- Department of Mathematical Analysis and Applications of Mathematics, Faculty of Science, Palacký University, Olomouc, Czechia
| | - Gianvito Pio
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
| | - Andrea Simeon
- BioSense Institute, University of Novi Sad, Novi Sad, Serbia
| | - Ramona Suharoschi
- Molecular Nutrition and Proteomics Research Laboratory, Department of Food Science, University of Agricultural Sciences and Veterinary Medicine of Cluj-Napoca, Cluj-Napoca, Romania
| | - Isabel Moreno-Indias
- Department of Endocrinology and Nutrition, Virgen de la Victoria University Hospital, the Biomedical Research Institute of Malaga and Platform in Nanomedicine (IBIMA-BIONAND Platform), University of Malaga, Malaga, Spain
| | - Andriy Temko
- Department of Electrical and Electronic Engineering, University College Cork, Cork, Ireland
| | | | - Elena-Simona Apostol
- Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Bucharest, Romania
| | - Ciprian-Octavian Truică
- Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Bucharest, Romania
| | - Rajesh Shigdel
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Jasminka Hasić Telalović
- Department of Computer Science, University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina
| | - Erik Bongcam-Rudloff
- Swedish University of Agricultural Sciences, Department of Animal Breeding and Genetics, Uppsala, Sweden
| | | | - Naida Babić Jordamović
- Computational Biology, International Centre for Genetic Engineering and Biotechnology, Trieste, Italy
- Verlab Research Institute for BIomedical Engineering, Medical Devices and Artificial Intelligence, Sarajevo, Bosnia and Herzegovina
| | - Laurent Falquet
- University of Fribourg and Swiss Institute of Bioinformatics, Fribourg, Switzerland
| | - Sonia Tarazona
- Department of Applied Statistics and Operations Research and Quality, Universitat Politècnica de València, València, Spain
| | - Alexia Sampri
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, United Kingdom
| | - Gaetano Isola
- Department of General Surgery and Surgical-Medical Specialties, School of Dentistry, University of Catania, Catania, Italy
| | - David Pérez-Serrano
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, CEI UAM+CSIC, Madrid, Spain
| | | | - Lubos Klucar
- Institute of Molecular Biology, Slovak Academy of Sciences, Bratislava, Slovakia
| | | | - Aki S. Havulinna
- Finnish Institute for Health and Welfare, Helsinki, Finland
- Institute for Molecular Medicine Finland, FIMM-HiLIFE, Helsinki, Finland
| | - Christian Jansen
- Biome Diagnostics GmbH, Vienna, Austria
- Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria
| | | | | |
Collapse
|