1
|
Li P, Li M, Chen WH. Best practices for developing microbiome-based disease diagnostic classifiers through machine learning. Gut Microbes 2025; 17:2489074. [PMID: 40186338 PMCID: PMC11980492 DOI: 10.1080/19490976.2025.2489074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Revised: 03/13/2025] [Accepted: 03/28/2025] [Indexed: 04/07/2025] Open
Abstract
The human gut microbiome, crucial in various diseases, can be utilized to develop diagnostic models through machine learning (ML). The specific tools and parameters used in model construction such as data preprocessing, batch effect removal and modeling algorithms can impact model performance and generalizability. To establish an generally applicable workflow, we divided the ML process into three above-mentioned steps and optimized each sequentially using 83 gut microbiome cohorts across 20 diseases. We tested a total of 156 tool-parameter-algorithm combinations and benchmarked them according to internal- and external- AUCs. At the data preprocessing step, we identified four data preprocessing methods that performed well for regression-type algorithms and one method that excelled for non-regression-type algorithms. At the batch effect removal step, we identified the "ComBat" function from the sva R package as an effective batch effect removal method and compared the performance of various algorithms. Finally, at the ML algorithm selection step, we found that Ridge and Random Forest ranked the best. Our optimized work flow performed similarly comparing with previous exhaustive methods for disease-specific optimizations, thus is generally applicable and can provide a comprehensive guideline for constructing diagnostic models for a range of diseases, potentially serving as a powerful tool for future medical diagnostics.
Collapse
Affiliation(s)
- Peikun Li
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Min Li
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Wei-Hua Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
- School of Biological Science, Jining Medical University, Rizhao, China
| |
Collapse
|
2
|
Guo J, Lin X, Xiao Y. Integration of smart sensors and phytoremediation for real-time pollution monitoring and ecological restoration in agricultural waste management. FRONTIERS IN PLANT SCIENCE 2025; 16:1550302. [PMID: 40433163 PMCID: PMC12106413 DOI: 10.3389/fpls.2025.1550302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Accepted: 04/11/2025] [Indexed: 05/29/2025]
Abstract
Global climate change and ecological degradation highlight the urgency of dealing with agricultural waste and ecological restoration. Traditional pollutant monitoring and ecological restoration methods face challenges in accuracy and adaptability, especially when dealing with complex environmental data. This paper proposes the Bio-DANN model, which combines biogeochemical models and deep learning techniques to improve the accuracy of pollutant monitoring and ecological restoration prediction. The model uses deep neural networks (DNNs) and attention mechanisms to process multidimensional environmental data in various agricultural and ecological scenarios in real time. Experimental results based on Open Soil Data and NEON datasets show that Bio-DANN performs well in pollutant prediction, with mean square errors (MSE) of 0.012 and 0.018, root mean square errors (RMSE) of 0.109 and 0.134, and accuracy of 0.92 and 0.90, respectively. In terms of ecological restoration assessment, Bio-DANN achieved ΔF and PIPGR of 0.15 and 18%, and 0.20 and 22%, respectively, and H' values of 1.5 and 1.7, which are better than other models. Bio-DANN provides a promising technical solution for environmental protection, resource recovery and sustainable agriculture, especially showing significant potential in pollutant monitoring, soil health assessment and ecological restoration evaluation.
Collapse
Affiliation(s)
- Jinsong Guo
- School of Economics, Guangdong Ocean University, Zhanjiang, China
| | - Xiaoxin Lin
- School of Mathematics and Computer Science, Guangdong Ocean University, Zhanjiang, China
| | - Yingjun Xiao
- School of Economics, Guangdong Ocean University, Zhanjiang, China
| |
Collapse
|
3
|
Liu J, Yang Z, Li L, Chu X, Wei S, Lian J. Predicting Aboveground Carbon Storage in Different Types of Forests in South Subtropical Regions Using Machine Learning Models. Ecol Evol 2025; 15:e71499. [PMID: 40433201 PMCID: PMC12105939 DOI: 10.1002/ece3.71499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Revised: 04/24/2025] [Accepted: 05/14/2025] [Indexed: 05/29/2025] Open
Abstract
Motivated by the need to enhance the accuracy of forest aboveground carbon storage (ACS) assessments, this study aimed to explore the effectiveness of different machine learning models in predicting ACS across various subtropical forest types in southern China. The study was conducted in southern China, focusing on different types of subtropical forests. This region harbors several types of subtropical forests, which are rarely found at similar latitudes in the world. Variance inflation factor was employed to screen independent variables, resulting in the selection of 13 significant predictors. Four machine learning models-support vector machine (SVM), random forest (RF), multi-layer perceptron (MLP), and extreme gradient boosting (XGB)-were constructed to estimate carbon storage. Model performance was evaluated using root mean square error, coefficient of determination (R 2), and mean absolute error. The model with the best generalization ability was selected to calculate SHAP values for each predictor. The XGB model demonstrated superior performance across all forest types, with R 2 values ranging from 0.898 to 0.974. In mountainous evergreen broad-leaved forests, the prediction accuracy followed the order of XGB>MLP>SVM>RF. In valley rainforests, MLP showed the highest R 2 value, but with higher MAE and RMSE, making it the second-best choice. The RF model performed moderately, while the SVM model showed the poorest performance. The SHAP values indicated that maximum diameter at breast height, slope, mean DBH, species evenness, altitude, and maximum tree height had significant effects on ACS. XGB model exhibits the best prediction performance and strongest adaptability for estimating ACS in subtropical southern China forests. Additionally, the MLP model can serve as an effective model for assessing carbon storage in valley rainforests within this region. Machine learning methods provide valuable references for predicting and assessing ACS in different types of zonal forests.
Collapse
Affiliation(s)
- Jiarun Liu
- School of Life & Environmental SciencesGuilin University of Electronic TechnologyGuilinGuangxiChina
| | - Zihang Yang
- School of Life & Environmental SciencesGuilin University of Electronic TechnologyGuilinGuangxiChina
| | - Lin Li
- School of Life & Environmental SciencesGuilin University of Electronic TechnologyGuilinGuangxiChina
| | - Xiaoxue Chu
- School of Life & Environmental SciencesGuilin University of Electronic TechnologyGuilinGuangxiChina
| | - Shiguang Wei
- Key Laboratory of Ecology of Rare and Endangered Species and Environmental Protection, Ministry of Education – Guangxi Key Laboratory of Landscape Resources Conservation and Sustainable Utilization in Lijiang River BasinGuangxi Normal UniversityGuilinGuangxiChina
| | - Juyu Lian
- Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical GardenChinese Academy of SciencesGuangzhouChina
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical GardenChinese Academy of SciencesGuangzhouChina
| |
Collapse
|
4
|
Dong J, Al‐Issa M, Feeney JS, Shelp GV, Poole EM, Cho CE. Prenatal Intake of High Multivitamins or Folic Acid With or Without Choline Contributes to Gut Microbiota-Associated Dysregulation of Serotonin in Offspring. Mol Nutr Food Res 2025; 69:e70044. [PMID: 40123263 PMCID: PMC12050513 DOI: 10.1002/mnfr.70044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Revised: 02/23/2025] [Accepted: 03/11/2025] [Indexed: 03/25/2025]
Abstract
The gut microbiota is amenable to early nutrition including micronutrients but intake above and below the recommendations commonly occur with unknown consequences. Serotonin (5-hydroxytryptamine [5-HT]) is a monoamine found centrally and peripherally with diverse functions such as food intake regulation via the hypothalamic 5-HT receptor 2C (5-HTR2C). This study determined the impact of prenatal micronutrients on the gut microbiota and serotonergic system in offspring. Pregnant Wistar rats were fed either recommended vitamins (RV), high vitamins (HV), high folic acid with recommended choline (HFRC), or high folic acid with no choline (HFNC). Offspring were fed a high-fat diet for 12 weeks postweaning. HV, HFRC, and HFNC males and females had lower hypothalamic 5-HTR2C protein expression compared to RV. Brain 5-HT concentrations were lower but colon 5-HT concentrations were higher in HV and HFNC males and females and HFRC males compared to RV. Refeeding response after 5-HTR2C agonist was negatively correlated with hypothalamic 5-HTR2C protein expression in males and with brain 5-HT concentrations in females. Random forest revealed top bacterial taxa, which Lactococcus, Ruminococcus, Bacteroides, and Oscillospira showed significant correlations with refeeding response and concentrations of brain and colon 5-HT. In conclusion, excess or imbalanced prenatal consumption of micronutrients leads to gut microbiota-associated disturbances in the serotonergic system in offspring.
Collapse
Affiliation(s)
- Jianzhang Dong
- Department of Human Health and Nutritional SciencesUniversity of GuelphGuelphOntarioCanada
| | - Mali Al‐Issa
- Department of Human Health and Nutritional SciencesUniversity of GuelphGuelphOntarioCanada
| | - Jenny S. Feeney
- Department of Human Health and Nutritional SciencesUniversity of GuelphGuelphOntarioCanada
| | - Gia V. Shelp
- Department of Human Health and Nutritional SciencesUniversity of GuelphGuelphOntarioCanada
| | - Elizabeth M. Poole
- Department of Family Relations and Applied NutritionUniversity of GuelphGuelphOntarioCanada
| | - Clara E. Cho
- Department of Human Health and Nutritional SciencesUniversity of GuelphGuelphOntarioCanada
| |
Collapse
|
5
|
Temesgen SA, Ahmad B, Grace-Mercure BK, Liu M, Liu L, Lin H, Deng K. Exploring species taxonomic kingdom using information entropy and nucleotide compositional features of coding sequences based on machine learning methods. Methods 2025; 240:165-179. [PMID: 40280261 DOI: 10.1016/j.ymeth.2025.03.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2025] [Revised: 03/08/2025] [Accepted: 03/31/2025] [Indexed: 04/29/2025] Open
Abstract
The flow of genetic information from DNA to protein is governed by the central dogma of molecular biology. Genetic drift and mutations usually lead to changes in DNA composition, thereby affecting the coding sequences (CDS) that encode functional proteins. Analyzing the nucleotide distribution in the coding regions of species is crucial for understanding their evolution. In this study, we applied Markov processes to analyze codon formation in 37,031,061 CDSs across 3,735 species genomes, spanning viruses, archaea, bacteria, and eukaryotes, to explore compositional changes. Our results revealed species preferences for different nucleotides. Information entropies and Markov information densities show that eukaryotes exhibit higher redundancy, followed by viruses, suggesting more gene duplication in eukaryotes and high mutation rates in viruses. Evolutionary trends showed an increase in information entropy and a decrease in Markov entropy, with negative correlations between first- and second-order Markov information densities. Furthermore, uniform manifold approximation and projection (UMAP) was used to reduce information redundancy for revealing unique evolutionary patterns in species classification. The machine learning methods demonstrated excellent performance in species classification accuracy, providing profound insights into CDS evolution and protein synthesis.
Collapse
Affiliation(s)
- Sebu Aboma Temesgen
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Basharat Ahmad
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | | | - Minghao Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Li Liu
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Hao Lin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Kejun Deng
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
| |
Collapse
|
6
|
Guo J, Shi A, Sun Y, Zhang S, Feng X, Chen Y, Yao Z. Network Pharmacology and Experimental Validation of the Effects of Shenling Baizhu San, Quzhi Ruangan Fang and Gexia Zhuyu Tang on the Intestinal Flora of Rats with NAFLD. Diabetes Metab Syndr Obes 2025; 18:1165-1194. [PMID: 40260263 PMCID: PMC12011051 DOI: 10.2147/dmso.s507039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Accepted: 04/03/2025] [Indexed: 04/23/2025] Open
Abstract
Objective In this study, we investigated the effect of Shenling Baizhu San(SLBZS), Quzhi Ruangan Fang(QZRGF) and Gexia Zhuyu Tang(GXZYT) on the intestinal flora of NAFLD rats through network pharmacology and experimental validation. Materials and Methods Protein-protein interaction, Gene Ontology (GO), and molecular docking were performed. Male Sprague-Dawley (SD) rats were divided into 6 groups: Normal, Model, SLBZS (7.2g/kg), QZRGF (27.72g/kg), GXZYT (28.8 g/kg) and positive control (Fenofibrate, 18mg/kg); the NAFLD model was established by High-fat diet. After one week of acclimatisation feeding consecutively, continuous gavage was given for 8 W and 12 W. Serum, liver and faeces were collected and biochemical and pathological indices were determined. The diversity and abundance of intestinal flora were also analyzed using 16S rDNA amplified sequencing. Results A total of 132 active ingredients were obtained from the screening results of SLBZS. A total of 202 active ingredients were obtained from the screening results of GXZYT. The screening results of QZRGF obtained 34 active ingredients. Nine common hub genes were screened from the PPI network. GO functional analysis reported that these targets were mainly closely related to the response to bacterial molecules. The molecular docking results indicated that the 11 core constituents in three compound prescriptions has good binding ability with MAPK1, AKT1, CASP3, FOS, TP53, STAT3, MAPK3. Conclusion The Chinese herbal compounds SLBZS, QZRGF and GXZYT may exert lipid-lowering effects through multi-components, multi-targets and multi-methods for the treatment of NAFLD while improving the diversity and abundance of the intestinal flora of the rats, and the best effect was achieved with SLBZS.
Collapse
Affiliation(s)
- Jia Guo
- School of Basic Medical Sciences, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
- Dongtai Hospital of Traditional Chinese Medicine, Dongtai, Jiangsu, 224200, People’s Republic of China
| | - Anhua Shi
- School of Basic Medical Sciences, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
- The Key Laboratory of Microcosmic Syndrome Differentiation, Education Department of Yunnan, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
- Yunnan Key Laboratory of Integrated Traditional Chinese and Western Medicine for Chronic Disease in Prevention and Treatment, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
| | - Yanhong Sun
- School of Basic Medical Sciences, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
| | - Shunzhen Zhang
- School of Basic Medical Sciences, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
| | - Xiaoyi Feng
- School of Basic Medical Sciences, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
- The Key Laboratory of Microcosmic Syndrome Differentiation, Education Department of Yunnan, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
- Yunnan Key Laboratory of Integrated Traditional Chinese and Western Medicine for Chronic Disease in Prevention and Treatment, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
| | - Yifan Chen
- School of Basic Medical Sciences, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
| | - Zheng Yao
- School of Basic Medical Sciences, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
- The Key Laboratory of Microcosmic Syndrome Differentiation, Education Department of Yunnan, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
- Yunnan Key Laboratory of Integrated Traditional Chinese and Western Medicine for Chronic Disease in Prevention and Treatment, Yunnan University of Chinese Medicine, Kunming, Yunnan, 650500, People’s Republic of China
| |
Collapse
|
7
|
Zschaubitz E, Schröder H, Glackin CC, Vogel L, Labrenz M, Sperlea T. A benchmark analysis of feature selection and machine learning methods for environmental metabarcoding datasets. Comput Struct Biotechnol J 2025; 27:1636-1647. [PMID: 40322584 PMCID: PMC12049816 DOI: 10.1016/j.csbj.2025.04.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2024] [Revised: 04/10/2025] [Accepted: 04/11/2025] [Indexed: 05/08/2025] Open
Abstract
Next-Generation Sequencing methods like DNA metabarcoding enable the generation of large community composition datasets and have grown instrumental in many branches of ecology in recent years. However, the sparsity, compositionality, and high dimensionality of metabarcoding datasets pose challenges in data analysis. In theory, feature selection methods improve the analyzability of eDNA metabarcoding datasets by identifying a subset of informative taxa that are relevant for a certain task and discarding those that are redundant or irrelevant. However, general guidelines on selecting a feature selection method for application to a given setting are lacking. Here, we report a comparison of feature selection methods in a supervised machine learning setup across 13 environmental metabarcoding datasets with differing characteristics. We evaluate workflows that consist of data preprocessing, feature selection and a machine learning model by their ability to capture the ecological relationship between the microbial community composition and environmental parameters. Our results demonstrate that, while the optimal feature selection approach depends on dataset characteristics, feature selection is more likely to impair model performance than to improve it for tree ensemble models like Random Forests. Furthermore, our results show that calculating relative counts impairs model performance, which suggests that novel methods to combat the compositionality of metabarcoding data are required.
Collapse
Affiliation(s)
- Erik Zschaubitz
- Department of Biological Oceanography, Leibniz Institute for Baltic Sea Research, Seestraße 15, Rostock, 18119, Germany
| | | | - Conor Christopher Glackin
- Department of Biological Oceanography, Leibniz Institute for Baltic Sea Research, Seestraße 15, Rostock, 18119, Germany
| | - Lukas Vogel
- Department of Biological Oceanography, Leibniz Institute for Baltic Sea Research, Seestraße 15, Rostock, 18119, Germany
| | - Matthias Labrenz
- Department of Biological Oceanography, Leibniz Institute for Baltic Sea Research, Seestraße 15, Rostock, 18119, Germany
| | - Theodor Sperlea
- Department of Biological Oceanography, Leibniz Institute for Baltic Sea Research, Seestraße 15, Rostock, 18119, Germany
| |
Collapse
|
8
|
Coskuner-Weber O, Alpsoy S, Yolcu O, Teber E, de Marco A, Shumka S. Metagenomics studies in aquaculture systems: Big data analysis, bioinformatics, machine learning and quantum computing. Comput Biol Chem 2025; 118:108444. [PMID: 40187295 DOI: 10.1016/j.compbiolchem.2025.108444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2025] [Revised: 03/15/2025] [Accepted: 03/25/2025] [Indexed: 04/07/2025]
Abstract
The burgeoning field of aquaculture has become a pivotal contributor to global food security and economic growth, presently surpassing capture fisheries in aquatic animal production as evidenced by recent statistics. However, the dense fish populations inherent in aquaculture systems exacerbate abiotic stressors and promote pathogenic spread, posing a risk to sustainability and yield. This study delves into the transformative potential of metagenomics, a method that directly retrieves genetic material from environmental samples, in elucidating microbial dynamics within aquaculture ecosystems. Our findings affirm that metagenomics, bolstered by tools in big data analytics, bioinformatics, and machine learning, can significantly enhance the precision of microbial assessment and pathogen detection. Furthermore, we explore quantum computing's emergent role, which promises unparalleled efficiency in data processing and model construction, poised to address the limitations of conventional computational techniques. Distinct from metabarcoding, metagenomics offers an expansive, unbiased profile of microbial biodiversity, revolutionizing our capacity to monitor, predict, and manage aquaculture systems with high accuracy and adaptability. Despite the challenges of computational demands and variability in data standardization, this study advocates for continued technological integration, thereby fostering resilient and sustainable aquaculture practices in a climate of escalating global food requirements.
Collapse
Affiliation(s)
- Orkid Coskuner-Weber
- Turkish-German University, Molecular Biotechnology, Sahinkaya Caddesi, No. 106, Beykoz, Istanbul 34820, Turkey.
| | - Semih Alpsoy
- Turkish-German University, Molecular Biotechnology, Sahinkaya Caddesi, No. 106, Beykoz, Istanbul 34820, Turkey
| | - Ozgur Yolcu
- Turkish-German University, Molecular Biotechnology, Sahinkaya Caddesi, No. 106, Beykoz, Istanbul 34820, Turkey
| | - Egehan Teber
- Turkish-German University, Molecular Biotechnology, Sahinkaya Caddesi, No. 106, Beykoz, Istanbul 34820, Turkey
| | - Ario de Marco
- Laboratory of Environmental and Life Sciences, University of Nova Gorica, Vipavska cesta 13, Nova Gorica 5000, Slovenia
| | - Spase Shumka
- Faculty of Biotechnology and Food, Agricultural University of Tirana, 1019 Koder Kamza, Tirana, Albania
| |
Collapse
|
9
|
Zhou PY, Takeuchi A, Martinez-Lopez F, Ehghaghi M, Wong AKC, Lee ESA. Benchmarking Interpretability in Healthcare Using Pattern Discovery and Disentanglement. Bioengineering (Basel) 2025; 12:308. [PMID: 40150773 PMCID: PMC11939797 DOI: 10.3390/bioengineering12030308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Revised: 03/07/2025] [Accepted: 03/12/2025] [Indexed: 03/29/2025] Open
Abstract
The healthcare industry seeks to integrate AI into clinical applications, yet understanding AI decision making remains a challenge for healthcare practitioners as these systems often function as black boxes. Our work benchmarks the Pattern Discovery and Disentanglement (PDD) system's unsupervised learning algorithm, which provides interpretable outputs and clustering results from clinical notes to aid decision making. Using the MIMIC-IV dataset, we process free-text clinical notes and ICD-9 codes with Term Frequency-Inverse Document Frequency and Topic Modeling. The PDD algorithm discretizes numerical features into event-based features, discovers association patterns from a disentangled statistical feature value association space, and clusters clinical records. The output is an interpretable knowledge base linking knowledge, patterns, and data to support decision making. Despite being unsupervised, PDD demonstrated performance comparable to supervised deep learning models, validating its clustering ability and knowledge representation. We benchmark interpretability techniques-Feature Permutation, Gradient SHAP, and Integrated Gradients-on the best-performing models (in terms of F1, ROC AUC, balanced accuracy, etc.), evaluating these based on sufficiency, comprehensiveness, and sensitivity metrics. Our findings highlight the limitations of feature importance ranking and post hoc analysis for clinical diagnosis. Meanwhile, PDD's global interpretability effectively compensates for these issues, helping healthcare practitioners understand the decision-making process and providing suggestive clusters of diseases to assist their diagnosis.
Collapse
Affiliation(s)
- Pei-Yuan Zhou
- System Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Amane Takeuchi
- Department of Computer Science, University of Toronto, Toronto, ON M5S 1A1, Canada; (A.T.); (M.E.); (E.-S.A.L.)
| | | | - Malikeh Ehghaghi
- Department of Computer Science, University of Toronto, Toronto, ON M5S 1A1, Canada; (A.T.); (M.E.); (E.-S.A.L.)
| | - Andrew K. C. Wong
- System Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - En-Shiun Annie Lee
- Department of Computer Science, University of Toronto, Toronto, ON M5S 1A1, Canada; (A.T.); (M.E.); (E.-S.A.L.)
- Faculty of Science, Ontario Tech University, Oshawa, ON L1G 0C5, Canada
| |
Collapse
|
10
|
Zhang L, Li X, Shi L, Zheng Y, Ding Y, Yuan T, Hu S, Chen J, Xiao P. Bacterial diversity and biomarkers screening of station and carriage surface in Shanghai metro system, China. CURRENT RESEARCH IN MICROBIAL SCIENCES 2025; 8:100374. [PMID: 40225043 PMCID: PMC11992389 DOI: 10.1016/j.crmicr.2025.100374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2025] Open
Abstract
Background Mass transit environments, such as the metro, can facilitate the spread of bacteria between humans and their surroundings. These environments are particularly important for human health due to their potential for spreading pathogens and their impact on large populations. To gain a deeper understanding of bacterial distribution in subways, it is essential to identify variables that affect bacterial composition and microorganisms that are probably harmful to human heath. Methods We conducted high-throughput 16S rRNA gene sequencing on surface samples from 5 subway stations in Shanghai, China, during the warm(summer), cold(winter) and transition(autumn) seasons. Bacteria community features across the three seasons were distinguished using random forest classification analyses, followed by in-depth diversity analyses. Results Significant differences were observed in surface bacterial communities across seasons. Highly abundant bacterial groups were generally ubiquitous. Among these highly abundant families and genera, some were unique to surface samples. Notably, the phyla Firmicutes, Proteobacteria, and Actinobacteria were predominant, with total abundances of 32.87 %, 29.41 %, and 16.31 %, respectively. Alpha diversity indices were statistically significant (P < 0.05) among different seasons, with autumn exhibiting significantly higher alpha diversity metrics compared to summer and winter. Beta diversity analysis revealed significant compositional dissimilarities and distinct clustering patterns among the three seasons (P < 0.05). An analysis of similarities (ANOSIM) test results indicated significant differences in bacterial patterns at the phylum, class, order, family, genus levels among the seasons (P < 0.05). Random forest classification analyses identified the top 24 bacterial taxa at the genus level across seasons in the metro system. Conclusions We provided a direct comparison of surface bacterial microbiomes, and a comprehensive survey of seasonal variation in subways using culture-independent methods. Our findings reveal differences in both diversity and abundance of certain taxa across seasons, with 24 top indicator bacterial genera identified. This work serves as a reference for understanding the composition and dynamics of bacterial communities and for biomarker screening in subways, a crucial public space in our increasingly urbanized and interconnected world.
Collapse
Affiliation(s)
- Lijun Zhang
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Xiaojing Li
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Lisha Shi
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Yi Zheng
- Shanghai Shentong Metro Group Co.,Ltd, Shanghai 201103, China
| | - Yichen Ding
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Tao Yuan
- Shanghai Jiao Tong University, Shanghai 200030, China
| | - Shuangqing Hu
- Shanghai Academy of Environmental Sciences, Shanghai 200233,China
| | - Jian Chen
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| | - Ping Xiao
- Division of Public Health Service and Safety Assessment, Shanghai Municipal Center for Disease Control and Prevention, Shanghai 201107, China
- State Environmental Protection Key Laboratory of Environmental Health Impact Assessment of Emerging Contaminants, Shanghai 201107, China
| |
Collapse
|
11
|
Tripathi P, Render R, Nidhi S, Tripathi V. Microbial genomics: a potential toolkit for forensic investigations. Forensic Sci Med Pathol 2025; 21:417-429. [PMID: 38878110 DOI: 10.1007/s12024-024-00830-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/08/2024] [Indexed: 03/29/2025]
Abstract
Microbial forensics is a new discipline of science that analyzes evidence related to biological crime through the uniqueness and abundance of microorganisms and their toxins. Microorganisms remain alive longer than any other trace of biological evidence, such as DNA, fingerprints, and fibers, because of the protective cell membrane or capsules. Microbiological research has opened up various possibilities for forensic investigations of microbial flora. Current molecular technologies, including DNA sequencing, whole-genome sequencing, metagenomics, DNA fingerprinting, and molecular phylogeny, provide valid results for forensic investigations. Recent advancements in genome sequencing technologies, genetic data generation, and bioinformatic tools have significantly improved microbial sampling methods and forensic analyses. In this review, we discuss the applications of microbial genomic tools and technologies in forensic investigations, including human identification, geolocation, and causes of death.
Collapse
Affiliation(s)
- Pooja Tripathi
- Department of Computational Biology and Bioinformatics, Jacob Institute of Biotechnology and Bioengineering, Sam Higginbottom University of Agriculture, Technology and Sciences, Prayagraj, Uttar Pradesh, 211007, India
| | - Riya Render
- Department of Forensic Sciences, National Forensic Sciences University, Ponda, Goa, 430401, India
| | - Sweta Nidhi
- Department of Forensic Sciences, National Forensic Sciences University, Ponda, Goa, 430401, India
| | - Vijay Tripathi
- Department of Microbiology, Graphic Era Deemed to be University, Clement Town, Dehradun, 248002, India.
| |
Collapse
|
12
|
Liu J, Xu C, Wang R, Huang J, Zhao R, Wang R. Microbiota and metabolomic profiling coupled with machine learning to identify biomarkers and drug targets in nasopharyngeal carcinoma. Front Pharmacol 2025; 16:1551411. [PMID: 40078290 PMCID: PMC11897916 DOI: 10.3389/fphar.2025.1551411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2024] [Accepted: 01/28/2025] [Indexed: 03/14/2025] Open
Abstract
Background Nasopharyngeal carcinoma (NPC) is a prevalent malignancy in certain regions, with radiotherapy as the standard treatment. However, resistance to radiotherapy remains a critical challenge, necessitating the identification of novel biomarkers and therapeutic targets. The tumor-associated microbiota and metabolites have emerged as potential modulators of radiotherapy outcomes. Methods This study included 22 NPC patients stratified into radiotherapy-responsive (R, n = 12) and radiotherapy-non-responsive (NR, n = 10) groups. Tumor tissue and fecal samples were subjected to 16S rRNA sequencing to profile microbiota composition and targeted metabolomics to quantify short-chain fatty acids (SCFAs). The XGBoost algorithm was applied to identify microbial taxa associated with radiotherapy response, and quantitative PCR (qPCR) was used to validate key findings. Statistical analyses were conducted to assess differences in microbial diversity, relative abundance, and metabolite levels between the groups. Results Significant differences in alpha diversity at the species level were observed between the R and NR groups. Bacteroides acidifaciens was enriched in the NR group, while Propionibacterium acnes and Clostridium magna were more abundant in the R group. Machine learning identified Acidosoma, Propionibacterium acnes, and Clostridium magna as key predictors of radiotherapy response. Metabolomic profiling revealed elevated acetate levels in the NR group, implicating its role in tumor growth and immune evasion. Validation via qPCR confirmed the differential abundance of these microbial taxa in both tumor tissue and fecal samples. Discussion Our findings highlight the interplay between microbiota and metabolite profiles in influencing radiotherapy outcomes in NPC. These results suggest that targeting the microbiota-metabolite axis may enhance radiotherapy efficacy in NPC.
Collapse
Affiliation(s)
- Junsong Liu
- Department of Otorhinolaryngology-Head and Neck Surgery, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Chongwen Xu
- Department of Otorhinolaryngology-Head and Neck Surgery, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Rui Wang
- Department of Thoracic Surgery, The First Affiliated Hospital of Xi’an Jiaotong University, Cancer Centre, Xi’an, Shaanxi, China
| | - Jianhua Huang
- Department of Otorhinolaryngology-Head and Neck Surgery, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Ruimin Zhao
- Department of Otorhinolaryngology-Head and Neck Surgery, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| | - Rui Wang
- Department of Anesthesiology, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an, Shaanxi, China
| |
Collapse
|
13
|
Azouggagh L, Ibáñez-Escriche N, Martínez-Álvaro M, Varona L, Casellas J, Negro S, Casto-Rebollo C. Characterization of microbiota signatures in Iberian pig strains using machine learning algorithms. Anim Microbiome 2025; 7:13. [PMID: 39901297 PMCID: PMC11789298 DOI: 10.1186/s42523-025-00378-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Accepted: 01/20/2025] [Indexed: 02/05/2025] Open
Abstract
BACKGROUND There is a growing interest in uncovering the factors that shape microbiome composition due to its association with complex phenotypic traits in livestock. Host genetic variation is increasingly recognized as a major factor influencing the microbiome. The Iberian pig breed, known for its high-quality meat products, includes various strains with recognized genetic and phenotypic variability. However, despite the microbiome's known impact on pigs' productive phenotypes such as meat quality traits, comparative analyses of gut microbial composition across Iberian pig strains are lacking. This study aims to explore the gut microbiota of two Iberian pig strains, Entrepelado (n = 74) and Retinto (n = 63), and their reciprocal crosses (n = 100), using machine learning (ML) models to identify key microbial taxa relevant for distinguishing their genetic backgrounds, which holds potential application in the pig industry. Nine ML algorithms, including tree-based, kernel-based, probabilistic, and linear algorithms, were used. RESULTS Beta diversity analysis on 16 S rRNA microbiome data revealed compositional divergence among genetic, age and batch groups. ML models exploring maternal, paternal and heterosis effects showed varying levels of classification performance, with the paternal effect scenario being the best, achieving a mean Area Under the ROC curve (AUROC) of 0.74 using the Catboost (CB) algorithm. However, the most genetically distant animals, the purebreds, were more easily discriminated using the ML models. The classification of the two Iberian strains reached the highest mean AUROC of 0.83 using Support Vector Machine (SVM) model. The most relevant genera in this classification performance were Acetitomaculum, Butyricicoccus and Limosilactobacillus. All of which exhibited a relevant differential abundance between purebred animals using a Bayesian linear model. CONCLUSIONS The study confirms variations in gut microbiota among Iberian pig strains and their crosses, influenced by genetic and non-genetic factors. ML models, particularly CB and RF, as well as SVM in certain scenarios, combined with a feature selection process, effectively classified genetic groups based on microbiome data and identified key microbial taxa. These taxa were linked to short-chain fatty acids production and lipid metabolism, suggesting microbial composition differences may contribute to variations in fat-related traits among Iberian genetic groups.
Collapse
Affiliation(s)
- Lamiae Azouggagh
- Institute for Animal Science and Technology, Universitat Politècnica de Valencia, Valencia, 46022, Spain
| | - Noelia Ibáñez-Escriche
- Institute for Animal Science and Technology, Universitat Politècnica de Valencia, Valencia, 46022, Spain.
| | - Marina Martínez-Álvaro
- Institute for Animal Science and Technology, Universitat Politècnica de Valencia, Valencia, 46022, Spain
| | - Luis Varona
- Instituto Agroalimentario de Aragón (IA2), Universidad de Zaragoza, Zaragoza, 50013, Spain
| | - Joaquim Casellas
- Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, 08193, Spain
| | | | - Cristina Casto-Rebollo
- Institute for Animal Science and Technology, Universitat Politècnica de Valencia, Valencia, 46022, Spain
| |
Collapse
|
14
|
Li Y, Tao C, Li S, Chen W, Fu D, Jafvert CT, Zhu T. Feasibility study of machine learning to explore relationships between antimicrobial resistance and microbial community structure in global wastewater treatment plant sludges. BIORESOURCE TECHNOLOGY 2025; 417:131878. [PMID: 39603473 DOI: 10.1016/j.biortech.2024.131878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 11/19/2024] [Accepted: 11/23/2024] [Indexed: 11/29/2024]
Abstract
Wastewater sludges (WSs) are major reservoirs and emission sources of antibiotic resistance genes (ARGs) in cities. Identifying antimicrobial resistance (AMR) host bacteria in WSs is crucial for understanding AMR formation and mitigating biological and ecological risks. Here 24 sludge data from wastewater treatment plants in Jiangsu Province, China, and 1559 sludge data from genetic databases were analyzed to explore the relationship between 7 AMRs and bacterial distribution. The results of the Procrustes and Spearman correlation analysis were unsatisfactory, with p-value exceeding the threshold of 0.05 and no strong correlation (r > 0.8). In contrast, explainable machine learning (EML) using SHapley Additive exPlanation (SHAP) revealed Pseudomonadota as a major contributor (39.3 %-74.2 %) to sludge AMR. Overall, the application of ML is promising in analyzing AMR-bacteria relationships. Given the different applicable occasions and advantages of various analysis methods, using ML as one of the correlation analysis tools is strongly recommended.
Collapse
Affiliation(s)
- Yi Li
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Cuicui Tao
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Shuyin Li
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Wenxuan Chen
- Department of Applied Microbial Ecology, Helmholtz Centre for Environmental Research (UFZ), 04318 Leipzig, Germany; School of Civil Engineering, Southeast University, Nanjing 210096, China
| | - Dafang Fu
- School of Civil Engineering, Southeast University, Nanjing 210096, China
| | - Chad T Jafvert
- Lyles School of Civil Engineering, and Environmental & Ecological Engineering, Purdue University, West Lafayette, IN 47907, USA.
| | - Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China.
| |
Collapse
|
15
|
Yin L, Yang M, Teng A, Ni C, Wang P, Tang S. Unraveling Microplastic Effects on Gut Microbiota across Various Animals Using Machine Learning. ACS NANO 2025; 19:369-380. [PMID: 39723918 DOI: 10.1021/acsnano.4c07885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2024]
Abstract
Microplastics, rapidly expanding and durable pollutant, have been shown to significantly impact gut microbiota across a spectrum of animal species. However, comprehensive analyses comparing microplastic effects on gut microbiota among these species are still limited, and the critical factors driving these effects remain to be clarified. To address these issues, we compiled 1352 gut microbiota samples from six animal categories, employing machine learning to conduct an in-depth meta-analysis. Our study revealed that mice, compared with other animals, not only exhibit a heightened susceptibility to the toxic effects of microplastics─evidenced by decreased gut microbiota diversity, increased Firmicutes/Bacteroidetes ratios, destabilized microbial networks, and disruption in the equilibrium of beneficial and harmful bacteria─but also possess limited potential to degrade microplastics, unlike earthworms and insects. Furthermore, machine learning models confirmed that exposure duration is the key factor driving changes induced by microplastics in gut microbiota. We also identified Lactobacillus, Helicobacter, and Pseudomonas as potential biomarkers for detecting microplastic toxicity in the animal gut. Overall, these findings provide valuable insights into the health risks and driving factors associated with microplastic exposure across multiple animal species.
Collapse
Affiliation(s)
- Lingzi Yin
- Bioscience and Biomedical Engineering Thrust, Systems Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong 511453, China
| | - Minghao Yang
- Bioscience and Biomedical Engineering Thrust, Systems Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong 511453, China
| | - Anqi Teng
- Bioscience and Biomedical Engineering Thrust, Systems Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong 511453, China
| | - Can Ni
- Department of Ocean Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR 999077, China
| | - Pandeng Wang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-sen University, Guangzhou 510275, China
| | - Shaojun Tang
- Bioscience and Biomedical Engineering Thrust, Systems Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong 511453, China
- Division of Emerging Interdisciplinary Areas, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR 999077 China
| |
Collapse
|
16
|
Kim S, Lee HC, Sim JE, Park SJ, Oh HH. Bacterial profile-based body fluid identification using a machine learning approach. Genes Genomics 2025; 47:87-98. [PMID: 39503932 DOI: 10.1007/s13258-024-01594-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Accepted: 10/23/2024] [Indexed: 01/16/2025]
Abstract
BACKGROUND Identifying the origins of biological traces is critical for the reconstruction of crime scenes in forensic investigations. Traditional methods for body fluid identification rely on chemical, enzymatic, immunological, and spectroscopic techniques, which can be sample-consuming and depend on simple color-change reactions. However, these methods have limitations when residual samples are insufficient after DNA extraction. OBJECTIVE This study aimed to develop a method for body fluid identification by leveraging bacterial DNA profiling to overcome the limitations of the conventional approaches. METHODS Bacterial profiles were determined by sequencing the hypervariable region of the 16 S rRNA gene, using DNA metabarcoding of evidence collected from criminal cases. Amplicon sequence variants (ASVs) were analyzed to identify significant microbial patterns in different body fluid samples. RESULTS The bacterial profile-based method demonstrated high discriminatory power with a machine learning model trained using the naïve Bayes algorithm, achieving an accuracy of over 98% in classifying samples into one of four body fluid types: blood, saliva, vaginal secretion, and mixture traces of vaginal secretions and semen. CONCLUSION Bacterial profiling enhances the accuracy and robustness of body fluid identification in forensic analysis, providing a valuable alternative to traditional methods by utilizing DNA and microbial community data despite the uncontrollable conditions. This approach offers significant improvements in the classification accuracy and practical applicability in forensic investigations.
Collapse
Affiliation(s)
- Sungmin Kim
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea.
| | - Han Chul Lee
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea
| | - Jeong Eun Sim
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea
| | - Su Jeong Park
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea
| | - Hye Hyun Oh
- Forensic Genetics and Chemistry Division, Supreme Prosecutors' Office, 157 Banpo daero, Seocho gu, Seoul, 06590, Republic of Korea
| |
Collapse
|
17
|
Hosseiniyan Khatibi SM, Dimaano NG, Veliz E, Sundaresan V, Ali J. Exploring and exploiting the rice phytobiome to tackle climate change challenges. PLANT COMMUNICATIONS 2024; 5:101078. [PMID: 39233440 PMCID: PMC11671768 DOI: 10.1016/j.xplc.2024.101078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 08/07/2024] [Accepted: 09/02/2024] [Indexed: 09/06/2024]
Abstract
The future of agriculture is uncertain under the current climate change scenario. Climate change directly and indirectly affects the biotic and abiotic elements that control agroecosystems, jeopardizing the safety of the world's food supply. A new area that focuses on characterizing the phytobiome is emerging. The phytobiome comprises plants and their immediate surroundings, involving numerous interdependent microscopic and macroscopic organisms that affect the health and productivity of plants. Phytobiome studies primarily focus on the microbial communities associated with plants, which are referred to as the plant microbiome. The development of high-throughput sequencing technologies over the past 10 years has dramatically advanced our understanding of the structure, functionality, and dynamics of the phytobiome; however, comprehensive methods for using this knowledge are lacking, particularly for major crops such as rice. Considering the impact of rice production on world food security, gaining fresh perspectives on the interdependent and interrelated components of the rice phytobiome could enhance rice production and crop health, sustain rice ecosystem function, and combat the effects of climate change. Our review re-conceptualizes the complex dynamics of the microscopic and macroscopic components in the rice phytobiome as influenced by human interventions and changing environmental conditions driven by climate change. We also discuss interdisciplinary and systematic approaches to decipher and reprogram the sophisticated interactions in the rice phytobiome using novel strategies and cutting-edge technology. Merging the gigantic datasets and complex information on the rice phytobiome and their application in the context of regenerative agriculture could lead to sustainable rice farming practices that are resilient to the impacts of climate change.
Collapse
Affiliation(s)
| | - Niña Gracel Dimaano
- International Rice Research Institute, Los Baños, Laguna, Philippines; College of Agriculture and Food Science, University of the Philippines Los Baños, Los Baños, Laguna, Philippines
| | - Esteban Veliz
- College of Biological Sciences, University of California, Davis, Davis, CA, USA
| | - Venkatesan Sundaresan
- College of Biological Sciences, University of California, Davis, Davis, CA, USA; College of Agricultural and Environmental Sciences, University of California, Davis, Davis, CA, USA
| | - Jauhar Ali
- International Rice Research Institute, Los Baños, Laguna, Philippines.
| |
Collapse
|
18
|
Rugji J, Erol Z, Taşçı F, Musa L, Hamadani A, Gündemir MG, Karalliu E, Siddiqui SA. Utilization of AI - reshaping the future of food safety, agriculture and food security - a critical review. Crit Rev Food Sci Nutr 2024:1-45. [PMID: 39644464 DOI: 10.1080/10408398.2024.2430749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/09/2024]
Abstract
Artificial intelligence is an emerging technology which harbors a suite of mechanisms that have the potential to be leveraged for reaping value across multiple domains. Lately, there is an increased interest in embracing applications associated with Artificial Intelligence to positively contribute to food safety. These applications such as machine learning, computer vision, predictive analytics algorithms, sensor networks, robotic inspection systems, and supply chain optimization tools have been established to contribute to several domains of food safety such as early warning of outbreaks, risk prediction, detection and identification of food associated pathogens. Simultaneously, the ambition toward establishing a sustainable food system has motivated the adoption of cutting-edge technologies such as Artificial Intelligence to strengthen food security. Given the myriad challenges confronting stakeholders in their endeavors to safeguard food security, Artificial Intelligence emerges as a promising tool capable of crafting holistic management strategies for food security. This entails maximizing crop yields, mitigating losses, and trimming operational expenses. AI models present notable benefits in efficiency, precision, uniformity, automation, pattern identification, accessibility, and scalability for food security endeavors. The escalation in the global trend for adopting alternative protein sources such as edible insects and microalgae as a sustainable food source reflects a growing recognition of the need for sustainable and resilient food systems to address the challenges of population growth, environmental degradation, and food insecurity. Artificial Intelligence offers a range of capabilities to enhance food safety in the production and consumption of alternative proteins like microalgae and edible insects, contributing to a sustainable and secure food system.
Collapse
Affiliation(s)
- Jerina Rugji
- Department of Food Hygiene and Technology, Burdur Mehmet Akif Ersoy University, Burdur, Turkey
- Department of Food Science, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Zeki Erol
- Department of Food Hygiene and Technology, Necmettin Erbakan University, Ereğli, Konya, Turkey
| | - Fulya Taşçı
- Department of Food Hygiene and Technology, Burdur Mehmet Akif Ersoy University, Burdur, Turkey
| | - Laura Musa
- Department of Veterinary Medicine and Animal Sciences, University of Milan, Milan, Italy
| | - Ambreen Hamadani
- Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | | | - Esa Karalliu
- Department of Infectious Diseases and Public Health, City University of Hong Kong, Hong Kong
| | | |
Collapse
|
19
|
Liu R, Zou Z, Zhang Z, He H, Xi M, Liang Y, Ye J, Dai Q, Wu Y, Tan H, Zhong W, Wang Z, Liang Y. Evaluation of glucocorticoid-related genes reveals GPD1 as a therapeutic target and regulator of sphingosine 1-phosphate metabolism in CRPC. Cancer Lett 2024; 605:217286. [PMID: 39413958 DOI: 10.1016/j.canlet.2024.217286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 09/08/2024] [Accepted: 10/03/2024] [Indexed: 10/18/2024]
Abstract
Prostate cancer (PCa) is an androgen-dependent disease, with castration-resistant prostate cancer (CRPC) being an advanced stage that no longer responds to androgen deprivation therapy (ADT). Mounting evidence suggests that glucocorticoid receptors (GR) confer resistance to ADT in CRPC patients by bypassing androgen receptor (AR) blockade. GR, as a novel therapeutic target in CRPC, has attracted substantial attention worldwide. This study utilized bioinformatic analysis of publicly available CRPC single-cell data to develop a consensus glucocorticoid-related signature (Glu-sig) that can serve as an independent predictor for relapse-free survival. Our results revealed that the signature demonstrated consistent and robust performance across seven publicly accessible datasets and an internal cohort. Furthermore, our findings demonstrated that glycerol-3-phosphate dehydrogenase 1 (GPD1) in Glu-sig can significantly promote CRPC progression by mediating the cell cycle pathway. Additionally, GPD1 was shown to be regulated by GR, with the GR antagonist mifepristone enhancing the anti-tumorigenic effects of GPD1 in CRPC cells. Mechanistically, targeting GPD1 induced the production of sphingosine 1-phosphate (S1P) and enhanced histone acetylation, thereby inducing the transcription of p21 that involved in cell cycle regulation. In conclusion, Glu-sig could serve as a robust and promising tool to improve the clinical outcomes of PCa patients, and modulating the GR/GPD1 axis that promotes tumor growth may be a promising approach for delaying CRPC progression.
Collapse
Affiliation(s)
- Ren Liu
- Department of Urology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Zhihao Zou
- Department of Urology, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China; Guangzhou Laboratory, Guangzhou, China
| | - Zhengrong Zhang
- Department of Urology, Zhuhai Hospital Affiliated with Jinan University, Zhuhai, China
| | - Huichan He
- State Key Laboratory of Respiratory Disease, Guangzhou Medical University, Guangzhou, China
| | - Ming Xi
- Department of Urology, Huadu District People's Hospital, Southern Medical University, Guangzhou, China
| | - Yingke Liang
- Department of Urology, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Jianheng Ye
- Department of Urology, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Qishan Dai
- Department of Urology, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Yongding Wu
- Department of Urology, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Huijing Tan
- Department of Urology, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Weide Zhong
- Department of Urology, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China; Guangzhou Laboratory, Guangzhou, China; Macau Institute of Systems Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau, China.
| | - Zongren Wang
- Department of Urology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.
| | - Yuxiang Liang
- Department of Urology, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China.
| |
Collapse
|
20
|
Han J, Zhang H, Ning K. Techniques for learning and transferring knowledge for microbiome-based classification and prediction: review and assessment. Brief Bioinform 2024; 26:bbaf015. [PMID: 39820436 PMCID: PMC11737891 DOI: 10.1093/bib/bbaf015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Revised: 12/10/2024] [Accepted: 01/06/2025] [Indexed: 01/19/2025] Open
Abstract
The volume of microbiome data is growing at an exponential rate, and the current methodologies for big data mining are encountering substantial obstacles. Effectively managing and extracting valuable insights from these vast microbiome datasets has emerged as a significant challenge in the field of contemporary microbiome research. This comprehensive review delves into the utilization of foundation models and transfer learning techniques within the context of microbiome-based classification and prediction tasks, advocating for a transition away from traditional task-specific or scenario-specific models towards more adaptable, continuous learning models. The article underscores the practicality and benefits of initially constructing a robust foundation model, which can then be fine-tuned using transfer learning to tackle specific context tasks. In real-world scenarios, the application of transfer learning empowers models to leverage disease-related data from one geographical area and enhance diagnostic precision in different regions. This transition from relying on "good models" to embracing "adaptive models" resonates with the philosophy of "teaching a man to fish" thereby paving the way for advancements in personalized medicine and accurate diagnosis. Empirical research suggests that the integration of foundation models with transfer learning methodologies substantially boosts the performance of models when dealing with large-scale and diverse microbiome datasets, effectively mitigating the challenges posed by data heterogeneity.
Collapse
Affiliation(s)
- Jin Han
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of AI Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Luoyu Road 1037, Wuhan 430074, Hubei, China
| | - Haohong Zhang
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of AI Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Luoyu Road 1037, Wuhan 430074, Hubei, China
| | - Kang Ning
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center of AI Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Luoyu Road 1037, Wuhan 430074, Hubei, China
| |
Collapse
|
21
|
Wang L, Lu W, Song Y, Liu S, Fu YV. Using machine learning to identify environmental factors that collectively determine microbial community structure of activated sludge. ENVIRONMENTAL RESEARCH 2024; 260:119635. [PMID: 39025351 DOI: 10.1016/j.envres.2024.119635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/12/2024] [Accepted: 07/15/2024] [Indexed: 07/20/2024]
Abstract
Activated sludge (AS) microbial communities are influenced by various environmental variables. However, a comprehensive analysis of how these variables jointly and nonlinearly shape the AS microbial community remains challenging. In this study, we employed advanced machine learning techniques to elucidate the collective effects of environmental variables on the structure and function of AS microbial communities. Applying Dirichlet multinomial mixtures analysis to 311 global AS samples, we identified four distinct microbial community types (AS-types), each characterized by unique microbial compositions and metabolic profiles. We used 14 classical linear and nonlinear machine learning methods to select a baseline model. The extremely randomized trees demonstrated optimal performance in learning the relationship between environmental factors and AS types (with an accuracy of 71.43%). Feature selection identified critical environmental factors and their importance rankings, including latitude (Lat), longitude (Long), precipitation during sampling (Precip), solids retention time (SRT), effluent total nitrogen (Effluent TN), average temperature during sampling month (Avg Temp), mixed liquor temperature (Mixed Temp), influent biochemical oxygen demand (Influent BOD), and annual precipitation (Annual Precip). Significantly, Lat, Long, Precip, Avg Temp, and Annual Precip, influenced metabolic variations among AS types. These findings emphasize the pivotal role of environmental variables in shaping microbial community structures and enhancing metabolic pathways within activated sludge. Our study encourages the application of machine learning techniques to design artificial activated sludge microbial communities for specific environmental purposes.
Collapse
Affiliation(s)
- Lu Wang
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Weilai Lu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yang Song
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuangjiang Liu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yu Vincent Fu
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, 100101, China; Savaid Medical School, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
22
|
Oh S, Byeon H, Wijaya J. Machine learning surveillance of foodborne infectious diseases using wastewater microbiome, crowdsourced, and environmental data. WATER RESEARCH 2024; 265:122282. [PMID: 39178596 DOI: 10.1016/j.watres.2024.122282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 08/14/2024] [Accepted: 08/15/2024] [Indexed: 08/26/2024]
Abstract
Clostridium perfringens (CP) is a common cause of foodborne infection, leading to significant human health risks and a high economic burden. Thus, effective CP disease surveillance is essential for preventive and therapeutic interventions; however, conventional practices often entail complex, resource-intensive, and costly procedures. This study introduced a data-driven machine learning (ML) modeling framework for CP-related disease surveillance. It leveraged an integrated dataset of municipal wastewater microbiome (e.g., CP abundance), crowdsourced (CP-related web search keywords), and environmental data. Various optimization strategies, including data integration, data normalization, model selection, and hyperparameter tuning, were implemented to improve the ML modeling performance, leading to enhanced predictions of CP cases over time. Explainable artificial intelligence methods identified CP abundance as the most reliable predictor of CP disease cases. Multi-omics subsequently revealed the presence of CP and its genotypes/toxinotypes in wastewater, validating the utility of microbiome-data-enabled ML surveillance for foodborne diseases. This ML-based framework thus exhibits significant potential for complementing and reinforcing existing disease surveillance systems.
Collapse
Affiliation(s)
- Seungdae Oh
- Department of Civil Engineering, College of Engineering, Kyung Hee University, Yongin, Republic of Korea.
| | - Haeil Byeon
- Department of Civil Engineering, College of Engineering, Kyung Hee University, Yongin, Republic of Korea
| | - Jonathan Wijaya
- Department of Civil Engineering, College of Engineering, Kyung Hee University, Yongin, Republic of Korea
| |
Collapse
|
23
|
Dos Santos GR, Fagnani E. Statistical assessment for reusing permeate from dewatering water treatment sludge process: more sustainability in environmental sanitation. ENVIRONMENTAL MONITORING AND ASSESSMENT 2024; 196:1066. [PMID: 39419907 DOI: 10.1007/s10661-024-13257-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Accepted: 10/10/2024] [Indexed: 10/19/2024]
Abstract
Sludge from water treatment plants (WTPs) is usually processed by physicochemical clarification followed by thickening, which results in the production of an effluent from dewatering/drying sludge processes that can potentially impact the environment. This paper assessed the viability of employing sludge dewatering water from a water treatment sludge plant (WTSP) in São Paulo State, Brazil, for reuse purposes. Water quality variables were monitored in the effluent and receiving stream. The data were analyzed by paired samples Student t-test (parametric significance test), paired samples Wilcoxon signed rank test (non-parametric significance test), and principal component analysis (multivariate analysis). Despite the distribution of data, typically not Gaussian, both Student and Wilcoxon methods agreed in 9 out of 10 studied parameters regarding the influence of WTSP discharge on waterbody; only for total manganese the Wilcoxon approach provided better fit than Student. Principal component analysis helped to evince correlations among all variables. Results provided useful information for understanding the vocation of WTSP effluent for reuse. For direct non-potable reuse, recirculating the final effluent back to the WTP for two months saved 92,000 m3 of water, the volume of drinking water demanded by the city (n = 292,000 inhabitants) in approximately 30 h.
Collapse
Affiliation(s)
- Giovana R Dos Santos
- Research Group for Optimization of Analytical Technologies Applied to Environmental and Sanitary Samples (GOTAS), School of Technology, Universidade Estadual de Campinas, Limeira, SP, 13484-332, Brazil
| | - Enelton Fagnani
- Research Group for Optimization of Analytical Technologies Applied to Environmental and Sanitary Samples (GOTAS), School of Technology, Universidade Estadual de Campinas, Limeira, SP, 13484-332, Brazil.
| |
Collapse
|
24
|
Bobbo T, Biscarini F, Yaddehige SK, Alberghini L, Rigoni D, Bianchi N, Taccioli C. Machine learning classification of archaea and bacteria identifies novel predictive genomic features. BMC Genomics 2024; 25:955. [PMID: 39402493 PMCID: PMC11472548 DOI: 10.1186/s12864-024-10832-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 09/24/2024] [Indexed: 10/19/2024] Open
Abstract
BACKGROUND Archaea and Bacteria are distinct domains of life that are adapted to a variety of ecological niches. Several genome-based methods have been developed for their accurate classification, yet many aspects of the specific genomic features that determine these differences are not fully understood. In this study, we used publicly available whole-genome sequences from bacteria ( N = 2546 ) and archaea ( N = 109 ). From these, a set of genomic features (nucleotide frequencies and proportions, coding sequences (CDS), non-coding, ribosomal and transfer RNA genes (ncRNA, rRNA, tRNA), Chargaff's, topological entropy and Shannon's entropy scores) was extracted and used as input data to develop machine learning models for the classification of archaea and bacteria. RESULTS The classification accuracy ranged from 0.993 (Random Forest) to 0.998 (Neural Networks). Over the four models, only 11 examples were misclassified, especially those belonging to the minority class (Archaea). From variable importance, tRNA topological and Shannon's entropy, nucleotide frequencies in tRNA, rRNA and ncRNA, CDS, tRNA and rRNA Chargaff's scores have emerged as the top discriminating factors. In particular, tRNA entropy (both topological and Shannon's) was the most important genomic feature for classification, pointing at the complex interactions between the genetic code, tRNAs and the translational machinery. CONCLUSIONS tRNA, rRNA and ncRNA genes emerged as the key genomic elements that underpin the classification of archaea and bacteria. In particular, higher nucleotide diversity was found in tRNA from bacteria compared to archaea. The analysis of the few classification errors reflects the complex phylogenetic relationships between bacteria, archaea and eukaryotes.
Collapse
Affiliation(s)
- Tania Bobbo
- Institute for Biomedical Technologies, National Research Council (CNR), Via Fratelli Cervi 93, Segrate (MI), 20054, Italy
| | - Filippo Biscarini
- Institute of Agricultural Biology and Biotechnology, National Research Council (CNR), Via Edoardo Bassini 15, Milano, 20133, Italy.
| | - Sachithra K Yaddehige
- Department of Animal Medicine, Health and Production, University of Padova, Viale dell'Universitá 16, Legnaro, 35020, Italy
| | - Leonardo Alberghini
- Department of Animal Medicine, Health and Production, University of Padova, Viale dell'Universitá 16, Legnaro, 35020, Italy
| | - Davide Rigoni
- Department of Pharmaceutical and Pharmacological Sciences, University of Padova, Via Francesco Marzolo 5, Padova, 35131, Italy
| | - Nicoletta Bianchi
- Department of Translational Medicine, University of Ferrara, Via Luigi Borsari 46, Ferrara, 44121, Italy.
| | - Cristian Taccioli
- Department of Animal Medicine, Health and Production, University of Padova, Viale dell'Universitá 16, Legnaro, 35020, Italy.
| |
Collapse
|
25
|
Mason AR, McKee-Zech HS, Steadman DW, DeBruyn JM. Environmental predictors impact microbial-based postmortem interval (PMI) estimation models within human decomposition soils. PLoS One 2024; 19:e0311906. [PMID: 39392823 PMCID: PMC11469530 DOI: 10.1371/journal.pone.0311906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 09/13/2024] [Indexed: 10/13/2024] Open
Abstract
Microbial succession has been suggested to supplement established postmortem interval (PMI) estimation methods for human remains. Due to limitations of entomological and morphological PMI methods, microbes are an intriguing target for forensic applications as they are present at all stages of decomposition. Previous machine learning models from soil necrobiome data have produced PMI error rates from two and a half to six days; however, these models are built solely on amplicon sequencing of biomarkers (e.g., 16S, 18S rRNA genes) and do not consider environmental factors that influence the presence and abundance of microbial decomposers. This study builds upon current research by evaluating the inclusion of environmental data on microbial-based PMI estimates from decomposition soil samples. Random forest regression models were built to predict PMI using relative taxon abundances obtained from different biological markers (bacterial 16S, fungal ITS, 16S-ITS combined) and taxonomic levels (phylum, class, order, OTU), both with and without environmental predictors (ambient temperature, soil pH, soil conductivity, and enzyme activities) from 19 deceased human individuals that decomposed on the soil surface (Tennessee, USA). Model performance was evaluated by calculating the mean absolute error (MAE). MAE ranged from 804 to 997 accumulated degree hours (ADH) across all models. 16S models outperformed ITS models (p = 0.006), while combining 16S and ITS did not improve upon 16S models alone (p = 0.47). Inclusion of environmental data in PMI prediction models had varied effects on MAE depending on the biological marker and taxonomic level conserved. Specifically, inclusion of the measured environmental features reduced MAE for all ITS models, but improved 16S models at higher taxonomic levels (phylum and class). Overall, we demonstrated some level of predictability in soil microbial succession during human decomposition, however error rates were high when considering a moderate population of donors.
Collapse
Affiliation(s)
- Allison R. Mason
- Department of Microbiology, University of Tennessee-Knoxville, Knoxville, TN, United States of America
| | - Hayden S. McKee-Zech
- Department of Anthropology, University of Tennessee-Knoxville, Knoxville, TN, United States of America
| | - Dawnie W. Steadman
- Department of Anthropology, University of Tennessee-Knoxville, Knoxville, TN, United States of America
| | - Jennifer M. DeBruyn
- Department of Microbiology, University of Tennessee-Knoxville, Knoxville, TN, United States of America
- Department of Biosystems Engineering and Soil Science, University of Tennessee-Knoxville, Knoxville, TN, United States of America
| |
Collapse
|
26
|
MacGregor H, Fukai I, Ash K, Arkin AP, Hazen TC. Potential applications of microbial genomics in nuclear non-proliferation. Front Microbiol 2024; 15:1410820. [PMID: 39360321 PMCID: PMC11445143 DOI: 10.3389/fmicb.2024.1410820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 09/04/2024] [Indexed: 10/04/2024] Open
Abstract
As nuclear technology evolves in response to increased demand for diversification and decarbonization of the energy sector, new and innovative approaches are needed to effectively identify and deter the proliferation of nuclear arms, while ensuring safe development of global nuclear energy resources. Preventing the use of nuclear material and technology for unsanctioned development of nuclear weapons has been a long-standing challenge for the International Atomic Energy Agency and signatories of the Treaty on the Non-Proliferation of Nuclear Weapons. Environmental swipe sampling has proven to be an effective technique for characterizing clandestine proliferation activities within and around known locations of nuclear facilities and sites. However, limited tools and techniques exist for detecting nuclear proliferation in unknown locations beyond the boundaries of declared nuclear fuel cycle facilities, representing a critical gap in non-proliferation safeguards. Microbiomes, defined as "characteristic communities of microorganisms" found in specific habitats with distinct physical and chemical properties, can provide valuable information about the conditions and activities occurring in the surrounding environment. Microorganisms are known to inhabit radionuclide-contaminated sites, spent nuclear fuel storage pools, and cooling systems of water-cooled nuclear reactors, where they can cause radionuclide migration and corrosion of critical structures. Microbial transformation of radionuclides is a well-established process that has been documented in numerous field and laboratory studies. These studies helped to identify key bacterial taxa and microbially-mediated processes that directly and indirectly control the transformation, mobility, and fate of radionuclides in the environment. Expanding on this work, other studies have used microbial genomics integrated with machine learning models to successfully monitor and predict the occurrence of heavy metals, radionuclides, and other process wastes in the environment, indicating the potential role of nuclear activities in shaping microbial community structure and function. Results of this previous body of work suggest fundamental geochemical-microbial interactions occurring at nuclear fuel cycle facilities could give rise to microbiomes that are characteristic of nuclear activities. These microbiomes could provide valuable information for monitoring nuclear fuel cycle facilities, planning environmental sampling campaigns, and developing biosensor technology for the detection of undisclosed fuel cycle activities and proliferation concerns.
Collapse
Affiliation(s)
| | - Isis Fukai
- Bredesen Center, University of Tennessee, Knoxville, TN, United States
| | - Kurt Ash
- Department of Civil and Environmental Engineering, University of Tennessee, Knoxville, TN, United States
| | - Adam Paul Arkin
- University of California, Berkeley, Berkeley, CA, United States
| | - Terry C. Hazen
- Bredesen Center, University of Tennessee, Knoxville, TN, United States
- Department of Civil and Environmental Engineering, University of Tennessee, Knoxville, TN, United States
- Department of Microbiology, University of Tennessee, Knoxville, TN, United States
- Department of Earth and Planetary Sciences, University of Tennessee, Knoxville, TN, United States
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, United States
| |
Collapse
|
27
|
Ligda P, Mittas N, Kyzas GZ, Claerebout E, Sotiraki S. Machine learning and explainable artificial intelligence for the prevention of waterborne cryptosporidiosis and giardiosis. WATER RESEARCH 2024; 262:122110. [PMID: 39042970 DOI: 10.1016/j.watres.2024.122110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 06/21/2024] [Accepted: 07/15/2024] [Indexed: 07/25/2024]
Abstract
Cryptosporidium and Giardia are important parasitic protozoa due to their zoonotic potential and impact on human health, and have often caused waterborne outbreaks of disease. Detection of (oo)cysts in water matrices is challenging and extremely costly, thus only few countries have legislated for regular monitoring of drinking water for their presence. Several attempts have been made trying to investigate the association between the presence of such (oo)cysts in waters with other biotic or abiotic factors, with inconclusive findings. In this regard, the aim of this study was the development of an holistic approach leveraging Machine Learning (ML) and eXplainable Artificial Intelligence (XAI) techniques, in order to provide empirical evidence related to the presence and prediction of Cryptosporidium oocysts and Giardia cysts in water samples. To meet this objective, we initially modelled the complex relationship between Cryptosporidium and Giardia (oo)cysts and a set of parasitological, microbiological, physicochemical and meteorological parameters via a model-agnostic meta-learner algorithm that provides flexibility regarding the selection of the ML model executing the fitting task. Based on this generic approach, a set of four well-known ML candidates were, empirically, evaluated in terms of their predictive capabilities. Then, the best-performed algorithms, were further examined through XAI techniques for gaining meaningful insights related to the explainability and interpretability of the derived solutions. The findings reveal that the Random Forest achieves the highest prediction performance when the objective is the prediction of both contamination and contamination intensity with Cryptosporidium oocysts in a given water sample, with meteorological/physicochemical and microbiological markers being informative, respectively. For the prediction of contamination with Giardia, the eXtreme Gradient Boosting with physicochemical parameters was the most efficient algorithm, while, the Support Vector Regression that takes into consideration both microbiological and meteorological markers was more efficient for evaluating the contamination intensity with cysts. The results of the study designate that the adoption of ML and XAI approaches can be considered as a valuable tool for unveiling the complicated correlation of the presence and contamination intensity with these zoonotic parasites that could constitute, in turn, a basis for the development of monitoring platforms and early warning systems for the prevention of waterborne disease outbreaks.
Collapse
Affiliation(s)
- Panagiota Ligda
- Laboratory of Parasitology, Veterinary Research Institute, Hellenic Agricultural Organization - DIMITRA, Thermi, Thessaloniki 57001, Greece.
| | - Nikolaos Mittas
- Hephaestus Laboratory, School of Chemistry, Faculty of Sciences, Democritus University of Thrace, Kavala GR-65404, Greece
| | - George Z Kyzas
- Hephaestus Laboratory, School of Chemistry, Faculty of Sciences, Democritus University of Thrace, Kavala GR-65404, Greece
| | - Edwin Claerebout
- Laboratory of Parasitology, Faculty of Veterinary Medicine, Ghent University, Salisburylaan 133, Merelbeke B-9820, Belgium
| | - Smaragda Sotiraki
- Laboratory of Parasitology, Veterinary Research Institute, Hellenic Agricultural Organization - DIMITRA, Thermi, Thessaloniki 57001, Greece
| |
Collapse
|
28
|
Liang Y, Khanthaphixay B, Reynolds J, Leigh PJ, Lim ML, Yoon JY. A smartphone-based approach for comprehensive soil microbiome profiling. APPLIED PHYSICS REVIEWS 2024; 11:031412. [PMID: 39221035 PMCID: PMC11307194 DOI: 10.1063/5.0174176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 07/09/2024] [Indexed: 09/04/2024]
Abstract
The soil microbiome is crucial for nutrient cycling, health, and plant growth. This study presents a smartphone-based approach as a low-cost and portable alternative to traditional methods for classifying bacterial species and characterizing microbial communities in soil samples. By harnessing bacterial autofluorescence detection and machine learning algorithms, the platform achieved an average accuracy of 88% in distinguishing common soil-related bacterial species despite the lack of biomarkers, nucleic acid amplification, or gene sequencing. Furthermore, it successfully identified dominant species within various bacterial mixtures with an accuracy of 76% and three-level soil health identification at an accuracy of 80%-82%, providing insights into microbial community dynamics. The influence of other soil conditions (pH and moisture) was relatively minor, showcasing the platform's robustness. Various field soil samples were also tested with this platform at 80% accuracy compared with the laboratory analyses, demonstrating the practicality and usability of this approach for on-site soil analysis. This study highlights the potential of the smartphone-based system as a valuable tool for soil assessment, microbial monitoring, and environmental management.
Collapse
Affiliation(s)
- Yan Liang
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, Arizona 85721, USA
| | - Bradley Khanthaphixay
- Department of Biomedical Engineering, The University of Arizona, Tucson, Arizona 85721, USA
| | - Jocelyn Reynolds
- Department of Biomedical Engineering, The University of Arizona, Tucson, Arizona 85721, USA
| | - Preston J. Leigh
- Department of Biomedical Engineering, The University of Arizona, Tucson, Arizona 85721, USA
| | - Melissa L. Lim
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, Arizona 85721, USA
| | | |
Collapse
|
29
|
Huang L, Huang H, Liang X, Su Q, Ye L, Zhai C, Huang E, Pang J, Zhong X, Shi M, Chen L. Skin locations inference and body fluid identification from skin microbial patterns for forensic applications. Forensic Sci Int 2024; 362:112152. [PMID: 39067177 DOI: 10.1016/j.forsciint.2024.112152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 03/15/2024] [Accepted: 07/15/2024] [Indexed: 07/30/2024]
Abstract
Given that microbiological analysis can be an alternative method that overcomes the shortcomings of traditional forensic technology, and skin samples may be the most common source of cases, the analysis of skin microbiome was investigated in this study. High-throughput sequencing targeting the V3-V4 region of 16S rRNA gene was performed to reveal the skin microbiome of healthy individuals in Guangdong Han. The bacterial diversity of the palm, navel, groin and plantar of the same individual was analyzed. The overall classification based on 16S rRNA gene amplicons revealed that the microbial composition of skin samples from different anatomical parts was different, and the dominant bacterial genus of the navel, plantar, groin and palm skin were dominated by Cutibacterium, Staphylococcus, Corynebacterium and Staphylococcus, respectively. PCoA analysis showed that the skin at these four anatomical locations could only be grouped into three clusters. A predictive model based on random forest algorithm showed the potential to accurately distinguish these four anatomical locations, which indicated that specific bacteria with low abundance were the key taxa. In addition, the skin microbiome in this study is significantly different from the dominant microbiome in saliva and vaginal secretions identified in our previous study, and can be distinguished from these two tissue fluids. In conclusion, the present findings on the community and microbial structure details of the human skin may reveal its potential application value in assessing the location of skin samples and the type of body fluids in forensic medicine.
Collapse
Affiliation(s)
- Litao Huang
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Hongyan Huang
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Xiaomin Liang
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Qin Su
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Linying Ye
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Chuangyan Zhai
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Enping Huang
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Junjie Pang
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - XingYu Zhong
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Meisen Shi
- Criminal Justice College of China University of Political Science and Law, Beijing 100088, China.
| | - Ling Chen
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, Guangdong 510515, China.
| |
Collapse
|
30
|
Raajaraam L, Raman K. Modeling Microbial Communities: Perspective and Challenges. ACS Synth Biol 2024; 13:2260-2270. [PMID: 39148432 DOI: 10.1021/acssynbio.4c00116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Microbial communities are immensely important due to their widespread presence and profound impact on various facets of life. Understanding these complex systems necessitates mathematical modeling, a powerful tool for simulating and predicting microbial community behavior. This review offers a critical analysis of metabolic modeling and highlights key areas that would greatly benefit from broader discussion and collaboration. Moreover, we explore the challenges and opportunities linked to the intricate nature of these communities, spanning data generation, modeling, and validation. We are confident that ongoing advancements in modeling techniques, such as machine learning, coupled with interdisciplinary collaborations, will unlock the full potential of microbial communities across diverse applications.
Collapse
Affiliation(s)
- Lavanya Raajaraam
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India
- Centre for Integrative Biology and Systems mEdicine, IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| | - Karthik Raman
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India
- Centre for Integrative Biology and Systems mEdicine, IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
- Department of Data Science and AI, Wadhwani School of Data Science and Artificial Intelligence, IIT Madras, Chennai 600 036, India
| |
Collapse
|
31
|
Yu D, Andersson-Li M, Maes S, Andersson-Li L, Neumann NF, Odlare M, Jonsson A. Development of a logic regression-based approach for the discovery of host- and niche-informative biomarkers in Escherichia coli and their application for microbial source tracking. Appl Environ Microbiol 2024; 90:e0022724. [PMID: 38940567 PMCID: PMC11267920 DOI: 10.1128/aem.00227-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 06/07/2024] [Indexed: 06/29/2024] Open
Abstract
Microbial source tracking leverages a wide range of approaches designed to trace the origins of fecal contamination in aquatic environments. Although source tracking methods are typically employed within the laboratory setting, computational techniques can be leveraged to advance microbial source tracking methodology. Herein, we present a logic regression-based supervised learning approach for the discovery of source-informative genetic markers within intergenic regions across the Escherichia coli genome that can be used for source tracking. With just single intergenic loci, logic regression was able to identify highly source-specific (i.e., exceeding 97.00%) biomarkers for a wide range of host and niche sources, with sensitivities reaching as high as 30.00%-50.00% for certain source categories, including pig, sheep, mouse, and wastewater, depending on the specific intergenic locus analyzed. Restricting the source range to reflect the most prominent zoonotic sources of E. coli transmission (i.e., bovine, chicken, human, and pig) allowed for the generation of informative biomarkers for all host categories, with specificities of at least 90.00% and sensitivities between 12.50% and 70.00%, using the sequence data from key intergenic regions, including emrKY-evgAS, ibsB-(mdtABCD-baeSR), ompC-rcsDB, and yedS-yedR, that appear to be involved in antibiotic resistance. Remarkably, we were able to use this approach to classify 48 out of 113 river water E. coli isolates collected in Northwestern Sweden as either beaver, human, or reindeer in origin with a high degree of consensus-thus highlighting the potential of logic regression modeling as a novel approach for augmenting current source tracking efforts.IMPORTANCEThe presence of microbial contaminants, particularly from fecal sources, within water poses a serious risk to public health. The health and economic burden of waterborne pathogens can be substantial-as such, the ability to detect and identify the sources of fecal contamination in environmental waters is crucial for the control of waterborne diseases. This can be accomplished through microbial source tracking, which involves the use of various laboratory techniques to trace the origins of microbial pollution in the environment. Building on current source tracking methodology, we describe a novel workflow that uses logic regression, a supervised machine learning method, to discover genetic markers in Escherichia coli, a common fecal indicator bacterium, that can be used for source tracking efforts. Importantly, our research provides an example of how the rise in prominence of machine learning algorithms can be applied to improve upon current microbial source tracking methodology.
Collapse
Affiliation(s)
- Daniel Yu
- School of Public Health, University of Alberta, Edmonton, Alberta, Canada
| | | | - Sharon Maes
- Department of Natural Sciences, Design and Sustainable Development, Mid Sweden University, Östersund, Sweden
| | - Lili Andersson-Li
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solna, Sweden
| | - Norman F. Neumann
- School of Public Health, University of Alberta, Edmonton, Alberta, Canada
| | - Monica Odlare
- Department of Natural Sciences, Design and Sustainable Development, Mid Sweden University, Östersund, Sweden
| | - Anders Jonsson
- Department of Natural Sciences, Design and Sustainable Development, Mid Sweden University, Östersund, Sweden
| |
Collapse
|
32
|
Cheng T, Zhang T, Zhang P, He X, Sadiq FA, Li J, Sang Y, Gao J. The complex world of kefir: Structural insights and symbiotic relationships. Compr Rev Food Sci Food Saf 2024; 23:e13364. [PMID: 38847746 DOI: 10.1111/1541-4337.13364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 04/04/2024] [Accepted: 05/21/2024] [Indexed: 06/13/2024]
Abstract
Kefir milk, known for its high nutritional value and health benefits, is traditionally produced by fermenting milk with kefir grains. These grains are a complex symbiotic community of lactic acid bacteria, acetic acid bacteria, yeasts, and other microorganisms. However, the intricate coexistence mechanisms within these microbial colonies remain a mystery, posing challenges in predicting their biological and functional traits. This uncertainty often leads to variability in kefir milk's quality and safety. This review delves into the unique structural characteristics of kefir grains, particularly their distinctive hollow structure. We propose hypotheses on their formation, which appears to be influenced by the aggregation behaviors of the community members and their alliances. In kefir milk, a systematic colonization process is driven by metabolite release, orchestrating the spatiotemporal rearrangement of ecological niches. We place special emphasis on the dynamic spatiotemporal changes within the kefir microbial community. Spatially, we observe variations in species morphology and distribution across different locations within the grain structure. Temporally, the review highlights the succession patterns of the microbial community, shedding light on their evolving interactions.Furthermore, we explore the ecological mechanisms underpinning the formation of a stable community composition. The interplay of cooperative and competitive species within these microorganisms ensures a dynamic balance, contributing to the community's richness and stability. In kefir community, competitive species foster diversity and stability, whereas cooperative species bolster mutualistic symbiosis. By deepening our understanding of the behaviors of these complex microbial communities, we can pave the way for future advancements in the development and diversification of starter cultures for food fermentation processes.
Collapse
Affiliation(s)
- Tiantian Cheng
- Department of Food Science and Technology, Hebei Agricultural University, Baoding, Hebei, China
| | - Tuo Zhang
- Department of Food Science and Technology, Hebei Agricultural University, Baoding, Hebei, China
| | - Pengmin Zhang
- Department of Food Science and Technology, Hebei Agricultural University, Baoding, Hebei, China
| | - Xiaowei He
- Department of Food Science and Technology, Hebei Agricultural University, Baoding, Hebei, China
| | - Faizan Ahmed Sadiq
- Advanced Therapies Group, School of Dentistry, Cardiff University, Cardiff, UK
| | - Jiale Li
- Department of Food Science and Technology, Hebei Agricultural University, Baoding, Hebei, China
| | - Yaxin Sang
- Department of Food Science and Technology, Hebei Agricultural University, Baoding, Hebei, China
| | - Jie Gao
- Department of Food Science and Technology, Hebei Agricultural University, Baoding, Hebei, China
| |
Collapse
|
33
|
Duran R, Cravo‐Laureau C. The hydrocarbon pollution crisis: Harnessing the earth hydrocarbon-degrading microbiome. Microb Biotechnol 2024; 17:e14526. [PMID: 39003601 PMCID: PMC11246598 DOI: 10.1111/1751-7915.14526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 07/02/2024] [Indexed: 07/15/2024] Open
Affiliation(s)
- Robert Duran
- Universite de Pau et Des Pays de l'Adour, E2S UPPA, CNRS, IPREMPauFrance
| | | |
Collapse
|
34
|
Schmidt S. Microbial Buffer? The Human Lung Microbiome and Immune Responses to Diesel Exhaust. ENVIRONMENTAL HEALTH PERSPECTIVES 2024; 132:74002. [PMID: 39073991 PMCID: PMC11285853 DOI: 10.1289/ehp15252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 07/01/2024] [Indexed: 07/31/2024]
Abstract
Having a more diverse lung microbiome was associated better lung capacity and lower measures of airway inflammation among a small group of volunteers exposed to diesel exhaust-even in those with COPD.
Collapse
|
35
|
Tamayo M, Olivares M, Ruas-Madiedo P, Margolles A, Espín JC, Medina I, Moreno-Arribas MV, Canals S, Mirasso CR, Ortín S, Beltrán-Sanchez H, Palloni A, Tomás-Barberán FA, Sanz Y. How Diet and Lifestyle Can Fine-Tune Gut Microbiomes for Healthy Aging. Annu Rev Food Sci Technol 2024; 15:283-305. [PMID: 38941492 DOI: 10.1146/annurev-food-072023-034458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Many physical, social, and psychological changes occur during aging that raise the risk of developing chronic diseases, frailty, and dependency. These changes adversely affect the gut microbiota, a phenomenon known as microbe-aging. Those microbiota alterations are, in turn, associated with the development of age-related diseases. The gut microbiota is highly responsive to lifestyle and dietary changes, displaying a flexibility that also provides anactionable tool by which healthy aging can be promoted. This review covers, firstly, the main lifestyle and socioeconomic factors that modify the gut microbiota composition and function during healthy or unhealthy aging and, secondly, the advances being made in defining and promoting healthy aging, including microbiome-informed artificial intelligence tools, personalized dietary patterns, and food probiotic systems.
Collapse
Affiliation(s)
- M Tamayo
- Institute of Agrochemistry and Food Technology, Spanish National Research Council (IATA-CSIC), Valencia, Spain;
- Faculty of Medicine, Autonomous University of Madrid (UAM), Spain
| | - M Olivares
- Institute of Agrochemistry and Food Technology, Spanish National Research Council (IATA-CSIC), Valencia, Spain;
| | | | - A Margolles
- Health Research Institute (ISPA), Asturias, Spain
| | - J C Espín
- Laboratory of Food & Health, Group of Quality, Safety, and Bioactivity of Plant Foods, Centro de Edafología y Biología Aplicada del Segura (CEBAS-CSIC), Murcia, Spain
| | - I Medina
- Instituto de Investigaciones Marinas, Spanish National Research Council (IIM-CSIC), Vigo, Spain
| | | | - S Canals
- Instituto de Neurociencias, Universidad Miguel Hernández-CSIC, Sant Joan d'Alacant, Spain
| | - C R Mirasso
- Instituto de Física Interdisciplinar y Sistemas Complejos IFISC (UIB-CSIC), Campus Universitat de les Illes Balears, Palma de Mallorca, Spain
| | - S Ortín
- Instituto de Física Interdisciplinar y Sistemas Complejos IFISC (UIB-CSIC), Campus Universitat de les Illes Balears, Palma de Mallorca, Spain
| | - H Beltrán-Sanchez
- Department of Community Health Sciences, Fielding School of Public Health and California Center for Population Research, University of California, Los Angeles, California, USA
| | - A Palloni
- Department of Sociology, University of Wisconsin, Madison, Wisconsin, USA
| | - F A Tomás-Barberán
- Laboratory of Food & Health, Group of Quality, Safety, and Bioactivity of Plant Foods, Centro de Edafología y Biología Aplicada del Segura (CEBAS-CSIC), Murcia, Spain
| | - Y Sanz
- Institute of Agrochemistry and Food Technology, Spanish National Research Council (IATA-CSIC), Valencia, Spain;
| |
Collapse
|
36
|
Haque S, Mengersen K, Barr I, Wang L, Yang W, Vardoulakis S, Bambrick H, Hu W. Towards development of functional climate-driven early warning systems for climate-sensitive infectious diseases: Statistical models and recommendations. ENVIRONMENTAL RESEARCH 2024; 249:118568. [PMID: 38417659 DOI: 10.1016/j.envres.2024.118568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 02/22/2024] [Accepted: 02/25/2024] [Indexed: 03/01/2024]
Abstract
Climate, weather and environmental change have significantly influenced patterns of infectious disease transmission, necessitating the development of early warning systems to anticipate potential impacts and respond in a timely and effective way. Statistical modelling plays a pivotal role in understanding the intricate relationships between climatic factors and infectious disease transmission. For example, time series regression modelling and spatial cluster analysis have been employed to identify risk factors and predict spatial and temporal patterns of infectious diseases. Recently advanced spatio-temporal models and machine learning offer an increasingly robust framework for modelling uncertainty, which is essential in climate-driven disease surveillance due to the dynamic and multifaceted nature of the data. Moreover, Artificial Intelligence (AI) techniques, including deep learning and neural networks, excel in capturing intricate patterns and hidden relationships within climate and environmental data sets. Web-based data has emerged as a powerful complement to other datasets encompassing climate variables and disease occurrences. However, given the complexity and non-linearity of climate-disease interactions, advanced techniques are required to integrate and analyse these diverse data to obtain more accurate predictions of impending outbreaks, epidemics or pandemics. This article presents an overview of an approach to creating climate-driven early warning systems with a focus on statistical model suitability and selection, along with recommendations for utilizing spatio-temporal and machine learning techniques. By addressing the limitations and embracing the recommendations for future research, we could enhance preparedness and response strategies, ultimately contributing to the safeguarding of public health in the face of evolving climate challenges.
Collapse
Affiliation(s)
- Shovanur Haque
- Ecosystem Change and Population Health Research Group, School of Public Health and Social Work, Queensland University of Technology, Brisbane, Australia
| | - Kerrie Mengersen
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia; Centre for Data Science (CDS), Queensland University of Technology (QUT), Brisbane, Australia
| | - Ian Barr
- World Health Organization Collaborating Centre for Reference and Research on Influenza, VIDRL, Doherty Institute, Melbourne, Australia; Department of Microbiology and Immunology, University of Melbourne, Victoria, Australia
| | - Liping Wang
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Division of Infectious disease, Chinese Centre for Disease Control and Prevention, China
| | - Weizhong Yang
- School of Population Medicine and Public Health, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100730, China
| | - Sotiris Vardoulakis
- HEAL Global Research Centre, Health Research Institute, University of Canberra, ACT Canberra, 2601, Australia
| | - Hilary Bambrick
- National Centre for Epidemiology and Population Health, The Australian National University, ACT 2601 Canberra, Australia
| | - Wenbiao Hu
- Ecosystem Change and Population Health Research Group, School of Public Health and Social Work, Queensland University of Technology, Brisbane, Australia.
| |
Collapse
|
37
|
Shen Z, Zhong Y, Wang Y, Zhu H, Liu R, Yu S, Zhang H, Wang M, Yang T, Zhang M. A computational approach to estimate postmortem interval using postmortem computed tomography of multiple tissues based on animal experiments. Int J Legal Med 2024; 138:1093-1107. [PMID: 37999765 DOI: 10.1007/s00414-023-03127-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 10/27/2023] [Indexed: 11/25/2023]
Abstract
The estimation of postmortem interval (PMI) is a complex and challenging problem in forensic medicine. In recent years, many studies have begun to use machine learning methods to estimate PMI. However, research combining postmortem computed tomography (PMCT) with machine learning models for PMI estimation is still in early stages. This study aims to establish a multi-tissue machine learning model for PMI estimation using PMCT data from various tissues. We collected PMCT data of seven tissues, including brain, eyeballs, myocardium, liver, kidneys, erector spinae, and quadriceps femoris from 10 rabbits after death. CT images were taken every 12 h until 192 h after death, and HU values were extracted from the CT images of each tissue as a dataset. Support vector machine, random forest, and K-nearest neighbors were performed to establish PMI estimation models, and after adjusting the parameters of each model, they were used as first-level classification to build a stacking model to further improve the PMI estimation accuracy. The accuracy and generalized area under the receiver operating characteristic curve of the multi-tissue stacking model were able to reach 93% and 0.96, respectively. Results indicated that PMCT detection could be used to obtain postmortem change of different tissue densities, and the stacking model demonstrated strong predictive and generalization abilities. This approach provides new research methods and ideas for the study of PMI estimation.
Collapse
Affiliation(s)
- Zefang Shen
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China
| | - Yue Zhong
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China
| | - Yucong Wang
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China
| | - Haibiao Zhu
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China
| | - Ran Liu
- Forensic Science Center of Beijing Huatong Junjian Science and Technology Company Limited, Beijing, 100016, China
| | - Shengnan Yu
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China
| | - Haidong Zhang
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China
| | - Min Wang
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China
| | - Tiantong Yang
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China.
| | - Mengzhou Zhang
- Key Laboratory of Evidence Science (China University of Political Science and Law), Ministry of Education, No. 25 Xitucheng Road, Haidian District, Beijing, 100088, China.
| |
Collapse
|
38
|
Manrique P, Montero I, Fernandez-Gosende M, Martinez N, Cantabrana CH, Rios-Covian D. Past, present, and future of microbiome-based therapies. MICROBIOME RESEARCH REPORTS 2024; 3:23. [PMID: 38841413 PMCID: PMC11149097 DOI: 10.20517/mrr.2023.80] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 03/07/2024] [Accepted: 03/12/2024] [Indexed: 06/07/2024]
Abstract
Technological advances in studying the human microbiome in depth have enabled the identification of microbial signatures associated with health and disease. This confirms the crucial role of microbiota in maintaining homeostasis and the host health status. Nowadays, there are several ways to modulate the microbiota composition to effectively improve host health; therefore, the development of therapeutic treatments based on the gut microbiota is experiencing rapid growth. In this review, we summarize the influence of the gut microbiota on the development of infectious disease and cancer, which are two of the main targets of microbiome-based therapies currently being developed. We analyze the two-way interaction between the gut microbiota and traditional drugs in order to emphasize the influence of gut microbial composition on drug effectivity and treatment response. We explore the different strategies currently available for modulating this ecosystem to our benefit, ranging from 1st generation intervention strategies to more complex 2nd generation microbiome-based therapies and their regulatory framework. Lastly, we finish with a quick overview of what we believe is the future of these strategies, that is 3rd generation microbiome-based therapies developed with the use of artificial intelligence (AI) algorithms.
Collapse
|
39
|
Cui J, Zhou F, Li J, Shen Z, Zhou J, Yang J, Jia Z, Zhang Z, Du F, Yao D. Amendment-driven soil health restoration through soil pH and microbial robustness in a Cd/Cu-combined acidic soil: A ten-year in-situ field experiment. JOURNAL OF HAZARDOUS MATERIALS 2024; 465:133109. [PMID: 38071771 DOI: 10.1016/j.jhazmat.2023.133109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 11/06/2023] [Accepted: 11/26/2023] [Indexed: 02/08/2024]
Abstract
Soil health arguably depends on biodiversity and has received wide attention in heavy-metal (HM) contaminated farmland remediation in recent years. However, long-term effects and mechanisms of soil amendment remain poorly understood with respect to soil microbal community. In this in-situ field study, four soil amendments (attapulgite-At, apatite-Ap, montmorillonite-M, lime-L) at three rates were applied once only for ten years in a cadmium (Cd)-copper (Cu) contaminated paddy soil deprecated for over five years. Results showed that after ten years and in compared with CK (no amendment), total Cd concentration and its risk in plot soils were not altered by amendments (p > 0.05), but total Cu concentration and its risk were significantly increased by both Ap and L, especially the former, rather than At and M (p < 0.05), through increased soil pH and enhanced bacterial alpha diversity as well as plant community. Soil microbial communities were more affected by amendment type (30%) than dosage (11%), microbial network characteristics were dominated by rare taxa, and soil multifunctionality was improved in Ap- and L-amended soils. A structural equation model (SEM) indicated that 57.3% of soil multifunctionality variances were accounted for by soil pH (+0.696) and microbial network robustness (-0.301). Moreover, microbial robustness could be potentially used as an indicator of soil multifunctionality, and Ap could be optimized to improve soil health in combined with biomass removal. These findings would advance the understanding of soil microbial roles, especially its network robustness, on soil multifunctionality for the remediation of metal contaminated soils and metal control management strategies in acidic soils. ENVIRONMENTAL IMPLICATION: Farmland soil contamination by heavy metals (HMs) has been becoming a serious global environmental challenge. However, most studies have been conducted over the short term, leading to a gap in the long-term remediation efficiency and ecological benefits of soil amendments. For the successful deployment of immobilization technologies, it is critical to understand the long-term stability of the immobilized HMs and soil health. Our study, to the best of our knowlege, is the first to state the long-term effects and mechanisms of soil amendments on soil health and optimize an effective and eco-friendly amendment for long-term Cd/Cu immobilization.
Collapse
Affiliation(s)
- Jian Cui
- Jiangsu Key Laboratory for the Research and Utilization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing 210014, China.
| | - Fengwu Zhou
- Jiangsu Provincial Key Laboratory of Materials Cycling and Pollution Control, School of Geography, Nanjing Normal University, Nanjing 210023, China
| | - Jinfeng Li
- Jiangsu Key Laboratory for the Research and Utilization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing 210014, China
| | - Ziyao Shen
- Jiangsu Key Laboratory for the Research and Utilization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing 210014, China
| | - Jing Zhou
- Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China
| | - John Yang
- Department of Agriculture and Environmental Science, Lincoln University of Missouri, Jefferson City, MO 65201, USA
| | - Zhongjun Jia
- Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China
| | - Zhen Zhang
- School of the Environment and Safety Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Fengfeng Du
- Jiangsu Key Laboratory for the Research and Utilization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing 210014, China
| | - Dongrui Yao
- Jiangsu Key Laboratory for the Research and Utilization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing 210014, China.
| |
Collapse
|
40
|
Wen S, Huang J, Li W, Wu M, Steyskal F, Meng J, Xu X, Hou P, Tang J. Henna plant biomass enhanced azo dye removal: Operating performance, microbial community and machine learning modeling. CHEMOSPHERE 2024; 352:141471. [PMID: 38373445 DOI: 10.1016/j.chemosphere.2024.141471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/17/2023] [Accepted: 02/14/2024] [Indexed: 02/21/2024]
Abstract
The bio-reduction of azo dyes is significantly dependent on the availability of electron donors and external redox mediators. In this study, the natural henna plant biomass was supplemented to promote the biological reduction of an azo dye of Acid Orange 7 (AO7). Besides, the machine learning (ML) approach was applied to decipher the intricate process of henna-assisted azo dye removal. The experimental results indicated that the hydrolysis and fermentation of henna plant biomass provided both electron donors such as volatile fatty acid (VFA) and redox mediator of lawsone to drive the bio-reduction of AO7 to sulfanilic acid (SA). The high henna dosage selectively enriched certain bacteria, such as Firmicutes phylum, Levilinea and Paludibacter genera, functioning in both the henna fermentation and AO7 reduction processes simultaneously. Among the three tested ML algorithms, eXtreme Gradient Boosting (XGBoost) presented exceptional accuracy and generalization ability in predicting the effluent AO7 concentrations with pH, oxidation-reduction potential (ORP), soluble chemical oxygen demand (SCOD), VFA, lawsone, henna dosage, and cumulative henna as input variables. The validating experiments with tailored optimal operating conditions and henna dosage (pH 7.5, henna dosage of 2 g/L, and cumulative henna of 14 g/L) confirmed that XGBoost was an effective ML model to predict the efficient AO7 removal (91.6%), with a negligible calculating error of 3.95%. Overall, henna plant biomass addition was a cost-effective and robust method to improve the bio-reduction of AO7, which had been demonstrated by long-term operation, ML modeling, and experimental validation.
Collapse
Affiliation(s)
- Shilin Wen
- College of Materials and Environmental Engineering, Hangzhou Dianzi University, Hangzhou, 310018, PR China
| | - Jingang Huang
- College of Materials and Environmental Engineering, Hangzhou Dianzi University, Hangzhou, 310018, PR China; China-Austria Belt and Road Joint Laboratory on Artificial Intelligence and Advanced Manufacturing, Hangzhou Dianzi University, Hangzhou, 310018, PR China.
| | - Weishuai Li
- College of Materials and Environmental Engineering, Hangzhou Dianzi University, Hangzhou, 310018, PR China
| | - Mengke Wu
- College of Materials and Environmental Engineering, Hangzhou Dianzi University, Hangzhou, 310018, PR China
| | - Felix Steyskal
- China-Austria Belt and Road Joint Laboratory on Artificial Intelligence and Advanced Manufacturing, Hangzhou Dianzi University, Hangzhou, 310018, PR China; M-U-T Maschinen-Umwelttechnik-Transportanlagen GmbH, Stockerau, 2000, Austria
| | - Jianfang Meng
- China-Austria Belt and Road Joint Laboratory on Artificial Intelligence and Advanced Manufacturing, Hangzhou Dianzi University, Hangzhou, 310018, PR China; M-U-T Maschinen-Umwelttechnik-Transportanlagen GmbH, Stockerau, 2000, Austria
| | - Xiaobin Xu
- China-Austria Belt and Road Joint Laboratory on Artificial Intelligence and Advanced Manufacturing, Hangzhou Dianzi University, Hangzhou, 310018, PR China
| | - Pingzhi Hou
- China-Austria Belt and Road Joint Laboratory on Artificial Intelligence and Advanced Manufacturing, Hangzhou Dianzi University, Hangzhou, 310018, PR China
| | - Junhong Tang
- College of Materials and Environmental Engineering, Hangzhou Dianzi University, Hangzhou, 310018, PR China
| |
Collapse
|
41
|
Bai Y, Lin H, Wang C, Wang Q, Qu J. Digitalizing river aquatic ecosystems. J Environ Sci (China) 2024; 137:677-680. [PMID: 37980050 DOI: 10.1016/j.jes.2023.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 03/07/2023] [Accepted: 03/08/2023] [Indexed: 11/20/2023]
Abstract
Traditional river health assessment relies on limited water quality indices and representative organism activity, but does not comprehensively obtain biotic and abiotic information of the ecosystem. Here, we propose a new approach to evaluate the ecological and health risks of river aquatic ecosystems. First, detailed physicochemical and biological characterization of a river ecosystem can be obtained through pollutant determination (especially emerging pollutants) and DNA/RNA sequencing. Second, supervised machine learning can be applied to perform classification analysis of characterization data and ascertain river ecosystem ecology and health. Our proposed methodology transforms river ecosystem health assessment and can be applied in river management.
Collapse
Affiliation(s)
- Yaohui Bai
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China.
| | - Hui Lin
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chenchen Wang
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China; School of Environmental and Municipal Engineering, Tianjin Chengjian University, Tianjin 300384, China
| | - Qiaojuan Wang
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiuhui Qu
- Key Laboratory of Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China; Center for Water and Ecology, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
42
|
Paes da Costa D, das Graças Espíndola da Silva T, Sérgio Ferreira Araujo A, Prudêncio de Araujo Pereira A, William Mendes L, Dos Santos Borges W, Felix da França R, Alberto Fragoso de Souza C, Alves da Silva B, Oliveira Silva R, Valente de Medeiros E. Soil fertility impact on recruitment and diversity of the soil microbiome in sub-humid tropical pastures in Northeastern Brazil. Sci Rep 2024; 14:3919. [PMID: 38365962 PMCID: PMC10873301 DOI: 10.1038/s41598-024-54221-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 02/09/2024] [Indexed: 02/18/2024] Open
Abstract
Soil fertility is key point to pastures systems and drives the microbial communities and their functionality. Therefore, an understanding of the interaction between soil fertility and microbial communities can increase our ability to manage pasturelands and maintain their soil functioning and productivity. This study probed the influence of soil fertility on microbial communities in tropical pastures in Brazil. Soil samples, gathered from the top 20 cm of twelve distinct areas with diverse fertility levels, were analyzed via 16S rRNA sequencing. The soils were subsequently classified into two categories, namely high fertility (HF) and low fertility (LF), using the K-Means clustering. The random forest analysis revealed that high fertility (HF) soils had more bacterial diversity, predominantly Proteobacteria, Nitrospira, Chloroflexi, and Bacteroidetes, while Acidobacteria increased in low fertility (LF) soils. High fertility (HF) soils exhibited more complex network interactions and an enrichment of nitrogen-cycling bacterial groups. Additionally, functional annotation based on 16S rRNA varied between clusters. Microbial groups in HF soil demonstrated enhanced functions such as nitrate reduction, aerobic ammonia oxidation, and aromatic compound degradation. In contrast, in the LF soil, the predominant processes were ureolysis, cellulolysis, methanol oxidation, and methanotrophy. Our findings expand our knowledge about how soil fertility drives bacterial communities in pastures.
Collapse
Affiliation(s)
- Diogo Paes da Costa
- Microbiology and Enzimology Lab., Federal University of Agreste Pernambuco, Garanhuns, PE, 55292-270, Brazil.
| | | | | | | | - Lucas William Mendes
- Center for Nuclear Energy in Agriculture, University of Sao Paulo, Piracicaba, SP, 13400-970, Brazil
| | - Wisraiane Dos Santos Borges
- Microbiology and Enzimology Lab., Federal University of Agreste Pernambuco, Garanhuns, PE, 55292-270, Brazil
| | - Rafaela Felix da França
- Microbiology and Enzimology Lab., Federal University of Agreste Pernambuco, Garanhuns, PE, 55292-270, Brazil
| | | | - Bruno Alves da Silva
- Microbiology and Enzimology Lab., Federal University of Agreste Pernambuco, Garanhuns, PE, 55292-270, Brazil
| | - Renata Oliveira Silva
- Microbiology and Enzimology Lab., Federal University of Agreste Pernambuco, Garanhuns, PE, 55292-270, Brazil
| | - Erika Valente de Medeiros
- Microbiology and Enzimology Lab., Federal University of Agreste Pernambuco, Garanhuns, PE, 55292-270, Brazil
| |
Collapse
|
43
|
Walsh C, Stallard-Olivera E, Fierer N. Nine (not so simple) steps: a practical guide to using machine learning in microbial ecology. mBio 2024; 15:e0205023. [PMID: 38126787 PMCID: PMC10865974 DOI: 10.1128/mbio.02050-23] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023] Open
Abstract
Due to the complex nature of microbiome data, the field of microbial ecology has many current and potential uses for machine learning (ML) modeling. With the increased use of predictive ML models across many disciplines, including microbial ecology, there is extensive published information on the specific ML algorithms available and how those algorithms have been applied. Thus, our goal is not to summarize the breadth of ML models available or compare their performances. Rather, our goal is to provide more concrete and actionable information to guide microbial ecologists in how to select, run, and interpret ML algorithms to predict the taxa or genes associated with particular sample categories or environmental gradients of interest. Such microbial data often have unique characteristics that require careful consideration of how to apply ML models and how to interpret the associated results. This review is intended for practicing microbial ecologists who may be unfamiliar with some of the intricacies of ML models. We provide examples and discuss common opportunities and pitfalls specific to applying ML models to the types of data sets most frequently collected by microbial ecologists.
Collapse
Affiliation(s)
- Corinne Walsh
- Cooperative Institute of Research in Environmental Sciences, CU Boulder, Boulder, Colorado, USA
- Ecology and Evolutionary Biology Department, CU Boulder, Boulder, Colorado, USA
| | - Elías Stallard-Olivera
- Cooperative Institute of Research in Environmental Sciences, CU Boulder, Boulder, Colorado, USA
- Ecology and Evolutionary Biology Department, CU Boulder, Boulder, Colorado, USA
| | - Noah Fierer
- Cooperative Institute of Research in Environmental Sciences, CU Boulder, Boulder, Colorado, USA
- Ecology and Evolutionary Biology Department, CU Boulder, Boulder, Colorado, USA
| |
Collapse
|
44
|
Yang MQ, Wang ZJ, Zhai CB, Chen LQ. Research progress on the application of 16S rRNA gene sequencing and machine learning in forensic microbiome individual identification. Front Microbiol 2024; 15:1360457. [PMID: 38371926 PMCID: PMC10869621 DOI: 10.3389/fmicb.2024.1360457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 01/23/2024] [Indexed: 02/20/2024] Open
Abstract
Forensic microbiome research is a field with a wide range of applications and a number of protocols have been developed for its use in this area of research. As individuals host radically different microbiota, the human microbiome is expected to become a new biomarker for forensic identification. To achieve an effective use of this procedure an understanding of factors which can alter the human microbiome and determinations of stable and changing elements will be critical in selecting appropriate targets for investigation. The 16S rRNA gene, which is notable for its conservation and specificity, represents a potentially ideal marker for forensic microbiome identification. Gene sequencing involving 16S rRNA is currently the method of choice for use in investigating microbiomes. While the sequencing involved with microbiome determinations can generate large multi-dimensional datasets that can be difficult to analyze and interpret, machine learning methods can be useful in surmounting this analytical challenge. In this review, we describe the research methods and related sequencing technologies currently available for application of 16S rRNA gene sequencing and machine learning in the field of forensic identification. In addition, we assess the potential value of 16S rRNA and machine learning in forensic microbiome science.
Collapse
Affiliation(s)
- Mai-Qing Yang
- Department of Pathology, Weifang People's Hospital (First Affiliated Hospital of Shandong Second Medical University), Weifang, China
| | - Zheng-Jiang Wang
- Department of Pathology, Weifang People's Hospital (First Affiliated Hospital of Shandong Second Medical University), Weifang, China
| | - Chun-Bo Zhai
- Department of Second Ward of Thoracic Surgery, Weifang People's Hospital (First Affiliated Hospital of Shandong Second Medical University), Weifang, China
| | - Li-Qian Chen
- Department of Pathology, Weifang People's Hospital (First Affiliated Hospital of Shandong Second Medical University), Weifang, China
| |
Collapse
|
45
|
Schaerer L, Ghannam R, Olson A, Van Camp A, Techtmann S. Persistence of location-specific microbial signatures on boats during voyages. MARINE POLLUTION BULLETIN 2024; 199:115884. [PMID: 38118397 DOI: 10.1016/j.marpolbul.2023.115884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 11/21/2023] [Accepted: 12/01/2023] [Indexed: 12/22/2023]
Abstract
Objects collect microorganisms from their surroundings and develop a microbial "fingerprint" that may be useful for determining an object's past location (provenance). It may be possible to use ubiquitous microorganisms for forensics or as environmental sensors. Here, we use microbial communities in the Chesapeake Bay region to demonstrate the use of natural microorganisms as biological sensors to determine the past location of boats. The microbiomes of two boats and of the open water were sampled as these vessels traveled from the Port of Baltimore to the Port of Norfolk, and back to Baltimore. 16S rRNA sequencing was performed to identify microorganisms. Differential abundance and machine learning analyses were utilized to identify microbial signatures and predicted probabilities which were used to determine the vessel's previous location. The work presented here provides a better understanding of how microbes in aquatic systems can be leveraged as utility for object biosensors.
Collapse
Affiliation(s)
- Laura Schaerer
- Department of Biological Sciences, Michigan Technological University, Houghton, MI 49931, USA
| | - Ryan Ghannam
- Department of Biological Sciences, Michigan Technological University, Houghton, MI 49931, USA
| | - Allison Olson
- Department of Biological Sciences, Michigan Technological University, Houghton, MI 49931, USA
| | - Annika Van Camp
- Department of Biological Sciences, Michigan Technological University, Houghton, MI 49931, USA
| | - Stephen Techtmann
- Department of Biological Sciences, Michigan Technological University, Houghton, MI 49931, USA.
| |
Collapse
|
46
|
Zhang Y, Wu H, Xu R, Wang Y, Chen L, Wei C. Machine learning modeling for the prediction of phosphorus and nitrogen removal efficiency and screening of crucial microorganisms in wastewater treatment plants. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 907:167730. [PMID: 37852495 DOI: 10.1016/j.scitotenv.2023.167730] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/08/2023] [Accepted: 10/08/2023] [Indexed: 10/20/2023]
Abstract
The effectiveness of wastewater treatment plants (WWTPs) is largely determined by the microbial community structure in their activated sludge (AS). Interactions among microbial communities in AS systems and their indirect effects on water quality changes are crucial for WWTP performance. However, there is currently no quantitative method to evaluate the contribution of microorganisms to the operating efficiency of WWTPs. Traditional assessments of WWTP performance are limited by experimental conditions, methods, and other factors, resulting in increased costs and experimental pollutants. Therefore, an effective method is needed to predict WWTP efficiency based on AS community structure and quantitatively evaluate the contribution of microorganisms in the AS system. This study evaluated and compared microbial communities and water quality changes from WWTPs worldwide by meta-analysis of published high-throughput sequencing data. Six machine learning (ML) models were utilized to predict the efficiency of phosphorus and nitrogen removal in WWTPs; among them, XGBoost showed the highest prediction accuracy. Cross-entropy was used to screen the crucial microorganisms related to phosphorus and nitrogen removal efficiency, and the modeling confirmed the reasonableness of the results. Thirteen genera with nitrogen and phosphorus cycling pathways obtained from the screening were considered highly appropriate for the simultaneous removal of phosphorus and nitrogen. The results showed that the microbes Haliangium, Vicinamibacteraceae, Tolumonas, and SWB02 are potentially crucial for phosphorus and nitrogen removal, as they may be involved in the process of phosphorus and nitrogen removal in sewage treatment plants. Overall, these findings have deepened our understanding of the relationship between microbial community structure and performance of WWTPs, indicating that microbial data should play a critical role in the future design of sewage treatment plants. The ML model of this study can efficiently screen crucial microbes associated with WWTP system performance, and it is promising for the discovery of potential microbial metabolic pathways.
Collapse
Affiliation(s)
- Yinan Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, PR China
| | - Haizhen Wu
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, PR China.
| | - Rui Xu
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, PR China
| | - Ying Wang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, PR China
| | - Liping Chen
- School of Environment and Energy, South China University of Technology, Guangzhou Higher Education Mega Centre, Guangzhou 510006, PR China
| | - Chaohai Wei
- School of Environment and Energy, South China University of Technology, Guangzhou Higher Education Mega Centre, Guangzhou 510006, PR China
| |
Collapse
|
47
|
Kida M, Pochwat K, Ziembowicz S. Assessment of machine learning-based methods predictive suitability for migration pollutants from microplastics degradation. JOURNAL OF HAZARDOUS MATERIALS 2024; 461:132565. [PMID: 37722325 DOI: 10.1016/j.jhazmat.2023.132565] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 09/04/2023] [Accepted: 09/14/2023] [Indexed: 09/20/2023]
Abstract
The aim of the work was to assess the usefulness of machine learning in predicting the migration of pollutants from microplastics. The search for methods to reduce unnecessary laboratory analyzes is a necessary action both to protect the environment and from an economic perspective. Multiple regression, artificial neural networks, support vector method and random forest regression were used in the study to predict leaching of plasticizers and other contaminants from microplastics. The development of the methods were based on the results of laboratory tests obtained by the GC-MS method. The results obtained confirm the potential of artificial neural networks and the support vector method for effective modelling and prediction of chemical compounds leached from microplastics. Correlation results were obtained for the analyzed parameters between the data obtained in the model and laboratory data in the range of 0.96-0.98 and 0.93-0.99 for artificial neural networks and the support vector method, respectively. Multiple regression showed the lowest performance in all cases in predicting plastic phthalic acid esters (coefficient of determination (R2) in the range of 0.03-0.24). ENVIRONMENTAL IMPLICATION: The results presented in this paper will provide new insight into the influence of different parameters and factors on the leaching of plastic additives. This information is necessary to assess the harmfulness of these materials. The collected data is unique on a global scale. For the first time, machine learning were used to predict the leaching rate of plasticizers from different polymers under different environmental conditions. The use of machine learning allows to reduce unnecessary laboratory tests and reduce costs and protect the environment. Currently, there are no research results in this field in the scientific literature.
Collapse
Affiliation(s)
- Małgorzata Kida
- Department of Chemistry and Environmental Engineering, Faculty of Civil and Environmental Engineering and Architecture, Rzeszow University of Technology, Ave Powstańców Warszawy 6, 35-959 Rzeszów, Poland.
| | - Kamil Pochwat
- Department of Infrastructure and Water Management, Faculty of Civil and Environmental Engineering and Architecture, Rzeszow University of Technology, Ave Powstańców Warszawy 6, 35-959 Rzeszów, Poland
| | - Sabina Ziembowicz
- Department of Chemistry and Environmental Engineering, Faculty of Civil and Environmental Engineering and Architecture, Rzeszow University of Technology, Ave Powstańców Warszawy 6, 35-959 Rzeszów, Poland
| |
Collapse
|
48
|
Fonseca DC, Marques Gomes da Rocha I, Depieri Balmant B, Callado L, Aguiar Prudêncio AP, Tepedino Martins Alves J, Torrinhas RS, da Rocha Fernandes G, Linetzky Waitzberg D. Evaluation of gut microbiota predictive potential associated with phenotypic characteristics to identify multifactorial diseases. Gut Microbes 2024; 16:2297815. [PMID: 38235595 PMCID: PMC10798365 DOI: 10.1080/19490976.2023.2297815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 12/18/2023] [Indexed: 01/19/2024] Open
Abstract
Gut microbiota has been implicated in various clinical conditions, yet the substantial heterogeneity in gut microbiota research results necessitates a more sophisticated approach than merely identifying statistically different microbial taxa between healthy and unhealthy individuals. Our study seeks to not only select microbial taxa but also explore their synergy with phenotypic host variables to develop novel predictive models for specific clinical conditions. DESIGN We assessed 50 healthy and 152 unhealthy individuals for phenotypic variables (PV) and gut microbiota (GM) composition by 16S rRNA gene sequencing. The entire modeling process was conducted in the R environment using the Random Forest algorithm. Model performance was assessed through ROC curve construction. RESULTS We evaluated 52 bacterial taxa and pre-selected PV (p < 0.05) for their contribution to the final models. Across all diseases, the models achieved their best performance when GM and PV data were integrated. Notably, the integrated predictive models demonstrated exceptional performance for rheumatoid arthritis (AUC = 88.03%), type 2 diabetes (AUC = 96.96%), systemic lupus erythematosus (AUC = 98.4%), and type 1 diabetes (AUC = 86.19%). CONCLUSION Our findings underscore that the selection of bacterial taxa based solely on differences in relative abundance between groups is insufficient to serve as clinical markers. Machine learning techniques are essential for mitigating the considerable variability observed within gut microbiota. In our study, the use of microbial taxa alone exhibited limited predictive power for health outcomes, while the integration of phenotypic variables into predictive models substantially enhanced their predictive capabilities.
Collapse
Affiliation(s)
- Danielle Cristina Fonseca
- Laboratory of Nutrition and Metabolic Surgery of the Digestive System, LIM 35, Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
- Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - Ilanna Marques Gomes da Rocha
- Laboratory of Nutrition and Metabolic Surgery of the Digestive System, LIM 35, Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
- Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - Bianca Depieri Balmant
- Laboratory of Nutrition and Metabolic Surgery of the Digestive System, LIM 35, Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
- Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - Leticia Callado
- Laboratory of Nutrition and Metabolic Surgery of the Digestive System, LIM 35, Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
- Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - Ana Paula Aguiar Prudêncio
- Laboratory of Nutrition and Metabolic Surgery of the Digestive System, LIM 35, Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
- Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - Juliana Tepedino Martins Alves
- Laboratory of Nutrition and Metabolic Surgery of the Digestive System, LIM 35, Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
- Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - Raquel Susana Torrinhas
- Laboratory of Nutrition and Metabolic Surgery of the Digestive System, LIM 35, Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
- Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - Gabriel da Rocha Fernandes
- Biosystems Informatics and Genomics Group, Instituto René Rachou - Fiocruz Minas, Belo Horizonte, Brazil
| | - Dan Linetzky Waitzberg
- Laboratory of Nutrition and Metabolic Surgery of the Digestive System, LIM 35, Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
- Department of Gastroenterology, Hospital das Clínicas HCFMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| |
Collapse
|
49
|
Marcos-Zambrano LJ, López-Molina VM, Bakir-Gungor B, Frohme M, Karaduzovic-Hadziabdic K, Klammsteiner T, Ibrahimi E, Lahti L, Loncar-Turukalo T, Dhamo X, Simeon A, Nechyporenko A, Pio G, Przymus P, Sampri A, Trajkovik V, Lacruz-Pleguezuelos B, Aasmets O, Araujo R, Anagnostopoulos I, Aydemir Ö, Berland M, Calle ML, Ceci M, Duman H, Gündoğdu A, Havulinna AS, Kaka Bra KHN, Kalluci E, Karav S, Lode D, Lopes MB, May P, Nap B, Nedyalkova M, Paciência I, Pasic L, Pujolassos M, Shigdel R, Susín A, Thiele I, Truică CO, Wilmes P, Yilmaz E, Yousef M, Claesson MJ, Truu J, Carrillo de Santa Pau E. A toolbox of machine learning software to support microbiome analysis. Front Microbiol 2023; 14:1250806. [PMID: 38075858 PMCID: PMC10704913 DOI: 10.3389/fmicb.2023.1250806] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 09/11/2023] [Indexed: 05/14/2025] Open
Abstract
The human microbiome has become an area of intense research due to its potential impact on human health. However, the analysis and interpretation of this data have proven to be challenging due to its complexity and high dimensionality. Machine learning (ML) algorithms can process vast amounts of data to uncover informative patterns and relationships within the data, even with limited prior knowledge. Therefore, there has been a rapid growth in the development of software specifically designed for the analysis and interpretation of microbiome data using ML techniques. These software incorporate a wide range of ML algorithms for clustering, classification, regression, or feature selection, to identify microbial patterns and relationships within the data and generate predictive models. This rapid development with a constant need for new developments and integration of new features require efforts into compile, catalog and classify these tools to create infrastructures and services with easy, transparent, and trustable standards. Here we review the state-of-the-art for ML tools applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on ML based software and framework resources currently available for the analysis of microbiome data in humans. The aim is to support microbiologists and biomedical scientists to go deeper into specialized resources that integrate ML techniques and facilitate future benchmarking to create standards for the analysis of microbiome data. The software resources are organized based on the type of analysis they were developed for and the ML techniques they implement. A description of each software with examples of usage is provided including comments about pitfalls and lacks in the usage of software based on ML methods in relation to microbiome data that need to be considered by developers and users. This review represents an extensive compilation to date, offering valuable insights and guidance for researchers interested in leveraging ML approaches for microbiome analysis.
Collapse
Affiliation(s)
- Laura Judith Marcos-Zambrano
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
| | - Víctor Manuel López-Molina
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
| | - Burcu Bakir-Gungor
- Department of Computer Engineering, Abdullah Gül University, Kayseri, Türkiye
| | - Marcus Frohme
- Division Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Germany
| | | | - Thomas Klammsteiner
- Department of Microbiology and Department of Ecology, University of Innsbruck, Innsbruck, Austria
| | - Eliana Ibrahimi
- Department of Biology, University of Tirana, Tirana, Albania
| | - Leo Lahti
- Department of Computing, University of Turku, Turku, Finland
| | | | - Xhilda Dhamo
- Department of Applied Mathematics, Faculty of Natural Sciences, University of Tirana, Tirana, Albania
| | - Andrea Simeon
- BioSense Institute, University of Novi Sad, Novi Sad, Serbia
| | - Alina Nechyporenko
- Division Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Germany
- Department of Systems Engineering, Kharkiv National University of Radioelectronics, Kharkiv, Ukraine
| | - Gianvito Pio
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
- Big Data Lab, National Interuniversity Consortium for Informatics, Rome, Italy
| | - Piotr Przymus
- Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Toruń, Poland
| | - Alexia Sampri
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, United Kingdom
| | - Vladimir Trajkovik
- Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, North Macedonia
| | - Blanca Lacruz-Pleguezuelos
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
| | - Oliver Aasmets
- Institute of Genomics, Estonian Genome Centre, University of Tartu, Tartu, Estonia
- Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Ricardo Araujo
- Nephrology and Infectious Diseases R & D Group, i3S—Instituto de Investigação e Inovação em Saúde; INEB—Instituto de Engenharia Biomédica, Universidade do Porto, Porto, Portugal
| | - Ioannis Anagnostopoulos
- Department of Informatics, University of Piraeus, Piraeus, Greece
- Computer Science and Biomedical Informatics Department, University of Thessaly, Lamia, Greece
| | - Önder Aydemir
- Department of Electrical and Electronics Engineering, Karadeniz Technical University, Trabzon, Türkiye
| | - Magali Berland
- INRAE, MetaGenoPolis, Université Paris-Saclay, Jouy-en-Josas, France
| | - M. Luz Calle
- Faculty of Sciences, Technology and Engineering, University of Vic – Central University of Catalonia, Vic, Barcelona, Spain
- IRIS-CC, Fundació Institut de Recerca i Innovació en Ciències de la Vida i la Salut a la Catalunya Central, Vic, Barcelona, Spain
| | - Michelangelo Ceci
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
- Big Data Lab, National Interuniversity Consortium for Informatics, Rome, Italy
| | - Hatice Duman
- Department of Molecular Biology and Genetics, Çanakkale Onsekiz Mart University, Çanakkale, Türkiye
| | - Aycan Gündoğdu
- Department of Microbiology and Clinical Microbiology, Faculty of Medicine, Erciyes University, Kayseri, Türkiye
- Metagenomics Laboratory, Genome and Stem Cell Center (GenKök), Erciyes University, Kayseri, Türkiye
| | - Aki S. Havulinna
- Finnish Institute for Health and Welfare - THL, Helsinki, Finland
- Institute for Molecular Medicine Finland, FIMM-HiLIFE, Helsinki, Finland
| | | | - Eglantina Kalluci
- Department of Applied Mathematics, Faculty of Natural Sciences, University of Tirana, Tirana, Albania
| | - Sercan Karav
- Department of Molecular Biology and Genetics, Çanakkale Onsekiz Mart University, Çanakkale, Türkiye
| | - Daniel Lode
- Division Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Germany
| | - Marta B. Lopes
- Department of Mathematics, Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, Portugal
- UNIDEMI, Department of Mechanical and Industrial Engineering, NOVA School of Science and Technology, Caparica, Portugal
| | - Patrick May
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Bram Nap
- School of Medicine, University of Galway, Galway, Ireland
| | - Miroslava Nedyalkova
- Department of Inorganic Chemistry, Faculty of Chemistry and Pharmacy, University of Sofia, Sofia, Bulgaria
| | - Inês Paciência
- Center for Environmental and Respiratory Health Research (CERH), Research Unit of Population Health, University of Oulu, Oulu, Finland
- Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Lejla Pasic
- Sarajevo Medical School, University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina
| | - Meritxell Pujolassos
- Faculty of Sciences, Technology and Engineering, University of Vic – Central University of Catalonia, Vic, Barcelona, Spain
| | - Rajesh Shigdel
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Antonio Susín
- Mathematical Department, UPC-Barcelona Tech, Barcelona, Spain
| | - Ines Thiele
- School of Medicine, University of Galway, Galway, Ireland
- APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Ciprian-Octavian Truică
- Computer Science and Engineering Department, Faculty of Automatic Control and Computers, National University of Science and Technology Politehnica, Bucharest, Romania
| | - Paul Wilmes
- Systems Ecology Group, Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg
- Department of Life Sciences and Medicine, Faculty of Science, Technology and Medicine, University of Luxembourg, Belvaux, Luxembourg
| | - Ercument Yilmaz
- Department of Computer Technologies, Karadeniz Technical University, Trabzon, Türkiye
| | - Malik Yousef
- Department of Information Systems, Zefat Academic College, Zefat, Israel
- Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat, Israel
| | - Marcus Joakim Claesson
- APC Microbiome Ireland, University College Cork, Cork, Ireland
- School of Microbiology, University College Cork, Cork, Ireland
| | - Jaak Truu
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | | |
Collapse
|
50
|
Toussaint PA, Leiser F, Thiebes S, Schlesner M, Brors B, Sunyaev A. Explainable artificial intelligence for omics data: a systematic mapping study. Brief Bioinform 2023; 25:bbad453. [PMID: 38113073 PMCID: PMC10729786 DOI: 10.1093/bib/bbad453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 07/28/2023] [Accepted: 11/08/2023] [Indexed: 12/21/2023] Open
Abstract
Researchers increasingly turn to explainable artificial intelligence (XAI) to analyze omics data and gain insights into the underlying biological processes. Yet, given the interdisciplinary nature of the field, many findings have only been shared in their respective research community. An overview of XAI for omics data is needed to highlight promising approaches and help detect common issues. Toward this end, we conducted a systematic mapping study. To identify relevant literature, we queried Scopus, PubMed, Web of Science, BioRxiv, MedRxiv and arXiv. Based on keywording, we developed a coding scheme with 10 facets regarding the studies' AI methods, explainability methods and omics data. Our mapping study resulted in 405 included papers published between 2010 and 2023. The inspected papers analyze DNA-based (mostly genomic), transcriptomic, proteomic or metabolomic data by means of neural networks, tree-based methods, statistical methods and further AI methods. The preferred post-hoc explainability methods are feature relevance (n = 166) and visual explanation (n = 52), while papers using interpretable approaches often resort to the use of transparent models (n = 83) or architecture modifications (n = 72). With many research gaps still apparent for XAI for omics data, we deduced eight research directions and discuss their potential for the field. We also provide exemplary research questions for each direction. Many problems with the adoption of XAI for omics data in clinical practice are yet to be resolved. This systematic mapping study outlines extant research on the topic and provides research directions for researchers and practitioners.
Collapse
Affiliation(s)
- Philipp A Toussaint
- Department of Economics and Management, Karlsruhe Institute of Technology, Karlsruhe, Germany
- HIDSS4Health – Helmholtz Information and Data Science School for Health, Karlsruhe, Heidelberg, Germany
| | - Florian Leiser
- Department of Economics and Management, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Scott Thiebes
- Department of Economics and Management, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Matthias Schlesner
- Biomedical Informatics, Data Mining and Data Analytics, Faculty of Applied Computer Science and Medical Faculty, University of Augsburg, Augsburg, Germany
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Translational Oncology, National Center for Tumor Diseases, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Ali Sunyaev
- Department of Economics and Management, Karlsruhe Institute of Technology, Karlsruhe, Germany
| |
Collapse
|