Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: LaPierre N, Ju CJT, Zhou G, Wang W. MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction. Methods 2019;166:74-82. [PMID: 30885720 PMCID: PMC6708502 DOI: 10.1016/j.ymeth.2019.03.003] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Revised: 02/14/2019] [Accepted: 03/04/2019] [Indexed: 01/21/2023] Open

Number

Cited by Other Article(s)

Yi X, He Y, Gao S, Li M. A review of the application of deep learning in obesity: From early prediction aid to advanced management assistance. Diabetes Metab Syndr 2024;18:103000. [PMID: 38604060 DOI: 10.1016/j.dsx.2024.103000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Revised: 01/23/2024] [Accepted: 03/29/2024] [Indexed: 04/13/2024]

Roy G, Prifti E, Belda E, Zucker JD. Deep learning methods in metagenomics: a review. Microb Genom 2024;10. [PMID: 38630611 DOI: 10.1099/mgen.0.001231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024] Open

Sharma D, Lou W, Xu W. phylaGAN: data augmentation through conditional GANs and autoencoders for improving disease prediction accuracy using microbiome data. Bioinformatics 2024;40:btae161. [PMID: 38569898 DOI: 10.1093/bioinformatics/btae161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 02/18/2024] [Accepted: 04/01/2024] [Indexed: 04/05/2024]

Asher EE, Bashan A. Model-free prediction of microbiome compositions. Microbiome 2024;12:17. [PMID: 38303006 PMCID: PMC10832217 DOI: 10.1186/s40168-023-01721-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 11/15/2023] [Indexed: 02/03/2024]

Curry KD, Yu FB, Vance SE, Segarra S, Bhaya D, Chikhi R, Rocha EP, Treangen TJ. Reference-free Structural Variant Detection in Microbiomes via Long-read Coassembly Graphs. bioRxiv 2024:2024.01.25.577285. [PMID: 38352454 PMCID: PMC10862772 DOI: 10.1101/2024.01.25.577285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]

Muller E, Shiryan I, Borenstein E. Multi-omic integration of microbiome data for identifying disease-associated modules. bioRxiv 2024:2023.07.03.547607. [PMID: 37461534 PMCID: PMC10349976 DOI: 10.1101/2023.07.03.547607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/27/2023]

Abstract

The human gut microbiome is a complex ecosystem with profound implications for health and disease. This recognition has led to a surge in multi-omic microbiome studies, employing various molecular assays to elucidate the microbiome's role in diseases across multiple functional layers. However, despite the clear value of these multi-omic datasets, rigorous integrative analysis of such data poses significant challenges, hindering a comprehensive understanding of microbiome-disease interactions. Perhaps most notably, multiple approaches, including univariate and multivariate analyses, as well as machine learning, have been applied to such data to identify disease-associated markers, namely, specific features (e.g., species, pathways, metabolites) that are significantly altered in disease state. These methods, however, often yield extensive lists of features associated with the disease without effectively capturing the multi-layered structure of multi-omic data or offering clear, interpretable hypotheses about underlying microbiome-disease mechanisms. Here, we address this challenge by introducing MintTea - an intermediate integration-based method for analyzing multi-omic microbiome data. MintTea combines a canonical correlation analysis (CCA) extension, consensus analysis, and an evaluation protocol to robustly identify disease-associated multi-omic modules. Each such module consists of a set of features from the various omics that both shift in concord, and collectively associate with the disease. Applying MintTea to diverse case-control cohorts with multi-omic data, we show that this framework is able to capture modules with high predictive power for disease, significant cross-omic correlations, and alignment with known microbiome-disease associations. For example, analyzing samples from a metabolic syndrome (MS) study, we found a MS-associated module comprising of a highly correlated cluster of serum glutamate- and TCA cycle-related metabolites, as well as bacterial species previously implicated in insulin resistance. In another cohort, we identified a module associated with late-stage colorectal cancer, featuring Peptostreptococcus and Gemella species and several fecal amino acids, in agreement with these species' reported role in the metabolism of these amino acids and their coordinated increase in abundance during disease development. Finally, comparing modules identified in different datasets, we detected multiple significant overlaps, suggesting common interactions between microbiome features. Combined, this work serves as a proof of concept for the potential benefits of advanced integration methods in generating integrated multi-omic hypotheses underlying microbiome-disease interactions and a promising avenue for researchers seeking systems-level insights into coherent mechanisms governing microbiome-related diseases.

Collapse

Liao H, Shang J, Sun Y. GDmicro: classifying host disease status with GCN and deep adaptation network based on the human gut microbiome data. Bioinformatics 2023;39:btad747. [PMID: 38085234 PMCID: PMC10749762 DOI: 10.1093/bioinformatics/btad747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 11/16/2023] [Accepted: 12/11/2023] [Indexed: 12/27/2023] Open

Hossain PS, Kim K, Uddin J, Samad MA, Choi K. Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach. Bioengineering (Basel) 2023;10:1293. [PMID: 38002417 PMCID: PMC10669241 DOI: 10.3390/bioengineering10111293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 11/02/2023] [Accepted: 11/05/2023] [Indexed: 11/26/2023] Open

Liu Y, Zhang YZ, Imoto S. Microbial Gene Ontology informed deep neural network for microbe functionality discovery in human diseases. PLoS One 2023;18:e0290307. [PMID: 37603579 PMCID: PMC10441785 DOI: 10.1371/journal.pone.0290307] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 08/04/2023] [Indexed: 08/23/2023] Open

Venkatachala Appa Swamy M, Periyasamy J, Thangavel M, Khan SB, Almusharraf A, Santhanam P, Ramaraj V, Elsisi M. Design and Development of IoT and Deep Ensemble Learning Based Model for Disease Monitoring and Prediction. Diagnostics (Basel) 2023;13:diagnostics13111942. [PMID: 37296794 DOI: 10.3390/diagnostics13111942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 05/04/2023] [Accepted: 05/11/2023] [Indexed: 06/12/2023] Open

Abstract

With the rapidly increasing reliance on advances in IoT, we persist towards pushing technology to new heights. From ordering food online to gene editing-based personalized healthcare, disruptive technologies like ML and AI continue to grow beyond our wildest dreams. Early detection and treatment through AI-assisted diagnostic models have outperformed human intelligence. In many cases, these tools can act upon the structured data containing probable symptoms, offer medication schedules based on the appropriate code related to diagnosis conventions, and predict adverse drug effects, if any, in accordance with medications. Utilizing AI and IoT in healthcare has facilitated innumerable benefits like minimizing cost, reducing hospital-obtained infections, decreasing mortality and morbidity etc. DL algorithms have opened up several frontiers by contributing towards healthcare opportunities through their ability to understand and learn from different levels of demonstration and generalization, which is significant in data analysis and interpretation. In contrast to ML which relies more on structured, labeled data and domain expertise to facilitate feature extractions, DL employs human-like cognitive abilities to extract hidden relationships and patterns from uncategorized data. Through the efficient application of DL techniques on the medical dataset, precise prediction, and classification of infectious/rare diseases, avoiding surgeries that can be preventable, minimization of over-dosage of harmful contrast agents for scans and biopsies can be reduced to a greater extent in future. Our study is focused on deploying ensemble deep learning algorithms and IoT devices to design and develop a diagnostic model that can effectively analyze medical Big Data and diagnose diseases by identifying abnormalities in early stages through medical images provided as input. This AI-assisted diagnostic model based on Ensemble Deep learning aims to be a valuable tool for healthcare systems and patients through its ability to diagnose diseases in the initial stages and present valuable insights to facilitate personalized treatment by aggregating the prediction of each base model and generating a final prediction.

Collapse

Hallsworth JE, Udaondo Z, Pedrós‐Alió C, Höfer J, Benison KC, Lloyd KG, Cordero RJB, de Campos CBL, Yakimov MM, Amils R. Scientific novelty beyond the experiment. Microb Biotechnol 2023;16:1131-1173. [PMID: 36786388 PMCID: PMC10221578 DOI: 10.1111/1751-7915.14222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 01/09/2023] [Accepted: 01/11/2023] [Indexed: 02/15/2023] Open

Abstract

Practical experiments drive important scientific discoveries in biology, but theory-based research studies also contribute novel-sometimes paradigm-changing-findings. Here, we appraise the roles of theory-based approaches focusing on the experiment-dominated wet-biology research areas of microbial growth and survival, cell physiology, host-pathogen interactions, and competitive or symbiotic interactions. Additional examples relate to analyses of genome-sequence data, climate change and planetary health, habitability, and astrobiology. We assess the importance of thought at each step of the research process; the roles of natural philosophy, and inconsistencies in logic and language, as drivers of scientific progress; the value of thought experiments; the use and limitations of artificial intelligence technologies, including their potential for interdisciplinary and transdisciplinary research; and other instances when theory is the most-direct and most-scientifically robust route to scientific novelty including the development of techniques for practical experimentation or fieldwork. We highlight the intrinsic need for human engagement in scientific innovation, an issue pertinent to the ongoing controversy over papers authored using/authored by artificial intelligence (such as the large language model/chatbot ChatGPT). Other issues discussed are the way in which aspects of language can bias thinking towards the spatial rather than the temporal (and how this biased thinking can lead to skewed scientific terminology); receptivity to research that is non-mainstream; and the importance of theory-based science in education and epistemology. Whereas we briefly highlight classic works (those by Oakes Ames, Francis H.C. Crick and James D. Watson, Charles R. Darwin, Albert Einstein, James E. Lovelock, Lynn Margulis, Gilbert Ryle, Erwin R.J.A. Schrödinger, Alan M. Turing, and others), the focus is on microbiology studies that are more-recent, discussing these in the context of the scientific process and the types of scientific novelty that they represent. These include several studies carried out during the 2020 to 2022 lockdowns of the COVID-19 pandemic when access to research laboratories was disallowed (or limited). We interviewed the authors of some of the featured microbiology-related papers and-although we ourselves are involved in laboratory experiments and practical fieldwork-also drew from our own research experiences showing that such studies can not only produce new scientific findings but can also transcend barriers between disciplines, act counter to scientific reductionism, integrate biological data across different timescales and levels of complexity, and circumvent constraints imposed by practical techniques. In relation to urgent research needs, we believe that climate change and other global challenges may require approaches beyond the experiment.

Collapse

Fung DLX, Li X, Leung CK, Hu P. A self-knowledge distillation-driven CNN-LSTM model for predicting disease outcomes using longitudinal microbiome data. Bioinform Adv 2023;3:vbad059. [PMID: 37228387 PMCID: PMC10203376 DOI: 10.1093/bioadv/vbad059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 04/03/2023] [Accepted: 05/01/2023] [Indexed: 05/27/2023]

Khachatryan L, Xiang Y, Ivanov A, Glaab E, Graham G, Granata I, Giordano M, Maddalena L, Piccirillo M, Manipur I, Baruzzo G, Cappellato M, Avot B, Stan A, Battey J, Lo Sasso G, Boue S, Ivanov NV, Peitsch MC, Hoeng J, Falquet L, Di Camillo B, Guarracino MR, Ulyantsev V, Sierro N, Poussin C. Results and lessons learned from the sbv IMPROVER metagenomics diagnostics for inflammatory bowel disease challenge. Sci Rep 2023;13:6303. [PMID: 37072468 PMCID: PMC10113391 DOI: 10.1038/s41598-023-33050-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 04/06/2023] [Indexed: 05/03/2023] Open

Affiliation(s)

Lusine Khachatryan PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland.
Yang Xiang PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
Artem Ivanov ITMO University, St. Petersburg, Russian Federation
Enrico Glaab University of Luxembourg, Luxembourg, Luxembourg
Garrett Graham Georgetown University, Washington, DC, USA
Ilaria Granata Consiglio Nazionale delle Ricerche, Naples, Italy
Maurizio Giordano Consiglio Nazionale delle Ricerche, Naples, Italy
Lucia Maddalena Consiglio Nazionale delle Ricerche, Naples, Italy
Marina Piccirillo Consiglio Nazionale delle Ricerche, Naples, Italy
Ichcha Manipur Consiglio Nazionale delle Ricerche, Naples, Italy
Giacomo Baruzzo University of Padua, Padua, Italy
Marco Cappellato University of Padua, Padua, Italy
Batiste Avot University of Fribourg, Fribourg, Switzerland
Adrian Stan PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
James Battey PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
Giuseppe Lo Sasso PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
Stephanie Boue PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
Nikolai V Ivanov PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
Manuel C Peitsch PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
Julia Hoeng PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
Laurent Falquet University of Fribourg, Fribourg, Switzerland
Barbara Di Camillo University of Padua, Padua, Italy
Mario R Guarracino Consiglio Nazionale delle Ricerche, Naples, Italy
Vladimir Ulyantsev ITMO University, St. Petersburg, Russian Federation
Nicolas Sierro PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland
Carine Poussin PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, 2000, Neuchâtel, Switzerland

Collapse

Boodaghidizaji M, Jungles T, Chen T, Zhang B, Landay A, Keshavarzian A, Hamaker B, Ardekani A. Machine learning based gut microbiota pattern and response to fiber as a diagnostic tool for chronic inflammatory diseases. bioRxiv 2023:2023.03.27.534466. [PMID: 37034781 PMCID: PMC10081192 DOI: 10.1101/2023.03.27.534466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Liang C, Wagstaff J, Aharony N, Schmit V, Manheim D. Managing the Transition to Widespread Metagenomic Monitoring: Policy Considerations for Future Biosurveillance. Health Secur 2023;21:34-45. [PMID: 36629860 PMCID: PMC9940815 DOI: 10.1089/hs.2022.0029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open

Yang L, Wang S, Altman RB. POPDx: an automated framework for patient phenotyping across 392 246 individuals in the UK Biobank study. J Am Med Inform Assoc 2023;30:245-255. [PMID: 36469791 PMCID: PMC9846671 DOI: 10.1093/jamia/ocac226] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 10/19/2022] [Accepted: 11/18/2022] [Indexed: 12/12/2022] Open

Loganathan T, Priya Doss C G. The influence of machine learning technologies in gut microbiome research and cancer studies - A review. Life Sci 2022;311:121118. [DOI: 10.1016/j.lfs.2022.121118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 10/19/2022] [Accepted: 10/19/2022] [Indexed: 11/18/2022]

Hernández Medina R, Kutuzova S, Nielsen KN, Johansen J, Hansen LH, Nielsen M, Rasmussen S. Machine learning and deep learning applications in microbiome research. ISME Commun 2022;2:98. [PMID: 37938690 PMCID: PMC9723725 DOI: 10.1038/s43705-022-00182-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 09/12/2022] [Accepted: 09/16/2022] [Indexed: 05/27/2023]

Wen LY, Wang X, Min F. Cost-sensitive microbial data augmentation through matrix factorization. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04187-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Díez López C, Montiel González D, Vidaki A, Kayser M. Prediction of Smoking Habits From Class-Imbalanced Saliva Microbiome Data Using Data Augmentation and Machine Learning. Front Microbiol 2022;13:886201. [PMID: 35928158 PMCID: PMC9343866 DOI: 10.3389/fmicb.2022.886201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 06/21/2022] [Indexed: 11/24/2022] Open

Abstract

Human microbiome research is moving from characterization and association studies to translational applications in medical research, clinical diagnostics, and others. One of these applications is the prediction of human traits, where machine learning (ML) methods are often employed, but face practical challenges. Class imbalance in available microbiome data is one of the major problems, which, if unaccounted for, leads to spurious prediction accuracies and limits the classifier's generalization. Here, we investigated the predictability of smoking habits from class-imbalanced saliva microbiome data by combining data augmentation techniques to account for class imbalance with ML methods for prediction. We collected publicly available saliva 16S rRNA gene sequencing data and smoking habit metadata demonstrating a serious class imbalance problem, i.e., 175 current vs. 1,070 non-current smokers. Three data augmentation techniques (synthetic minority over-sampling technique, adaptive synthetic, and tree-based associative data augmentation) were applied together with seven ML methods: logistic regression, k-nearest neighbors, support vector machine with linear and radial kernels, decision trees, random forest, and extreme gradient boosting. K-fold nested cross-validation was used with the different augmented data types and baseline non-augmented data to validate the prediction outcome. Combining data augmentation with ML generally outperformed baseline methods in our dataset. The final prediction model combined tree-based associative data augmentation and support vector machine with linear kernel, and achieved a classification performance expressed as Matthews correlation coefficient of 0.36 and AUC of 0.81. Our method successfully addresses the problem of class imbalance in microbiome data for reliable prediction of smoking habits.

Collapse

Zhou X, Chen L, Liu HX. Applications of Machine Learning Models to Predict and Prevent Obesity: A Mini-Review. Front Nutr 2022;9:933130. [PMID: 35866076 PMCID: PMC9294383 DOI: 10.3389/fnut.2022.933130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 05/19/2022] [Indexed: 11/28/2022] Open

Li B, Zhong D, Qiao J, Jiang X. GNPI: Graph normalization to integrate phylogenetic information for metagenomic host phenotype prediction. Methods 2022;205:11-17. [PMID: 35636652 DOI: 10.1016/j.ymeth.2022.05.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/17/2022] [Accepted: 05/26/2022] [Indexed: 11/24/2022] Open

Bakir-Gungor B, Hacılar H, Jabeer A, Nalbantoglu OU, Aran O, Yousef M. Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods. PeerJ 2022;10:e13205. [PMID: 35497193 PMCID: PMC9048649 DOI: 10.7717/peerj.13205] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 03/10/2022] [Indexed: 01/12/2023] Open

Abstract

The tremendous boost in next generation sequencing and in the "omics" technologies makes it possible to characterize the human gut microbiome-the collective genomes of the microbial community that reside in our gastrointestinal tract. Although some of these microorganisms are considered to be essential regulators of our immune system, the alteration of the complexity and eubiotic state of microbiota might promote autoimmune and inflammatory disorders such as diabetes, rheumatoid arthritis, Inflammatory bowel diseases (IBD), obesity, and carcinogenesis. IBD, comprising Crohn's disease and ulcerative colitis, is a gut-related, multifactorial disease with an unknown etiology. IBD presents defects in the detection and control of the gut microbiota, associated with unbalanced immune reactions, genetic mutations that confer susceptibility to the disease, and complex environmental conditions such as westernized lifestyle. Although some existing studies attempt to unveil the composition and functional capacity of the gut microbiome in relation to IBD diseases, a comprehensive picture of the gut microbiome in IBD patients is far from being complete. Due to the complexity of metagenomic studies, the applications of the state-of-the-art machine learning techniques became popular to address a wide range of questions in the field of metagenomic data analysis. In this regard, using IBD associated metagenomics dataset, this study utilizes both supervised and unsupervised machine learning algorithms, (i) to generate a classification model that aids IBD diagnosis, (ii) to discover IBD-associated biomarkers, (iii) to discover subgroups of IBD patients using k-means and hierarchical clustering approaches. To deal with the high dimensionality of features, we applied robust feature selection algorithms such as Conditional Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF), min redundancy max relevance (mRMR), Select K Best (SKB), Information Gain (IG) and Extreme Gradient Boosting (XGBoost). In our experiments with 100-fold Monte Carlo cross-validation (MCCV), XGBoost, IG, and SKB methods showed a considerable effect in terms of minimizing the microbiota used for the diagnosis of IBD and thus reducing the cost and time. We observed that compared to Decision Tree, Support Vector Machine, Logitboost, Adaboost, and stacking ensemble classifiers, our Random Forest classifier resulted in better performance measures for the classification of IBD. Our findings revealed potential microbiome-mediated mechanisms of IBD and these findings might be useful for the development of microbiome-based diagnostics.

Collapse

Giliberti R, Cavaliere S, Mauriello IE, Ercolini D, Pasolli E. Host phenotype classification from human microbiome data is mainly driven by the presence of microbial taxa. PLoS Comput Biol 2022;18:e1010066. [PMID: 35446845 DOI: 10.1371/journal.pcbi.1010066] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 05/03/2022] [Accepted: 03/29/2022] [Indexed: 12/14/2022] Open

Abstract

Machine learning-based classification approaches are widely used to predict host phenotypes from microbiome data. Classifiers are typically employed by considering operational taxonomic units or relative abundance profiles as input features. Such types of data are intrinsically sparse, which opens the opportunity to make predictions from the presence/absence rather than the relative abundance of microbial taxa. This also poses the question whether it is the presence rather than the abundance of particular taxa to be relevant for discrimination purposes, an aspect that has been so far overlooked in the literature. In this paper, we aim at filling this gap by performing a meta-analysis on 4,128 publicly available metagenomes associated with multiple case-control studies. At species-level taxonomic resolution, we show that it is the presence rather than the relative abundance of specific microbial taxa to be important when building classification models. Such findings are robust to the choice of the classifier and confirmed by statistical tests applied to identifying differentially abundant/present taxa. Results are further confirmed at coarser taxonomic resolutions and validated on 4,026 additional 16S rRNA samples coming from 30 public case-control studies.

The composition of the human microbiome has been linked to a large number of different diseases. In this context, classification methodologies based on machine learning approaches have represented a promising tool for diagnostic purposes from metagenomics data. The link between microbial population composition and host phenotypes has been usually performed by considering taxonomic profiles represented by relative abundances of microbial species. In this study, we show that it is more the presence rather than the relative abundance of microbial taxa to be relevant to maximize classification accuracy. This is accomplished by conducting a meta-analysis on more than 4,000 shotgun metagenomes coming from 25 case-control studies and in which original relative abundance data are degraded to presence/absence profiles. Findings are also extended to 16S rRNA data and advance the research field in building prediction models directly from human microbiome data.

Collapse

Rashid J, Batool S, Kim J, Wasif Nisar M, Hussain A, Juneja S, Kushwaha R. An Augmented Artificial Intelligence Approach for Chronic Diseases Prediction. Front Public Health 2022;10:860396. [PMID: 35433587 PMCID: PMC9008324 DOI: 10.3389/fpubh.2022.860396] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 02/22/2022] [Indexed: 12/23/2022] Open

Michel-Mata S, Wang XW, Liu YY, Angulo MT. Predicting microbiome compositions from species assemblages through deep learning. Imeta 2022;1:e3. [PMID: 35757098 PMCID: PMC9221840 DOI: 10.1002/imt2.3] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]

Curry KD, Nute MG, Treangen TJ. It takes guts to learn: machine learning techniques for disease detection from the gut microbiome. Emerg Top Life Sci 2021;5:815-827. [PMID: 34779841 PMCID: PMC8786294 DOI: 10.1042/etls20210213] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 09/29/2021] [Accepted: 10/06/2021] [Indexed: 02/01/2023]

Narayana JK, Mac Aogáin M, Goh WWB, Xia K, Tsaneva-Atanasova K, Chotirmall SH. Mathematical-based microbiome analytics for clinical translation. Comput Struct Biotechnol J 2021;19:6272-6281. [PMID: 34900137 PMCID: PMC8637001 DOI: 10.1016/j.csbj.2021.11.029] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 11/17/2021] [Accepted: 11/17/2021] [Indexed: 12/20/2022] Open

Deng Z, Zhang J, Li J, Zhang X. Application of Deep Learning in Plant-Microbiota Association Analysis. Front Genet 2021;12:697090. [PMID: 34691142 PMCID: PMC8531731 DOI: 10.3389/fgene.2021.697090] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 08/31/2021] [Indexed: 01/04/2023] Open

Gao J, Zhang X, Tian L, Liu Y, Wang J, Li Z, Hu X. MTGNN: Multi-Task Graph Neural Network based few-shot learning for disease similarity measurement. Methods 2021;198:88-95. [PMID: 34700014 DOI: 10.1016/j.ymeth.2021.10.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 10/16/2021] [Accepted: 10/18/2021] [Indexed: 11/24/2022] Open

Zhao Z, Woloszynek S, Agbavor F, Mell JC, Sokhansanj BA, Rosen GL. Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network. PLoS Comput Biol 2021;17:e1009345. [PMID: 34550967 PMCID: PMC8496832 DOI: 10.1371/journal.pcbi.1009345] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/07/2021] [Accepted: 08/12/2021] [Indexed: 01/04/2023] Open

Abstract

Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutional neural networks, recurrent neural networks, and attention mechanisms to predict taxonomic classifications and sample-associated attributes, such as the relationship between the microbiome and host phenotype, on the read/sequence level. In this paper, we develop this novel deep learning approach and evaluate its application to amplicon sequences. We apply our approach to short DNA reads and full sequences of 16S ribosomal RNA (rRNA) marker genes, which identify the heterogeneity of a microbial community sample. We demonstrate that our implementation of a novel attention-based deep network architecture, Read2Pheno, achieves read-level phenotypic prediction. Training Read2Pheno models will encode sequences (reads) into dense, meaningful representations: learned embedded vectors output from the intermediate layer of the network model, which can provide biological insight when visualized. The attention layer of Read2Pheno models can also automatically identify nucleotide regions in reads/sequences which are particularly informative for classification. As such, this novel approach can avoid pre/post-processing and manual interpretation required with conventional approaches to microbiome sequence classification. We further show, as proof-of-concept, that aggregating read-level information can robustly predict microbial community properties, host phenotype, and taxonomic classification, with performance at least comparable to conventional approaches. An implementation of the attention-based deep learning network is available at https://github.com/EESI/sequence_attention (a python package) and https://github.com/EESI/seq2att (a command line tool).

Collapse

Sun Q, Peng Y, Liu J. A reference-free approach for cell type classification with scRNA-seq. iScience 2021;24:102855. [PMID: 34381979 PMCID: PMC8335627 DOI: 10.1016/j.isci.2021.102855] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 05/07/2021] [Accepted: 07/08/2021] [Indexed: 11/29/2022] Open

Sharma D, Xu W. phyLoSTM: a novel deep learning model on disease prediction from longitudinal microbiome data. Bioinformatics 2021;37:3707-3714. [PMID: 34213529 DOI: 10.1093/bioinformatics/btab482] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 05/24/2021] [Accepted: 06/30/2021] [Indexed: 11/12/2022] Open

García-Jiménez B, Muñoz J, Cabello S, Medina J, Wilkinson MD. Predicting microbiomes through a deep latent space. Bioinformatics 2021;37:1444-1451. [PMID: 33289510 PMCID: PMC8208755 DOI: 10.1093/bioinformatics/btaa971] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 10/21/2020] [Accepted: 11/06/2020] [Indexed: 12/28/2022] Open

Abstract

Motivation

Microbial communities influence their environment by modifying the availability of compounds, such as nutrients or chemical elicitors. Knowing the microbial composition of a site is therefore relevant to improve productivity or health. However, sequencing facilities are not always available, or may be prohibitively expensive in some cases. Thus, it would be desirable to computationally predict the microbial composition from more accessible, easily-measured features.

Results

Integrating deep learning techniques with microbiome data, we propose an artificial neural network architecture based on heterogeneous autoencoders to condense the long vector of microbial abundance values into a deep latent space representation. Then, we design a model to predict the deep latent space and, consequently, to predict the complete microbial composition using environmental features as input. The performance of our system is examined using the rhizosphere microbiome of Maize. We reconstruct the microbial composition (717 taxa) from the deep latent space (10 values) with high fidelity (>0.9 Pearson correlation). We then successfully predict microbial composition from environmental variables, such as plant age, temperature or precipitation (0.73 Pearson correlation, 0.42 Bray–Curtis). We extend this to predict microbiome composition under hypothetical scenarios, such as future climate change conditions. Finally, via transfer learning, we predict microbial composition in a distinct scenario with only 100 sequences, and distinct environmental features. We propose that our deep latent space may assist microbiome-engineering strategies when technical or financial resources are limited, through predicting current or future microbiome compositions.

Availability and implementation

Software, results and data are available at https://github.com/jorgemf/DeepLatentMicrobiome

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Chen X, Liu L, Zhang W, Yang J, Wong KC. Human host status inference from temporal microbiome changes via recurrent neural networks. Brief Bioinform 2021;22:6307015. [PMID: 34151933 DOI: 10.1093/bib/bbab223] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 04/21/2021] [Accepted: 04/21/2021] [Indexed: 01/04/2023] Open

DiMucci D, Kon M, Segrè D. BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes. Front Mol Biosci 2021;8:663532. [PMID: 34222331 PMCID: PMC8245782 DOI: 10.3389/fmolb.2021.663532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 05/24/2021] [Indexed: 11/15/2022] Open

Khan K, Ramsahai E. Maintaining proper health records improves machine learning predictions for novel 2019-nCoV. BMC Med Inform Decis Mak 2021;21:172. [PMID: 34044839 PMCID: PMC8159067 DOI: 10.1186/s12911-021-01537-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Accepted: 05/23/2021] [Indexed: 11/19/2022] Open

Wu S, Chen Y, Li Z, Li J, Zhao F, Su X. Towards multi-label classification: Next step of machine learning for microbiome research. Comput Struct Biotechnol J 2021;19:2742-2749. [PMID: 34093989 PMCID: PMC8131981 DOI: 10.1016/j.csbj.2021.04.054] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 04/21/2021] [Accepted: 04/22/2021] [Indexed: 11/22/2022] Open

Zhang W, Chen X, Wong KC. Noninvasive early diagnosis of intestinal diseases based on artificial intelligence in genomics and microbiome. J Gastroenterol Hepatol 2021;36:823-831. [PMID: 33880763 DOI: 10.1111/jgh.15500] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Revised: 03/15/2021] [Accepted: 03/17/2021] [Indexed: 12/15/2022]

Wei ZG, Zhang XD, Cao M, Liu F, Qian Y, Zhang SW. Comparison of Methods for Picking the Operational Taxonomic Units From Amplicon Sequences. Front Microbiol 2021;12:644012. [PMID: 33841367 PMCID: PMC8024490 DOI: 10.3389/fmicb.2021.644012] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 02/17/2021] [Indexed: 12/31/2022] Open

Moreno-Indias I, Lahti L, Nedyalkova M, Elbere I, Roshchupkin G, Adilovic M, Aydemir O, Bakir-Gungor B, Santa Pau ECD, D’Elia D, Desai MS, Falquet L, Gundogdu A, Hron K, Klammsteiner T, Lopes MB, Marcos-Zambrano LJ, Marques C, Mason M, May P, Pašić L, Pio G, Pongor S, Promponas VJ, Przymus P, Saez-Rodriguez J, Sampri A, Shigdel R, Stres B, Suharoschi R, Truu J, Truică CO, Vilne B, Vlachakis D, Yilmaz E, Zeller G, Zomer AL, Gómez-Cabrero D, Claesson MJ. Statistical and Machine Learning Techniques in Human Microbiome Studies: Contemporary Challenges and Solutions. Front Microbiol 2021;12:635781. [PMID: 33692771 PMCID: PMC7937616 DOI: 10.3389/fmicb.2021.635781] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 01/28/2021] [Indexed: 12/23/2022] Open

Affiliation(s)

Isabel Moreno-Indias Instituto de Investigación Biomédica de Málaga (IBIMA), Unidad de Gestión Clìnica de Endocrinologìa y Nutrición, Hospital Clìnico Universitario Virgen de la Victoria, Universidad de Málaga, Málaga, Spain Centro de Investigación Biomeìdica en Red de Fisiopatologtìa de la Obesidad y la Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
Leo Lahti Department of Computing, University of Turku, Turku, Finland
Miroslava Nedyalkova Human Genetics and Disease Mechanisms, Latvian Biomedical Research and Study Centre, Riga, Latvia
Ilze Elbere Latvian Biomedical Research and Study Centre, Riga, Latvia
Gennady Roshchupkin Department of Epidemiology, Erasmus Medical Center, Rotterdam, Netherlands
Muhamed Adilovic Department of Genetics and Bioengineering, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina
Onder Aydemir Department of Electrical and Electronics Engineering, Karadeniz Technical University, Trabzon, Turkey
Burcu Bakir-Gungor Department of Computer Engineering, Abdullah Gul University, Kayseri, Turkey
Enrique Carrillo-de Santa Pau Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
Domenica D’Elia Department for Biomedical Sciences, Institute for Biomedical Technologies, National Research Council, Bari, Italy
Mahesh S. Desai Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg Odense Research Center for Anaphylaxis, Department of Dermatology and Allergy Center, Odense University Hospital, University of Southern Denmark, Odense, Denmark
Laurent Falquet Department of Biology, University of Fribourg, Fribourg, Switzerland Swiss Institute of Bioinformatics, Lausanne, Switzerland
Aycan Gundogdu Department of Microbiology and Clinical Microbiology, Faculty of Medicine, Erciyes University, Kayseri, Turkey Metagenomics Laboratory, Genome and Stem Cell Center (GenKök), Erciyes University, Kayseri, Turkey
Karel Hron Department of Mathematical Analysis and Applications of Mathematics, Palacký University, Olomouc, Czechia
Thomas Klammsteiner Department of Microbiology, University of Innsbruck, Innsbruck, Austria
Marta B. Lopes NOVA Laboratory for Computer Science and Informatics (NOVA LINCS), FCT, UNL, Caparica, Portugal Centro de Matemática e Aplicações (CMA), FCT, UNL, Caparica, Portugal
Laura Judith Marcos-Zambrano Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
Cláudia Marques CINTESIS, NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal
Michael Mason Computational Oncology, Sage Bionetworks, Seattle, WA, United States
Patrick May Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
Lejla Pašić Sarajevo Medical School, University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina
Gianvito Pio Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
Sándor Pongor Faculty of Information Tehnology and Bionics, Pázmány University, Budapest, Hungary
Vasilis J. Promponas Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
Piotr Przymus Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Toruñ, Poland
Julio Saez-Rodriguez Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine and Heidelberg University Hospital, Heidelberg, Germany
Alexia Sampri Division of Informatics, Imaging and Data Sciences, School of Health Sciences, University of Manchester, Manchester, United Kingdom
Rajesh Shigdel Department of Clinical Science, University of Bergen, Bergen, Norway
Blaz Stres Jozef Stefan Institute, Ljubljana, Slovenia Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia Faculty of Civil and Geodetic Engineering, University of Ljubljana, Ljubljana, Slovenia
Ramona Suharoschi Molecular Nutrition and Proteomics Lab, Faculty of the Food Science and Technology, Institute of Life Sciences, University of Agricultural Sciences and Veterinary Medicine of Cluj-Napoca, Cluj-Napoca, Romania
Jaak Truu Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
Ciprian-Octavian Truică Department of Computer Science and Engineering, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Bucharest, Romania
Baiba Vilne Bioinformatics Research Unit, Riga Stradins University, Riga, Latvia
Dimitrios Vlachakis Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
Ercument Yilmaz Department of Computer Technologies, Karadeniz Technical University, Trabzon, Turkey
Georg Zeller European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany
Aldert L. Zomer Department of Infectious Diseases and Immunology, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands
David Gómez-Cabrero Navarrabiomed, Complejo Hospitalario de Navarra (CHN), IdiSNA, Universidad Pública de Navarra (UPNA), Pamplona, Spain
Marcus J. Claesson School of Microbiology and APC Microbiome Ireland, University College Cork, Cork, Ireland

Collapse

Marcos-Zambrano LJ, Karaduzovic-Hadziabdic K, Loncar Turukalo T, Przymus P, Trajkovik V, Aasmets O, Berland M, Gruca A, Hasic J, Hron K, Klammsteiner T, Kolev M, Lahti L, Lopes MB, Moreno V, Naskinova I, Org E, Paciência I, Papoutsoglou G, Shigdel R, Stres B, Vilne B, Yousef M, Zdravevski E, Tsamardinos I, Carrillo de Santa Pau E, Claesson MJ, Moreno-Indias I, Truu J. Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment. Front Microbiol 2021;12:634511. [PMID: 33737920 PMCID: PMC7962872 DOI: 10.3389/fmicb.2021.634511] [Citation(s) in RCA: 113] [Impact Index Per Article: 37.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 02/01/2021] [Indexed: 12/19/2022] Open

Abstract

The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.

Collapse

Affiliation(s)

Laura Judith Marcos-Zambrano Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
Kanita Karaduzovic-Hadziabdic Faculty of Engineering and Natural Sciences, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina
Tatjana Loncar Turukalo Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia
Piotr Przymus Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Toruń, Poland
Vladimir Trajkovik Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, North Macedonia
Oliver Aasmets Institute of Genomics, Estonian Genome Centre, University of Tartu, Tartu, Estonia Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
Magali Berland Université Paris-Saclay, INRAE, MGP, Jouy-en-Josas, France
Aleksandra Gruca Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, Poland
Jasminka Hasic University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina
Karel Hron Department of Mathematical Analysis and Applications of Mathematics, Palacký University, Olomouc, Czechia
Thomas Klammsteiner Department of Microbiology, University of Innsbruck, Innsbruck, Austria
Mikhail Kolev South West University “Neofit Rilski”, Blagoevgrad, Bulgaria
Leo Lahti Department of Computing, University of Turku, Turku, Finland
Marta B. Lopes NOVA Laboratory for Computer Science and Informatics (NOVA LINCS), FCT, UNL, Caparica, Portugal Centro de Matemática e Aplicações (CMA), FCT, UNL, Caparica, Portugal
Victor Moreno Oncology Data Analytics Program, Catalan Institute of Oncology (ICO)Barcelona, Spain Colorectal Cancer Group, Institut de Recerca Biomedica de Bellvitge (IDIBELL), Barcelona, Spain Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Barcelona, Spain Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
Irina Naskinova South West University “Neofit Rilski”, Blagoevgrad, Bulgaria
Elin Org Institute of Genomics, Estonian Genome Centre, University of Tartu, Tartu, Estonia
Inês Paciência EPIUnit – Instituto de Saúde Pública da Universidade do Porto, Porto, Portugal
Georgios Papoutsoglou Department of Computer Science, University of Crete, Heraklion, Greece
Rajesh Shigdel Department of Clinical Science, University of Bergen, Bergen, Norway
Blaz Stres Group for Microbiology and Microbial Biotechnology, Department of Animal Science, University of Ljubljana, Ljubljana, Slovenia
Baiba Vilne Bioinformatics Research Unit, Riga Stradins University, Riga, Latvia
Malik Yousef Department of Information Systems, Zefat Academic College, Zefat, Israel Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat, Israel
Eftim Zdravevski Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, North Macedonia
Ioannis Tsamardinos Department of Computer Science, University of Crete, Heraklion, Greece
Enrique Carrillo de Santa Pau Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
Marcus J. Claesson School of Microbiology & APC Microbiome Ireland, University College Cork, Cork, Ireland
Isabel Moreno-Indias Unidad de Gestión Clínica de Endocrinología y Nutrición, Instituto de Investigación Biomédica de Málaga (IBIMA), Hospital Clínico Universitario Virgen de la Victoria, Universidad de Málaga, Málaga, Spain Centro de Investigación Biomédica en Red de Fisiopatología de la Obesidad y la Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
Jaak Truu Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia

Collapse

Li W, Liu H, Cheng F, Li Y, Li S, Yan J. Artiﬁcial intelligence applications for oncological positron emission tomography imaging. Eur J Radiol 2020;134:109448. [PMID: 33307463 DOI: 10.1016/j.ejrad.2020.109448] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 10/07/2020] [Accepted: 11/26/2020] [Indexed: 12/16/2022]

Iadanza E, Fabbri R, Bašić-čičak D, Amedei A, Telalovic JH. Gut microbiota and artificial intelligence approaches: A scoping review. Health Technol 2020;10:1343-58. [DOI: 10.1007/s12553-020-00486-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Cammarota G, Ianiro G, Ahern A, Carbone C, Temko A, Claesson MJ, Gasbarrini A, Tortora G. Gut microbiome, big data and machine learning to promote precision medicine for cancer. Nat Rev Gastroenterol Hepatol 2020;17:635-648. [PMID: 32647386 DOI: 10.1038/s41575-020-0327-3] [Citation(s) in RCA: 135] [Impact Index Per Article: 33.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/02/2020] [Indexed: 12/13/2022]

Su X, Jing G, Zhang Y, Wu S. Method development for cross-study microbiome data mining: Challenges and opportunities. Comput Struct Biotechnol J 2020;18:2075-2080. [PMID: 32802279 PMCID: PMC7419250 DOI: 10.1016/j.csbj.2020.07.020] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 07/22/2020] [Accepted: 07/24/2020] [Indexed: 01/26/2023] Open

Seneviratne CJ, Balan P, Suriyanarayanan T, Lakshmanan M, Lee DY, Rho M, Jakubovics N, Brandt B, Crielaard W, Zaura E. Oral microbiome-systemic link studies: perspectives on current limitations and future artificial intelligence-based approaches. Crit Rev Microbiol 2020;46:288-299. [PMID: 32434436 DOI: 10.1080/1040841x.2020.1766414] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Reiman D, Metwally AA, Sun J, Dai Y. PopPhy-CNN: A Phylogenetic Tree Embedded Architecture for Convolutional Neural Networks to Predict Host Phenotype From Metagenomic Data. IEEE J Biomed Health Inform 2020;24:2993-3001. [PMID: 32396115 DOI: 10.1109/jbhi.2020.2993761] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Khan S, Kelly L. Multiclass Disease Classification from Microbial Whole-Community Metagenomes. Pac Symp Biocomput 2020;25:55-66. [PMID: 31797586 PMCID: PMC7120658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

van den Bogert B, Boekhorst J, Pirovano W, May A. On the Role of Bioinformatics and Data Science in Industrial Microbiome Applications. Front Genet 2019;10:721. [PMID: 31447883 PMCID: PMC6696986 DOI: 10.3389/fgene.2019.00721] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 07/09/2019] [Indexed: 01/08/2023] Open