1
|
Tseng KK, Koehler H, Becker DJ, Gibb R, Carlson CJ, Pilar Fernandez MD, Seifert SN. Viral genomic features predict Orthopoxvirus reservoir hosts. Commun Biol 2025; 8:309. [PMID: 40000824 PMCID: PMC11862092 DOI: 10.1038/s42003-025-07746-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 02/14/2025] [Indexed: 02/27/2025] Open
Abstract
Orthopoxviruses (OPVs), including the causative agents of smallpox and mpox have led to devastating outbreaks in human populations worldwide. However, the discontinuation of smallpox vaccination, which also provides cross-protection against related OPVs, has diminished global immunity to OPVs more broadly. We apply machine learning models incorporating both host ecological and viral genomic features to predict likely reservoirs of OPVs. We demonstrate that incorporating viral genomic features in addition to host ecological traits enhanced the accuracy of potential OPV host predictions, highlighting the importance of host-virus molecular interactions in predicting potential host species. We identify hotspots for geographic regions rich with potential OPV hosts in parts of southeast Asia, equatorial Africa, and the Amazon, revealing high overlap between regions predicted to have a high number of potential OPV host species and those with the lowest smallpox vaccination coverage, indicating a heightened risk for the emergence or establishment of zoonotic OPVs. Our findings can be used to target wildlife surveillance, particularly related to concerns about mpox establishment beyond its historical range.
Collapse
Affiliation(s)
- Katie K Tseng
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA, USA
| | - Heather Koehler
- School of Molecular Biosciences, Washington State University, Pullman, WA, USA
| | - Daniel J Becker
- School of Biological Sciences, University of Oklahoma, Norman, OK, USA
| | - Rory Gibb
- Centre for Biodiversity and Environment Research, Department of Genetics, Evolution and Environment, University College London, London, UK
- People & Nature Lab, UCL East, University College London, London, UK
| | - Colin J Carlson
- Department of Epidemiology of Microbial Diseases, Yale University School of Public Health, New Haven, CT, USA
| | | | - Stephanie N Seifert
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA, USA.
| |
Collapse
|
2
|
Karwowska Z, Aasmets O, Kosciolek T, Org E. Effects of data transformation and model selection on feature importance in microbiome classification data. MICROBIOME 2025; 13:2. [PMID: 39754220 PMCID: PMC11699698 DOI: 10.1186/s40168-024-01996-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 12/04/2024] [Indexed: 01/06/2025]
Abstract
BACKGROUND Accurate classification of host phenotypes from microbiome data is crucial for advancing microbiome-based therapies, with machine learning offering effective solutions. However, the complexity of the gut microbiome, data sparsity, compositionality, and population-specificity present significant challenges. Microbiome data transformations can alleviate some of the aforementioned challenges, but their usage in machine learning tasks has largely been unexplored. RESULTS Our analysis of over 8500 samples from 24 shotgun metagenomic datasets showed that it is possible to classify healthy and diseased individuals using microbiome data with minimal dependence on the choice of algorithm or transformation. Presence-absence transformations performed comparably to abundance-based transformations, and only a small subset of predictors is necessary for accurate classification. However, while different transformations resulted in comparable classification performance, the most important features varied significantly, which highlights the need to reevaluate machine learning-based biomarker detection. CONCLUSIONS Microbiome data transformations can significantly influence feature selection but have a limited effect on classification accuracy. Our findings suggest that while classification is robust across different transformations, the variation in feature selection necessitates caution when using machine learning for biomarker identification. This research provides valuable insights for applying machine learning to microbiome data and identifies important directions for future work.
Collapse
Affiliation(s)
- Zuzanna Karwowska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Doctoral School of Exact and Natural Sciences, Jagiellonian University, Krakow, Poland
- Sano Centre for Computational Medicine, Krakow, Poland
| | - Oliver Aasmets
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Tomasz Kosciolek
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland.
- Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland.
- Sano Centre for Computational Medicine, Krakow, Poland.
| | - Elin Org
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia.
| |
Collapse
|
3
|
Tseng KK, Koehler H, Becker DJ, Gibb R, Carlson CJ, del Pilar Fernandez M, Seifert SN. Viral genomic features predict Orthopoxvirus reservoir hosts. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.26.564211. [PMID: 37961540 PMCID: PMC10634857 DOI: 10.1101/2023.10.26.564211] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Orthopoxviruses (OPVs), including the causative agents of smallpox and mpox have led to devastating outbreaks in human populations worldwide. However, the discontinuation of smallpox vaccination, which also provides cross-protection against related OPVs, has diminished global immunity to OPVs more broadly. We apply machine learning models incorporating both host ecological and viral genomic features to predict likely reservoirs of OPVs. We demonstrate that incorporating viral genomic features in addition to host ecological traits enhanced the accuracy of potential OPV host predictions, highlighting the importance of host-virus molecular interactions in predicting potential host species. We identify hotspots for geographic regions rich with potential OPV hosts in parts of southeast Asia, equatorial Africa, and the Amazon, revealing high overlap between regions predicted to have a high number of potential OPV host species and those with the lowest smallpox vaccination coverage, indicating a heightened risk for the emergence or establishment of zoonotic OPVs. Our findings can be used to target wildlife surveillance, particularly related to concerns about mpox establishment beyond its historical range.
Collapse
Affiliation(s)
- Katie K. Tseng
- Paul G. Allen School for Global Health, Washington State University, Pullman, Washington, United States of America
| | - Heather Koehler
- School of Molecular Biosciences, Washington State University, Pullman, Washington, United States of America
| | - Daniel J. Becker
- School of Biological Sciences, University of Oklahoma, Norman, Oklahoma, United States of America
| | - Rory Gibb
- Centre for Biodiversity and Environment Research, Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
- People & Nature Lab, UCL East, University College London, Stratford, London, United Kindom
| | - Colin J. Carlson
- Center for Global Health Science and Security, Georgetown University, Washington, DC, United States of America
| | - Maria del Pilar Fernandez
- Paul G. Allen School for Global Health, Washington State University, Pullman, Washington, United States of America
| | - Stephanie N. Seifert
- Paul G. Allen School for Global Health, Washington State University, Pullman, Washington, United States of America
| |
Collapse
|
4
|
Gorman ED, Lladser ME. Interpretable metric learning in comparative metagenomics: The adaptive Haar-like distance. PLoS Comput Biol 2024; 20:e1011543. [PMID: 38768195 PMCID: PMC11142682 DOI: 10.1371/journal.pcbi.1011543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 05/31/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024] Open
Abstract
Random forests have emerged as a promising tool in comparative metagenomics because they can predict environmental characteristics based on microbial composition in datasets where β-diversity metrics fall short of revealing meaningful relationships between samples. Nevertheless, despite this efficacy, they lack biological insight in tandem with their predictions, potentially hindering scientific advancement. To overcome this limitation, we leverage a geometric characterization of random forests to introduce a data-driven phylogenetic β-diversity metric, the adaptive Haar-like distance. This new metric assigns a weight to each internal node (i.e., split or bifurcation) of a reference phylogeny, indicating the relative importance of that node in discerning environmental samples based on their microbial composition. Alongside this, a weighted nearest-neighbors classifier, constructed using the adaptive metric, can be used as a proxy for the random forest while maintaining accuracy on par with that of the original forest and another state-of-the-art classifier, CoDaCoRe. As shown in datasets from diverse microbial environments, however, the new metric and classifier significantly enhance the biological interpretability and visualization of high-dimensional metagenomic samples.
Collapse
Affiliation(s)
- Evan D. Gorman
- Department of Applied Mathematics, University of Colorado, Boulder, Colorado, United States of America
| | - Manuel E. Lladser
- Department of Applied Mathematics, University of Colorado, Boulder, Colorado, United States of America
| |
Collapse
|
5
|
Quiroga MV, Stegen JC, Mataloni G, Cowan D, Lebre PH, Valverde A. Microdiverse bacterial clades prevail across Antarctic wetlands. Mol Ecol 2024; 33:e17189. [PMID: 37909659 DOI: 10.1111/mec.17189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 10/06/2023] [Accepted: 10/16/2023] [Indexed: 11/03/2023]
Abstract
Antarctica's extreme environmental conditions impose selection pressures on microbial communities. Indeed, a previous study revealed that bacterial assemblages at the Cierva Point Wetland Complex (CPWC) are shaped by strong homogeneous selection. Yet which bacterial phylogenetic clades are shaped by selection processes and their ecological strategies to thrive in such extreme conditions remain unknown. Here, we applied the phyloscore and feature-level βNTI indexes coupled with phylofactorization to successfully detect bacterial monophyletic clades subjected to homogeneous (HoS) and heterogenous (HeS) selection. Remarkably, only the HoS clades showed high relative abundance across all samples and signs of putative microdiversity. The majority of the amplicon sequence variants (ASVs) within each HoS clade clustered into a unique 97% sequence similarity operational taxonomic unit (OTU) and inhabited a specific environment (lotic, lentic or terrestrial). Our findings suggest the existence of microdiversification leading to sub-taxa niche differentiation, with putative distinct ecotypes (consisting of groups of ASVs) adapted to a specific environment. We hypothesize that HoS clades thriving in the CPWC have phylogenetically conserved traits that accelerate their rate of evolution, enabling them to adapt to strong spatio-temporally variable selection pressures. Variable selection appears to operate within clades to cause very rapid microdiversification without losing key traits that lead to high abundance. Variable and homogeneous selection, therefore, operate simultaneously but on different aspects of organismal ecology. The result is an overall signal of homogeneous selection due to rapid within-clade microdiversification caused by variable selection. It is unknown whether other systems experience this dynamic, and we encourage future work evaluating the transferability of our results.
Collapse
Affiliation(s)
- María V Quiroga
- Instituto Tecnológico de Chascomús (CONICET-UNSAM), Buenos Aires, Argentina
- Escuela de Bio y Nanotecnologías (UNSAM), Buenos Aires, Argentina
| | - James C Stegen
- Pacific Northwest National Laboratory, Ecosystem Science Team, Richland, Washington, USA
| | - Gabriela Mataloni
- Instituto de Investigación e Ingeniería Ambiental (IIIA, CONICET-UNSAM), Buenos Aires, Argentina
| | - Don Cowan
- Centre for Microbial Ecology and Genomics (CMEG), Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Pedro H Lebre
- Centre for Microbial Ecology and Genomics (CMEG), Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Angel Valverde
- Instituto de Recursos Naturales y Agrobiología de Salamanca (IRNASA), Consejo Superior de Investigaciones Científicas (CSIC), Salamanca, Spain
| |
Collapse
|
6
|
Fusi M, Ngugi DK, Marasco R, Booth JM, Cardinale M, Sacchi L, Clementi E, Yang X, Garuglieri E, Fodelianakis S, Michoud G, Daffonchio D. Gill-associated bacteria are homogeneously selected in amphibious mangrove crabs to sustain host intertidal adaptation. MICROBIOME 2023; 11:189. [PMID: 37612775 PMCID: PMC10463870 DOI: 10.1186/s40168-023-01629-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 07/20/2023] [Indexed: 08/25/2023]
Abstract
BACKGROUND The transition from water to air is a key event in the evolution of many marine organisms to access new food sources, escape water hypoxia, and exploit the higher and temperature-independent oxygen concentration of air. Despite the importance of microorganisms in host adaptation, their contribution to overcoming the challenges posed by the lifestyle changes from water to land is not well understood. To address this, we examined how microbial association with a key multifunctional organ, the gill, is involved in the intertidal adaptation of fiddler crabs, a dual-breathing organism. RESULTS Electron microscopy revealed a rod-shaped bacterial layer tightly connected to the gill lamellae of the five crab species sampled across a latitudinal gradient from the central Red Sea to the southern Indian Ocean. The gill bacterial community diversity assessed with 16S rRNA gene amplicon sequencing was consistently low across crab species, and the same actinobacterial group, namely Ilumatobacter, was dominant regardless of the geographic location of the host. Using metagenomics and metatranscriptomics, we detected that these members of actinobacteria are potentially able to convert ammonia to amino acids and may help eliminate toxic sulphur compounds and carbon monoxide to which crabs are constantly exposed. CONCLUSIONS These results indicate that bacteria selected on gills can play a role in the adaptation of animals in dynamic intertidal ecosystems. Hence, this relationship is likely to be important in the ecological and evolutionary processes of the transition from water to air and deserves further attention, including the ontogenetic onset of this association. Video Abstract.
Collapse
Affiliation(s)
- Marco Fusi
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia.
- Centre for Conservation and Restoration Science, Edinburgh Napier University, Edinburgh, UK.
| | - David K Ngugi
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Inhoffenstrasse 7B, D-38124, Braunschweig, Germany
| | - Ramona Marasco
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Jenny Marie Booth
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Massimiliano Cardinale
- Institute of Applied Microbiology Research Center for BioSystems, Land Use, and Nutrition (IFZ) Justus-Liebig-University Giessen, D-35392, Giessen, Germany
- Department of Biological and Environmental Sciences and Technologies, University of Salento, via Prov.le Lecce-Monteroni, I-73100, Lecce, Italy
| | - Luciano Sacchi
- Dipartimento di Biologia e Biotecnologie "L. Spallanzani", Università di Pavia, I-27100, Pavia, Italy
| | - Emanuela Clementi
- Dipartimento di Biologia e Biotecnologie "L. Spallanzani", Università di Pavia, I-27100, Pavia, Italy
| | - Xinyuan Yang
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Elisa Garuglieri
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Stilianos Fodelianakis
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Grégoire Michoud
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Daniele Daffonchio
- Red Sea Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia.
| |
Collapse
|
7
|
Poisot T, Ouellet MA, Mollentze N, Farrell MJ, Becker DJ, Brierley L, Albery GF, Gibb RJ, Seifert SN, Carlson CJ. Network embedding unveils the hidden interactions in the mammalian virome. PATTERNS (NEW YORK, N.Y.) 2023; 4:100738. [PMID: 37409053 PMCID: PMC10318366 DOI: 10.1016/j.patter.2023.100738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 01/19/2023] [Accepted: 03/31/2023] [Indexed: 07/07/2023]
Abstract
Predicting host-virus interactions is fundamentally a network science problem. We develop a method for bipartite network prediction that combines a recommender system (linear filtering) with an imputation algorithm based on low-rank graph embedding. We test this method by applying it to a global database of mammal-virus interactions and thus show that it makes biologically plausible predictions that are robust to data biases. We find that the mammalian virome is under-characterized anywhere in the world. We suggest that future virus discovery efforts could prioritize the Amazon Basin (for its unique coevolutionary assemblages) and sub-Saharan Africa (for its poorly characterized zoonotic reservoirs). Graph embedding of the imputed network improves predictions of human infection from viral genome features, providing a shortlist of priorities for laboratory studies and surveillance. Overall, our study indicates that the global structure of the mammal-virus network contains a large amount of information that is recoverable, and this provides new insights into fundamental biology and disease emergence.
Collapse
Affiliation(s)
- Timothée Poisot
- Département de Sciences Biologiques, Université de Montréal, Montréal, QC, Canada
| | - Marie-Andrée Ouellet
- Département de Sciences Biologiques, Université de Montréal, Montréal, QC, Canada
| | - Nardus Mollentze
- School of Biodiversity, One Health and Veterinary Medicine, University of Glasgow, Glasgow, UK
- MRC – University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Maxwell J. Farrell
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | | | - Liam Brierley
- Department of Health Data Science, University of Liverpool, Liverpool, UK
| | | | - Rory J. Gibb
- Center for Biodiversity & Environment Research, University College, London, UK
| | - Stephanie N. Seifert
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA, USA
| | - Colin J. Carlson
- Center for Global Health Science and Security, Georgetown University, Washington, DC, USA
| |
Collapse
|
8
|
Cohen LE, Fagre AC, Chen B, Carlson CJ, Becker DJ. Coronavirus sampling and surveillance in bats from 1996-2019: a systematic review and meta-analysis. Nat Microbiol 2023; 8:1176-1186. [PMID: 37231088 PMCID: PMC10234814 DOI: 10.1038/s41564-023-01375-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 03/24/2023] [Indexed: 05/27/2023]
Abstract
The emergence of SARS-CoV-2 highlights a need for evidence-based strategies to monitor bat viruses. We performed a systematic review of coronavirus sampling (testing for RNA positivity) in bats globally. We identified 110 studies published between 2005 and 2020 that collectively reported positivity from 89,752 bat samples. We compiled 2,274 records of infection prevalence at the finest methodological, spatiotemporal and phylogenetic level of detail possible from public records into an open, static database named datacov, together with metadata on sampling and diagnostic methods. We found substantial heterogeneity in viral prevalence across studies, reflecting spatiotemporal variation in viral dynamics and methodological differences. Meta-analysis identified sample type and sampling design as the best predictors of prevalence, with virus detection maximized in rectal and faecal samples and by repeat sampling of the same site. Fewer than one in five studies collected and reported longitudinal data, and euthanasia did not improve virus detection. We show that bat sampling before the SARS-CoV-2 pandemic was concentrated in China, with research gaps in South Asia, the Americas and sub-Saharan Africa, and in subfamilies of phyllostomid bats. We propose that surveillance strategies should address these gaps to improve global health security and enable the origins of zoonotic coronaviruses to be identified.
Collapse
Affiliation(s)
- Lily E Cohen
- Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Anna C Fagre
- Department of Microbiology, Immunology, and Pathology, College of Veterinary Medicine and Biomedical Sciences, Colorado State University, Fort Collins, CO, USA
| | - Binqi Chen
- Center for Global Health Science and Security, Georgetown University Medical Center, Washington, DC, USA
| | - Colin J Carlson
- Center for Global Health Science and Security, Georgetown University Medical Center, Washington, DC, USA
| | - Daniel J Becker
- Department of Biology, University of Oklahoma, Norman, OK, USA
| |
Collapse
|
9
|
Homogeneous Environmental Selection Structures the Bacterial Communities of Benthic Biofilms in Proglacial Floodplain Streams. Appl Environ Microbiol 2023; 89:e0201022. [PMID: 36847567 PMCID: PMC10053691 DOI: 10.1128/aem.02010-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
In proglacial floodplains, glacier recession promotes biogeochemical and ecological gradients across relatively small spatial scales. The resulting environmental heterogeneity induces remarkable microbial biodiversity among proglacial stream biofilms. Yet the relative importance of environmental constraints in forming biofilm communities remains largely unknown. Extreme environmental conditions in proglacial streams may lead to the homogenizing selection of biofilm-forming microorganisms. However, environmental differences between proglacial streams may impose different selective forces, resulting in nested, spatially structured assembly processes. Here, we investigated bacterial community assembly processes by unraveling ecologically successful phylogenetic clades in two stream types (glacier-fed mainstems and non-glacier-fed tributaries) draining three proglacial floodplains in the Swiss Alps. Clades with low phylogenetic turnover rates were present in all stream types, including Gammaproteobacteria and Alphaproteobacteria, while the other clades were specific to one stream type. These clades constituted up to 34.8% and 31.1% of the community diversity and up to 61.3% and 50.9% of the relative abundances in mainstems and tributaries, respectively, highlighting their importance and success in these communities. Furthermore, the proportion of bacteria under homogeneous selection was inversely related to the abundance of photoautotrophs, and these clades may therefore decrease in abundance with the future "greening" of proglacial habitats. Finally, we found little effect of physical distance from the glacier on clades under selection in glacier-fed streams, probably due to the high hydrological connectivity of our study reaches. Overall, these findings shed new light on the mechanisms of microbial biofilm assembly in proglacial streams and help us to predict their future in a rapidly changing environment. IMPORTANCE Streams draining proglacial floodplains harbor benthic biofilms comprised of diverse microbial communities. These high-mountain ecosystems are rapidly changing with climate warming, and it is therefore critical to better understand the mechanisms underlying the assembly of their microbial communities. We found that homogeneous selection dominates the structuring of bacterial communities in benthic biofilms in both glacier-fed mainstems and nonglacier tributary streams within three proglacial floodplains in the Swiss Alps. However, differences between glacier-fed and tributary ecosystems may impose differential selective forces. Here, we uncovered nested, spatially structured assembly processes for proglacial floodplain communities. Our analyses additionally provided insights into linkages between aquatic photoautotrophs and the bacterial taxa under homogeneous selection, potentially by providing a labile source of carbon in these otherwise carbon-deprived systems. In the future, we expect a shift in the bacterial communities under homogeneous selection in glacier-fed streams as primary production becomes more important and streams become "greener".
Collapse
|
10
|
Heckley AM, Becker DJ. Tropical bat ectoparasitism in continuous versus fragmented forests: A gap analysis and preliminary meta-analysis. Ecol Evol 2023; 13:e9784. [PMID: 36744075 PMCID: PMC9891993 DOI: 10.1002/ece3.9784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 01/06/2023] [Accepted: 01/16/2023] [Indexed: 02/04/2023] Open
Abstract
Tropical regions are experiencing rapid rates of forest fragmentation, which can have several effects on wildlife, including altered parasite dynamics. Bats are a useful host group to consider the effects of fragmentation, because they are abundant in the tropics, serve important ecological roles, and harbor many parasites. Nevertheless, research on the effects of fragmentation on bat ectoparasites is still limited. To help guide ongoing and future research efforts, this study had two objectives: (1) conduct a gap analysis to characterize the state of currently available research on fragmentation effects on bat ectoparasites and (2) conduct a preliminary meta-analysis to identify current trends. We systematically highlighted several research gaps: Studies comparing the effects of fragmented versus continuous forests on ectoparasites are limited and have primarily been conducted in the Neotropics, with a focus on bats in the superfamily Noctilionidea (especially frugivorous phyllostomids). Our preliminary meta-analysis suggested that ectoparasite prevalence (but not the mean or variance in intensity) was higher in fragments than in continuous forests. Moreover, prevalence increased with increasing roost duration, and mean intensity was higher for bats with higher wing aspect ratios. Intensity variance was affected by an interaction between forest type and wing aspect ratio, such that variance increased for bats with high-wing aspect ratios in continuous forests but decreased in fragments. These results suggest that fragmentation can shape aspects of bat ectoparasitism and could have implications for the ecology, health, and conservation of bats in fragmented landscapes. However, existing research gaps could bias our current understanding of habitat change and bat health, and future research should thus investigate these effects in the Paleotropics and with other bat families.
Collapse
Affiliation(s)
- Alexis M. Heckley
- Department of Biology and the Redpath MuseumMcGill UniversityMontrealQuebecCanada
| | | |
Collapse
|
11
|
Krohn C, Khudur L, Dias DA, van den Akker B, Rees CA, Crosbie ND, Surapaneni A, O'Carroll DM, Stuetz RM, Batstone DJ, Ball AS. The role of microbial ecology in improving the performance of anaerobic digestion of sewage sludge. Front Microbiol 2022; 13:1079136. [PMID: 36590430 PMCID: PMC9801413 DOI: 10.3389/fmicb.2022.1079136] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/28/2022] [Indexed: 12/15/2022] Open
Abstract
The use of next-generation diagnostic tools to optimise the anaerobic digestion of municipal sewage sludge has the potential to increase renewable natural gas recovery, improve the reuse of biosolid fertilisers and help operators expand circular economies globally. This review aims to provide perspectives on the role of microbial ecology in improving digester performance in wastewater treatment plants, highlighting that a systems biology approach is fundamental for monitoring mesophilic anaerobic sewage sludge in continuously stirred reactor tanks. We further highlight the potential applications arising from investigations into sludge ecology. The principal limitation for improvements in methane recoveries or in process stability of anaerobic digestion, especially after pre-treatment or during co-digestion, are ecological knowledge gaps related to the front-end metabolism (hydrolysis and fermentation). Operational problems such as stable biological foaming are a key problem, for which ecological markers are a suitable approach. However, no biomarkers exist yet to assist in monitoring and management of clade-specific foaming potentials along with other risks, such as pollutants and pathogens. Fundamental ecological principles apply to anaerobic digestion, which presents opportunities to predict and manipulate reactor functions. The path ahead for mapping ecological markers on process endpoints and risk factors of anaerobic digestion will involve numerical ecology, an expanding field that employs metrics derived from alpha, beta, phylogenetic, taxonomic, and functional diversity, as well as from phenotypes or life strategies derived from genetic potentials. In contrast to addressing operational issues (as noted above), which are effectively addressed by whole population or individual biomarkers, broad improvement and optimisation of function will require enhancement of hydrolysis and acidogenic processes. This will require a discovery-based approach, which will involve integrative research involving the proteome and metabolome. This will utilise, but overcome current limitations of DNA-centric approaches, and likely have broad application outside the specific field of anaerobic digestion.
Collapse
Affiliation(s)
- Christian Krohn
- ARC Training Centre for the Transformation of Australia's Biosolids Resource, RMIT University, Bundoora, VIC, Australia,*Correspondence: Christian Krohn,
| | - Leadin Khudur
- ARC Training Centre for the Transformation of Australia's Biosolids Resource, RMIT University, Bundoora, VIC, Australia
| | - Daniel Anthony Dias
- School of Health and Biomedical Sciences, Discipline of Laboratory Medicine, STEM College, RMIT University, Bundoora, VIC, Australia
| | | | | | | | - Aravind Surapaneni
- ARC Training Centre for the Transformation of Australia's Biosolids Resource, RMIT University, Bundoora, VIC, Australia
| | - Denis M. O'Carroll
- Water Research Centre, School of Civil and Environmental Engineering, University of New South Wales, Sydney, NSW, Australia
| | - Richard M. Stuetz
- Water Research Centre, School of Civil and Environmental Engineering, University of New South Wales, Sydney, NSW, Australia
| | - Damien J. Batstone
- ARC Training Centre for the Transformation of Australia's Biosolids Resource, RMIT University, Bundoora, VIC, Australia,Australian Centre for Water and Environmental Biotechnology, Gehrmann Building, The University of Queensland, Brisbane, QLD, Australia
| | - Andrew S. Ball
- ARC Training Centre for the Transformation of Australia's Biosolids Resource, RMIT University, Bundoora, VIC, Australia
| |
Collapse
|
12
|
Douglas GM, Hayes MG, Langille MGI, Borenstein E. Integrating phylogenetic and functional data in microbiome studies. Bioinformatics 2022; 38:5055-5063. [PMID: 36179077 PMCID: PMC9665866 DOI: 10.1093/bioinformatics/btac655] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 09/10/2022] [Accepted: 09/29/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Microbiome functional data are frequently analyzed to identify associations between microbial functions (e.g. genes) and sample groups of interest. However, it is challenging to distinguish between different possible explanations for variation in community-wide functional profiles by considering functions alone. To help address this problem, we have developed POMS, a package that implements multiple phylogeny-aware frameworks to more robustly identify enriched functions. RESULTS The key contribution is an extended balance-tree workflow that incorporates functional and taxonomic information to identify functions that are consistently enriched in sample groups across independent taxonomic lineages. Our package also includes a workflow for running phylogenetic regression. Based on simulated data we demonstrate that these approaches more accurately identify gene families that confer a selective advantage compared with commonly used tools. We also show that POMS in particular can identify enriched functions in real-world metagenomics datasets that are potential targets of strong selection on multiple members of the microbiome. AVAILABILITY AND IMPLEMENTATION These workflows are freely available in the POMS R package at https://github.com/gavinmdouglas/POMS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gavin M Douglas
- Department of Microbiology and Immunology, McGill University, Montréal, QC H3A 2B4, Canada
| | - Molly G Hayes
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS B3H 4R2, Canada
| | | | | |
Collapse
|
13
|
Becker DJ, Albery GF, Sjodin AR, Poisot T, Bergner LM, Chen B, Cohen LE, Dallas TA, Eskew EA, Fagre AC, Farrell MJ, Guth S, Han BA, Simmons NB, Stock M, Teeling EC, Carlson CJ. Optimising predictive models to prioritise viral discovery in zoonotic reservoirs. THE LANCET. MICROBE 2022; 3:e625-e637. [PMID: 35036970 PMCID: PMC8747432 DOI: 10.1016/s2666-5247(21)00245-7] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use the bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of probable reservoir hosts. In early 2020, we generated an ensemble of eight statistical models that predicted host-virus associations and developed priority sampling recommendations for potential bat reservoirs of betacoronaviruses and bridge hosts for SARS-CoV-2. During a time frame of more than a year, we tracked the discovery of 47 new bat hosts of betacoronaviruses, validated the initial predictions, and dynamically updated our analytical pipeline. We found that ecological trait-based models performed well at predicting these novel hosts, whereas network methods consistently performed approximately as well or worse than expected at random. These findings illustrate the importance of ensemble modelling as a buffer against mixed-model quality and highlight the value of including host ecology in predictive models. Our revised models showed an improved performance compared with the initial ensemble, and predicted more than 400 bat species globally that could be undetected betacoronavirus hosts. We show, through systematic validation, that machine learning models can help to optimise wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating.
Collapse
Affiliation(s)
- Daniel J Becker
- Department of Biology, University of Oklahoma, Norman, OK, USA
| | - Gregory F Albery
- Department of Biology, Georgetown University, Washington, DC, USA
| | - Anna R Sjodin
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
| | - Timothée Poisot
- Université de Montréal, Département de Sciences Biologiques, Montréal, QC, Canada
| | - Laura M Bergner
- Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK
- Medical Research Centre, University of Glasgow Centre for Virus Research, Glasgow, UK
| | - Binqi Chen
- Center for Global Health Science and Security, Georgetown University Medical Center, Washington, DC, USA
| | - Lily E Cohen
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Tad A Dallas
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Evan A Eskew
- Department of Biology, Pacific Lutheran University, Tacoma, WA, USA
| | - Anna C Fagre
- Department of Microbiology, Immunology, and Pathology, College of Veterinary Medicine and Biomedical Sciences, Colorado State University, Fort Collins, CO, USA
- Bat Health Foundation, Fort Collins, CO, USA
| | - Maxwell J Farrell
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, ON, Canada
| | - Sarah Guth
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
| | - Barbara A Han
- Cary Institute of Ecosystem Studies, Millbrook, NY, USA
| | - Nancy B Simmons
- Department of Mammalogy, Division of Vertebrate Zoology, American Museum of Natural History, New York, NY, USA
| | - Michiel Stock
- Research Unit Knowledge-based Systems, Department of Data Analysis and Mathematical Modelling, Ghent University, Belgium
| | - Emma C Teeling
- School of Biology and Environmental Science, Science Centre West, University College Dublin, Dublin, Ireland
| | - Colin J Carlson
- Department of Biology, Georgetown University, Washington, DC, USA
- Center for Global Health Science and Security, Georgetown University Medical Center, Washington, DC, USA
- Department of Microbiology and Immunology, Georgetown University Medical Center, Washington, DC, USA
| |
Collapse
|
14
|
Mull N, Carlson CJ, Forbes KM, Becker DJ. Virus isolation data improve host predictions for New World rodent orthohantaviruses. J Anim Ecol 2022; 91:1290-1302. [PMID: 35362148 DOI: 10.1111/1365-2656.13694] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 03/16/2022] [Indexed: 11/30/2022]
Abstract
Identifying reservoir host species is crucial for understanding the ecology of multi-host pathogens and predicting risks of pathogen spillover from wildlife to people. Predictive models are increasingly used for identifying ecological traits and prioritizing surveillance of likely zoonotic reservoirs, but these often employ different types of evidence for establishing host associations. Comparisons between models with different infection evidence are necessary to guide inferences about the trait profiles of likely hosts and identify which hosts and geographical regions are likely sources of spillover. Here, we use New World rodent-orthohantavirus associations to explore differences in the performance and predictions of models trained on two types of evidence for infection and onward transmission: RT-PCR and live virus isolation data, representing active infections versus host competence, respectively. Orthohantaviruses are primarily carried by muroid rodents and cause the diseases haemorrhagic fever with renal syndrome (HFRS) and hantavirus cardiopulmonary syndrome (HCPS) in humans. We show that although boosted regression tree (BRT) models trained on RT-PCR and live virus isolation data both performed well and capture generally similar trait profiles, rodent phylogeny influenced previously collected RT-PCR data, and BRTs using virus isolation data displayed a narrower list of predicted reservoirs than those using RT-PCR data. BRT models trained on RT-PCR data identified 138 undiscovered hosts and virus isolation models identified 92 undiscovered hosts, with 27 undiscovered hosts identified by both models. Distributions of predicted hosts were concentrated in several different regions for each model, with large discrepancies between evidence types. As a form of validation, virus isolation models independently predicted several orthohantavirus-rodent host associations that had been previously identified through empirical research using RT-PCR. Our model predictions provide a priority list of species and locations for future orthohantavirus sampling. More broadly, these results demonstrate the value of multiple data types for predicting zoonotic pathogen hosts. These methods can be applied across a range of systems to improve our understanding of pathogen maintenance and increase efficiency of pathogen surveillance.
Collapse
Affiliation(s)
- Nathaniel Mull
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Colin J Carlson
- Center for Global Health Science and Security, Georgetown University Medical Center, Washington, DC, USA
| | - Kristian M Forbes
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Daniel J Becker
- Department of Biology, University of Oklahoma, Norman, OK, USA
| |
Collapse
|
15
|
Kohler TJ, Fodelianakis S, Michoud G, Ezzat L, Bourquin M, Peter H, Busi SB, Pramateftaki P, Deluigi N, Styllas M, Tolosano M, de Staercke V, Schön M, Brandani J, Marasco R, Daffonchio D, Wilmes P, Battin TJ. Glacier shrinkage will accelerate downstream decomposition of organic matter and alters microbiome structure and function. GLOBAL CHANGE BIOLOGY 2022; 28:3846-3859. [PMID: 35320603 PMCID: PMC9323552 DOI: 10.1111/gcb.16169] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 03/06/2022] [Indexed: 05/22/2023]
Abstract
The shrinking of glaciers is among the most iconic consequences of climate change. Despite this, the downstream consequences for ecosystem processes and related microbiome structure and function remain poorly understood. Here, using a space-for-time substitution approach across 101 glacier-fed streams (GFSs) from six major regions worldwide, we investigated how glacier shrinkage is likely to impact the organic matter (OM) decomposition rates of benthic biofilms. To do this, we measured the activities of five common extracellular enzymes and estimated decomposition rates by using enzyme allocation equations based on stoichiometry. We found decomposition rates to average 0.0129 (% d-1 ), and that decreases in glacier influence (estimated by percent glacier catchment coverage, turbidity, and a glacier index) accelerates decomposition rates. To explore mechanisms behind these relationships, we further compared decomposition rates with biofilm and stream water characteristics. We found that chlorophyll-a, temperature, and stream water N:P together explained 61% of the variability in decomposition. Algal biomass, which is also increasing with glacier shrinkage, showed a particularly strong relationship with decomposition, likely indicating their importance in contributing labile organic compounds to these carbon-poor habitats. We also found high relative abundances of chytrid fungi in GFS sediments, which putatively parasitize these algae, promoting decomposition through a fungal shunt. Exploring the biofilm microbiome, we then sought to identify bacterial phylogenetic clades significantly associated with decomposition, and found numerous positively (e.g., Saprospiraceae) and negatively (e.g., Nitrospira) related clades. Lastly, using metagenomics, we found evidence of different bacterial classes possessing different proportions of EEA-encoding genes, potentially informing some of the microbial associations with decomposition rates. Our results, therefore, present new mechanistic insights into OM decomposition in GFSs by demonstrating that an algal-based "green food web" is likely to increase in importance in the future and will promote important biogeochemical shifts in these streams as glaciers vanish.
Collapse
Affiliation(s)
- Tyler J. Kohler
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Stilianos Fodelianakis
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Grégoire Michoud
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Leïla Ezzat
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Massimo Bourquin
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Hannes Peter
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Susheel Bhanu Busi
- Systems Ecology Research GroupLuxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Paraskevi Pramateftaki
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Nicola Deluigi
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Michail Styllas
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Matteo Tolosano
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Vincent de Staercke
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Martina Schön
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Jade Brandani
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| | - Ramona Marasco
- Biological and Environmental Sciences and Engineering Division (BESE)King Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
| | - Daniele Daffonchio
- Biological and Environmental Sciences and Engineering Division (BESE)King Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
| | - Paul Wilmes
- Systems Ecology Research GroupLuxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Tom J. Battin
- River Ecosystems Laboratory, Alpine and Polar Environmental Research Center, Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
| |
Collapse
|
16
|
Czech L, Stamatakis A, Dunthorn M, Barbera P. Metagenomic Analysis Using Phylogenetic Placement-A Review of the First Decade. FRONTIERS IN BIOINFORMATICS 2022; 2:871393. [PMID: 36304302 PMCID: PMC9580882 DOI: 10.3389/fbinf.2022.871393] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 04/11/2022] [Indexed: 12/20/2022] Open
Abstract
Phylogenetic placement refers to a family of tools and methods to analyze, visualize, and interpret the tsunami of metagenomic sequencing data generated by high-throughput sequencing. Compared to alternative (e. g., similarity-based) methods, it puts metabarcoding sequences into a phylogenetic context using a set of known reference sequences and taking evolutionary history into account. Thereby, one can increase the accuracy of metagenomic surveys and eliminate the requirement for having exact or close matches with existing sequence databases. Phylogenetic placement constitutes a valuable analysis tool per se, but also entails a plethora of downstream tools to interpret its results. A common use case is to analyze species communities obtained from metagenomic sequencing, for example via taxonomic assignment, diversity quantification, sample comparison, and identification of correlations with environmental variables. In this review, we provide an overview over the methods developed during the first 10 years. In particular, the goals of this review are 1) to motivate the usage of phylogenetic placement and illustrate some of its use cases, 2) to outline the full workflow, from raw sequences to publishable figures, including best practices, 3) to introduce the most common tools and methods and their capabilities, 4) to point out common placement pitfalls and misconceptions, 5) to showcase typical placement-based analyses, and how they can help to analyze, visualize, and interpret phylogenetic placement data.
Collapse
Affiliation(s)
- Lucas Czech
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, United States
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Micah Dunthorn
- Natural History Museum, University of Oslo, Oslo, Norway
| | | |
Collapse
|
17
|
Zhu Q, Huang S, Gonzalez A, McGrath I, McDonald D, Haiminen N, Armstrong G, Vázquez-Baeza Y, Yu J, Kuczynski J, Sepich-Poore GD, Swafford AD, Das P, Shaffer JP, Lejzerowicz F, Belda-Ferre P, Havulinna AS, Méric G, Niiranen T, Lahti L, Salomaa V, Kim HC, Jain M, Inouye M, Gilbert JA, Knight R. Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy. mSystems 2022; 7:e0016722. [PMID: 35369727 PMCID: PMC9040630 DOI: 10.1128/msystems.00167-22] [Citation(s) in RCA: 54] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Accepted: 02/25/2022] [Indexed: 02/06/2023] Open
Abstract
We introduce the operational genomic unit (OGU) method, a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as the minimum unit for assessing the diversity of microbial communities and their relevance to environmental factors. This approach is independent of taxonomic classification, granting the possibility of maximal resolution of community composition, and organizes features into an accurate hierarchy using a phylogenomic tree. The outputs are suitable for contemporary analytical protocols for community ecology, differential abundance, and supervised learning while supporting phylogenetic methods, such as UniFrac and phylofactorization, that are seldom applied to shotgun metagenomics despite being prevalent in 16S rRNA gene amplicon studies. As demonstrated in two real-world case studies, the OGU method produces biologically meaningful patterns from microbiome data sets. Such patterns further remain detectable at very low metagenomic sequencing depths. Compared with taxonomic unit-based analyses implemented in currently adopted metagenomics tools, and the analysis of 16S rRNA gene amplicon sequence variants, this method shows superiority in informing biologically relevant insights, including stronger correlation with body environment and host sex on the Human Microbiome Project data set and more accurate prediction of human age by the gut microbiomes of Finnish individuals included in the FINRISK 2002 cohort. We provide Woltka, a bioinformatics tool to implement this method, with full integration with the QIIME 2 package and the Qiita web platform, to facilitate adoption of the OGU method in future metagenomics studies. IMPORTANCE Shotgun metagenomics is a powerful, yet computationally challenging, technique compared to 16S rRNA gene amplicon sequencing for decoding the composition and structure of microbial communities. Current analyses of metagenomic data are primarily based on taxonomic classification, which is limited in feature resolution. To solve these challenges, we introduce operational genomic units (OGUs), which are the individual reference genomes derived from sequence alignment results, without further assigning them taxonomy. The OGU method advances current read-based metagenomics in two dimensions: (i) providing maximal resolution of community composition and (ii) permitting use of phylogeny-aware tools. Our analysis of real-world data sets shows that it is advantageous over currently adopted metagenomic analysis methods and the finest-grained 16S rRNA analysis methods in predicting biological traits. We thus propose the adoption of OGUs as an effective practice in metagenomic studies.
Collapse
Affiliation(s)
- Qiyun Zhu
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
- Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, Arizona, USA
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
| | - Shi Huang
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
- Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China
| | - Antonio Gonzalez
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
| | - Imran McGrath
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
- Division of Biological Sciences, University of California San Diego, La Jolla, California, USA
| | - Daniel McDonald
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
| | - Niina Haiminen
- IBM T. J. Watson Research Center, Yorktown Heights, New York, USA
| | - George Armstrong
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, California, USA
| | - Yoshiki Vázquez-Baeza
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
| | - Julian Yu
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
- Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, Arizona, USA
| | | | | | - Austin D. Swafford
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
| | - Promi Das
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, California, USA
| | - Justin P. Shaffer
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
| | - Franck Lejzerowicz
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
| | - Pedro Belda-Ferre
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
| | - Aki S. Havulinna
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
- Institute for Molecular Medicine Finland, HiLIFE, University of Helsinki, Helsinki, Finland
| | - Guillaume Méric
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Teemu Niiranen
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
- Department of Internal Medicine, University of Turku, Turku, Finland
- Division of Medicine, Turku University Hospital, Finland
| | - Leo Lahti
- Department of Computing, University of Turku, Turku, Finland
| | - Veikko Salomaa
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Ho-Cheol Kim
- IBM Almaden Research Center, San Jose, California, USA
| | - Mohit Jain
- Department of Medicine, University of California San Diego, La Jolla, California, USA
- Department of Pharmacology, University of California San Diego, La Jolla, California, USA
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Department of Public Health and Primary Care, Cambridge University, Cambridge, United Kingdom
| | - Jack A. Gilbert
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, La Jolla, California, USA
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, California, USA
| | - Rob Knight
- Department of Pediatrics, School of Medicine, University of California San Diego, La Jolla, California, USA
- Department of Bioengineering, University of California San Diego, La Jolla, California, USA
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California, USA
| |
Collapse
|
18
|
Abstract
Cognitive impairment (CI) is among the most common non-motor symptoms of Parkinson’s disease (PD), with a substantially negative impact on patient management and outcome. The development and progression of CI exhibits high interindividual variability, which requires better diagnostic and monitoring strategies. PD patients often display sweating disorders resulting from autonomic dysfunction, which has been associated with CI. Because the axillary microbiota is known to change with humidity level and sweat composition, we hypothesized that the axillary microbiota of PD patients shifts in association with CI progression, and thus can be used as a proxy for classification of CI stages in PD. We compared the axillary microbiota compositions of 103 PD patients (55 PD patients with dementia [PDD] and 48 PD patients with mild cognitive impairment [PD-MCI]) and 26 cognitively normal healthy controls (HC). We found that axillary microbiota profiles differentiate HC, PD-MCI, and PDD groups based on differential ranking analysis, and detected an increasing trend in the log ratio of Corynebacterium to Anaerococcus in progression from HC to PDD. In addition, phylogenetic factorization revealed that the depletion of the Anaerococcus, Peptoniphilus, and W5053 genera is associated with PD-MCI and PDD. Moreover, functional predictions suggested significant increases in myo-inositol degradation, ergothioneine biosynthesis, propionate biosynthesis, menaquinone biosynthesis, and the proportion of aerobic bacteria and biofilm formation capacity, in parallel to increasing CI. Our results suggest that alterations in axillary microbiota are associated with CI in PD. Thus, axillary microbiota has the potential to be exploited as a noninvasive tool in the development of novel strategies. IMPORTANCE Parkinson's disease (PD) is the second most common neurodegenerative disease. Cognitive impairment (CI) in PD has significant negative impacts on life quality of patients. The emergence and progression of cognitive impairment shows high variability among PD patients, and thus requires better diagnostic and monitoring strategies. Recent findings indicate a close link between autonomic dysfunction and cognitive impairment. Since thermoregulatory dysfunction and skin changes are among the main manifestations of autonomic dysfunction in PD, we hypothesized that alterations in the axillary microbiota may be useful for tracking cognitive impairment stages in PD. To our knowledge, this the first study characterizing the axillary microbiota of PD patients and exploring its association with cognitive impairment stages in PD. Future studies should include larger cohorts and multicenter studies to validate our results and investigate potential biological mechanisms.
Collapse
|
19
|
Jin J, Krohn C, Franks AE, Wang X, Wood JL, Petrovski S, McCaskill M, Batinovic S, Xie Z, Tang C. Elevated atmospheric CO 2 alters the microbial community composition and metabolic potential to mineralize organic phosphorus in the rhizosphere of wheat. MICROBIOME 2022; 10:12. [PMID: 35074003 PMCID: PMC8785599 DOI: 10.1186/s40168-021-01203-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 11/25/2021] [Indexed: 06/14/2023]
Abstract
BACKGROUND Understanding how elevated atmospheric CO2 (eCO2) impacts on phosphorus (P) transformation in plant rhizosphere is critical for maintaining ecological sustainability in response to climate change, especially in agricultural systems where soil P availability is low. METHODS This study used rhizoboxes to physically separate rhizosphere regions (plant root-soil interface) into 1.5-mm segments. Wheat plants were grown in rhizoboxes under eCO2 (800 ppm) and ambient CO2 (400 ppm) in two farming soils, Chromosol and Vertosol, supplemented with phytate (organic P). Photosynthetic carbon flow in the plant-soil continuum was traced with 13CO2 labeling. Amplicon sequencing was performed on the rhizosphere-associated microbial community in the root-growth zone, and 1.5 mm and 3 mm away from the root. RESULTS Elevated CO2 accelerated the mineralization of phytate in the rhizosphere zones, which corresponded with increases in plant-derived 13C enrichment and the relative abundances of discreet phylogenetic clades containing Bacteroidetes and Gemmatimonadetes in the bacterial community, and Funneliformis affiliated to arbuscular mycorrhizas in the fungal community. Although the amplicon sequence variants (ASVs) associated the stimulation of phytate mineralization under eCO2 differed between the two soils, these ASVs belonged to the same phyla associated with phytase and phosphatase production. The symbiotic mycorrhizas in the rhizosphere of wheat under eCO2 benefited from increased plant C supply and increased P access from soil. Further supportive evidence was the eCO2-induced increase in the genetic pool expressing the pentose phosphate pathway, which is the central pathway for biosynthesis of RNA/DNA precursors. CONCLUSIONS The results suggested that an increased belowground carbon flow under eCO2 stimulated bacterial growth, changing community composition in favor of phylotypes capable of degrading aromatic P compounds. It is proposed that energy investments by bacteria into anabolic processes increase under eCO2 to level microbial P-use efficiencies and that synergies with symbiotic mycorrhizas further enhance the competition for and mineralization of organic P. Video Abstract.
Collapse
Affiliation(s)
- Jian Jin
- Department of Animal, Plant and Soil Sciences, Centre for AgriBioscience, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia.
- Key Laboratory of Mollisols Agroecology, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, 150081, China.
| | - Christian Krohn
- Department of Animal, Plant and Soil Sciences, Centre for AgriBioscience, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia
| | - Ashley E Franks
- Department of Physiology, Anatomy and Microbiology, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia
- Centre for Future Landscapes, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia
| | - Xiaojuan Wang
- Department of Animal, Plant and Soil Sciences, Centre for AgriBioscience, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia
| | - Jennifer L Wood
- Department of Physiology, Anatomy and Microbiology, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia
- Centre for Future Landscapes, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia
| | - Steve Petrovski
- Department of Physiology, Anatomy and Microbiology, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia
| | - Malcolm McCaskill
- Agriculture Victoria Research, Department of Jobs, Precincts and Regions, Victoria, 3300, Hamilton, Australia
| | - Steven Batinovic
- Department of Physiology, Anatomy and Microbiology, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia
| | - Zhihuang Xie
- Key Laboratory of Mollisols Agroecology, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, 150081, China
| | - Caixian Tang
- Department of Animal, Plant and Soil Sciences, Centre for AgriBioscience, La Trobe University, Melbourne Campus, Bundoora, Victoria, 3086, Australia.
| |
Collapse
|
20
|
Millán J, Di Cataldo S, Volokhov DV, Becker DJ. Worldwide occurrence of haemoplasmas in wildlife: Insights into the patterns of infection, transmission, pathology and zoonotic potential. Transbound Emerg Dis 2021; 68:3236-3256. [PMID: 33210822 DOI: 10.1111/tbed.13932] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 11/13/2020] [Accepted: 11/14/2020] [Indexed: 12/17/2022]
Abstract
Haemotropic mycoplasmas (haemoplasmas) have increasingly attracted the attention of wildlife disease researchers due to a combination of wide host range, high prevalence and genetic diversity. A systematic review identified 75 articles that investigated haemoplasma infection in wildlife by molecular methods (chiefly targeting partial 16S rRNA gene sequences), which included 131 host genera across six orders. Studies were less common in the Eastern Hemisphere (especially Africa and Asia) and more frequent in the Artiodactyla and Carnivora. Meta-analysis showed that infection prevalence did not vary by geographic region nor host order, but wild hosts showed significantly higher prevalence than captive hosts. Using a taxonomically flexible machine learning algorithm, we also found vampire bats and cervids to have greater prevalence, whereas mink, a subclade of vesper bats, and true foxes all had lower prevalence compared to the remaining sampled mammal phylogeny. Haemoplasma genotype and nucleotide diversity varied little among wild mammals but were marginally lower in primates and bats. Coinfection with more than one haemoplasma species or genotype was always confirmed when assessed. Risk factors of infection identified were sociality, age, males and high trophic levels, and both prevalence and diversity were often higher in undisturbed environments. Haemoplasmas likely use different and concurrent transmission routes and typically display enzootic dynamics when wild populations are studied longitudinally. Haemoplasma pathology is poorly known in wildlife but appears subclinical. Candidatus Mycoplasma haematohominis, which causes disease in humans, probably has it natural host in bats. Haemoplasmas can serve as a model system in ecological and evolutionary studies, and future research on these pathogens in wildlife must focus on increasing the geographic range and taxa of studies and elucidating pathology, transmission and zoonotic potential. To facilitate such work, we recommend using universal PCR primers or NGS protocols to detect novel haemoplasmas and other genetic markers to differentiate among species and infer cross-species transmission.
Collapse
Affiliation(s)
- Javier Millán
- Instituto Agroalimentario de Aragón-IA2 (Universidad de Zaragoza-CITA), Zaragoza, Spain
- Fundación ARAID, Zaragoza, Spain
- Facultad de Ciencias de la Vida, Universidad Andres Bello, Santiago, Chile
| | - Sophia Di Cataldo
- Programa de Doctorado en Medicina de la Conservación, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Santiago, Chile
| | - Dmitriy V Volokhov
- Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Daniel J Becker
- Department of Biology, University of Oklahoma, Norman, Oklahoma, USA
| |
Collapse
|
21
|
Millán J, Becker DJ. Patterns of Exposure and Infection with Microparasites in Iberian Wild Carnivores: A Review and Meta-Analysis. Animals (Basel) 2021; 11:2708. [PMID: 34573674 PMCID: PMC8469010 DOI: 10.3390/ani11092708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/13/2021] [Accepted: 09/13/2021] [Indexed: 11/28/2022] Open
Abstract
We use a suite of meta-analytic and comparative methods to derive fundamental insights into how sampling effort, pathogen richness, infection prevalence, and seroprevalence vary across Carnivora taxa and Iberian geography. The red fox was the most studied species, the wolf and Iberian lynx were disproportionally studied, and the Arctoidea were understudied. Sampling effort was higher in Mediterranean areas, but central Spain showed the higher pathogen richness. Excluding studies analyzing fecal samples, 53 different pathogens have been detected in Iberian carnivores, including 16 viruses, 27 bacteria, and 10 protozoa but no fungi. Sampling effort and pathogen diversity were generally more similar among closely related carnivore species. Seropositivity to viruses was lower and higher in the Mustelinae and the Canidae, respectively, and seropositivity to protozoa was higher in both taxa. Canine distemper virus exposure was greatest in canids and mustelids. Carnivore protoparvovirus-1 exposure was greatest in the Atlantic regions, and the Felidae and the Musteloidea had lower infection prevalence. A subclade of the Mustelidae had a greater prevalence of Leishmania infection. We observed no relationships between host phylogenetic distance and pathogen sharing among species. Lastly, we identify important research pitfalls and future directions to improve the study of infectious disease in Iberian wild carnivore communities.
Collapse
Affiliation(s)
- Javier Millán
- Instituto Agroalimentario de Aragón-IA2, Universidad de Zaragoza-CITA, 50013 Zaragoza, Spain
- Fundación ARAID, Avda. Ranillas 1, 50018 Zaragoza, Spain
- Facultad de Ciencias de la Vida, Universidad Andres Bello, Santiago 8320000, Chile
| | - Daniel J. Becker
- Department of Biology, University of Oklahoma, Norman, OK 73019, USA;
| |
Collapse
|
22
|
Fodelianakis S, Washburne AD, Bourquin M, Pramateftaki P, Kohler TJ, Styllas M, Tolosano M, De Staercke V, Schön M, Busi SB, Brandani J, Wilmes P, Peter H, Battin TJ. Microdiversity characterizes prevalent phylogenetic clades in the glacier-fed stream microbiome. ISME JOURNAL 2021; 16:666-675. [PMID: 34522009 PMCID: PMC8857233 DOI: 10.1038/s41396-021-01106-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 08/14/2021] [Accepted: 09/02/2021] [Indexed: 02/01/2023]
Abstract
Glacier-fed streams (GFSs) are extreme and rapidly vanishing ecosystems, and yet they harbor diverse microbial communities. Although our understanding of the GFS microbiome has recently increased, we do not know which microbial clades are ecologically successful in these ecosystems, nor do we understand potentially underlying mechanisms. Ecologically successful clades should be more prevalent across GFSs compared to other clades, which should be reflected as clade-wise distinctly low phylogenetic turnover. However, methods to assess such patterns are currently missing. Here we developed and applied a novel analytical framework, “phyloscore analysis”, to identify clades with lower spatial phylogenetic turnover than other clades in the sediment microbiome across twenty GFSs in New Zealand. These clades constituted up to 44% and 64% of community α-diversity and abundance, respectively. Furthermore, both their α-diversity and abundance increased as sediment chlorophyll a decreased, corroborating their ecological success in GFS habitats largely devoid of primary production. These clades also contained elevated levels of putative microdiversity than others, which could potentially explain their high prevalence in GFSs. This hitherto unknown microdiversity may be threatened as glaciers shrink, urging towards further genomic and functional exploration of the GFS microbiome.
Collapse
Affiliation(s)
- Stilianos Fodelianakis
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland.
| | | | - Massimo Bourquin
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Paraskevi Pramateftaki
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Tyler J Kohler
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Michail Styllas
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Matteo Tolosano
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Vincent De Staercke
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Martina Schön
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Susheel Bhanu Busi
- Systems Ecology Research Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Jade Brandani
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Paul Wilmes
- Systems Ecology Research Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Hannes Peter
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Tom J Battin
- Stream Biofilm & Ecosystem Research Lab, ENAC Division, Ecole Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland.
| |
Collapse
|
23
|
Bien J, Yan X, Simpson L, Müller CL. Tree-aggregated predictive modeling of microbiome data. Sci Rep 2021; 11:14505. [PMID: 34267244 PMCID: PMC8282688 DOI: 10.1038/s41598-021-93645-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 06/22/2021] [Indexed: 01/05/2023] Open
Abstract
Modern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale. At the most granular level, the primary data consist of sparse counts of amplicon sequence variants or operational taxonomic units that are associated with taxonomic and phylogenetic group information. In this contribution, we leverage the hierarchical structure of amplicon data and propose a data-driven and scalable tree-guided aggregation framework to associate microbial subcompositions with response variables of interest. The excess number of zero or low count measurements at the read level forces traditional microbiome data analysis workflows to remove rare sequencing variants or group them by a fixed taxonomic rank, such as genus or phylum, or by phylogenetic similarity. By contrast, our framework, which we call trac (tree-aggregation of compositional data), learns data-adaptive taxon aggregation levels for predictive modeling, greatly reducing the need for user-defined aggregation in preprocessing while simultaneously integrating seamlessly into the compositional data analysis framework. We illustrate the versatility of our framework in the context of large-scale regression problems in human gut, soil, and marine microbial ecosystems. We posit that the inferred aggregation levels provide highly interpretable taxon groupings that can help microbiome researchers gain insights into the structure and functioning of the underlying ecosystem of interest.
Collapse
Affiliation(s)
- Jacob Bien
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA, USA
| | | | - Léo Simpson
- Technische Universität München, Munich, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
| | - Christian L Müller
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany.
- Department of Statistics, Ludwig-Maximilians-Universität München, Munich, Germany.
- Center for Computational Mathematics, Flatiron Institute, Simons Foundation, New York, NY, USA.
| |
Collapse
|
24
|
Jeganathan P, Holmes SP. A Statistical Perspective on the Challenges in Molecular Microbial Biology. JOURNAL OF AGRICULTURAL, BIOLOGICAL, AND ENVIRONMENTAL STATISTICS 2021; 26:131-160. [PMID: 36398283 PMCID: PMC9667415 DOI: 10.1007/s13253-021-00447-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 02/15/2021] [Accepted: 02/24/2021] [Indexed: 12/13/2022]
Abstract
High throughput sequencing (HTS)-based technology enables identifying and quantifying non-culturable microbial organisms in all environments. Microbial sequences have enhanced our understanding of the human microbiome, the soil and plant environment, and the marine environment. All molecular microbial data pose statistical challenges due to contamination sequences from reagents, batch effects, unequal sampling, and undetected taxa. Technical biases and heteroscedasticity have the strongest effects, but different strains across subjects and environments also make direct differential abundance testing unwieldy. We provide an introduction to a few statistical tools that can overcome some of these difficulties and demonstrate those tools on an example. We show how standard statistical methods, such as simple hierarchical mixture and topic models, can facilitate inferences on latent microbial communities. We also review some nonparametric Bayesian approaches that combine visualization and uncertainty quantification. The intersection of molecular microbial biology and statistics is an exciting new venue. Finally, we list some of the important open problems that would benefit from more careful statistical method development.
Collapse
Affiliation(s)
- Pratheepa Jeganathan
- Department of Statistics, Stanford University, Sequoia Hall, 390 Jane Stanford Way, Stanford, CA 94305, USA
| | - Susan P Holmes
- Department of Statistics, Stanford University, Sequoia Hall, 390 Jane Stanford Way, Stanford, CA 94305, USA
| |
Collapse
|
25
|
Huang R, Soneson C, Germain PL, Schmidt TSB, Mering CV, Robinson MD. treeclimbR pinpoints the data-dependent resolution of hierarchical hypotheses. Genome Biol 2021; 22:157. [PMID: 34001188 PMCID: PMC8127214 DOI: 10.1186/s13059-021-02368-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 04/28/2021] [Indexed: 12/13/2022] Open
Abstract
treeclimbR is for analyzing hierarchical trees of entities, such as phylogenies or cell types, at different resolutions. It proposes multiple candidates that capture the latent signal and pinpoints branches or leaves that contain features of interest, in a data-driven way. It outperforms currently available methods on synthetic data, and we highlight the approach on various applications, including microbiome and microRNA surveys as well as single-cell cytometry and RNA-seq datasets. With the emergence of various multi-resolution genomic datasets, treeclimbR provides a thorough inspection on entities across resolutions and gives additional flexibility to uncover biological associations.
Collapse
Affiliation(s)
- Ruizhu Huang
- Department of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland
| | - Charlotte Soneson
- Department of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland
- Present Address: Friedrich Miescher Institute for Biomedical Research and SIB Swiss Institute of Bioinformatics, Basel, 4058, Switzerland
| | - Pierre-Luc Germain
- Department of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland
- D-HEST Institute for Neuroscience, Swiss Federal Institute of Technology, Zurich, 8057, Switzerland
| | - Thomas S B Schmidt
- Department of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland
- Present Address: European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, 69117, Germany
| | - Christian Von Mering
- Department of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland
| | - Mark D Robinson
- Department of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057, Switzerland.
| |
Collapse
|
26
|
Dallas TA, Becker DJ. Taxonomic resolution affects host-parasite association model performance. Parasitology 2021; 148:584-590. [PMID: 33342442 PMCID: PMC10950372 DOI: 10.1017/s0031182020002371] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 12/07/2020] [Accepted: 12/09/2020] [Indexed: 11/07/2022]
Abstract
Identifying the factors that structure host–parasite interactions is fundamental to understand the drivers of species distributions and to predict novel cross-species transmission events. More phylogenetically related host species tend to have more similar parasite associations, but parasite specificity may vary as a function of transmission mode, parasite taxonomy or life history. Accordingly, analyses that attempt to infer host−parasite associations using combined data on different parasite groups may perform quite differently relative to analyses on each parasite subset. In essence, are more data always better when predicting host−parasite associations, or does parasite taxonomic resolution matter? Here, we explore how taxonomic resolution affects predictive models of host−parasite associations using the London Natural History Museum's database of host–helminth interactions. Using boosted regression trees, we demonstrate that taxon-specific models (i.e. of Acanthocephalans, Nematodes and Platyhelminthes) consistently outperform full models in predicting mammal-helminth associations. At finer spatial resolutions, full and taxon-specific model performance does not vary, suggesting tradeoffs between phylogenetic and spatial scales of analysis. Although all models identify similar host and parasite covariates as important to such patterns, our results emphasize the importance of phylogenetic scale in the study of host–parasite interactions and suggest that using taxonomic subsets of data may improve predictions of parasite distributions and cross-species transmission. Predictive models of host–pathogen interactions should thus attempt to encompass the spatial resolution and phylogenetic scale desired for inference and prediction and potentially use model averaging or ensemble models to combine predictions from separately trained models.
Collapse
Affiliation(s)
- Tad A. Dallas
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA70802, USA
| | - Daniel J. Becker
- Department of Biology, University of Oklahoma, Norman, OK73019, USA
| |
Collapse
|
27
|
Becker DJ, Speer KA, Korstian JM, Volokhov DV, Droke HF, Brown AM, Baijnauth CL, Padgett-Stewart T, Broders HG, Plowright RK, Rainwater TR, Fenton MB, Simmons NB, Chumchal MM. Disentangling interactions among mercury, immunity and infection in a Neotropical bat community. J Appl Ecol 2021; 58:879-889. [PMID: 33911313 PMCID: PMC8078557 DOI: 10.1111/1365-2664.13809] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Accepted: 12/01/2020] [Indexed: 12/13/2022]
Abstract
1. Contaminants such as mercury are pervasive and can have immunosuppressive effects on wildlife. Impaired immunity could be important for forecasting pathogen spillover, as many land-use changes that generate mercury contamination also bring wildlife into close contact with humans and domestic animals. However, the interactions among contaminants, immunity and infection are difficult to study in natural systems, and empirical tests of possible directional relationships remain rare. 2. We capitalized on extreme mercury variation in a diverse bat community in Belize to test association among contaminants, immunity and infection. By comparing a previous dataset of bats sampled in 2014 with new data from 2017, representing a period of rapid agricultural land conversion, we first confirmed bat species more reliant on aquatic prey had higher fur mercury. Bats in the agricultural habitat also had higher mercury in recent years. We then tested covariation between mercury and cellular immunity and determined if such relationships mediated associations between mercury and bacterial pathogens. As bat ecology can dictate exposure to mercury and pathogens, we also assessed species-specific patterns in mercury-infection relationships. 3. Across the bat community, individuals with higher mercury had fewer neutrophils but not lymphocytes, suggesting stronger associations with innate immunity. However, the odds of infection for haemoplasmas and Bartonella spp. were generally lowest in bats with high mercury, and relationships between mercury and immunity did not mediate infection patterns. Mercury also showed species- and clade-specific relationships with infection, being associated with especially low odds for haemoplasmas in Pteronotus mesoamericanus and Dermanura phaeotis. For Bartonella spp., mercury was associated with particularly low odds of infection in the genus Pteronotus but high odds in the subfamily Stenodermatinae. 4. Synthesis and application. Lower general infection risk in bats with high mercury despite weaker innate defense suggests contaminant-driven loss of pathogen habitat (i.e. anemia) or vector mortality as possible causes. Greater attention to these potential pathways could help disentangle relationships among contaminants, immunity and infection in anthropogenic habitats and help forecast disease risks. Our results also suggest that contaminants may increase infection risk in some taxa but not others, emphasizing the importance of considering surveillance and management at different phylogenetic scales.
Collapse
Affiliation(s)
| | - Kelly A. Speer
- Richard Gilder Graduate School, American Museum of Natural History, New York, NY, USA
- Department of Invertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
- Center for Conservation Genomics, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC, USA
| | | | - Dmitriy V. Volokhov
- Center for Biologies Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Hannah F. Droke
- Department of Global and Planetary Health, University of South Florida, Tampa, FL, USA
| | - Alexis M. Brown
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, USA
| | - Catherene L. Baijnauth
- Sackler Institute of Comparative Genomics, American Museum of Natural History, New York, NY, USA
| | - Ticha Padgett-Stewart
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA
| | - Hugh G. Broders
- Department of Biology, University of Waterloo, Waterloo, ON, Canada
| | - Raina K. Plowright
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA
| | - Thomas R. Rainwater
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC, USA
- Baruch Institute of Coastal Ecology and Forest Science, Clemson University, Georgetown, SC, USA
- Tom Yawkey Wildlife Center, Georgetown, SC, USA
| | - M. Brock Fenton
- Department of Biology, Western University, London, ON, Canada
| | - Nancy B. Simmons
- Department of Mammalogy, Division of Vertebrate Zoology, American Museum of Natural History, New York, NY, USA
| | | |
Collapse
|
28
|
Moreno-Indias I, Lahti L, Nedyalkova M, Elbere I, Roshchupkin G, Adilovic M, Aydemir O, Bakir-Gungor B, Santa Pau ECD, D’Elia D, Desai MS, Falquet L, Gundogdu A, Hron K, Klammsteiner T, Lopes MB, Marcos-Zambrano LJ, Marques C, Mason M, May P, Pašić L, Pio G, Pongor S, Promponas VJ, Przymus P, Saez-Rodriguez J, Sampri A, Shigdel R, Stres B, Suharoschi R, Truu J, Truică CO, Vilne B, Vlachakis D, Yilmaz E, Zeller G, Zomer AL, Gómez-Cabrero D, Claesson MJ. Statistical and Machine Learning Techniques in Human Microbiome Studies: Contemporary Challenges and Solutions. Front Microbiol 2021; 12:635781. [PMID: 33692771 PMCID: PMC7937616 DOI: 10.3389/fmicb.2021.635781] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 01/28/2021] [Indexed: 12/23/2022] Open
Abstract
The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 "ML4Microbiome" that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies.
Collapse
Affiliation(s)
- Isabel Moreno-Indias
- Instituto de Investigación Biomédica de Málaga (IBIMA), Unidad de Gestión Clìnica de Endocrinologìa y Nutrición, Hospital Clìnico Universitario Virgen de la Victoria, Universidad de Málaga, Málaga, Spain
- Centro de Investigación Biomeìdica en Red de Fisiopatologtìa de la Obesidad y la Nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
| | - Leo Lahti
- Department of Computing, University of Turku, Turku, Finland
| | - Miroslava Nedyalkova
- Human Genetics and Disease Mechanisms, Latvian Biomedical Research and Study Centre, Riga, Latvia
| | - Ilze Elbere
- Latvian Biomedical Research and Study Centre, Riga, Latvia
| | | | - Muhamed Adilovic
- Department of Genetics and Bioengineering, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Onder Aydemir
- Department of Electrical and Electronics Engineering, Karadeniz Technical University, Trabzon, Turkey
| | - Burcu Bakir-Gungor
- Department of Computer Engineering, Abdullah Gul University, Kayseri, Turkey
| | | | - Domenica D’Elia
- Department for Biomedical Sciences, Institute for Biomedical Technologies, National Research Council, Bari, Italy
| | - Mahesh S. Desai
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
- Odense Research Center for Anaphylaxis, Department of Dermatology and Allergy Center, Odense University Hospital, University of Southern Denmark, Odense, Denmark
| | - Laurent Falquet
- Department of Biology, University of Fribourg, Fribourg, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Aycan Gundogdu
- Department of Microbiology and Clinical Microbiology, Faculty of Medicine, Erciyes University, Kayseri, Turkey
- Metagenomics Laboratory, Genome and Stem Cell Center (GenKök), Erciyes University, Kayseri, Turkey
| | - Karel Hron
- Department of Mathematical Analysis and Applications of Mathematics, Palacký University, Olomouc, Czechia
| | | | - Marta B. Lopes
- NOVA Laboratory for Computer Science and Informatics (NOVA LINCS), FCT, UNL, Caparica, Portugal
- Centro de Matemática e Aplicações (CMA), FCT, UNL, Caparica, Portugal
| | - Laura Judith Marcos-Zambrano
- Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
| | - Cláudia Marques
- CINTESIS, NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal
| | - Michael Mason
- Computational Oncology, Sage Bionetworks, Seattle, WA, United States
| | - Patrick May
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Lejla Pašić
- Sarajevo Medical School, University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina
| | - Gianvito Pio
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
| | - Sándor Pongor
- Faculty of Information Tehnology and Bionics, Pázmány University, Budapest, Hungary
| | - Vasilis J. Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Piotr Przymus
- Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Toruñ, Poland
| | - Julio Saez-Rodriguez
- Institute of Computational Biomedicine, Heidelberg University, Faculty of Medicine and Heidelberg University Hospital, Heidelberg, Germany
| | - Alexia Sampri
- Division of Informatics, Imaging and Data Sciences, School of Health Sciences, University of Manchester, Manchester, United Kingdom
| | - Rajesh Shigdel
- Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Blaz Stres
- Jozef Stefan Institute, Ljubljana, Slovenia
- Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
- Faculty of Civil and Geodetic Engineering, University of Ljubljana, Ljubljana, Slovenia
| | - Ramona Suharoschi
- Molecular Nutrition and Proteomics Lab, Faculty of the Food Science and Technology, Institute of Life Sciences, University of Agricultural Sciences and Veterinary Medicine of Cluj-Napoca, Cluj-Napoca, Romania
| | - Jaak Truu
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Ciprian-Octavian Truică
- Department of Computer Science and Engineering, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Bucharest, Romania
| | - Baiba Vilne
- Bioinformatics Research Unit, Riga Stradins University, Riga, Latvia
| | - Dimitrios Vlachakis
- Laboratory of Genetics, Department of Biotechnology, School of Applied Biology and Biotechnology, Agricultural University of Athens, Athens, Greece
| | - Ercument Yilmaz
- Department of Computer Technologies, Karadeniz Technical University, Trabzon, Turkey
| | - Georg Zeller
- European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany
| | - Aldert L. Zomer
- Department of Infectious Diseases and Immunology, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands
| | - David Gómez-Cabrero
- Navarrabiomed, Complejo Hospitalario de Navarra (CHN), IdiSNA, Universidad Pública de Navarra (UPNA), Pamplona, Spain
| | - Marcus J. Claesson
- School of Microbiology and APC Microbiome Ireland, University College Cork, Cork, Ireland
| |
Collapse
|
29
|
Krohn C, Jin J, Wood JL, Hayden HL, Kitching M, Ryan J, Fabijański P, Franks AE, Tang C. Highly decomposed organic carbon mediates the assembly of soil communities with traits for the biodegradation of chlorinated pollutants. JOURNAL OF HAZARDOUS MATERIALS 2021; 404:124077. [PMID: 33053475 DOI: 10.1016/j.jhazmat.2020.124077] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 09/04/2020] [Accepted: 09/21/2020] [Indexed: 06/11/2023]
Abstract
To improve biodegradation strategies for chlorinated pollutants, the roles of soil organic matter and microbial function need to be clarified. It was hypothesised that microbial degradation of specific organic fractions in soils enhance community metabolic capability to degrade chlorinated pollutants. This field study used historic records of dieldrin concentrations since 1988 and established relationships between dieldrin dissipation and soil carbon fractions together with bacterial and fungal diversity in surface soils of Kurosol and Chromosol. Sparse partial least squares analysis linked dieldrin dissipation to metabolic activities associated with the highly decomposed carbon fraction. Dieldrin dissipation, after three decades of natural attenuation, was associated with increased bacterial species fitness for the decomposition of recalcitrant carbon substrates including synthetic chlorinated pollutants. These metabolic capabilities were linked to the decomposed carbon fraction, an important driver for the microbial community and function. Common bacterial traits among taxonomic groups enriched in samples with high dieldrin dissipation included their slow growth, large genome and complex metabolism which supported the notion that metabolic strategies for dieldrin degradation evolved in an energy-low soil environment. The findings provide new perspectives for bioremediation strategies and suggest that soil management should aim at stimulating metabolism at the decomposed, fine carbon fraction.
Collapse
Affiliation(s)
- Christian Krohn
- Department of Animal, Plant and Soil Sciences, Centre for AgriBioscience, La Trobe University, Melbourne Campus, Bundoora, Vic 3086, Australia
| | - Jian Jin
- Department of Animal, Plant and Soil Sciences, Centre for AgriBioscience, La Trobe University, Melbourne Campus, Bundoora, Vic 3086, Australia.
| | - Jennifer L Wood
- Department of Physiology, Anatomy and Microbiology, La Trobe University, Melbourne Campus, Bundoora, Vic 3086, Australia; Centre for Future Landscapes, La Trobe University, Melbourne Campus, Bundoora, Vic 3086, Australia
| | - Helen L Hayden
- Agriculture Victoria, Department of Jobs, Precincts and Regions, Centre for AgriBioScience, Bundoora, Vic 3083, Australia
| | - Matt Kitching
- Agriculture Victoria, Department of Jobs, Precincts and Regions, Macleod, Vic 3085, Australia
| | - John Ryan
- Agriculture Victoria, Department of Jobs, Precincts and Regions, Wangaratta, Vic 3677, Australia
| | - Piotr Fabijański
- Agriculture Victoria, Department of Jobs, Precincts and Regions, Ellinbank, Vic 3821, Australia
| | - Ashley E Franks
- Department of Physiology, Anatomy and Microbiology, La Trobe University, Melbourne Campus, Bundoora, Vic 3086, Australia; Centre for Future Landscapes, La Trobe University, Melbourne Campus, Bundoora, Vic 3086, Australia
| | - Caixian Tang
- Department of Animal, Plant and Soil Sciences, Centre for AgriBioscience, La Trobe University, Melbourne Campus, Bundoora, Vic 3086, Australia.
| |
Collapse
|
30
|
Analysis of microbial compositions: a review of normalization and differential abundance analysis. NPJ Biofilms Microbiomes 2020; 6:60. [PMID: 33268781 PMCID: PMC7710733 DOI: 10.1038/s41522-020-00160-w] [Citation(s) in RCA: 157] [Impact Index Per Article: 31.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 10/16/2020] [Indexed: 12/22/2022] Open
Abstract
Increasingly, researchers are discovering associations between microbiome and a wide range of human diseases such as obesity, inflammatory bowel diseases, HIV, and so on. The first step towards microbiome wide association studies is the characterization of the composition of human microbiome under different conditions. Determination of differentially abundant microbes between two or more environments, known as differential abundance (DA) analysis, is a challenging and an important problem that has received considerable interest during the past decade. It is well documented in the literature that the observed microbiome data (OTU/SV table) are relative abundances with an excess of zeros. Since relative abundances sum to a constant, these data are necessarily compositional. In this article we review some recent methods for DA analysis and describe their strengths and weaknesses.
Collapse
|
31
|
Albery GF, Becker DJ. Fast-lived Hosts and Zoonotic Risk. Trends Parasitol 2020; 37:117-129. [PMID: 33214097 DOI: 10.1016/j.pt.2020.10.012] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 10/25/2020] [Accepted: 10/26/2020] [Indexed: 01/02/2023]
Abstract
Because most emerging human pathogens originate in mammals, many studies aim to identify host traits that determine the risk of sourcing zoonotic outbreaks. Studies regularly assert that 'fast-lived' mammal species exhibiting greater fecundity and shorter lifespans tend to host more zoonoses; however, the causes of this association remain poorly understood and they cover a range of immune and nonimmune mechanisms. We discuss these drivers in the context of evolutionary ecology and wildlife-human interactions. Ultimately, differentiating these mechanisms will require linking interspecific variation in life history with immunity, pathogen diversity, transmissibility, and zoonotic risk, and critical data gaps currently limit our ability to do so. We highlight sampling and analytical frameworks to address this gap and to better inform zoonotic reservoir prediction.
Collapse
Affiliation(s)
- Gregory F Albery
- Department of Biology, Georgetown University, Washington, DC, USA.
| | - Daniel J Becker
- Department of Biology, University of Oklahoma, Norman, OK, USA.
| |
Collapse
|
32
|
Identifying Suspect Bat Reservoirs of Emerging Infections. Vaccines (Basel) 2020; 8:vaccines8020228. [PMID: 32429501 PMCID: PMC7349958 DOI: 10.3390/vaccines8020228] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 05/10/2020] [Accepted: 05/13/2020] [Indexed: 11/30/2022] Open
Abstract
Bats host a number of pathogens that cause severe disease and onward transmission in humans and domestic animals. Some of these pathogens, including henipaviruses and filoviruses, are considered a concern for future pandemics. There has been substantial effort to identify these viruses in bats. However, the reservoir hosts for Ebola virus are still unknown and henipaviruses are largely uncharacterized across their distribution. Identifying reservoir species is critical in understanding the viral ecology within these hosts and the conditions that lead to spillover. We collated surveillance data to identify taxonomic patterns in prevalence and seroprevalence and to assess sampling efforts across species. We systematically collected data on filovirus and henipavirus detections and used a machine-learning algorithm, phylofactorization, in order to search the bat phylogeny for cladistic patterns in filovirus and henipavirus infection, accounting for sampling efforts. Across sampled bat species, evidence for filovirus infection was widely dispersed across the sampled phylogeny. We found major gaps in filovirus sampling in bats, especially in Western Hemisphere species. Evidence for henipavirus infection was clustered within the Pteropodidae; however, no other clades have been as intensely sampled. The major predictor of filovirus and henipavirus exposure or infection was sampling effort. Based on these results, we recommend expanding surveillance for these pathogens across the bat phylogenetic tree.
Collapse
|
33
|
Becker DJ, Speer KA, Brown AM, Fenton MB, Washburne AD, Altizer S, Streicker DG, Plowright RK, Chizhikov VE, Simmons NB, Volokhov DV. Ecological and evolutionary drivers of haemoplasma infection and bacterial genotype sharing in a Neotropical bat community. Mol Ecol 2020; 29:1534-1549. [PMID: 32243630 PMCID: PMC8299350 DOI: 10.1111/mec.15422] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Revised: 03/16/2020] [Accepted: 03/23/2020] [Indexed: 12/21/2022]
Abstract
Most emerging pathogens can infect multiple species, underlining the importance of understanding the ecological and evolutionary factors that allow some hosts to harbour greater infection prevalence and share pathogens with other species. However, our understanding of pathogen jumps is based primarily around viruses, despite bacteria accounting for the greatest proportion of zoonoses. Because bacterial pathogens in bats (order Chiroptera) can have conservation and human health consequences, studies that examine the ecological and evolutionary drivers of bacterial prevalence and barriers to pathogen sharing are crucially needed. Here were studied haemotropic Mycoplasma spp. (i.e., haemoplasmas) across a species-rich bat community in Belize over two years. Across 469 bats spanning 33 species, half of individuals and two-thirds of species were haemoplasma positive. Infection prevalence was higher for males and for species with larger body mass and colony sizes. Haemoplasmas displayed high genetic diversity (21 novel genotypes) and strong host specificity. Evolutionary patterns supported codivergence of bats and bacterial genotypes alongside phylogenetically constrained host shifts. Bat species centrality to the network of shared haemoplasma genotypes was phylogenetically clustered and unrelated to prevalence, further suggesting rare-but detectable-bacterial sharing between species. Our study highlights the importance of using fine phylogenetic scales when assessing host specificity and suggests phylogenetic similarity may play a key role in host shifts not only for viruses but also for bacteria. Such work more broadly contributes to increasing efforts to understand cross-species transmission and the epidemiological consequences of bacterial pathogens.
Collapse
Affiliation(s)
- Daniel J. Becker
- Department of BiologyIndiana UniversityBloomingtonINUSA
- Center for the Ecology of Infectious DiseaseUniversity of GeorgiaAthensGAUSA
| | - Kelly A. Speer
- Richard Gilder Graduate SchoolAmerican Museum of Natural HistoryNew YorkNYUSA
- Department of Invertebrate ZoologyNational Museum of Natural HistorySmithsonian InstitutionWashingtonDCUSA
- Center for Conservation GenomicsSmithsonian Conservation Biology InstituteNational Zoological ParkWashingtonDCUSA
| | - Alexis M. Brown
- Department of Ecology and EvolutionStony Brook UniversityStony BrookNYUSA
| | | | - Alex D. Washburne
- Department of Microbiology and ImmunologyMontana State UniversityBozemanMTUSA
| | - Sonia Altizer
- Center for the Ecology of Infectious DiseaseUniversity of GeorgiaAthensGAUSA
- Odum School of EcologyUniversity of GeorgiaAthensGAUSA
| | - Daniel G. Streicker
- Odum School of EcologyUniversity of GeorgiaAthensGAUSA
- MRC–University of Glasgow Centre for Virus ResearchGlasgowUK
- Institute of Biodiversity, Animal Health and Comparative MedicineUniversity of GlasgowGlasgowUK
| | - Raina K. Plowright
- Department of Microbiology and ImmunologyMontana State UniversityBozemanMTUSA
| | - Vladimir E. Chizhikov
- Center for Biologics Evaluation and Research, Food and Drug AdministrationSilver SpringMDUSA
| | - Nancy B. Simmons
- Richard Gilder Graduate SchoolAmerican Museum of Natural HistoryNew YorkNYUSA
- Department of MammalogyDivision of Vertebrate ZoologyAmerican Museum of Natural HistoryNew YorkNYUSA
| | - Dmitriy V. Volokhov
- Center for Biologics Evaluation and Research, Food and Drug AdministrationSilver SpringMDUSA
| |
Collapse
|
34
|
Lokmer A, Aflalo S, Amougou N, Lafosse S, Froment A, Tabe FE, Poyet M, Groussin M, Said-Mohamed R, Ségurel L. Response of the human gut and saliva microbiome to urbanization in Cameroon. Sci Rep 2020; 10:2856. [PMID: 32071424 PMCID: PMC7028744 DOI: 10.1038/s41598-020-59849-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 02/05/2020] [Indexed: 12/22/2022] Open
Abstract
Urban populations from highly industrialized countries are characterized by a lower gut bacterial diversity as well as by changes in composition compared to rural populations from less industrialized countries. To unveil the mechanisms and factors leading to this diversity loss, it is necessary to identify the factors associated with urbanization-induced shifts at a smaller geographical scale, especially in less industrialized countries. To do so, we investigated potential associations between a variety of dietary, medical, parasitological and socio-cultural factors and the gut and saliva microbiomes of 147 individuals from three populations along an urbanization gradient in Cameroon. We found that the presence of Entamoeba sp., a commensal gut protozoan, followed by stool consistency, were major determinants of the gut microbiome diversity and composition. Interestingly, urban individuals have retained most of their gut eukaryotic and bacterial diversity despite significant changes in diet compared to the rural areas, suggesting that the loss of bacterial microbiome diversity observed in industrialized areas is likely associated with medication. Finally, we observed a weak positive correlation between the gut and the saliva microbiome diversity and composition, even though the saliva microbiome is mainly shaped by habitat-related factors.
Collapse
Affiliation(s)
- Ana Lokmer
- UMR7206 Eco-anthropologie, CNRS - MNHN - Université de Paris, Paris, France.
| | - Sophie Aflalo
- UMR7206 Eco-anthropologie, CNRS - MNHN - Université de Paris, Paris, France
| | - Norbert Amougou
- UMR7206 Eco-anthropologie, CNRS - MNHN - Université de Paris, Paris, France
| | - Sophie Lafosse
- UMR7206 Eco-anthropologie, CNRS - MNHN - Université de Paris, Paris, France
| | - Alain Froment
- UMR7206 Eco-anthropologie, CNRS - MNHN - Université de Paris, Paris, France
| | - Francis Ekwin Tabe
- Faculté de Médecine et des Sciences Biomédicales - Université Yaoundé 1, Yaoundé, Cameroun
| | - Mathilde Poyet
- Department of Biological Engineering/Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA, USA.,The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mathieu Groussin
- Department of Biological Engineering/Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA, USA.,The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Rihlat Said-Mohamed
- SAMRC/WITS Developmental Pathways for Health Research Unit, Department of Paediatrics, School of Clinical Medicine, Faculty of Health Sciences, University of Witwatersrand, Johannesburg, South Africa
| | - Laure Ségurel
- UMR7206 Eco-anthropologie, CNRS - MNHN - Université de Paris, Paris, France.
| |
Collapse
|
35
|
Rocca JD, Simonin M, Bernhardt ES, Washburne AD, Wright JP. Rare microbial taxa emerge when communities collide: freshwater and marine microbiome responses to experimental mixing. Ecology 2020; 101:e02956. [PMID: 31840237 DOI: 10.1002/ecy.2956] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2019] [Revised: 10/01/2019] [Accepted: 11/12/2019] [Indexed: 01/06/2023]
Abstract
Whole microbial communities regularly merge with one another, often in tandem with their environments, in a process called community coalescence. Such events impose substantial changes: abiotic perturbation from environmental blending and biotic perturbation of community merging. We used an aquatic mixing experiment to unravel the effects of these perturbations on the whole microbiome response and on the success of individual taxa when distinct freshwater and marine communities coalesce. We found that an equal mix of freshwater and marine habitats and blended microbiomes resulted in strong convergence of the community structure toward that of the marine microbiome. The enzymatic potential of these blended microbiomes in mixed media also converged toward that of the marine, with strong correlations between the multivariate response patterns of the enzymes and of community structure. Exposing each endmember inocula to an axenic equal mix of their freshwater and marine source waters led to a 96% loss of taxa from our freshwater microbiomes and a 66% loss from our marine microbiomes. When both inocula were added together to this mixed environment, interactions amongst the communities led to a further loss of 29% and 49% of freshwater and marine taxa, respectively. Under both the axenic and competitive scenarios, the diversity lost was somewhat counterbalanced by increased abundance of microbial taxa that were too rare to detect in the initial inocula. Our study emphasizes the importance of the rare biosphere as a critical component of microbial community responses to community coalescence.
Collapse
Affiliation(s)
- Jennifer D Rocca
- Department of Biology, Duke University, Durham, North Carolina, 27708, USA
| | - Marie Simonin
- Department of Biology, Duke University, Durham, North Carolina, 27708, USA.,IRD, Cirad, IPME, University of Montpellier, Montpellier, 34080, France
| | - Emily S Bernhardt
- Department of Biology, Duke University, Durham, North Carolina, 27708, USA.,Nicholas School of the Environment, Duke University, Durham, North Carolina, 27708, USA
| | - Alex D Washburne
- Department of Microbiology & Immunology, Montana State University, Bozeman, Montana, 59717, USA
| | - Justin P Wright
- Department of Biology, Duke University, Durham, North Carolina, 27708, USA.,Nicholas School of the Environment, Duke University, Durham, North Carolina, 27708, USA
| |
Collapse
|
36
|
Qing X, Bik H, Yergaliyev TM, Gu J, Fonderie P, Brown-Miyara S, Szitenberg A, Bert W. Widespread prevalence but contrasting patterns of intragenomic rRNA polymorphisms in nematodes: Implications for phylogeny, species delimitation and life history inference. Mol Ecol Resour 2019; 20:318-332. [PMID: 31721426 DOI: 10.1111/1755-0998.13118] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Revised: 10/26/2019] [Accepted: 11/07/2019] [Indexed: 01/15/2023]
Abstract
Ribosomal RNA genes have long been a favoured locus in phylogenetic and metabarcoding studies. Within a genome, rRNA loci are organized as tandem repeated arrays and the copies are homogenized through the process of concerted evolution. However, some level of rRNA variation (intragenomic polymorphism) is known to persist and be maintained in the genomes of many species. In nematode worms, the extent of rRNA polymorphism (RP) across species and the evolutionary and life history factors that contribute to the maintenance of intragenomic RP is largely unknown. Here, we present an extensive analysis across 30 terrestrial nematode species representing a range of free-living and parasitic taxa isolated worldwide. Our results indicate that RP is common and widespread, ribosome function appears to be maintained despite mutational changes, and intragenomic variants are stable in the genome and neutrally evolving. However, levels of variation were varied widely across rRNA locus and species, with some taxa observed to lack RP entirely. Higher levels of RP were significantly correlated with shorter generation time and high reproductive rates, and population-level factors may play a role in the geographic and phylogenetic structuring of rRNA variants observed in genera such as Rotylenchulus and Pratylenchus. Although RP did not dramatically impact the clustering and recovery of taxa in mock metabarcoding analyses, the present study has significant implications for global biodiversity estimates of nematode species derived from environmental rRNA amplicon studies, as well as our understanding of the evolutionary and ecological factors shaping genetic diversity across the nematode Tree of Life.
Collapse
Affiliation(s)
- Xue Qing
- Nematology Research Unit, Department of Biology, Ghent University, Ghent, Belgium.,Department of Entomology, Nematology and Chemistry Units, Agricultural Research Organization (ARO), Volcani Center, Rishon LeZion, Israel
| | - Holly Bik
- Department of Nematology, University of California-Riverside, Riverside, CA, USA
| | - Timur M Yergaliyev
- Dead Sea and Arava Science Center, Dead Sea Branch, Masada National Park, Tamar Regional Council, Tel Aviv, Israel.,A. Baitursynov Kostanay State University, Kostanay, Kazakhstan
| | - Jianfeng Gu
- Technical Center of Ningbo Customs (Ningbo Inspection and Quarantine Science Technology Academy), Ningbo, China
| | - Pamela Fonderie
- Nematology Research Unit, Department of Biology, Ghent University, Ghent, Belgium
| | - Sigal Brown-Miyara
- Department of Entomology, Nematology and Chemistry Units, Agricultural Research Organization (ARO), Volcani Center, Rishon LeZion, Israel
| | - Amir Szitenberg
- Dead Sea and Arava Science Center, Dead Sea Branch, Masada National Park, Tamar Regional Council, Tel Aviv, Israel
| | - Wim Bert
- Nematology Research Unit, Department of Biology, Ghent University, Ghent, Belgium
| |
Collapse
|
37
|
Becker DJ, Washburne AD, Faust CL, Mordecai EA, Plowright RK. The problem of scale in the prediction and management of pathogen spillover. Philos Trans R Soc Lond B Biol Sci 2019; 374:20190224. [PMID: 31401958 PMCID: PMC6711304 DOI: 10.1098/rstb.2019.0224] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/21/2019] [Indexed: 01/28/2023] Open
Abstract
Disease emergence events, epidemics and pandemics all underscore the need to predict zoonotic pathogen spillover. Because cross-species transmission is inherently hierarchical, involving processes that occur at varying levels of biological organization, such predictive efforts can be complicated by the many scales and vastness of data potentially required for forecasting. A wide range of approaches are currently used to forecast spillover risk (e.g. macroecology, pathogen discovery, surveillance of human populations, among others), each of which is bound within particular phylogenetic, spatial and temporal scales of prediction. Here, we contextualize these diverse approaches within their forecasting goals and resulting scales of prediction to illustrate critical areas of conceptual and pragmatic overlap. Specifically, we focus on an ecological perspective to envision a research pipeline that connects these different scales of data and predictions from the aims of discovery to intervention. Pathogen discovery and predictions focused at the phylogenetic scale can first provide coarse and pattern-based guidance for which reservoirs, vectors and pathogens are likely to be involved in spillover, thereby narrowing surveillance targets and where such efforts should be conducted. Next, these predictions can be followed with ecologically driven spatio-temporal studies of reservoirs and vectors to quantify spatio-temporal fluctuations in infection and to mechanistically understand how pathogens circulate and are transmitted to humans. This approach can also help identify general regions and periods for which spillover is most likely. We illustrate this point by highlighting several case studies where long-term, ecologically focused studies (e.g. Lyme disease in the northeast USA, Hendra virus in eastern Australia, Plasmodium knowlesi in Southeast Asia) have facilitated predicting spillover in space and time and facilitated the design of possible intervention strategies. Such studies can in turn help narrow human surveillance efforts and help refine and improve future large-scale, phylogenetic predictions. We conclude by discussing how greater integration and exchange between data and predictions generated across these varying scales could ultimately help generate more actionable forecasts and interventions. This article is part of the theme issue 'Dynamic and integrative approaches to understanding pathogen spillover'.
Collapse
Affiliation(s)
- Daniel J. Becker
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA
- Center for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, USA
- Department of Biology, Indiana University, Bloomington, IN, USA
| | - Alex D. Washburne
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA
| | - Christina L. Faust
- Institute of Biodiversity Animal Health and Comparative Medicine, University of Glasgow, Glasgow, UK
| | | | - Raina K. Plowright
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA
| |
Collapse
|
38
|
Washburne AD, Crowley DE, Becker DJ, Manlove KR, Childs ML, Plowright RK. Percolation models of pathogen spillover. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180331. [PMID: 31401950 DOI: 10.1098/rstb.2018.0331] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Predicting pathogen spillover requires counting spillover events and aligning such counts with process-related covariates for each spillover event. How can we connect our analysis of spillover counts to simple, mechanistic models of pathogens jumping from reservoir hosts to recipient hosts? We illustrate how the pathways to pathogen spillover can be represented as a directed graph connecting reservoir hosts and recipient hosts and the number of spillover events modelled as a percolation of infectious units along that graph. Percolation models of pathogen spillover formalize popular intuition and management concepts for pathogen spillover, such as the inextricably multilevel nature of cross-species transmission, the impact of covariance between processes such as pathogen shedding and human susceptibility on spillover risk, and the assumptions under which the effect of a management intervention targeting one process, such as persistence of vectors, will translate to an equal effect on the overall spillover risk. Percolation models also link statistical analysis of spillover event datasets with a mechanistic model of spillover. Linear models, one might construct for process-specific parameters, such as the log-rate of shedding from one of several alternative reservoirs, yield a nonlinear model of the log-rate of spillover. The resulting nonlinearity is approximately piecewise linear with major impacts on statistical inferences of the importance of process-specific covariates such as vector density. We recommend that statistical analysis of spillover datasets use piecewise linear models, such as generalized additive models, regression clustering or ensembles of linear models, to capture the piecewise linearity expected from percolation models. We discuss the implications of our findings for predictions of spillover risk beyond the range of observed covariates, a major challenge of forecasting spillover risk in the Anthropocene. This article is part of the theme issue 'Dynamic and integrative approaches to understanding pathogen spillover'.
Collapse
Affiliation(s)
- Alex D Washburne
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA
| | - Daniel E Crowley
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA
| | - Daniel J Becker
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA.,Center for the Ecology of Infectious Diseases, University of Georgia, Athens, GA, USA.,Department of Biology, Indiana University, Bloomington, IN, USA
| | - Kezia R Manlove
- Veterinary Microbiology and Pathology, Washington State University College of Veterinary Medicine, Bozeman, MT, USA
| | - Marissa L Childs
- Emmett Interdisciplinary Program in Environment and Resources, Stanford University, Stanford, CA, USA
| | - Raina K Plowright
- Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA
| |
Collapse
|
39
|
Czech L, Stamatakis A. Scalable methods for analyzing and visualizing phylogenetic placement of metagenomic samples. PLoS One 2019; 14:e0217050. [PMID: 31136592 PMCID: PMC6538146 DOI: 10.1371/journal.pone.0217050] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 05/05/2019] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND The exponential decrease in molecular sequencing cost generates unprecedented amounts of data. Hence, scalable methods to analyze these data are required. Phylogenetic (or Evolutionary) Placement methods identify the evolutionary provenance of anonymous sequences with respect to a given reference phylogeny. This increasingly popular method is deployed for scrutinizing metagenomic samples from environments such as water, soil, or the human gut. NOVEL METHODS Here, we present novel and, more importantly, highly scalable methods for analyzing phylogenetic placements of metagenomic samples. More specifically, we introduce methods for (a) visualizing differences between samples and their correlation with associated meta-data on the reference phylogeny, (b) clustering similar samples using a variant of the k-means method, and (c) finding phylogenetic factors using an adaptation of the Phylofactorization method. These methods enable to interpret metagenomic data in a phylogenetic context, to find patterns in the data, and to identify branches of the phylogeny that are driving these patterns. RESULTS To demonstrate the scalability and utility of our methods, as well as to provide exemplary interpretations of our methods, we applied them to 3 publicly available datasets comprising 9782 samples with a total of approximately 168 million sequences. The results indicate that new biological insights can be attained via our methods.
Collapse
Affiliation(s)
- Lucas Czech
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| |
Collapse
|