1
|
Kumar B, Lorusso E, Fosso B, Pesole G. A comprehensive overview of microbiome data in the light of machine learning applications: categorization, accessibility, and future directions. Front Microbiol 2024; 15:1343572. [PMID: 38419630 PMCID: PMC10900530 DOI: 10.3389/fmicb.2024.1343572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 01/29/2024] [Indexed: 03/02/2024] Open
Abstract
Metagenomics, Metabolomics, and Metaproteomics have significantly advanced our knowledge of microbial communities by providing culture-independent insights into their composition and functional potential. However, a critical challenge in this field is the lack of standard and comprehensive metadata associated with raw data, hindering the ability to perform robust data stratifications and consider confounding factors. In this comprehensive review, we categorize publicly available microbiome data into five types: shotgun sequencing, amplicon sequencing, metatranscriptomic, metabolomic, and metaproteomic data. We explore the importance of metadata for data reuse and address the challenges in collecting standardized metadata. We also, assess the limitations in metadata collection of existing public repositories collecting metagenomic data. This review emphasizes the vital role of metadata in interpreting and comparing datasets and highlights the need for standardized metadata protocols to fully leverage metagenomic data's potential. Furthermore, we explore future directions of implementation of Machine Learning (ML) in metadata retrieval, offering promising avenues for a deeper understanding of microbial communities and their ecological roles. Leveraging these tools will enhance our insights into microbial functional capabilities and ecological dynamics in diverse ecosystems. Finally, we emphasize the crucial metadata role in ML models development.
Collapse
Affiliation(s)
- Bablu Kumar
- Università degli Studi di Milano, Milan, Italy
- Department of Biosciences, Biotechnology and Environment, University of Bari A. Moro, Bari, Italy
| | - Erika Lorusso
- Department of Biosciences, Biotechnology and Environment, University of Bari A. Moro, Bari, Italy
- National Research Council, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Bari, Italy
| | - Bruno Fosso
- Department of Biosciences, Biotechnology and Environment, University of Bari A. Moro, Bari, Italy
| | - Graziano Pesole
- Department of Biosciences, Biotechnology and Environment, University of Bari A. Moro, Bari, Italy
- National Research Council, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, Bari, Italy
| |
Collapse
|
2
|
Chaung K, Baharav TZ, Henderson G, Zheludev IN, Wang PL, Salzman J. SPLASH: A statistical, reference-free genomic algorithm unifies biological discovery. Cell 2023; 186:5440-5456.e26. [PMID: 38065078 PMCID: PMC10861363 DOI: 10.1016/j.cell.2023.10.028] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 08/31/2023] [Accepted: 10/26/2023] [Indexed: 12/18/2023]
Abstract
Today's genomics workflows typically require alignment to a reference sequence, which limits discovery. We introduce a unifying paradigm, SPLASH (Statistically Primary aLignment Agnostic Sequence Homing), which directly analyzes raw sequencing data, using a statistical test to detect a signature of regulation: sample-specific sequence variation. SPLASH detects many types of variation and can be efficiently run at scale. We show that SPLASH identifies complex mutation patterns in SARS-CoV-2, discovers regulated RNA isoforms at the single-cell level, detects the vast sequence diversity of adaptive immune receptors, and uncovers biology in non-model organisms undocumented in their reference genomes: geographic and seasonal variation and diatom association in eelgrass, an oceanic plant impacted by climate change, and tissue-specific transcripts in octopus. SPLASH is a unifying approach to genomic analysis that enables expansive discovery without metadata or references.
Collapse
Affiliation(s)
- Kaitlin Chaung
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Tavor Z Baharav
- Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA
| | - George Henderson
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Ivan N Zheludev
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Peter L Wang
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Julia Salzman
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Biochemistry, Stanford University, Stanford, CA 94305, USA; Department of Statistics (by courtesy), Stanford University, Stanford, CA 94305, USA; Department of Biology (by courtesy), Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
3
|
Abante J, Wang PL, Salzman J. DIVE: a reference-free statistical approach to diversity-generating and mobile genetic element discovery. Genome Biol 2023; 24:240. [PMID: 37864197 PMCID: PMC10589994 DOI: 10.1186/s13059-023-03038-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 08/14/2023] [Indexed: 10/22/2023] Open
Abstract
Diversity-generating and mobile genetic elements are key to microbial and viral evolution and can result in evolutionary leaps. State-of-the-art algorithms to detect these elements have limitations. Here, we introduce DIVE, a new reference-free approach to overcome these limitations using information contained in sequencing reads alone. We show that DIVE has improved detection power compared to existing reference-based methods using simulations and real data. We use DIVE to rediscover and characterize the activity of known and novel elements and generate new biological hypotheses about the mobilome. Building on DIVE, we develop a reference-free framework capable of de novo discovery of mobile genetic elements.
Collapse
Affiliation(s)
- Jordi Abante
- Biomedical Data Science, Stanford University, 1265 Welch Rd, Palo Alto, 94305, CA, USA
- Center for Computational, Evolutionary and Human Genomics, Stanford University, 327 Campus Drive, Stanford, 94305, CA, USA
- Current address: Department of Biomedical Sciences, Universitat de Barcelona, Casanova 143, Barcelona, 08036, Spain
| | - Peter L Wang
- Biomedical Data Science, Stanford University, 1265 Welch Rd, Palo Alto, 94305, CA, USA
- Department of Biochemistry, Stanford University, 279 Campus Drive, Stanford, 94305, CA, USA
| | - Julia Salzman
- Biomedical Data Science, Stanford University, 1265 Welch Rd, Palo Alto, 94305, CA, USA.
- Department of Biochemistry, Stanford University, 279 Campus Drive, Stanford, 94305, CA, USA.
- Department of Statistics, Stanford University, 390 Serra Mall, Stanford, 94305, CA, USA.
| |
Collapse
|
4
|
Chaung K, Baharav TZ, Henderson G, Zheludev IN, Wang PL, Salzman J. [WITHDRAWN] SPLASH: a statistical, reference-free genomic algorithm unifies biological discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.17.549408. [PMID: 37503014 PMCID: PMC10370119 DOI: 10.1101/2023.07.17.549408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
The authors have withdrawn this manuscript due to a duplicate posting of manuscript number BIORXIV/2022/497555. Therefore, the authors do not wish this work to be cited as reference for the project. If you have any questions, please contact the corresponding author. The correct preprint can be found at doi: https://doi.org/10.1101/2022.06.24.497555.
Collapse
|
5
|
Chaung K, Baharav TZ, Henderson G, Zheludev IN, Wang PL, Salzman J. SPLASH: a statistical, reference-free genomic algorithm unifies biological discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2022.06.24.497555. [PMID: 35794890 PMCID: PMC9258296 DOI: 10.1101/2022.06.24.497555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
Abstract
Today's genomics workflows typically require alignment to a reference sequence, which limits discovery. We introduce a new unifying paradigm, SPLASH (Statistically Primary aLignment Agnostic Sequence Homing), an approach that directly analyzes raw sequencing data to detect a signature of regulation: sample-specific sequence variation. The approach, which includes a new statistical test, is computationally efficient and can be run at scale. SPLASH unifies detection of myriad forms of sequence variation. We demonstrate that SPLASH identifies complex mutation patterns in SARS-CoV-2 strains, discovers regulated RNA isoforms at the single cell level, documents the vast sequence diversity of adaptive immune receptors, and uncovers biology in non-model organisms undocumented in their reference genomes: geographic and seasonal variation and diatom association in eelgrass, an oceanic plant impacted by climate change, and tissue-specific transcripts in octopus. SPLASH is a new unifying approach to genomic analysis that enables an expansive scope of discovery without metadata or references.
Collapse
Affiliation(s)
- Kaitlin Chaung
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, USA
- Department of Biochemistry, Stanford University, Stanford, 94305, USA
| | - Tavor Z. Baharav
- Department of Electrical Engineering, Stanford University, Stanford, 94305, USA
| | - George Henderson
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, USA
- Department of Biochemistry, Stanford University, Stanford, 94305, USA
| | - Ivan N. Zheludev
- Department of Biochemistry, Stanford University, Stanford, 94305, USA
| | - Peter L. Wang
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, USA
- Department of Biochemistry, Stanford University, Stanford, 94305, USA
| | - Julia Salzman
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, USA
- Department of Biochemistry, Stanford University, Stanford, 94305, USA
- Department of Statistics (by courtesy), Stanford University, Stanford, 94305, USA
| |
Collapse
|
6
|
Shen K, Din AU, Sinha B, Zhou Y, Qian F, Shen B. Translational informatics for human microbiota: data resources, models and applications. Brief Bioinform 2023; 24:7152256. [PMID: 37141135 DOI: 10.1093/bib/bbad168] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 04/07/2023] [Accepted: 04/11/2023] [Indexed: 05/05/2023] Open
Abstract
With the rapid development of human intestinal microbiology and diverse microbiome-related studies and investigations, a large amount of data have been generated and accumulated. Meanwhile, different computational and bioinformatics models have been developed for pattern recognition and knowledge discovery using these data. Given the heterogeneity of these resources and models, we aimed to provide a landscape of the data resources, a comparison of the computational models and a summary of the translational informatics applied to microbiota data. We first review the existing databases, knowledge bases, knowledge graphs and standardizations of microbiome data. Then, the high-throughput sequencing techniques for the microbiome and the informatics tools for their analyses are compared. Finally, translational informatics for the microbiome, including biomarker discovery, personalized treatment and smart healthcare for complex diseases, are discussed.
Collapse
Affiliation(s)
- Ke Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Ahmad Ud Din
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Baivab Sinha
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Yi Zhou
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| | - Fuliang Qian
- Center for Systems Biology, Suzhou Medical College of Soochow University, Suzhou 215123, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Suzhou 215123, China
| | - Bairong Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China
| |
Collapse
|
7
|
Borodušķe A, Ķibilds J, Fridmanis D, Gudrā D, Ustinova M, Seņkovs M, Nikolajeva V. Does peptide-nucleic acid (PNA) clamping of host plant DNA benefit ITS1 amplicon-based characterization of the fungal endophyte community? FUNGAL ECOL 2023. [DOI: 10.1016/j.funeco.2022.101181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
8
|
Abdelsalam NA, Elshora H, El-Hadidi M. Interactive Web-Based Services for Metagenomic Data Analysis and Comparisons. Methods Mol Biol 2023; 2649:133-174. [PMID: 37258861 DOI: 10.1007/978-1-0716-3072-3_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Recently, sequencing technologies have become readily available, and scientists are more motivated to conduct metagenomic research to unveil the potential of a myriad of ecosystems and biomes. Metagenomics studies the composition and functions of microbial communities and paves the way to multiple applications in medicine, industry, and ecology. Nonetheless, the immense amount of sequencing data of metagenomics research and the few user-friendly analysis tools and pipelines carry a new challenge to the data analysis.Web-based bioinformatics tools are now being developed to facilitate the analysis of complex metagenomic data without prior knowledge of any programming languages or special installation. Specialized web tools help answer researchers' main questions on the taxonomic classification, functional capabilities, discrepancies between two ecosystems, and the probable functional correlations between the members of a specific microbial community. With an Internet connection and a few clicks, researchers can conveniently and efficiently analyze the metagenomic datasets, summarize results, and visualize key information on the composition and the functional potential of metagenomic samples under study. This chapter provides a simple guide to a few of the fundamental web-based services used for metagenomic data analyses, such as BV-BRC, RDP, MG-RAST, MicrobiomeAnalyst, METAGENassist, and MGnify.
Collapse
Affiliation(s)
- Nehal Adel Abdelsalam
- University of Science and Technology, Zewail City, Giza, Egypt
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, Egypt
| | - Hajar Elshora
- Bioinformatics Group, Center for Informatics Sciences (CIS), Nile University, Giza, Egypt
- Biomedical Informatics Program, School of Information Technology and Computer Science, Nile University, Giza, Egypt
| | - Mohamed El-Hadidi
- Bioinformatics Group, Center for Informatics Sciences (CIS), Nile University, Giza, Egypt.
| |
Collapse
|
9
|
Marzano M, Calasso M, Caponio GR, Celano G, Fosso B, De Palma D, Vacca M, Notario E, Pesole G, De Leo F, De Angelis M. Extension of the shelf-life of fresh pasta using modified atmosphere packaging and bioprotective cultures. Front Microbiol 2022; 13:1003437. [PMID: 36406432 PMCID: PMC9666361 DOI: 10.3389/fmicb.2022.1003437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 10/05/2022] [Indexed: 01/25/2023] Open
Abstract
Microbial stability of fresh pasta depends on heat treatment, storage temperature, proper preservatives, and atmosphere packaging. This study aimed at improving the microbial quality, safety, and shelf life of fresh pasta using modified atmosphere composition and packaging with or without the addition of bioprotective cultures (Lactobacillus acidophilus, Lactobacillus casei, Bifidobacterium spp., and Bacillus coagulans) into semolina. Three fresh pasta variants were made using (i) the traditional protocol (control), MAP (20:80 CO2:N2), and barrier packaging, (ii) the experimental MAP (40:60 CO2:N2) and barrier packaging, and (iii) the experimental MAP, barrier packaging, and bioprotective cultures. Their effects on physicochemical properties (i.e., content on macro elements, water activity, headspace O2, CO2 concentrations, and mycotoxins), microbiological patterns, protein, and volatile organic compounds (VOC) were investigated at the beginning and the end of the actual or extended shelf-life through traditional and multi-omics approaches. We showed that the gas composition and properties of the packaging material tested in the experimental MAP system, with or without bioprotective cultures, positively affect features of fresh pasta avoiding changes in their main chemical properties, allowing for a storage longer than 120 days under refrigerated conditions. These results support that, although bioprotective cultures were not all able to grow in tested conditions, they can control the spoilage and the associated food-borne microbiota in fresh pasta during storage by their antimicrobials and/or fermentation products synergically. The VOC profiling, based on gas-chromatography mass-spectrometry (GC-MS), highlighted significant differences affected by the different manufacturing and packaging of samples. Therefore, the use of the proposed MAP system and the addition of bioprotective cultures can be considered an industrial helpful strategy to reduce the quality loss during refrigerated storage and to increase the shelf life of fresh pasta for additional 30 days by allowing the economic and environmental benefits spurring innovation in existing production models.
Collapse
Affiliation(s)
- Marinella Marzano
- Istituto di Biomembrane, Bioenergetica e Biotecnologie Molecolari, Consiglio Nazionale delle Ricerche, Bari, Italy
| | - Maria Calasso
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Giusy Rita Caponio
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Giuseppe Celano
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Bruno Fosso
- Istituto di Biomembrane, Bioenergetica e Biotecnologie Molecolari, Consiglio Nazionale delle Ricerche, Bari, Italy
- Dipartimento di Bioscienze, Biotecnologie e Biofarmaceutica, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | | | - Mirco Vacca
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Elisabetta Notario
- Dipartimento di Bioscienze, Biotecnologie e Biofarmaceutica, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Graziano Pesole
- Istituto di Biomembrane, Bioenergetica e Biotecnologie Molecolari, Consiglio Nazionale delle Ricerche, Bari, Italy
- Dipartimento di Bioscienze, Biotecnologie e Biofarmaceutica, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Francesca De Leo
- Istituto di Biomembrane, Bioenergetica e Biotecnologie Molecolari, Consiglio Nazionale delle Ricerche, Bari, Italy
| | - Maria De Angelis
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| |
Collapse
|
10
|
Nwachukwu BC, Babalola OO. Metagenomics: A Tool for Exploring Key Microbiome With the Potentials for Improving Sustainable Agriculture. FRONTIERS IN SUSTAINABLE FOOD SYSTEMS 2022. [DOI: 10.3389/fsufs.2022.886987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Microorganisms are immense in nature and exist in every imaginable ecological niche, performing a wide range of metabolic processes. Unfortunately, using traditional microbiological methods, most microorganisms remain unculturable. The emergence of metagenomics has resolved the challenge of capturing the entire microbial community in an environmental sample by enabling the analysis of whole genomes without requiring culturing. Metagenomics as a non-culture approach encompasses a greater amount of genetic information than traditional approaches. The plant root-associated microbial community is essential for plant growth and development, hence the interactions between microorganisms, soil, and plants is essential to understand and improve crop yields in rural and urban agriculture. Although some of these microorganisms are currently unculturable in the laboratory, metagenomic techniques may nevertheless be used to identify the microorganisms and their functional traits. A detailed understanding of these organisms and their interactions should facilitate an improvement of plant growth and sustainable crop production in soil and soilless agriculture. Therefore, the objective of this review is to provide insights into metagenomic techniques to study plant root-associated microbiota and microbial ecology. In addition, the different DNA-based techniques and their role in elaborating plant microbiomes are discussed. As an understanding of these microorganisms and their biotechnological potentials are unlocked through metagenomics, they can be used to develop new, useful and unique bio-fertilizers and bio-pesticides that are not harmful to the environment.
Collapse
|
11
|
Pérez-Losada M, Narayanan DB, Kolbe AR, Ramos-Tapia I, Castro-Nallar E, Crandall KA, Domínguez J. Comparative Analysis of Metagenomics and Metataxonomics for the Characterization of Vermicompost Microbiomes. Front Microbiol 2022; 13:854423. [PMID: 35620097 PMCID: PMC9127802 DOI: 10.3389/fmicb.2022.854423] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 04/21/2022] [Indexed: 11/21/2022] Open
Abstract
The study of microbial communities or microbiotas in animals and environments is important because of their impact in a broad range of industrial applications, diseases and ecological roles. High throughput sequencing (HTS) is the best strategy to characterize microbial composition and function. Microbial profiles can be obtained either by shotgun sequencing of genomes, or through amplicon sequencing of target genes (e.g., 16S rRNA for bacteria and ITS for fungi). Here, we compared both HTS approaches at assessing taxonomic and functional diversity of bacterial and fungal communities during vermicomposting of white grape marc. We applied specific HTS workflows to the same 12 microcosms, with and without earthworms, sampled at two distinct phases of the vermicomposting process occurring at 21 and 63 days. Metataxonomic profiles were inferred in DADA2, with bacterial metabolic pathways predicted via PICRUSt2. Metagenomic taxonomic profiles were inferred in PathoScope, while bacterial functional profiles were inferred in Humann2. Microbial profiles inferred by metagenomics and metataxonomics showed similarities and differences in composition, structure, and metabolic function at different taxonomic levels. Microbial composition and abundance estimated by both HTS approaches agreed reasonably well at the phylum level, but larger discrepancies were observed at lower taxonomic ranks. Shotgun HTS identified ~1.8 times more bacterial genera than 16S rRNA HTS, while ITS HTS identified two times more fungal genera than shotgun HTS. This is mainly a consequence of the difference in resolution and reference richness between amplicon and genome sequencing approaches and databases, respectively. Our study also revealed great differences and even opposite trends in alpha- and beta-diversity between amplicon and shotgun HTS. Interestingly, amplicon PICRUSt2-imputed functional repertoires overlapped ~50% with shotgun Humann2 profiles. Finally, both approaches indicated that although bacteria and fungi are the main drivers of biochemical decomposition, earthworms also play a key role in plant vermicomposting. In summary, our study highlights the strengths and weaknesses of metagenomics and metataxonomics and provides new insights on the vermicomposting of white grape marc. Since both approaches may target different biological aspects of the communities, combining them will provide a better understanding of the microbiotas under study.
Collapse
Affiliation(s)
- Marcos Pérez-Losada
- Computational Biology Institute, The George Washington University, Washington, DC, United States
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
- CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Vairão, Portugal
| | - Dhatri Badri Narayanan
- Computational Biology Institute, The George Washington University, Washington, DC, United States
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
| | - Allison R. Kolbe
- Computational Biology Institute, The George Washington University, Washington, DC, United States
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
| | - Ignacio Ramos-Tapia
- Instituto de Investigación Interdisciplinaria (I3), Universidad de Talca, Talca, Chile
| | - Eduardo Castro-Nallar
- CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Vairão, Portugal
- Instituto de Investigación Interdisciplinaria (I3), Universidad de Talca, Talca, Chile
- Departamento de Microbiología, Facultad de Ciencias de la Salud, Universidad de Talca, Talca, Chile
| | - Keith A. Crandall
- Computational Biology Institute, The George Washington University, Washington, DC, United States
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
| | - Jorge Domínguez
- Grupo de Ecoloxía Animal (GEA), Universidade de Vigo, Vigo, Spain
| |
Collapse
|
12
|
Waterhouse RM, Adam-Blondon AF, Agosti D, Baldrian P, Balech B, Corre E, Davey RP, Lantz H, Pesole G, Quast C, Glöckner FO, Raes N, Sandionigi A, Santamaria M, Addink W, Vohradsky J, Nunes-Jorge A, Willassen NP, Lanfear J. Recommendations for connecting molecular sequence and biodiversity research infrastructures through ELIXIR. F1000Res 2021; 10:ELIXIR-1238. [PMID: 35999898 PMCID: PMC9360911 DOI: 10.12688/f1000research.73825.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/27/2022] [Indexed: 12/03/2022] Open
Abstract
Threats to global biodiversity are increasingly recognised by scientists and the public as a critical challenge. Molecular sequencing technologies offer means to catalogue, explore, and monitor the richness and biogeography of life on Earth. However, exploiting their full potential requires tools that connect biodiversity infrastructures and resources. As a research infrastructure developing services and technical solutions that help integrate and coordinate life science resources across Europe, ELIXIR is a key player. To identify opportunities, highlight priorities, and aid strategic thinking, here we survey approaches by which molecular technologies help inform understanding of biodiversity. We detail example use cases to highlight how DNA sequencing is: resolving taxonomic issues; Increasing knowledge of marine biodiversity; helping understand how agriculture and biodiversity are critically linked; and playing an essential role in ecological studies. Together with examples of national biodiversity programmes, the use cases show where progress is being made but also highlight common challenges and opportunities for future enhancement of underlying technologies and services that connect molecular and wider biodiversity domains. Based on emerging themes, we propose key recommendations to guide future funding for biodiversity research: biodiversity and bioinformatic infrastructures need to collaborate closely and strategically; taxonomic efforts need to be aligned and harmonised across domains; metadata needs to be standardised and common data management approaches widely adopted; current approaches need to be scaled up dramatically to address the anticipated explosion of molecular data; bioinformatics support for biodiversity research needs to be enabled and sustained; training for end users of biodiversity research infrastructures needs to be prioritised; and community initiatives need to be proactive and focused on enabling solutions. For sequencing data to deliver their full potential they must be connected to knowledge: together, molecular sequence data collection initiatives and biodiversity research infrastructures can advance global efforts to prevent further decline of Earth's biodiversity.
Collapse
Affiliation(s)
- Robert M. Waterhouse
- Department of Ecology and Evolution and Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Vaud, 1015, Switzerland
| | | | | | - Petr Baldrian
- Institute of Microbiology of the Czech Academy of Sciences, Praha, 142 20, Czech Republic
| | - Bachir Balech
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari, 70126, Italy
| | - Erwan Corre
- CNRS/Sorbonne Université, Station Biologique de Roscoff, Roscoff, 29680, France
| | | | - Henrik Lantz
- Department of Medical Biochemistry and Microbiology/NBIS, Uppsala University, Uppsala, Sweden
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari, 70126, Italy
- Department of Biosciences. Biotechnology and Biopharmaceutics, University of Bari “A. Moro”, Bari, 70126, Italy
| | - Christian Quast
- Life Sciences & Chemistry, Jacobs University Bremen gGmbH, Bremen, Germany
| | - Frank Oliver Glöckner
- MARUM - Center for Marine Environmental Sciences, University of Bremen, Bremerhaven, 27570, Germany
- Alfred Wegener Institute, Helmholtz Center for Polar- and Marine Research, Bremerhaven, 27570, Germany
| | - Niels Raes
- NLBIF - Netherlands Biodiversity Information Facility, Naturalis Biodiversity Center, Leiden, 2300 RA, The Netherlands
| | | | - Monica Santamaria
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari, 70126, Italy
| | - Wouter Addink
- DiSSCo - Distributed System of Scientific Collections, Naturalis Biodiversity Center, Leiden, 2300 RA, The Netherlands
| | - Jiri Vohradsky
- Laboratory of Bioinformatics, Institute of Microbiology, Prague, 142 20, Czech Republic
| | | | | | - Jerry Lanfear
- ELIXIR Hub, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| |
Collapse
|
13
|
Waterhouse RM, Adam-Blondon AF, Agosti D, Baldrian P, Balech B, Corre E, Davey RP, Lantz H, Pesole G, Quast C, Glöckner FO, Raes N, Sandionigi A, Santamaria M, Addink W, Vohradsky J, Nunes-Jorge A, Willassen NP, Lanfear J. Recommendations for connecting molecular sequence and biodiversity research infrastructures through ELIXIR. F1000Res 2021; 10:ELIXIR-1238. [PMID: 35999898 PMCID: PMC9360911 DOI: 10.12688/f1000research.73825.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/12/2021] [Indexed: 09/03/2024] Open
Abstract
Threats to global biodiversity are increasingly recognised by scientists and the public as a critical challenge. Molecular sequencing technologies offer means to catalogue, explore, and monitor the richness and biogeography of life on Earth. However, exploiting their full potential requires tools that connect biodiversity infrastructures and resources. As a research infrastructure developing services and technical solutions that help integrate and coordinate life science resources across Europe, ELIXIR is a key player. To identify opportunities, highlight priorities, and aid strategic thinking, here we survey approaches by which molecular technologies help inform understanding of biodiversity. We detail example use cases to highlight how DNA sequencing is: resolving taxonomic issues; Increasing knowledge of marine biodiversity; helping understand how agriculture and biodiversity are critically linked; and playing an essential role in ecological studies. Together with examples of national biodiversity programmes, the use cases show where progress is being made but also highlight common challenges and opportunities for future enhancement of underlying technologies and services that connect molecular and wider biodiversity domains. Based on emerging themes, we propose key recommendations to guide future funding for biodiversity research: biodiversity and bioinformatic infrastructures need to collaborate closely and strategically; taxonomic efforts need to be aligned and harmonised across domains; metadata needs to be standardised and common data management approaches widely adopted; current approaches need to be scaled up dramatically to address the anticipated explosion of molecular data; bioinformatics support for biodiversity research needs to be enabled and sustained; training for end users of biodiversity research infrastructures needs to be prioritised; and community initiatives need to be proactive and focused on enabling solutions. For sequencing data to deliver their full potential they must be connected to knowledge: together, molecular sequence data collection initiatives and biodiversity research infrastructures can advance global efforts to prevent further decline of Earth's biodiversity.
Collapse
Affiliation(s)
- Robert M. Waterhouse
- Department of Ecology and Evolution and Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Vaud, 1015, Switzerland
| | | | | | - Petr Baldrian
- Institute of Microbiology of the Czech Academy of Sciences, Praha, 142 20, Czech Republic
| | - Bachir Balech
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari, 70126, Italy
| | - Erwan Corre
- CNRS/Sorbonne Université, Station Biologique de Roscoff, Roscoff, 29680, France
| | | | - Henrik Lantz
- Department of Medical Biochemistry and Microbiology/NBIS, Uppsala University, Uppsala, Sweden
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari, 70126, Italy
- Department of Biosciences. Biotechnology and Biopharmaceutics, University of Bari “A. Moro”, Bari, 70126, Italy
| | - Christian Quast
- Life Sciences & Chemistry, Jacobs University Bremen gGmbH, Bremen, Germany
| | - Frank Oliver Glöckner
- MARUM - Center for Marine Environmental Sciences, University of Bremen, Bremerhaven, 27570, Germany
- Alfred Wegener Institute, Helmholtz Center for Polar- and Marine Research, Bremerhaven, 27570, Germany
| | - Niels Raes
- NLBIF - Netherlands Biodiversity Information Facility, Naturalis Biodiversity Center, Leiden, 2300 RA, The Netherlands
| | | | - Monica Santamaria
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, CNR, Bari, 70126, Italy
| | - Wouter Addink
- DiSSCo - Distributed System of Scientific Collections, Naturalis Biodiversity Center, Leiden, 2300 RA, The Netherlands
| | - Jiri Vohradsky
- Laboratory of Bioinformatics, Institute of Microbiology, Prague, 142 20, Czech Republic
| | | | | | - Jerry Lanfear
- ELIXIR Hub, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| |
Collapse
|
14
|
Sahin E. Putative Group I Introns in the Nuclear Internal Transcribed Spacer of the Basidiomycete Fungus Gautieria Vittad. CYTOL GENET+ 2021. [DOI: 10.3103/s009545272105011x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
15
|
Abstract
Candida auris is a human fungal pathogen classified as an urgent threat to the delivery of health care due to its extensive antimicrobial resistance and the high mortality rates associated with invasive infections. Global outbreaks have occurred in health care facilities, particularly, long-term care hospitals and nursing homes. Skin is the primary site of colonization for C. auris. To accelerate research studies, we developed microbiome sequencing protocols, including amplicon and metagenomic sequencing, directly from patient samples at health care facilities with ongoing C. auris outbreaks. We characterized the skin mycobiome with a database optimized to classify Candida species and C. auris to the clade level. While Malassezia species were the predominant skin-associated fungi, nursing home residents also harbored Candida species, including C. albicans, and C. parapsilosis. Amplicon sequencing was concordant with culturing studies to identify C. auris-colonized patients and provided further resolution that distinct clades of C. auris are colonizing facilities in New York and Illinois. Shotgun metagenomic sequencing from a clinical sample with a high fungal bioburden generated a skin-associated profile of the C. auris genome. Future larger scale clinical studies are warranted to more systematically investigate the effects of commensal microbes and patient risk factors on the colonization and transmission of C. auris. IMPORTANCECandida auris is a human pathogen of high concern due to its extensive antifungal drug resistance and high mortality rates associated with invasive infections. Candida auris skin colonization and persistence on environmental surfaces make this pathogen difficult to control once it enters a health care facility. Residents in long-term care hospitals and nursing homes are especially vulnerable. In this study, we developed microbiome sequencing protocols directly from surveillance samples, including amplicon and metagenomic sequencing, demonstrating concordance between sequencing results and culturing.
Collapse
|
16
|
Tangaro M, Defazio G, Fosso B, Licciulli VF, Grillo G, Donvito G, Lavezzo E, Baruzzo G, Pesole G, Santamaria M. ITSoneWB: profiling global taxonomic diversity of eukaryotic communities on Galaxy. Bioinformatics 2021; 37:4253-4254. [PMID: 34117876 PMCID: PMC9502156 DOI: 10.1093/bioinformatics/btab431] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 06/03/2021] [Accepted: 06/11/2021] [Indexed: 12/05/2022] Open
Abstract
Summary ITSoneWB (ITSone WorkBench) is a Galaxy-based bioinformatic environment where comprehensive and high-quality reference data are connected with established pipelines and new tools in an automated and easy-to-use service targeted at global taxonomic analysis of eukaryotic communities based on Internal Transcribed Spacer 1 variants high-throughput sequencing. Availability and implementation ITSoneWB has been deployed on the INFN-Bari ReCaS cloud facility and is freely available on the web at http://itsonewb.cloud.ba.infn.it/galaxy. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marco Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari 70126, Italy
| | - Giuseppe Defazio
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari 'A. Moro', Bari 70126, Italy
| | - Bruno Fosso
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari 70126, Italy
| | - Vito Flavio Licciulli
- Institute of Biomedical Technologies, National Research Council, Bari Unit, 70126 Bari, Italy
| | - Giorgio Grillo
- Institute of Biomedical Technologies, National Research Council, Bari Unit, 70126 Bari, Italy
| | - Giacinto Donvito
- National Institute for Nuclear Physics (INFN), Section of Bari, Bari 70126, Italy
| | - Enrico Lavezzo
- Department of Molecular Medicine, University of Padova, Padova 35131, Italy
| | - Giacomo Baruzzo
- Department of Information Engineering, University of Padova, Padova, 35131, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari 70126, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari 'A. Moro', Bari 70126, Italy
| | - Monica Santamaria
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari 70126, Italy
| |
Collapse
|
17
|
Gao B, Chi L, Zhu Y, Shi X, Tu P, Li B, Yin J, Gao N, Shen W, Schnabl B. An Introduction to Next Generation Sequencing Bioinformatic Analysis in Gut Microbiome Studies. Biomolecules 2021; 11:530. [PMID: 33918473 PMCID: PMC8066849 DOI: 10.3390/biom11040530] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/28/2021] [Accepted: 03/29/2021] [Indexed: 12/12/2022] Open
Abstract
The gut microbiome is a microbial ecosystem which expresses 100 times more genes than the human host and plays an essential role in human health and disease pathogenesis. Since most intestinal microbial species are difficult to culture, next generation sequencing technologies have been widely applied to study the gut microbiome, including 16S rRNA, 18S rRNA, internal transcribed spacer (ITS) sequencing, shotgun metagenomic sequencing, metatranscriptomic sequencing and viromic sequencing. Various software tools were developed to analyze different sequencing data. In this review, we summarize commonly used computational tools for gut microbiome data analysis, which extended our understanding of the gut microbiome in health and diseases.
Collapse
Affiliation(s)
- Bei Gao
- Department of Marine Science, School of Marine Sciences, Nanjing University of Information Science and Technology, Nanjing 210044, China;
| | - Liang Chi
- Metaorganism Immunity Section, Laboratory of Immune Systems Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA;
| | - Yixin Zhu
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA;
| | - Xiaochun Shi
- Department of Environmental Ecological Engineering, School of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China; (X.S.); (W.S.)
| | - Pengcheng Tu
- Department of Food Science and Nutrition, College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, China;
| | - Bing Li
- Suzhou Industrial Park Environmental Law Enforcement Brigade (Environmental Monitoring Station), Suzhou 215021, China;
| | - Jun Yin
- Department of Hydrometeorology, School of Hydrology and Water Resources, Nanjing University of Information Science and Technology, Nanjing 210044, China;
| | - Nan Gao
- Department of Biotechnology, School of Biological and Pharmaceutical Engineering, Nanjing Tech University, Nanjing 211816, China;
| | - Weishou Shen
- Department of Environmental Ecological Engineering, School of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China; (X.S.); (W.S.)
- Jiangsu Key Laboratory of Atmospheric Environment Monitoring and Pollution Control, Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing 210044, China
| | - Bernd Schnabl
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA;
- Department of Medicine, VA San Diego Healthcare System, San Diego, CA 92161, USA
| |
Collapse
|
18
|
Vasar M, Davison J, Neuenkamp L, Sepp SK, Young JPW, Moora M, Öpik M. User-friendly bioinformatics pipeline gDAT (graphical downstream analysis tool) for analysing rDNA sequences. Mol Ecol Resour 2021; 21:1380-1392. [PMID: 33527735 DOI: 10.1111/1755-0998.13340] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 01/19/2021] [Accepted: 01/21/2021] [Indexed: 01/04/2023]
Abstract
High-throughput sequencing (HTS) of multiple organisms in parallel (metabarcoding) has become a routine and cost-effective method for the analysis of microbial communities in environmental samples. However, careful data treatment is required to identify potential errors in HTS data, and the large volume of data generated by HTS requires in-house experience with command line tools for downstream analysis. This paper introduces a pipeline that incorporates the most common command line tools into an easy-to-use graphical interface-gDAT. By using the Python scripting language, the pipeline is compatible with the latest Windows, macOS and Linux operating systems. The pipeline supports analysis of Sanger, 454, IonTorrent, Illumina and PacBio sequences, allows custom modification of quality filtering steps, and implements both open and closed-reference operational taxonomic unit-picking for sequence identification. Predefined parameters are optimized for analysis of small subunit (SSU) rRNA gene amplicons from arbuscular mycorrhizal fungi, but the pipeline is widely applicable to metabarcoding studies targeting a broad range of organisms. The pipeline was additionally tested with data using general eukaryotic primers from the SSU gene region and fungal primers from the internal transcribed spacer (ITS) marker region. We describe the pipeline design and evaluate its performance and speed by conducting analysis of example data sets using different marker regions sequenced on Illumina platforms. The graphical interface, with the option to use the command line if needed, provides an accessible tool for rapid data analysis with repeatability and logging capabilities. Keeping the software open-source maximizes code accessibility, allowing scrutiny and bug fixes by the community.
Collapse
Affiliation(s)
- Martti Vasar
- Department of Botany, University of Tartu, Tartu, Estonia
| | - John Davison
- Department of Botany, University of Tartu, Tartu, Estonia
| | - Lena Neuenkamp
- Institute of Plant Sciences, University of Bern, Bern, Switzerland
| | | | | | - Mari Moora
- Department of Botany, University of Tartu, Tartu, Estonia
| | - Maarja Öpik
- Department of Botany, University of Tartu, Tartu, Estonia
| |
Collapse
|
19
|
Francioli D, Lentendu G, Lewin S, Kolb S. DNA Metabarcoding for the Characterization of Terrestrial Microbiota-Pitfalls and Solutions. Microorganisms 2021; 9:361. [PMID: 33673098 PMCID: PMC7918050 DOI: 10.3390/microorganisms9020361] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 02/04/2021] [Accepted: 02/09/2021] [Indexed: 02/06/2023] Open
Abstract
Soil-borne microbes are major ecological players in terrestrial environments since they cycle organic matter, channel nutrients across trophic levels and influence plant growth and health. Therefore, the identification, taxonomic characterization and determination of the ecological role of members of soil microbial communities have become major topics of interest. The development and continuous improvement of high-throughput sequencing platforms have further stimulated the study of complex microbiota in soils and plants. The most frequently used approach to study microbiota composition, diversity and dynamics is polymerase chain reaction (PCR), amplifying specific taxonomically informative gene markers with the subsequent sequencing of the amplicons. This methodological approach is called DNA metabarcoding. Over the last decade, DNA metabarcoding has rapidly emerged as a powerful and cost-effective method for the description of microbiota in environmental samples. However, this approach involves several processing steps, each of which might introduce significant biases that can considerably compromise the reliability of the metabarcoding output. The aim of this review is to provide state-of-the-art background knowledge needed to make appropriate decisions at each step of a DNA metabarcoding workflow, highlighting crucial steps that, if considered, ensures an accurate and standardized characterization of microbiota in environmental studies.
Collapse
Affiliation(s)
- Davide Francioli
- Microbial Biogeochemistry, Research Area Landscape Functioning, Leibniz Centre for Agricultural Landscape Research (ZALF), Eberswalder Str. 84, 15374 Müncheberg, Germany; (S.L.); (S.K.)
| | - Guillaume Lentendu
- Laboratory of Soil Biodiversity, University of Neuchâtel, Rue Emile-Argand 11, 2000 Neuchâtel, Switzerland;
| | - Simon Lewin
- Microbial Biogeochemistry, Research Area Landscape Functioning, Leibniz Centre for Agricultural Landscape Research (ZALF), Eberswalder Str. 84, 15374 Müncheberg, Germany; (S.L.); (S.K.)
| | - Steffen Kolb
- Microbial Biogeochemistry, Research Area Landscape Functioning, Leibniz Centre for Agricultural Landscape Research (ZALF), Eberswalder Str. 84, 15374 Müncheberg, Germany; (S.L.); (S.K.)
| |
Collapse
|
20
|
Fryssouli V, Zervakis GI, Polemis E, Typas MA. A global meta-analysis of ITS rDNA sequences from material belonging to the genus Ganoderma (Basidiomycota, Polyporales) including new data from selected taxa. MycoKeys 2020; 75:71-143. [PMID: 33304123 PMCID: PMC7723883 DOI: 10.3897/mycokeys.75.59872] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 10/26/2020] [Indexed: 01/16/2023] Open
Abstract
Ganoderma P. Karst. is a cosmopolitan genus of white-rot fungi which comprises species with highly-prized pharmaceutical properties, valuable biotechnological applications and of significant phytopathological interest. However, the status of the taxonomy within the genus is still highly controversial and ambiguous despite the progress made through molecular approaches. A metadata analysis of 3908 nuclear ribosomal internal transcribed spacer (ITS) rDNA sequences obtained from GenBank/ENA/DDBJ and UNITE was performed by targeting sequences annotated as Ganoderma, but also sequences from environmental samples and from material examined for the first time. Ganoderma taxa segregated into five main lineages (Clades A to E). Clade A corresponds to the core of laccate species and includes G. shanxiense and three major well-supported clusters: Cluster A.1 ('G. lucidum sensu lato') consists of taxa from Eurasia and North America, Cluster A.2 of material with worldwide occurrence including G. resinaceum and Cluster A.3 is composed of species originating from all continents except Europe and comprises G. lingzhi. Clade B includes G. applanatum and allied species with a Holarctic distribution. Clade C comprises taxa from Asia and Africa only. Clade D consists of laccate taxa with tropical/subtropical occurrence, while clade E harbours the highest number of non-laccate species with a cosmopolitan distribution. The 92 Ganoderma-associated names, initially used for sequences labelling, correspond to at least 80 taxa. Amongst them, 21 constitute putatively new phylospecies after our application of criteria relevant to the robustness/support of the terminal clades, intra- and interspecific genetic divergence and available biogeographic data. Moreover, several other groups or individual sequences seem to represent distinct taxonomic entities and merit further investigation. A particularly large number of the public sequences was revealed to be insufficiently and/or incorrectly identified, for example, 87% and 78% of entries labelled as G. australe and G. lucidum, respectively. In general, ITS demonstrated high efficacy in resolving relationships amongst most of the Ganoderma taxa; however, it was not equally useful at elucidating species barriers across the entire genus and such cases are outlined. Furthermore, we draw conclusions on biogeography by evaluating species occurrence on a global scale in conjunction with phylogenetic structure/patterns. The sequence variability assessed in ITS spacers could be further exploited for diagnostic purposes.
Collapse
Affiliation(s)
- Vassiliki Fryssouli
- Agricultural University of Athens, Laboratory of General and Agricultural Microbiology, Iera Odos 75, 11855 Athens, Greece
| | - Georgios I. Zervakis
- Agricultural University of Athens, Laboratory of General and Agricultural Microbiology, Iera Odos 75, 11855 Athens, Greece
| | - Elias Polemis
- Agricultural University of Athens, Laboratory of General and Agricultural Microbiology, Iera Odos 75, 11855 Athens, Greece
| | - Milton A. Typas
- National and Kapodistrian University of Athens, Department of Genetics and Biotechnology, Faculty of Biology, Panepistemiopolis, Athens 15701, Greece
| |
Collapse
|
21
|
Banchi E, Ametrano CG, Greco S, Stanković D, Muggia L, Pallavicini A. PLANiTS: a curated sequence reference dataset for plant ITS DNA metabarcoding. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2020:5722079. [PMID: 32016319 PMCID: PMC6997939 DOI: 10.1093/database/baz155] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 12/11/2019] [Accepted: 12/23/2019] [Indexed: 01/02/2023]
Abstract
DNA metabarcoding combines DNA barcoding with high-throughput sequencing to identify different taxa within environmental communities. The ITS has already been proposed and widely used as universal barcode marker for plants, but a comprehensive, updated and accurate reference dataset of plant ITS sequences has not been available so far. Here, we constructed reference datasets of Viridiplantae ITS1, ITS2 and entire ITS sequences including both Chlorophyta and Streptophyta. The sequences were retrieved from NCBI, and the ITS region was extracted. The sequences underwent identity check to remove misidentified records and were clustered at 99% identity to reduce redundancy and computational effort. For this step, we developed a script called 'better clustering for QIIME' (bc4q) to ensure that the representative sequences are chosen according to the composition of the cluster at a different taxonomic level. The three datasets obtained with the bc4q script are PLANiTS1 (100 224 sequences), PLANiTS2 (96 771 sequences) and PLANiTS (97 550 sequences), and all are pre-formatted for QIIME, being this the most used bioinformatic pipeline for metabarcoding analysis. Being curated and updated reference databases, PLANiTS1, PLANiTS2 and PLANiTS are proposed as a reliable, pivotal first step for a general standardization of plant DNA metabarcoding studies. The bc4q script is presented as a new tool useful in each research dealing with sequences clustering. Database URL: https://github.com/apallavicini/bc4q; https://github.com/apallavicini/PLANiTS.
Collapse
Affiliation(s)
- Elisa Banchi
- Department of Life Sciences, University of Trieste, via Giorgieri 5, 34127, Trieste, Italy.,Division of Oceanography, National Institute of Oceanography and Applied Geophysics, via Piccard 54, 34151, Trieste, Italy
| | - Claudio G Ametrano
- Department of Life Sciences, University of Trieste, via Giorgieri 5, 34127, Trieste, Italy
| | - Samuele Greco
- Department of Life Sciences, University of Trieste, via Giorgieri 5, 34127, Trieste, Italy
| | - David Stanković
- Department of Life Sciences, University of Trieste, via Giorgieri 5, 34127, Trieste, Italy.,Marine Biology Station, National Institute of Biology, Fornače 41, Piran, Slovenia
| | - Lucia Muggia
- Department of Life Sciences, University of Trieste, via Giorgieri 5, 34127, Trieste, Italy
| | - Alberto Pallavicini
- Department of Life Sciences, University of Trieste, via Giorgieri 5, 34127, Trieste, Italy.,Division of Oceanography, National Institute of Oceanography and Applied Geophysics, via Piccard 54, 34151, Trieste, Italy.,Department of Biology and Evoliution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy
| |
Collapse
|
22
|
Fawley MW, Fawley KP. Identification of Eukaryotic Microalgal Strains. JOURNAL OF APPLIED PHYCOLOGY 2020; 32:2699-2709. [PMID: 33542589 PMCID: PMC7853647 DOI: 10.1007/s10811-020-02190-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Proper identification and documentation of microalgae is often lacking in publications of applied phycology, algal physiology and biochemistry. Identification of many eukaryotic microalgae can be very daunting to the non-specialist. We present a systematic process for identifying eukaryotic microalgae using morphological evidence and DNA sequence analysis. Our intent was to provide an identification method that could be used by non-taxonomists, but which is grounded in the current techniques used by algal taxonomists. Central to the identification is database searches with DNA sequences of appropriate loci. We provide usable criteria for identification at the genus or species level, depending on the availability of sequence data in curated databases and repositories. Particular attention is paid to dealing with possible misidentifications in DNA databases and utilizing current taxonomy.
Collapse
Affiliation(s)
- Marvin W Fawley
- Division of Natural Sciences and Mathematics, University of the Ozarks, Clarksville, AR 72830, USA
| | - Karen P Fawley
- Division of Natural Sciences and Mathematics, University of the Ozarks, Clarksville, AR 72830, USA
| |
Collapse
|
23
|
Abstract
BACKGROUND During the past decade, breakthroughs in sequencing technology and computational biology have provided the basis for studies of the myriad ways in which microbial communities ("microbiota") in and on the human body influence human health and disease. In almost every medical specialty, there is now a growing interest in accurate and replicable profiling of the microbiota for use in diagnostic and therapeutic application. CONTENT This review provides an overview of approaches, challenges, and considerations for diagnostic applications borrowing from other areas of molecular diagnostics, including clinical metagenomics. Methodological considerations and evolving approaches for microbiota profiling from mitochondrially encoded 16S rRNA-based amplicon sequencing to metagenomics and metatranscriptomics are discussed. To improve replicability, at least the most vulnerable steps in testing workflows will need to be standardized and continuous efforts needed to define QC standards. Challenges such as purity of reagents and consumables, improvement of reference databases, and availability of diagnostic-grade data analysis solutions will require joint efforts across disciplines and with manufacturers. SUMMARY The body of literature supporting important links between the microbiota at different anatomic sites with human health and disease is expanding rapidly and therapeutic manipulation of the intestinal microbiota is becoming routine. The next decade will likely see implementation of microbiome diagnostics in diagnostic laboratories to fully capitalize on technological and scientific advances and apply them in routine medical practice.
Collapse
Affiliation(s)
- Robert Schlaberg
- Department of Pathology, University of Utah, Salt Lake City, UT.,ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT.,IDbyDNA Inc., San Francisco, CA
| |
Collapse
|
24
|
Zamora J, Ekman S. Phylogeny and character evolution in the Dacrymycetes, and systematics of Unilacrymaceae and Dacryonaemataceae fam. nov. PERSOONIA 2020; 44:161-205. [PMID: 33116340 PMCID: PMC7567964 DOI: 10.3767/persoonia.2020.44.07] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 10/24/2019] [Indexed: 02/07/2023]
Abstract
We present a multilocus phylogeny of the class Dacrymycetes, based on data from the 18S, ITS, 28S, RPB1, RPB2, TEF-1α, 12S, and ATP6 DNA regions, with c. 90 species including the types of most currently accepted genera. A variety of methodological approaches was used to infer phylogenetic relationships among the Dacrymycetes, from a supermatrix strategy using maximum likelihood and Bayesian inference on a concatenated dataset, to coalescence-based calculations, such as quartet-based summary methods of independent single-locus trees, and Bayesian integration of single-locus trees into a species tree under the multispecies coalescent. We evaluate for the first time the taxonomic usefulness of some cytological phenotypic characters, i.e., vacuolar contents (vacuolar bodies and lipid bodies), number of nuclei of recently discharged basidiospores, and pigments, with especial emphasis on carotenoids. These characters, along with several others traditionally used for the taxonomy of this group (basidium shape, presence and morphology of clamp connections, morphology of the terminal cells of cortical/marginal hyphae, presence and degree of ramification of the hyphidia), are mapped on the resulting phylogenies and their evolution through the class Dacrymycetes discussed. Our analyses reveal five lineages that putatively represent five different families, four of which are accepted and named. Three out of these four lineages correspond to previously circumscribed and published families (Cerinomycetaceae, Dacrymycetaceae, and Unilacrymaceae), and one is proposed as the new family Dacryonaemataceae. Provisionally, only a single order, Dacrymycetales, is accepted within the class. Furthermore, the systematics of the two smallest families, Dacryonaemataceae and Unilacrymaceae, are investigated to the species level, using coalescence-based species delimitation on multilocus DNA data, and a detailed morphological study including morphometric analyses of the basidiospores. Three species are accepted in Dacryonaema, the type, Da. rufum, the newly combined Da. macnabbii (basionym Dacrymyces macnabbii), and a new species named Da. macrosporum. Two species are accepted in Unilacryma, the new U. bispora, and the type, U. unispora, the latter treated in a broad sense pending improved sampling across the Holarctic.
Collapse
Affiliation(s)
- J.C. Zamora
- Museum of Evolution, Uppsala University, Norbyvägen 16, SE-75236 Uppsala, Sweden
- Departamento de Biología Vegetal II, Facultad de Farmacia, Universidad Complutense de Madrid, Ciudad Universitaria, plaza de Ramón y Cajal s/n, E-28040, Madrid, Spain
| | - S. Ekman
- Museum of Evolution, Uppsala University, Norbyvägen 16, SE-75236 Uppsala, Sweden
| |
Collapse
|
25
|
Mitchell AL, Almeida A, Beracochea M, Boland M, Burgin J, Cochrane G, Crusoe MR, Kale V, Potter SC, Richardson LJ, Sakharova E, Scheremetjew M, Korobeynikov A, Shlemov A, Kunyavskaya O, Lapidus A, Finn RD. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res 2020; 48:D570-D578. [PMID: 31696235 PMCID: PMC7145632 DOI: 10.1093/nar/gkz1035] [Citation(s) in RCA: 228] [Impact Index Per Article: 45.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 10/23/2019] [Indexed: 12/16/2022] Open
Abstract
MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline with multiple analysis pipelines that are tailored according to the input data, and that are formally described using the Common Workflow Language, enabling greater provenance, reusability, and reproducibility. MGnify's new analysis pipelines offer additional approaches for taxonomic assertions based on ribosomal internal transcribed spacer regions (ITS1/2) and expanded protein functional annotations. Biochemical pathways and systems predictions have also been added for assembled contigs. MGnify's growing focus on the assembly of metagenomic data has also seen the number of datasets it has assembled and analysed increase six-fold. The non-redundant protein database constructed from the proteins encoded by these assemblies now exceeds 1 billion sequences. Meanwhile, a newly developed contig viewer provides fine-grained visualisation of the assembled contigs and their enriched annotations.
Collapse
Affiliation(s)
- Alex L Mitchell
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexandre Almeida
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Martin Beracochea
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Miguel Boland
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michael R Crusoe
- Common Workflow Language, a project of the Software Freedom Conservancy, Inc. 137 Montague Street, Suite 380, Brooklyn, NY 11201-3548, USA
| | - Varsha Kale
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon C Potter
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lorna J Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ekaterina Sakharova
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maxim Scheremetjew
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anton Korobeynikov
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Russia
| | - Alex Shlemov
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Russia
| | - Alla Lapidus
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Russia
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
26
|
Amid C, Alako BTF, Balavenkataraman Kadhirvelu V, Burdett T, Burgin J, Fan J, Harrison PW, Holt S, Hussein A, Ivanov E, Jayathilaka S, Kay S, Keane T, Leinonen R, Liu X, Martinez-Villacorta J, Milano A, Pakseresht A, Rahman N, Rajan J, Reddy K, Richards E, Smirnov D, Sokolov A, Vijayaraja S, Cochrane G. The European Nucleotide Archive in 2019. Nucleic Acids Res 2020; 48:D70-D76. [PMID: 31722421 PMCID: PMC7145635 DOI: 10.1093/nar/gkz1063] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 10/25/2019] [Accepted: 11/07/2019] [Indexed: 11/12/2022] Open
Abstract
The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.
Collapse
Affiliation(s)
- Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jun Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Abdulrahman Hussein
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josue Martinez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Annalisa Milano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amir Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Edward Richards
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
27
|
Corsaro D, Venditti D. Putative group I introns in the eukaryote nuclear internal transcribed spacers. Curr Genet 2019; 66:373-384. [PMID: 31463775 DOI: 10.1007/s00294-019-01027-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 08/05/2019] [Accepted: 08/17/2019] [Indexed: 11/28/2022]
Abstract
Group I introns are mobile genetic elements that interrupt genes encoding proteins and RNAs. In the rRNA operon, introns can insert in the small subunit (SSU) and large subunit (LSU) of a wide variety of protists and various prokaryotes, but they were never found in the ITS region. In this study, unusually long ITS regions of fungi and closely related unicellular organisms (Polychytrium aggregatum, Mitosporidium daphniae, Amoeboaphelidium occidentale and Nuclearia simplex) were analysed. While the insertion of repeats is responsible for long ITS in other eukaryotes, the increased size of the sequences analysed herein seems rather due to the presence of introns in ITS-1 or ITS-2. The identified insertions can be folded in secondary structures according to group I intron models, and they cluster within introns in conserved core-based phylogeny. In addition, for Mitosporidium, Amoeboaphelidium and Nuclearia, more conventional ITS-2 structures can be deduced once spacer introns are removed. Sequences of five shark species were also analysed for their structure and included in phylogeny because of unpublished work reporting introns in their ITS, obtaining congruent results. Overall, the data presented herein indicate that spacer regions may contain introns.
Collapse
Affiliation(s)
- Daniele Corsaro
- CHLAREAS, 12 rue du Maconnais, Vandoeuvre-lès-Nancy, 54500, France.
| | | |
Collapse
|
28
|
Stefanini I, Cavalieri D. Metagenomic Approaches to Investigate the Contribution of the Vineyard Environment to the Quality of Wine Fermentation: Potentials and Difficulties. Front Microbiol 2018; 9:991. [PMID: 29867889 PMCID: PMC5964215 DOI: 10.3389/fmicb.2018.00991] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2017] [Accepted: 04/27/2018] [Indexed: 01/08/2023] Open
Abstract
The winemaking is a complex process that begins in the vineyard and ends at consumption moment. Recent reports have shown the relevance of microbial populations in the definition of the regional organoleptic and sensory characteristics of a wine. Metagenomic approaches, allowing the exhaustive identification of microorganisms present in complex samples, have recently played a fundamental role in the dissection of the contribution of the vineyard environment to wine fermentation. Systematic approaches have explored the impact of agronomical techniques, vineyard topologies, and climatic changes on bacterial and fungal populations found in the vineyard and in fermentations, also trying to predict or extrapolate the effects on the sensorial characteristics of the resulting wine. This review is aimed at highlighting the major technical and experimental challenges in dissecting the contribution of the vineyard and native environments microbiota to the wine fermentation process, and how metagenomic approaches can help in understanding microbial fluxes and selections across the environments and specimens related to wine fermentation.
Collapse
Affiliation(s)
- Irene Stefanini
- Division of Biomedical Sciences, University of Warwick, Coventry, United Kingdom
| | | |
Collapse
|