1
|
Mankoti M, Pandit NK, Meena SS, Mohanty A. Investigating the genomic and metabolic abilities of PGPR Pseudomonas fluorescens in promoting plant growth and fire blight management. Mol Genet Genomics 2024; 299:110. [PMID: 39601883 DOI: 10.1007/s00438-024-02198-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Accepted: 10/26/2024] [Indexed: 11/29/2024]
Abstract
Pseudomonas fluorescens is commonly found in diverse environments and is well known for its metabolic and antagonistic properties. Despite its remarkable attributes, its potential role in promoting plant growth remains unexplored. This study examines these traits across 14 strains residing in diverse rhizosphere environments through pangenome and comparative genome analysis, alongside molecular docking studies against Erwinia amylovora to combat fire blight. Whole genome analysis revealed circular chromosome (6.01-7.07 Mb) with GC content averaging 59.95-63.39%. Predicted genes included 16S rRNA and protein-coding genes ranging from 4435 to 6393 bp and 1527 to 1541 bp, respectively. Pangenome analysis unveiled an open pangenome, shedding light on genetic factors influencing plant growth promotion and biocontrol, including nitrogen fixation, phosphorus solubilization, siderophore production, stress tolerance, flagella biosynthesis, and induced systemic resistance. Furthermore, pyrrolnitrin, phenazine-1-carboxylic acid, pyoluteorin, lokisin, 2,4-diacetylpholoroglucinol and pseudomonic acid were identified. Molecular docking against key proteins of E. amylovora highlighted the high binding affinities of 2,4-diacetylphloroglucinol, pseudomonic acid, and lokisin. These findings underscore the multifaceted role of P. fluorescens in plant growth promotion and biocontrol, with key biomolecules showing promising applications in plant growth and defense against pathogens.
Collapse
Affiliation(s)
- Megha Mankoti
- Department of Biotechnology, Dr B R Ambedkar National Institute of Technology Jalandhar, Punjab, India
| | - Nisha Kumari Pandit
- Department of Biotechnology, Dr B R Ambedkar National Institute of Technology Jalandhar, Punjab, India
| | - Sumer Singh Meena
- Department of Biotechnology, Dr B R Ambedkar National Institute of Technology Jalandhar, Punjab, India.
| | - Anee Mohanty
- Department of Biotechnology, Dr B R Ambedkar National Institute of Technology Jalandhar, Punjab, India.
| |
Collapse
|
2
|
Matthews CA, Watson-Haigh NS, Burton RA, Sheppard AE. A gentle introduction to pangenomics. Brief Bioinform 2024; 25:bbae588. [PMID: 39552065 PMCID: PMC11570541 DOI: 10.1093/bib/bbae588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/12/2024] [Accepted: 11/01/2024] [Indexed: 11/19/2024] Open
Abstract
Pangenomes have emerged in response to limitations associated with traditional linear reference genomes. In contrast to a traditional reference that is (usually) assembled from a single individual, pangenomes aim to represent all of the genomic variation found in a group of organisms. The term 'pangenome' is currently used to describe multiple different types of genomic information, and limited language is available to differentiate between them. This is frustrating for researchers working in the field and confusing for researchers new to the field. Here, we provide an introduction to pangenomics relevant to both prokaryotic and eukaryotic organisms and propose a formalization of the language used to describe pangenomes (see the Glossary) to improve the specificity of discussion in the field.
Collapse
Affiliation(s)
- Chelsea A Matthews
- School of Agriculture, Food and Wine, Waite Campus, University of Adelaide, Urrbrae, South Australia 5064, Australia
| | - Nathan S Watson-Haigh
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, Victoria 3000, Australia
- South Australian Genomics Centre, SAHMRI, North Terrace, Adelaide, South Australia 5000, Australia
- Alkahest Inc., San Carlos, CA 94070, United States
| | - Rachel A Burton
- School of Agriculture, Food and Wine, Waite Campus, University of Adelaide, Urrbrae, South Australia 5064, Australia
| | - Anna E Sheppard
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia
| |
Collapse
|
3
|
Bonnici V, Chicco D. Seven quick tips for gene-focused computational pangenomic analysis. BioData Min 2024; 17:28. [PMID: 39227987 PMCID: PMC11370085 DOI: 10.1186/s13040-024-00380-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 08/12/2024] [Indexed: 09/05/2024] Open
Abstract
Pangenomics is a relatively new scientific field which investigates the union of all the genomes of a clade. The word pan means everything in ancient Greek; the term pangenomics originally regarded genomes of bacteria and was later intended to refer to human genomes as well. Modern bioinformatics offers several tools to analyze pangenomics data, paving the way to an emerging field that we can call computational pangenomics. Current computational power available for the bioinformatics community has made computational pangenomic analyses easy to perform, but this higher accessibility to pangenomics analysis also increases the chances to make mistakes and to produce misleading or inflated results, especially by beginners. To handle this problem, we present here a few quick tips for efficient and correct computational pangenomic analyses with a focus on bacterial pangenomics, by describing common mistakes to avoid and experienced best practices to follow in this field. We believe our recommendations can help the readers perform more robust and sound pangenomic analyses and to generate more reliable results.
Collapse
Affiliation(s)
- Vincenzo Bonnici
- Dipartimento di Scienze Matematiche Fisiche e Informatiche, Università di Parma, Parma, Italy.
| | - Davide Chicco
- Dipartimento di Informatica Sistemistica e Comunicazione, Università di Milano-Bicocca, Milan, Italy.
- Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
4
|
Cannon SB, Lee HO, Weeks NT, Berendzen J. Pandagma: a tool for identifying pan-gene sets and gene families at desired evolutionary depths and accommodating whole-genome duplications. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae526. [PMID: 39180716 PMCID: PMC11377846 DOI: 10.1093/bioinformatics/btae526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 08/14/2024] [Accepted: 08/22/2024] [Indexed: 08/26/2024]
Abstract
SUMMARY Identification of allelic or corresponding genes (pan-genes) within a species or genus is important for discovery of biologically significant genetic conservation and variation. Similarly, identification of orthologs (gene families) across wider evolutionary distances is important for understanding the genetic basis for similar or differing traits. Especially in plants, several complications make identification of pan-genes and gene families challenging, including whole-genome duplications, evolutionary rate differences among lineages, and varying qualities of assemblies and annotations. Here, we document and distribute a set of workflows that we have used to address these problems. RESULTS Pandagma is a set of configurable workflows for identifying and comparing pan-gene sets and gene families for annotation sets from eukaryotic genomes, using a combination of homology, synteny, and expected rates of synonymous change in coding sequence. AVAILABILITY AND IMPLEMENTATION The Pandagma workflows, example configurations, implementation details, and scripts for retrieving public datasets, are available at https://github.com/legumeinfo/pandagma.
Collapse
Affiliation(s)
- Steven B Cannon
- USDA-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, 819 Wallace Rd., Ames, IA 50011, United States
| | - Hyun-Oh Lee
- ORISE Fellow, USDA-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, 819 Wallace Rd., Ames, IA 50011, United States
| | - Nathan T Weeks
- USDA-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, 819 Wallace Rd., Ames, IA 50011, United States
| | - Joel Berendzen
- GenerisBio, 4327 Lost Feather Ln, Santa Fe, NM 87507, United States
| |
Collapse
|
5
|
Lamkiewicz K, Barf LM, Sachse K, Hölzer M. RIBAP: a comprehensive bacterial core genome annotation pipeline for pangenome calculation beyond the species level. Genome Biol 2024; 25:170. [PMID: 38951884 PMCID: PMC11218241 DOI: 10.1186/s13059-024-03312-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 06/14/2024] [Indexed: 07/03/2024] Open
Abstract
Microbial pangenome analysis identifies present or absent genes in prokaryotic genomes. However, current tools are limited when analyzing species with higher sequence diversity or higher taxonomic orders such as genera or families. The Roary ILP Bacterial core Annotation Pipeline (RIBAP) uses an integer linear programming approach to refine gene clusters predicted by Roary for identifying core genes. RIBAP successfully handles the complexity and diversity of Chlamydia, Klebsiella, Brucella, and Enterococcus genomes, outperforming other established and recent pangenome tools for identifying all-encompassing core genes at the genus level. RIBAP is a freely available Nextflow pipeline at github.com/hoelzer-lab/ribap and zenodo.org/doi/10.5281/zenodo.10890871.
Collapse
Affiliation(s)
- Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, Jena, 07743, Germany
| | - Lisa-Marie Barf
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, Jena, 07743, Germany
| | - Konrad Sachse
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, Jena, 07743, Germany
| | - Martin Hölzer
- Genome Competence Center (MF1), Robert Koch Institute, Berlin, 13353, Germany.
| |
Collapse
|
6
|
Bonnici V, Mengoni C, Mangoni M, Franco G, Giugno R. PanDelos-frags: A methodology for discovering pangenomic content of incomplete microbial assemblies. J Biomed Inform 2023; 148:104552. [PMID: 37995844 DOI: 10.1016/j.jbi.2023.104552] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 09/06/2023] [Accepted: 11/19/2023] [Indexed: 11/25/2023]
Abstract
Pangenomics was originally defined as the problem of comparing the composition of genes into gene families within a set of bacterial isolates belonging to the same species. The problem requires the calculation of sequence homology among such genes. When combined with metagenomics, namely for human microbiome composition analysis, gene-oriented pangenome detection becomes a promising method to decipher ecosystem functions and population-level evolution. Established computational tools are able to investigate the genetic content of isolates for which a complete genomic sequence is available. However, there is a plethora of incomplete genomes that are available on public resources, which only a few tools may analyze. Incomplete means that the process for reconstructing their genomic sequence is not complete, and only fragments of their sequence are currently available. However, the information contained in these fragments may play an essential role in the analyses. Here, we present PanDelos-frags, a computational tool which exploits and extends previous results in analyzing complete genomes. It provides a new methodology for inferring missing genetic information and thus for managing incomplete genomes. PanDelos-frags outperforms state-of-the-art approaches in reconstructing gene families in synthetic benchmarks and in a real use case of metagenomics. PanDelos-frags is publicly available at https://github.com/InfOmics/PanDelos-frags.
Collapse
Affiliation(s)
- Vincenzo Bonnici
- Department of Mathematical, Physical and Computer Sciences, University of Parma, Parco Area delle Scienze 53/a (Campus), Parma, 43124, PR, Italy.
| | - Claudia Mengoni
- Department of Computer Science, University of Verona, Strada le Grazie, 15, Verona, 37134, VR, Italy
| | - Manuel Mangoni
- Fondazione IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), 71013, Italy; Department of Experimental Medicine, Sapienza University of Rome, Rome (RM), Italy
| | - Giuditta Franco
- Department of Computer Science, University of Verona, Strada le Grazie, 15, Verona, 37134, VR, Italy
| | - Rosalba Giugno
- Department of Computer Science, University of Verona, Strada le Grazie, 15, Verona, 37134, VR, Italy
| |
Collapse
|
7
|
Li T, Yin Y. Critical assessment of pan-genomic analysis of metagenome-assembled genomes. Brief Bioinform 2022; 23:6702672. [PMID: 36124775 PMCID: PMC9677465 DOI: 10.1093/bib/bbac413] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 08/23/2022] [Accepted: 08/26/2022] [Indexed: 12/30/2022] Open
Abstract
Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we conducted a critical assessment of pan-genomics of MAGs, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs. We found that incompleteness led to significant core gene (CG) loss. The CG loss remained when using different pan-genome analysis tools (Roary, BPGA, Anvi'o) and when using a mixture of MAGs and complete genomes. Contamination had little effect on core genome size (except for Roary due to in its gene clustering issue) but had major influence on accessory genomes. Importantly, the CG loss was partially alleviated by lowering the CG threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The CG loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees. Our main findings were supported by a study of real MAG-isolate genome data. We conclude that lowering CG threshold and predicting genes in metagenome mode (as Anvi'o does with Prodigal) are necessary in pan-genome analysis of MAGs. Development of new pan-genome analysis tools specifically for MAGs are needed in future studies.
Collapse
Affiliation(s)
- Tang Li
- Nebraska Food for Health Center, Department of Food Science and Technology, University of Nebraska - Lincoln, Lincoln, NE, 68508, USA
| | - Yanbin Yin
- Corresponding author. Yanbin Yin, Nebraska Food for Health Center, Department of Food Science and Technology, University of Nebraska - Lincoln, Lincoln, NE 68508, USA. Tel.: +1-402-472-4303; E-mail:
| |
Collapse
|
8
|
Bonnici V, Giugno R. PANPROVA: PANgenomic PROkaryotic eVolution of full Assemblies. Bioinformatics 2022; 38:2631-2632. [PMID: 35289871 DOI: 10.1093/bioinformatics/btac158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 03/05/2022] [Accepted: 03/14/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Computational tools for pangenonic analysis have gained increasing interest over the past two decades in various applications such as evolutionary studies and vaccine development. Synthetic benchmarks are essential for the systematic evaluation of their performance. Currently, benchmarking tools represent a genome as a set of genetic sequences and fail to simulate the complete information of the genomes, which is essential for evaluating pangenomic detection between fragmented genomes. RESULTS We present PANPROVA, a benchmark tool to simulate prokaryotic pangenomic evolution by evolving the complete genomic sequence of an ancestral isolate. In this way the possibility of operating in the pre-assembly phase is enabled. Gene set variations, sequence variation and horizontal acquisition from a pool of external genomes are the evolutionary features of the tool. AVAILABILITY AND IMPLEMENTATION PANPROVA is publicly available at https://github.com/InfOmics/PANPROVA.
Collapse
Affiliation(s)
- Vincenzo Bonnici
- Department of Mathematical, Physical and Computer Sciences, University of Parma, Parma, 43124, Italy
| | - Rosalba Giugno
- Department of Computer Science, University of Verona, Verona, 37134, Italy
| |
Collapse
|