1
|
Ieger-Raittz R, De Pierri CR, Perico CP, Costa FDF, Bana EG, Vicenzi L, Machado DDJS, Marchaukoski JN, Raittz RT. What are we learning with Yoga? Mapping the scientific literature on Yoga using a vector-text-mining approach. PLoS One 2025; 20:e0322791. [PMID: 40440353 PMCID: PMC12121831 DOI: 10.1371/journal.pone.0322791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Accepted: 03/27/2025] [Indexed: 06/02/2025] Open
Abstract
The techniques used in yoga have roots in traditions that precede modern science. Research shows that yoga enhances quality of life and well-being, positively impacting physical and mental health. As yoga gains acceptance in Western countries, scientific studies on the subject increase exponentially. However, many of these studies are considered inconsistent due to the diverse methodologies and focuses in the field, which creates challenges for researchers and hampers progress. This study aims to develop a comprehensive framework for existing literature on yoga, facilitating multidisciplinary collaboration and bringing new light to relevant aspects. Given the complexity of the subject, advanced modeling techniques are necessary. Contemporary artificial intelligence methods have advanced Bioinformatics, including text mining (TM), allowing us to employ vector representations of texts to derive semantic insights and organize literature effectively. Based on TM resources, we provided a better general understanding of yoga and highlighted the relationships between yoga practice and various domains, including biochemical parameters and neuroscience. It also reveals that practitioners can learn to engage with their bodies and environments actively, enhancing their quality of life. However, there is a lack of research exploring the mechanisms behind this learning and its potential for further enhancement. Vector TM has made it possible to bolster and improve human analysis. The set of resources developed allowed us to determine the mapping of the literature, the analysis of which revealed 4 dimensions (exercise, physiology, theory and therapeutic) divided into 9 cohesive groups, representing the trends in the literature. The resulting platforms are available to Yoga researchers to evaluate our findings and make their forays into the existing literature.
Collapse
Affiliation(s)
- Rosangela Ieger-Raittz
- Graduate Program in Physical Exercise Medicine in Health Promotion, Health Sciences Sector, Federal University of Paraná, Curitiba, Paraná, Brazil
| | - Camilla Reginatto De Pierri
- Laboratory of Artificial Intelligence Applied to Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Department of Biochemistry and Molecular Biology, Federal University of Paraná, Curitiba, Paraná, Brazil
| | - Camila Pereira Perico
- Laboratory of Artificial Intelligence Applied to Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Associate Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
| | - Flavia de Fatima Costa
- Laboratory of Artificial Intelligence Applied to Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Associate Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
| | - Elisa Garbin Bana
- Laboratory of Artificial Intelligence Applied to Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Associate Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
| | - Leonardo Vicenzi
- Laboratory of Artificial Intelligence Applied to Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Associate Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
| | - Diogo de Jesus Soares Machado
- Laboratory of Artificial Intelligence Applied to Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Associate Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
| | - Jeroniza Nunes Marchaukoski
- Laboratory of Artificial Intelligence Applied to Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Associate Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
| | - Roberto Tadeu Raittz
- Laboratory of Artificial Intelligence Applied to Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
- Associate Graduate Program in Bioinformatics, SEPT, Federal University of Paraná, Curitiba, Paraná, Brazil
| |
Collapse
|
2
|
Pimenta-Zanon MH, Kashiwabara AY, Vanzela ALL, Lopes FM. GRAMEP: an alignment-free method based on the maximum entropy principle for identifying SNPs. BMC Bioinformatics 2025; 26:66. [PMID: 40000933 PMCID: PMC11863517 DOI: 10.1186/s12859-025-06037-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Accepted: 01/06/2025] [Indexed: 02/27/2025] Open
Abstract
BACKGROUND Advances in high throughput sequencing technologies provide a huge number of genomes to be analyzed. Thus, computational methods play a crucial role in analyzing and extracting knowledge from the data generated. Investigating genomic mutations is critical because of their impact on chromosomal evolution, genetic disorders, and diseases. It is common to adopt aligning sequences for analyzing genomic variations. However, this approach can be computationally expensive and restrictive in scenarios with large datasets. RESULTS We present a novel method for identifying single nucleotide polymorphisms (SNPs) in DNA sequences from assembled genomes. This study proposes GRAMEP, an alignment-free approach that adopts the principle of maximum entropy to discover the most informative k-mers specific to a genome or set of sequences under investigation. The informative k-mers enable the detection of variant-specific mutations in comparison to a reference genome or other set of sequences. In addition, our method offers the possibility of classifying novel sequences with no need for organism-specific information. GRAMEP demonstrated high accuracy in both in silico simulations and analyses of viral genomes, including Dengue, HIV, and SARS-CoV-2. Our approach maintained accurate SARS-CoV-2 variant identification while demonstrating a lower computational cost compared to methods with the same purpose. CONCLUSIONS GRAMEP is an open and user-friendly software based on maximum entropy that provides an efficient alignment-free approach to identifying and classifying unique genomic subsequences and SNPs with high accuracy, offering advantages over comparative methods. The instructions for use, applicability, and usability of GRAMEP are open access at https://github.com/omatheuspimenta/GRAMEP .
Collapse
Affiliation(s)
- Matheus Henrique Pimenta-Zanon
- Computer Science Department, Universidade Tecnológica Federal do Paraná (UTFPR), Alberto Carazzai, 1640, Cornélio Procópio, Paraná, 86300-000, Brazil
| | - André Yoshiaki Kashiwabara
- Computer Science Department, Universidade Tecnológica Federal do Paraná (UTFPR), Alberto Carazzai, 1640, Cornélio Procópio, Paraná, 86300-000, Brazil
| | - André Luís Laforga Vanzela
- Laboratory of Cytogenetics and Plant Diversity, Department of General Biology, Universidade Estadual de Londrina (UEL), Rodovia Celso Garcia Cid, PR-445, Km 380, Londrina, Paraná, 86057-970, Brazil
| | - Fabricio Martins Lopes
- Computer Science Department, Universidade Tecnológica Federal do Paraná (UTFPR), Alberto Carazzai, 1640, Cornélio Procópio, Paraná, 86300-000, Brazil.
| |
Collapse
|
3
|
Nichio BTDL, Chaves RBR, Pedrosa FDO, Raittz RT. Exploring diazotrophic diversity: unveiling Nif core distribution and evolutionary patterns in nitrogen-fixing organisms. BMC Genomics 2025; 26:81. [PMID: 39871141 PMCID: PMC11773926 DOI: 10.1186/s12864-024-10994-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Accepted: 11/05/2024] [Indexed: 01/29/2025] Open
Abstract
BACKGROUND Diazotrophs carry out biological nitrogen fixation (BNF) using the nitrogenase enzyme complex (NEC), which relies on nitrogenase encoded by nif genes. Horizontal gene transfer (HGT) and gene duplications have created significant diversity among these genes, making it challenging to identify potential diazotrophs. Previous studies have established a minimal set of Nif proteins, known as the Nif core, which includes NifH, NifD, NifK, NifE, NifN, and NifB. This study aimed to identify potential diazotroph groups based on the Nif core and to analyze the inheritance patterns of accessory Nif proteins related to Mo-nitrogenase, along with their impact on N2 fixation maintenance. RESULTS In a systematic study, 118 diazotrophs were identified, resulting in a database of 2,156 Nif protein sequences obtained with RAFTS³G. Using this Nif database and a data mining strategy, we extended our analysis to 711 species and found that 544 contain the Nif core. A partial Nif core set was observed in eight species in this study. Finally, we cataloged 662 species with Nif core, of which 52 were novel. Our analysis generated 10,076 Nif proteins from these species and revealed some Nif core duplications. Additionally, we determined the optimal cluster value (k = 10) for analyzing diazotrophic diversity. Combining synteny and phylogenetic analyses revealed distinct syntenies in the nif gene composition across ten groups. CONCLUSIONS This study advances our understanding of the distribution of nif genes, aiding in the prediction and classification of N₂-fixing organisms. Furthermore, we present a comprehensive overview of the diversity, distribution, and evolutionary relationships among diazotrophic organisms associated with the Nif core. The analysis revealed the phylogenetic and functional organization of different groups, identifying synteny patterns and new nif gene arrangements across various bacterial and archaeal species.The identified groups serve as a valuable framework for further exploration of the molecular mechanisms underlying biological nitrogen fixation and its evolutionary significance across different bacterial lineages.
Collapse
Affiliation(s)
- Bruno Thiago de Lima Nichio
- Laboratory of Artificial Intelligence Applied to Bioinformatics, Professional and Technical Education Sector - SEPT, UFPR, Curitiba, Paraná, Brazil
- Department of Biochemistry, Biological Sciences Sector, Federal University of Paraná (UFPR), Curitiba, Paraná, Brazil
| | - Roxana Beatriz Ribeiro Chaves
- Department of Biochemistry, Biological Sciences Sector, Federal University of Paraná (UFPR), Curitiba, Paraná, Brazil
| | - Fábio de Oliveira Pedrosa
- Laboratory of Artificial Intelligence Applied to Bioinformatics, Professional and Technical Education Sector - SEPT, UFPR, Curitiba, Paraná, Brazil
- Department of Biochemistry, Biological Sciences Sector, Federal University of Paraná (UFPR), Curitiba, Paraná, Brazil
| | - Roberto Tadeu Raittz
- Laboratory of Artificial Intelligence Applied to Bioinformatics, Professional and Technical Education Sector - SEPT, UFPR, Curitiba, Paraná, Brazil.
- Department of Biochemistry, Biological Sciences Sector, Federal University of Paraná (UFPR), Curitiba, Paraná, Brazil.
| |
Collapse
|
4
|
Perico CP, De Pierri CR, Neto GP, Fernandes DR, Pedrosa FO, de Souza EM, Raittz RT. Genomic landscape of the SARS-CoV-2 pandemic in Brazil suggests an external P.1 variant origin. Front Microbiol 2022; 13:1037455. [PMID: 36620039 PMCID: PMC9814972 DOI: 10.3389/fmicb.2022.1037455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022] Open
Abstract
Brazil was the epicenter of worldwide pandemics at the peak of its second wave. The genomic/proteomic perspective of the COVID-19 pandemic in Brazil could provide insights to understand the global pandemics behavior. In this study, we track SARS-CoV-2 molecular information in Brazil using real-time bioinformatics and data science strategies to provide a comparative and evolutive panorama of the lineages in the country. SWeeP vectors represented the Brazilian and worldwide genomic/proteomic data from Global Initiative on Sharing Avian Influenza Data (GISAID) between February 2020 and August 2021. Clusters were analyzed and compared with PANGO lineages. Hierarchical clustering provided phylogenetic and evolutionary analyses of the lineages, and we tracked the P.1 (Gamma) variant origin. The genomic diversity based on Chao's estimation allowed us to compare richness and coverage among Brazilian states and other representative countries. We found that epidemics in Brazil occurred in two moments with different genetic profiles. The P.1 lineages emerged in the second wave, which was more aggressive. We could not trace the origin of P.1 from the variants present in Brazil. Instead, we found evidence pointing to its external source and a possible recombinant event that may relate P.1 to a B.1.1.28 variant subset. We discussed the potential application of the pipeline for emerging variants detection and the PANGO terminology stability over time. The diversity analysis showed that the low coverage and unbalanced sequencing among states in Brazil could have allowed the silent entry and dissemination of P.1 and other dangerous variants. This study may help to understand the development and consequences of variants of concern (VOC) entry.
Collapse
Affiliation(s)
- Camila P Perico
- Laboratory of Artificial Intelligence Applied to Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
- Graduate Program in Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
| | - Camilla R De Pierri
- Laboratory of Artificial Intelligence Applied to Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
- Department of Biochemistry and Molecular Biology, Federal University of Paraná, Curitiba, Brazil
| | - Giuseppe Pasqualato Neto
- Laboratory of Artificial Intelligence Applied to Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
| | - Danrley R Fernandes
- Laboratory of Artificial Intelligence Applied to Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
- Graduate Program in Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
| | - Fabio O Pedrosa
- Graduate Program in Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
- Department of Biochemistry and Molecular Biology, Federal University of Paraná, Curitiba, Brazil
| | - Emanuel M de Souza
- Graduate Program in Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
- Department of Biochemistry and Molecular Biology, Federal University of Paraná, Curitiba, Brazil
| | - Roberto T Raittz
- Laboratory of Artificial Intelligence Applied to Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
- Graduate Program in Bioinformatics, Professional and Technological Education Sector (SEPT), Federal University of Paraná, Curitiba, Brazil
| |
Collapse
|
5
|
da Silva Filho AC, Marchaukoski JN, Raittz RT, De Pierri CR, de Jesus Soares Machado D, Fadel-Picheth CMT, Picheth G. Prediction and Analysis in silico of Genomic Islands in Aeromonas hydrophila. Front Microbiol 2021; 12:769380. [PMID: 34912316 PMCID: PMC8667584 DOI: 10.3389/fmicb.2021.769380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 11/01/2021] [Indexed: 11/13/2022] Open
Abstract
Aeromonas are Gram-negative rods widely distributed in the environment. They can cause severe infections in fish related to financial losses in the fish industry, and are considered opportunistic pathogens of humans causing infections ranging from diarrhea to septicemia. The objective of this study was to determine in silico the contribution of genomic islands to A. hydrophila. The complete genomes of 17 A. hydrophila isolates, which were separated into two phylogenetic groups, were analyzed using a genomic island (GI) predictor. The number of predicted GIs and their characteristics varied among strains. Strains from group 1, which contains mainly fish pathogens, generally have a higher number of predicted GIs, and with larger size, than strains from group 2 constituted by strains recovered from distinct sources. Only a few predicted GIs were shared among them and contained mostly genes from the core genome. Features related to virulence, metabolism, and resistance were found in the predicted GIs, but strains varied in relation to their gene content. In strains from group 1, O Ag biosynthesis clusters OX1 and OX6 were identified, while strains from group 2 each had unique clusters. Metabolic pathways for myo-inositol, L-fucose, sialic acid, and a cluster encoding QueDEC, tgtA5, and proteins related to DNA metabolism were identified in strains of group 1, which share a high number of predicted GIs. No distinctive features of group 2 strains were identified in their predicted GIs, which are more diverse and possibly better represent GIs in this species. However, some strains have several resistance attributes encoded by their predicted GIs. Several predicted GIs encode hypothetical proteins and phage proteins whose functions have not been identified but may contribute to Aeromonas fitness. In summary, features with functions identified on predicted GIs may confer advantages to host colonization and competitiveness in the environment.
Collapse
Affiliation(s)
| | - Jeroniza Nunes Marchaukoski
- Department of Bioinformatics, Professional and Technical Education Sector, Federal University of Parana, Curitiba, Brazil
| | - Roberto Tadeu Raittz
- Department of Bioinformatics, Professional and Technical Education Sector, Federal University of Parana, Curitiba, Brazil
| | | | - Diogo de Jesus Soares Machado
- Department of Bioinformatics, Professional and Technical Education Sector, Federal University of Parana, Curitiba, Brazil
| | | | - Geraldo Picheth
- Department of Clinical Analysis, Federal University of Parana, Curitiba, Brazil
| |
Collapse
|
6
|
Vieira AZ, Raittz RT, Faoro H. Origin and evolution of nonulosonic acid synthases and their relationship with bacterial pathogenicity revealed by a large-scale phylogenetic analysis. Microb Genom 2021; 7:000563. [PMID: 33848237 PMCID: PMC8208679 DOI: 10.1099/mgen.0.000563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 03/16/2021] [Indexed: 12/28/2022] Open
Abstract
Nonulosonic acids (NulOs) are a group of nine-carbon monosaccharides with different functions in nature. N-acetylneuraminic acid (Neu5Ac) is the most common NulO. It covers the membrane surface of all human cells and is a central molecule in the process of self-recognition via SIGLECS receptors. Some pathogenic bacteria escape the immune system by copying the sialylation of the host cell membrane. Neu5Ac production in these bacteria is catalysed by the enzyme NeuB. Some bacteria can also produce other NulOs named pseudaminic and legionaminic acids, through the NeuB homologues PseI and LegI, respectively. In Opisthokonta eukaryotes, the biosynthesis of Neu5Ac is catalysed by the enzyme NanS. In this study, we used publicly available data of sequences of NulOs synthases to investigate its distribution within the three domains of life and its relationship with pathogenic bacteria. We mined the KEGG database and found 425 NeuB sequences. Most NeuB sequences (58.74 %) from the KEGG orthology database were classified as from environmental bacteria; however, sequences from pathogenic bacteria showed higher conservation and prevalence of a specific domain named SAF. Using the HMM profile we identified 13 941 NulO synthase sequences in UniProt. Phylogenetic analysis of these sequences showed that the synthases were divided into three main groups that can be related to the lifestyle of these bacteria: (I) predominantly environmental, (II) intermediate and (III) predominantly pathogenic. NeuB was widely distributed in the groups. However, LegI and PseI were more concentrated in groups II and III, respectively. We also found that PseI appeared later in the evolutionary process, derived from NeuB. We use this same methodology to retrieve sialic acid synthase sequences from Archaea and Eukarya. A large-scale phylogenetic analysis showed that while the Archaea sequences are spread across the tree, the eukaryotic NanS sequences were grouped in a specific branch in group II. None of the bacterial NanS sequences grouped with the eukaryotic branch. The analysis of conserved residues showed that the synthases of Archaea and Eukarya present a mutation in one of the three catalytic residues, an E134D change, related to a Neisseria meningitidis reference sequence. We also found that the conservation profile is higher between NeuB of pathogenic bacteria and NanS of eukaryotes than between NeuB of environmental bacteria and NanS of eukaryotes. Our large-scale analysis brings new perspectives on the evolution of NulOs synthases, suggesting their presence in the last common universal ancestor.
Collapse
Affiliation(s)
- Alexandre Zanatta Vieira
- Laboratory for Applied Science and Technology in Health, Carlos Chagas Institute, Fiocruz-PR, Algacyr Munhoz Mader street, 3775, Curitiba, Paraná, Brazil
- Graduation Program on Bioinformatics – Universidade Federal do Paraná, Alcides Viera Arcoverde street 1225, Curitiba, Paraná, Brazil
| | - Roberto Tadeu Raittz
- Graduation Program on Bioinformatics – Universidade Federal do Paraná, Alcides Viera Arcoverde street 1225, Curitiba, Paraná, Brazil
| | - Helisson Faoro
- Laboratory for Applied Science and Technology in Health, Carlos Chagas Institute, Fiocruz-PR, Algacyr Munhoz Mader street, 3775, Curitiba, Paraná, Brazil
- Graduation Program on Bioinformatics – Universidade Federal do Paraná, Alcides Viera Arcoverde street 1225, Curitiba, Paraná, Brazil
| |
Collapse
|
7
|
Comparative Genomics Provides Insights into the Taxonomy of Azoarcus and Reveals Separate Origins of Nif Genes in the Proposed Azoarcus and Aromatoleum Genera. Genes (Basel) 2021; 12:genes12010071. [PMID: 33430351 PMCID: PMC7825797 DOI: 10.3390/genes12010071] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 12/30/2020] [Accepted: 01/05/2021] [Indexed: 01/19/2023] Open
Abstract
Among other attributes, the Betaproteobacterial genus Azoarcus has biotechnological importance for plant growth-promotion and remediation of petroleum waste-polluted water and soils. It comprises at least two phylogenetically distinct groups. The "plant-associated" group includes strains that are isolated from the rhizosphere or root interior of the C4 plant Kallar Grass, but also strains from soil and/or water; all are considered to be obligate aerobes and all are diazotrophic. The other group (now partly incorporated into the new genus Aromatoleum) comprises a diverse range of species and strains that live in water or soil that is contaminated with petroleum and/or aromatic compounds; all are facultative or obligate anaerobes. Some are diazotrophs. A comparative genome analysis of 32 genomes from 30 Azoarcus-Aromatoleum strains was performed in order to delineate generic boundaries more precisely than the single gene, 16S rRNA, that has been commonly used in bacterial taxonomy. The origin of diazotrophy in Azoarcus-Aromatoleum was also investigated by comparing full-length sequences of nif genes, and by physiological measurements of nitrogenase activity using the acetylene reduction assay. Based on average nucleotide identity (ANI) and whole genome analyses, three major groups could be discerned: (i) Azoarcus comprising Az. communis, Az. indigens and Az. olearius, and two unnamed species complexes, (ii) Aromatoleum Group 1 comprising Ar. anaerobium, Ar. aromaticum, Ar. bremense, and Ar. buckelii, and (iii) Aromatoleum Group 2 comprising Ar. diolicum, Ar. evansii, Ar. petrolei, Ar. toluclasticum, Ar. tolulyticum, Ar. toluolicum, and Ar. toluvorans. Single strain lineages such as Azoarcus sp. KH32C, Az. pumilus, and Az. taiwanensis were also revealed. Full length sequences of nif-cluster genes revealed two groups of diazotrophs in Azoarcus-Aromatoleum with nif being derived from Dechloromonas in Azoarcus sensu stricto (and two Thauera strains) and from Azospira in Aromatoleum Group 2. Diazotrophy was confirmed in several strains, and for the first time in Az. communis LMG5514, Azoarcus sp. TTM-91 and Ar. toluolicum TT. In terms of ecology, with the exception of a few plant-associated strains in Azoarcus (s.s.), across the group, most strains/species are found in soil and water (often contaminated with petroleum or related aromatic compounds), sewage sludge, and seawater. The possession of nar, nap, nir, nor, and nos genes by most Azoarcus-Aromatoleum strains suggests that they have the potential to derive energy through anaerobic nitrate respiration, so this ability cannot be usefully used as a phenotypic marker to distinguish genera. However, the possession of bzd genes indicating the ability to degrade benzoate anaerobically plus the type of diazotrophy (aerobic vs. anaerobic) could, after confirmation of their functionality, be considered as distinguishing phenotypes in any new generic delineations. The taxonomy of the Azoarcus-Aromatoleum group should be revisited; retaining the generic name Azoarcus for its entirety, or creating additional genera are both possible outcomes.
Collapse
|