1
|
De R, Whiteley M, Azad RK. A gene network-driven approach to infer novel pathogenicity-associated genes: application to Pseudomonas aeruginosa PAO1. mSystems 2023; 8:e0047323. [PMID: 37921470 PMCID: PMC10734507 DOI: 10.1128/msystems.00473-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 10/04/2023] [Indexed: 11/04/2023] Open
Abstract
IMPORTANCE We present here a new systems-level approach to decipher genetic factors and biological pathways associated with virulence and/or antibiotic treatment of bacterial pathogens. The power of this approach was demonstrated by application to a well-studied pathogen Pseudomonas aeruginosa PAO1. Our gene co-expression network-based approach unraveled known and unknown genes and their networks associated with pathogenicity in P. aeruginosa PAO1. The systems-level investigation of P. aeruginosa PAO1 helped identify putative pathogenicity and resistance-associated genetic factors that could not otherwise be detected by conventional approaches of differential gene expression analysis. The network-based analysis uncovered modules that harbor genes not previously reported by several original studies on P. aeruginosa virulence and resistance. These could potentially act as molecular determinants of P. aeruginosa PAO1 pathogenicity and responses to antibiotics.
Collapse
Affiliation(s)
- Ronika De
- Department of Biological Sciences, University of North Texas, Denton, Texas, USA
- BioDiscovery Institute, University of North Texas, Denton, Texas, USA
| | - Marvin Whiteley
- Center for Microbial Dynamics and Infection, School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
- Emory-Children’s Cystic Fibrosis Center, Atlanta, Georgia, USA
| | - Rajeev K. Azad
- Department of Biological Sciences, University of North Texas, Denton, Texas, USA
- BioDiscovery Institute, University of North Texas, Denton, Texas, USA
- Department of Mathematics, University of North Texas, Denton, Texas, USA
| |
Collapse
|
2
|
Subramanian D, Natarajan J. Leveraging big data bioinformatics approaches to extract knowledge from Staphylococcus aureus public omics data. Crit Rev Microbiol 2022; 49:391-413. [PMID: 35468027 DOI: 10.1080/1040841x.2022.2065905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Staphylococcus aureus is a notorious pathogen posing challenges in the medical industry due to drug resistance and biofilm formation. The horizon of knowledge on S. aureus pathogenesis has expanded with the advancement of data-driven bioinformatics techniques. Mining information from sequenced genomes and their expression data is an economic approach that alleviates wastage of resources and redundancy in experiments. The current review covers how big data bioinformatics has been used in the analysis of S. aureus from publicly available -omics data to uncover mechanisms of infection and inhibition. Particularly, advances in the past two decades in biomarker discovery, host responses, phenotype identification, consolidation of information, and drug development are discussed highlighting the challenges and shortcomings. Overall, the review summarizes the diverse aspects of scrupulous re-analysis of S. aureus proteomic and transcriptomic expression datasets retrieved from public repositories in terms of the efforts taken, benefits offered, and follow-up actions. The detailed review thus serves as a reference and aid for (i) Computational biologists by briefing the approaches utilized for bacterial omics re-analysis concerning S. aureus and (ii) Experimental biologists by elucidating the potential of bioinformatics in biological research to generate reliable postulates in a prompt and economical manner.
Collapse
Affiliation(s)
- Devika Subramanian
- Data Mining and Text Mining Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore, India
| | - Jeyakumar Natarajan
- Data Mining and Text Mining Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore, India
| |
Collapse
|
3
|
Rajput A, Tsunemoto H, Sastry AV, Szubin R, Rychel K, Sugie J, Pogliano J, Palsson BO. Machine learning from Pseudomonas aeruginosa transcriptomes identifies independently modulated sets of genes associated with known transcriptional regulators. Nucleic Acids Res 2022; 50:3658-3672. [PMID: 35357493 PMCID: PMC9023270 DOI: 10.1093/nar/gkac187] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 02/28/2022] [Accepted: 03/29/2022] [Indexed: 12/16/2022] Open
Abstract
The transcriptional regulatory network (TRN) of Pseudomonas aeruginosa coordinates cellular processes in response to stimuli. We used 364 transcriptomes (281 publicly available + 83 in-house generated) to reconstruct the TRN of P. aeruginosa using independent component analysis. We identified 104 independently modulated sets of genes (iModulons) among which 81 reflect the effects of known transcriptional regulators. We identified iModulons that (i) play an important role in defining the genomic boundaries of biosynthetic gene clusters (BGCs), (ii) show increased expression of the BGCs and associated secretion systems in nutrient conditions that are important in cystic fibrosis, (iii) show the presence of a novel ribosomally synthesized and post-translationally modified peptide (RiPP) BGC which might have a role in P. aeruginosa virulence, (iv) exhibit interplay of amino acid metabolism regulation and central metabolism across different carbon sources and (v) clustered according to their activity changes to define iron and sulfur stimulons. Finally, we compared the identified iModulons of P. aeruginosa with those previously described in Escherichia coli to observe conserved regulons across two Gram-negative species. This comprehensive TRN framework encompasses the majority of the transcriptional regulatory machinery in P. aeruginosa, and thus should prove foundational for future research into its physiological functions.
Collapse
Affiliation(s)
- Akanksha Rajput
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Hannah Tsunemoto
- Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Anand V Sastry
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Richard Szubin
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Kevin Rychel
- Department of Bioengineering, University of California, San Diego, La Jolla, USA
| | - Joseph Sugie
- Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Joe Pogliano
- Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, USA.,Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.,Center for Microbiome Innovation, University of California San Diego, La Jolla, CA 92093, USA.,Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800 Kongens, Lyngby, Denmark
| |
Collapse
|
4
|
Guo L, Mao L, Lu W, Yang J. Identification of breast cancer prognostic modules via differential module selection based on weighted gene Co-expression network analysis. Biosystems 2020; 199:104317. [PMID: 33279569 DOI: 10.1016/j.biosystems.2020.104317] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 11/30/2020] [Accepted: 11/30/2020] [Indexed: 02/06/2023]
Abstract
Breast cancer is a complex cancer which includes many different subtypes. Identifying prognostic modules, i.e., functionally related gene networks that play crucial roles in cancer development is essential in breast cancer study. Different subtypes of breast cancer correspond to different treatment methods. The purpose of this study is to use a new method to divide breast cancer into different prognostic modules, so as to provide scientific basis for improving clinical management. The method is based on comparing similarities between modules detected from different weighted gene co-expression networks. The method was applied on genomic data of breast cancer from The Cancer Genome Atlas database and was applied to select differential modules between two groups of patients with significant differences in survival times. It was compared with a previously proposed module selection method. The result shows that our method outperforms the previously proposed one. Moreover, within the identified two differential modules, the first one is highly enriched with genes involved in hormone responds, the second one is highly related with biological process engaged in M-phase. The two modules were further validated by log-rank test in the validation dataset GSE3494. Both of the two modules show significantly different with p-values less than 0.02. The identified two modules confirmed previous findings including importance of biological networks in breast cancer involved in hormone response and M-phase. Out of the top twenty hub genes in the two modules, fifteen genes were previously shown to be prognostic markers for breast cancer.
Collapse
Affiliation(s)
- Ling Guo
- Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China; College of Electrical Engineering, Northwest Minzu University, Lanzhou, China
| | - Leer Mao
- Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, China.
| | - WenTing Lu
- College of Electrical Engineering, Northwest Minzu University, Lanzhou, China
| | - Jun Yang
- College of Electrical Engineering, Northwest Minzu University, Lanzhou, China
| |
Collapse
|
5
|
Abstract
Cholera is a devastating illness that kills tens of thousands of people annually. Vibrio cholerae, the causative agent of cholera, is an important model organism to investigate both bacterial pathogenesis and the impact of horizontal gene transfer on the emergence and dissemination of new virulent strains. Despite the importance of this pathogen, roughly one-third of V. cholerae genes are functionally unannotated, leaving large gaps in our understanding of this microbe. Through coexpression network analysis of existing RNA sequencing data, this work develops an approach to uncover novel gene-gene relationships and contextualize genes with no known function, which will advance our understanding of V. cholerae virulence and evolution. Research into the evolution and pathogenesis of Vibrio cholerae has benefited greatly from the generation of high-throughput sequencing data to drive molecular analyses. The steady accumulation of these data sets now provides a unique opportunity for in silico hypothesis generation via coexpression analysis. Here, we leverage all published V. cholerae RNA sequencing data, in combination with select data from other platforms, to generate a gene coexpression network that validates known gene interactions and identifies novel genetic partners across the entire V. cholerae genome. This network provides direct insights into genes influencing pathogenicity, metabolism, and transcriptional regulation, further clarifies results from previous sequencing experiments in V. cholerae (e.g., transposon insertion sequencing [Tn-seq] and chromatin immunoprecipitation sequencing [ChIP-seq]), and expands upon microarray-based findings in related Gram-negative bacteria. IMPORTANCE Cholera is a devastating illness that kills tens of thousands of people annually. Vibrio cholerae, the causative agent of cholera, is an important model organism to investigate both bacterial pathogenesis and the impact of horizontal gene transfer on the emergence and dissemination of new virulent strains. Despite the importance of this pathogen, roughly one-third of V. cholerae genes are functionally unannotated, leaving large gaps in our understanding of this microbe. Through coexpression network analysis of existing RNA sequencing data, this work develops an approach to uncover novel gene-gene relationships and contextualize genes with no known function, which will advance our understanding of V. cholerae virulence and evolution.
Collapse
|
6
|
Joshi SR, Jagtap S, Basu B, Deobagkar DD, Ghosh P. Construction, analysis and validation of co-expression network to understand stress adaptation in Deinococcus radiodurans R1. PLoS One 2020; 15:e0234721. [PMID: 32579573 PMCID: PMC7314050 DOI: 10.1371/journal.pone.0234721] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 06/02/2020] [Indexed: 01/12/2023] Open
Abstract
Systems biology based approaches have been effectively utilized to mine high throughput data. In the current study, we have performed system-level analysis for Deinococcus radiodurans R1 by constructing a gene co-expression network based on several microarray datasets available in the public domain. This condition-independent network was constructed by Weighted Gene Co-expression Network Analysis (WGCNA) with 61 microarray samples from 9 different experimental conditions. We identified 13 co-expressed modules, of which, 11 showed functional enrichments of one or more pathway/s or biological process. Comparative analysis of differentially expressed genes and proteins from radiation and desiccation stress studies with our co-expressed modules revealed the association of cyan with radiation response. Interestingly, two modules viz darkgreen and tan was associated with radiation as well as desiccation stress responses. The functional analysis of these modules showed enrichment of pathways important for adaptation of radiation or desiccation stress. To decipher the regulatory roles of these stress responsive modules, we identified transcription factors (TFs) and then calculated a Biweight mid correlation between modules hub gene and the identified TFs. We obtained 7 TFs for radiation and desiccation responsive modules. The expressions of 3 TFs were validated in response to gamma radiation using qRT-PCR. Along with the TFs, selected close neighbor genes of two important TFs, viz., DR_0997 (CRP) and DR_2287 (AsnC family transcriptional regulator) in the darkgreen module were also validated. In our network, among 13 hub genes associated with 13 modules, the functionality of 5 hub genes which are annotated as hypothetical proteins (hypothetical hub genes) in D. radiodurans genome has been revealed. Overall the study provided a better insight of pathways and regulators associated with relevant DNA damaging stress response in D. radiodurans.
Collapse
Affiliation(s)
- Suraj R. Joshi
- Bioinformatics Centre, Savitribai Phule Pune University, Pune, India
- Molecular Biology Research Laboratory, Department of Zoology, Savitribai Phule Pune University, Pune, India
- Molecular Biology Division, Bhabha Atomic Research Centre, Mumbai, India
| | - Surabhi Jagtap
- Bioinformatics Centre, Savitribai Phule Pune University, Pune, India
| | - Bhakti Basu
- Molecular Biology Division, Bhabha Atomic Research Centre, Mumbai, India
| | - Deepti D. Deobagkar
- Molecular Biology Research Laboratory, Department of Zoology, Savitribai Phule Pune University, Pune, India
| | - Payel Ghosh
- Bioinformatics Centre, Savitribai Phule Pune University, Pune, India
- * E-mail: ,
| |
Collapse
|
7
|
Galán-Vásquez E, Perez-Rueda E. Identification of Modules With Similar Gene Regulation and Metabolic Functions Based on Co-expression Data. Front Mol Biosci 2019; 6:139. [PMID: 31921888 PMCID: PMC6929668 DOI: 10.3389/fmolb.2019.00139] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 11/18/2019] [Indexed: 12/16/2022] Open
Abstract
Biological systems respond to environmental perturbations and to a large diversity of compounds through gene interactions, and these genetic factors comprise complex networks. In particular, a wide variety of gene co-expression networks have been constructed in recent years thanks to the dramatic increase of experimental information obtained with techniques, such as microarrays and RNA sequencing. These networks allow the identification of groups of co-expressed genes that can function in the same process and, in turn, these networks may be related to biological functions of industrial, medical and academic interest. In this study, gene co-expression networks for 17 bacterial organisms from the COLOMBOS database were analyzed via weighted gene co-expression network analysis and clustered into modules of genes with similar expression patterns for each species. These networks were analyzed to determine relevant modules through a hypergeometric approach based on a set of transcription factors and enzymes for each genome. The richest modules were characterized using PFAM families and KEGG metabolic maps. Additionally, we conducted a Gene Ontology analysis for enrichment of biological functions. Finally, we identified modules that shared similarity through all the studied organisms by using comparative genomics.
Collapse
Affiliation(s)
- Edgardo Galán-Vásquez
- Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Ciudad Universitaria, Universidad Nacional Autónoma de México, Ciudad de México, Mexico
| | - Ernesto Perez-Rueda
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Unidad Académica Yucatán, Mérida, Mexico.,Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
| |
Collapse
|