1
|
Agrawal S, Buyan A, Severin J, Koido M, Alam T, Abugessaisa I, Chang HY, Dostie J, Itoh M, Kere J, Kondo N, Li Y, Makeev VJ, Mendez M, Okazaki Y, Ramilowski JA, Sigorskikh AI, Strug LJ, Yagi K, Yasuzawa K, Yip CW, Hon CC, Hoffman MM, Terao C, Kulakovskiy IV, Kasukawa T, Shin JW, Carninci P, de Hoon MJL. Annotation of nuclear lncRNAs based on chromatin interactions. PLoS One 2024; 19:e0295971. [PMID: 38709794 PMCID: PMC11073715 DOI: 10.1371/journal.pone.0295971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 12/02/2023] [Indexed: 05/08/2024] Open
Abstract
The human genome is pervasively transcribed and produces a wide variety of long non-coding RNAs (lncRNAs), constituting the majority of transcripts across human cell types. Some specific nuclear lncRNAs have been shown to be important regulatory components acting locally. As RNA-chromatin interaction and Hi-C chromatin conformation data showed that chromatin interactions of nuclear lncRNAs are determined by the local chromatin 3D conformation, we used Hi-C data to identify potential target genes of lncRNAs. RNA-protein interaction data suggested that nuclear lncRNAs act as scaffolds to recruit regulatory proteins to target promoters and enhancers. Nuclear lncRNAs may therefore play a role in directing regulatory factors to locations spatially close to the lncRNA gene. We provide the analysis results through an interactive visualization web portal at https://fantom.gsc.riken.jp/zenbu/reports/#F6_3D_lncRNA.
Collapse
Affiliation(s)
- Saumya Agrawal
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Andrey Buyan
- Autosome.org, Russia
- FANTOM Consortium, Dolgoprudny, Russia
| | - Jessica Severin
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Masaru Koido
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | | | - Howard Y. Chang
- Center for Personal Dynamic Regulome, Stanford University, Stanford, California, United States of America
| | - Josée Dostie
- Department of Biochemistry, Rosalind and Morris Goodman Cancer Research Center, McGill University, Montréal, Québec, Canada
| | - Masayoshi Itoh
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Japan
| | - Juha Kere
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
- Stem Cells and Metabolism Research Program, University of Helsinki and Folkhälsan Research Center, Helsinki, Finland
| | - Naoto Kondo
- RIKEN Center for Life Science Technologies, Yokohama, Japan
| | - Yunjing Li
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | | | - Mickaël Mendez
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Yasushi Okazaki
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Jordan A. Ramilowski
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Advanced Medical Research Center, Yokohama City University, Yokohama, Japan
| | | | - Lisa J. Strug
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Department of Statistical Sciences, University of Toronto, Ontario, Canada
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Ken Yagi
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kayoko Yasuzawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Chi Wai Yip
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Chung Chau Hon
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Michael M. Hoffman
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Princess Margaret Cancer Centre, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
| | - Chikashi Terao
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | | | - Takeya Kasukawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Jay W. Shin
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Piero Carninci
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Human Technopole, Milan, Italy
| | | |
Collapse
|
2
|
Bai W, Li C, Li W, Wang H, Han X, Wang P, Wang L. Machine learning assists prediction of genes responsible for plant specialized metabolite biosynthesis by integrating multi-omics data. BMC Genomics 2024; 25:418. [PMID: 38679745 PMCID: PMC11057162 DOI: 10.1186/s12864-024-10258-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 03/26/2024] [Indexed: 05/01/2024] Open
Abstract
BACKGROUND Plant specialized (or secondary) metabolites (PSM), also known as phytochemicals, natural products, or plant constituents, play essential roles in interactions between plants and environment. Although many research efforts have focused on discovering novel metabolites and their biosynthetic genes, the resolution of metabolic pathways and identified biosynthetic genes was limited by rudimentary analysis approaches and enormous number of candidate genes. RESULTS Here we integrated state-of-the-art automated machine learning (ML) frame AutoGluon-Tabular and multi-omics data from Arabidopsis to predict genes encoding enzymes involved in biosynthesis of plant specialized metabolite (PSM), focusing on the three main PSM categories: terpenoids, alkaloids, and phenolics. We found that the related features of genomics and proteomics were the top two crucial categories of features contributing to the model performance. Using only these key features, we built a new model in Arabidopsis, which performed better than models built with more features including those related with transcriptomics and epigenomics. Finally, the built models were validated in maize and tomato, and models tested for maize and trained with data from two other species exhibited either equivalent or superior performance to intraspecies predictions. CONCLUSIONS Our external validation results in grape and poppy on the one hand implied the applicability of our model to the other species, and on the other hand showed enormous potential to improve the prediction of enzymes synthesizing PSM with the inclusion of valid data from a wider range of species.
Collapse
Affiliation(s)
- Wenhui Bai
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, China, 518000, Shenzhen
| | - Cheng Li
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, China, 518000, Shenzhen
| | - Wei Li
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, China, 518000, Shenzhen
| | - Hai Wang
- National Maize Improvement Center, Key Laboratory of Crop Heterosis and Utilization, Joint Laboratory for International Cooperation in Crop Molecular Breeding, China Agricultural University, Beijing, 100193, China
| | - Xiaohong Han
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China.
| | - Peipei Wang
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518124, China.
| | - Li Wang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, China, 518000, Shenzhen.
| |
Collapse
|
3
|
Orduña L, Santiago A, Navarro-Payá D, Zhang C, Wong DCJ, Matus JT. Aggregated gene co-expression networks predict transcription factor regulatory landscapes in grapevine. JOURNAL OF EXPERIMENTAL BOTANY 2023; 74:6522-6540. [PMID: 37668374 DOI: 10.1093/jxb/erad344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 08/30/2023] [Indexed: 09/06/2023]
Abstract
Gene co-expression networks (GCNs) have not been extensively studied in non-model plants. However, the rapid accumulation of transcriptome datasets in certain species represents an opportunity to explore underutilized network aggregation approaches. In fact, aggregated GCNs (aggGCNs) highlight robust co-expression interactions and improve functional connectivity. We applied and evaluated two different aggregation methods on public grapevine RNA-Seq datasets from three different tissues (leaf, berry, and 'all organs'). Our results show that co-occurrence-based aggregation generally yielded the best-performing networks. We applied aggGCNs to study several transcription factor gene families, showing their capacity for detecting both already-described and novel regulatory relationships between R2R3-MYBs, bHLH/MYC, and multiple specialized metabolic pathways. Specifically, transcription factor gene- and pathway-centered network analyses successfully ascertained the previously established role of VviMYBPA1 in controlling the accumulation of proanthocyanidins while providing insights into its novel role as a regulator of p-coumaroyl-CoA biosynthesis as well as the shikimate and aromatic amino acid pathways. This network was validated using DNA affinity purification sequencing data, demonstrating that co-expression networks of transcriptional activators can serve as a proxy of gene regulatory networks. This study presents an open repository to reproduce networks in other crops and a GCN application within the Vitviz platform, a user-friendly tool for exploring co-expression relationships.
Collapse
Affiliation(s)
- Luis Orduña
- Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, Paterna, 46908, Valencia, Spain
| | - Antonio Santiago
- Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, Paterna, 46908, Valencia, Spain
| | - David Navarro-Payá
- Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, Paterna, 46908, Valencia, Spain
| | - Chen Zhang
- Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, Paterna, 46908, Valencia, Spain
| | - Darren C J Wong
- Ecology and Evolution, Research School of Biology, The Australian National University, Acton, Australia
| | - José Tomás Matus
- Institute for Integrative Systems Biology (I2SysBio), Universitat de València-CSIC, Paterna, 46908, Valencia, Spain
| |
Collapse
|
4
|
Trasierras AM, Luna JM, Ventura S. A contrast set mining based approach for cancer subtype analysis. Artif Intell Med 2023; 143:102590. [PMID: 37673572 DOI: 10.1016/j.artmed.2023.102590] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 05/24/2023] [Accepted: 05/30/2023] [Indexed: 09/08/2023]
Abstract
The task of detecting common and unique characteristics among different cancer subtypes is an important focus of research that aims to improve personalized therapies. Unlike current approaches mainly based on predictive techniques, our study aims to improve the knowledge about the molecular mechanisms that descriptively led to cancer, thus not requiring previous knowledge to be validated. Here, we propose an approach based on contrast set mining to capture high-order relationships in cancer transcriptomic data. In this way, we were able to extract valuable insights from several cancer subtypes in the form of highly specific genetic relationships related to functional pathways affected by the disease. To this end, we have divided several cancer gene expression databases by the subtype associated with each sample to detect which gene groups are related to each cancer subtype. To demonstrate the potential and usefulness of the proposed approach we have extensively analysed RNA-Seq gene expression data from breast, kidney, and colon cancer subtypes. The possible role of the obtained genetic relationships was further evaluated through extensive literature research, while its prognosis was assessed via survival analysis, finding gene expression patterns related to survival in various cancer subtypes. Some gene associations were described in the literature as potential cancer biomarkers while other results have been not described yet and could be a starting point for future research.
Collapse
Affiliation(s)
- A M Trasierras
- Department of Computer Science and Numerical Analysis, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), Spain; Maimonides Biomedical Research Institute of Cordoba, IMIBIC, University of Cordoba, Córdoba, 14071, Spain; Phytoplant Research S.L.U, Departamento Tecnología y Control, Rabanales 21-Parque Científico Tecnológico de Córdoba, Calle Astrónoma Cecilia Payne, Córdoba, Spain
| | - J M Luna
- Department of Computer Science and Numerical Analysis, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), Spain; Maimonides Biomedical Research Institute of Cordoba, IMIBIC, University of Cordoba, Córdoba, 14071, Spain
| | - S Ventura
- Department of Computer Science and Numerical Analysis, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), Spain; Maimonides Biomedical Research Institute of Cordoba, IMIBIC, University of Cordoba, Córdoba, 14071, Spain.
| |
Collapse
|
5
|
Chandra O, Sharma M, Pandey N, Jha IP, Mishra S, Kong SL, Kumar V. Patterns of transcription factor binding and epigenome at promoters allow interpretable predictability of multiple functions of non-coding and coding genes. Comput Struct Biotechnol J 2023; 21:3590-3603. [PMID: 37520281 PMCID: PMC10371796 DOI: 10.1016/j.csbj.2023.07.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 07/05/2023] [Accepted: 07/11/2023] [Indexed: 08/01/2023] Open
Abstract
Understanding the biological roles of all genes only through experimental methods is challenging. A computational approach with reliable interpretability is needed to infer the function of genes, particularly for non-coding RNAs. We have analyzed genomic features that are present across both coding and non-coding genes like transcription factor (TF) and cofactor ChIP-seq (823), histone modifications ChIP-seq (n = 621), cap analysis gene expression (CAGE) tags (n = 255), and DNase hypersensitivity profiles (n = 255) to predict ontology-based functions of genes. Our approach for gene function prediction was reliable (>90% balanced accuracy) for 486 gene-sets. PubMed abstract mining and CRISPR screens supported the inferred association of genes with biological functions, for which our method had high accuracy. Further analysis revealed that TF-binding patterns at promoters have high predictive strength for multiple functions. TF-binding patterns at the promoter add an unexplored dimension of explainable regulatory aspects of genes and their functions. Therefore, we performed a comprehensive analysis for the functional-specificity of TF-binding patterns at promoters and used them for clustering functions to reveal many latent groups of gene-sets involved in common major cellular processes. We also showed how our approach could be used to infer the functions of non-coding genes using the CRISPR screens of coding genes, which were validated using a long non-coding RNA CRISPR screen. Thus our results demonstrated the generality of our approach by using gene-sets from CRISPR screens. Overall, our approach opens an avenue for predicting the involvement of non-coding genes in various functions.
Collapse
Affiliation(s)
- Omkar Chandra
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Ph-III, New Delhi, India
| | - Madhu Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Ph-III, New Delhi, India
| | - Neetesh Pandey
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Ph-III, New Delhi, India
| | - Indra Prakash Jha
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Ph-III, New Delhi, India
| | - Shreya Mishra
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Ph-III, New Delhi, India
| | - Say Li Kong
- Genome Institute of Singapore, Agency for Science Technology and Research, Singapore, Singapore
| | - Vibhor Kumar
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Ph-III, New Delhi, India
| |
Collapse
|
6
|
Sarmah DT, Gujjar S, Mathapati S, Bairagi N, Chatterjee S. Identification of critical autophagy-related proteins in diabetic retinopathy: A multi-dimensional computational study. Gene 2023; 866:147339. [PMID: 36882123 DOI: 10.1016/j.gene.2023.147339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 02/24/2023] [Accepted: 03/01/2023] [Indexed: 03/07/2023]
Abstract
Diabetic retinopathy (DR) is a common consequence of diabetes mellitus and a primary cause of visual impairment in middle-aged and elderly individuals. DR is susceptible to cellular degradation facilitated by autophagy. In this study, we have employed a multi-layer relatedness (MLR) approach to uncover novel autophagy-related proteins involved in DR. The objective of MLR is to determine the relatedness of autophagic and DR proteins by incorporating both expression and prior-knowledge-based similarities. We constructed a prior knowledge-based network and identified the topologically significant novel disease-related candidate autophagic proteins (CAPs). Then, we evaluated their significance in a gene co-expression and a differentially-expressed gene (DEG) network. Finally, we investigated the proximity of CAPs to the known disease-related proteins. Leveraging this methodology, we identified three crucial autophagy-related proteins, TP53, HSAP90AA1, and PIK3R1, which can influence the DR interactome in various layers of heterogeneity of clinical manifestations. They are strongly related to multiple detrimental characteristics of DR, such as pericyte loss, angiogenesis, apoptosis, and endothelial cell migration, and hence may be used to prevent or delay the progression and development of DR. We evaluated one of the identified targets, TP53, in a cell-based model and found that its inhibition resulted in reduced angiogenesis in high glucose condition required to control DR.
Collapse
Affiliation(s)
- Dipanka Tanu Sarmah
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Sunil Gujjar
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Santosh Mathapati
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Nandadulal Bairagi
- Centre for Mathematical Biology and Ecology, Department of Mathematics, Jadavpur University, Kolkata 700032, India
| | - Samrat Chatterjee
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India.
| |
Collapse
|
7
|
Woodward E, Schlingmann K, Tobias J, Turner R. Characterisation of the testicular transcriptome in stallions with age-related testicular degeneration. Equine Vet J 2023; 55:239-252. [PMID: 35569039 DOI: 10.1111/evj.13588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 04/20/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND Age-related testicular degeneration can be defined as the progressive deterioration of the testis that typically occurs in middle-aged or older males and that leads to diminished testicular function and subfertility. In the equine breeding industry, genetically valuable males maintain their value as breeding animals well into old age. Because testicular degeneration is common in middle-aged and older stallions, the disease often has a significant negative impact on a stallion's breeding career and leads to economic losses in the horse breeding industry. OBJECTIVE Because testicular degeneration is a tissue autologous disease in the horse, the objective of this study was to use whole-transcriptome sequencing to compare the testicular transcriptomes of normal, fertile stallions to those of stallions affected by age-related testicular degeneration in order to better understand the pathophysiology of the disease. STUDY DESIGN Cross sectional. METHODS Testicular tissue samples from clinical castrations or euthanasia were collected from normal healthy (n = 3) or older subfertile (n = 4) stallions. Samples were processed and sequenced on an Illumina HiSeq™ 2000 Sequencing System. Bioinformatic analysis of the data was performed in R/RStudio, and the transcriptomes were compared between the two groups. Genes were considered to be differentially expressed between healthy and diseased tissue if they demonstrated at least a ±1.5× fold change difference and had a false discovery rate-adjusted P value <0.05. Gene ontology analysis was performed using Ingenuity® IPA. RESULTS Analyses of differential expression of individual genes, as well as computer-based gene ontology analysis, identified upregulation of cytokine-mediated inflammatory pathways in testes from stallions affected with testicular degeneration. This upregulation of inflammation was associated with upregulation of cell survival pathways, inhibition of apoptotic pathways and increases in collagen formation. MAIN LIMITATIONS There are unavoidable confounding factors (e.g. differences in breed, management, environment, age) that could create non disease-related genetic variation between our normal and affected samples. In addition, there are practical limitations to applying computer-based gene ontology analysis to equine samples. Gene ontology software relies on published information (mostly non-equine), and some biological processes (e.g. apoptosis and inflammation) are more commonly studied than others and so are over-represented in the literature and therefore more likely to be identified by computer algorithms. Caution should be taken when interpreting the data, as alterations in gene expression can be the cause of disease processes or can be the result of disease processes. CONCLUSIONS These results suggest that chronic, low-grade inflammation may be involved in the pathophysiology of age-related testicular degeneration in stallions.
Collapse
Affiliation(s)
- Elizabeth Woodward
- Department of Biomedical Sciences, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Karen Schlingmann
- Department of Clinical Studies, New Bolton Center, School of Veterinary Medicine, University of Pennsylvania, Kennett Square, Pennsylvania, USA
| | - John Tobias
- Penn Genome Analysis Core, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Regina Turner
- Department of Clinical Studies, New Bolton Center, School of Veterinary Medicine, University of Pennsylvania, Kennett Square, Pennsylvania, USA
| |
Collapse
|
8
|
Zhang Y, Huynh-Dam KT, Ding X, Sikirzhytski V, Lim CU, Broude E, Kiaris H. RASSF1 is identified by transcriptome coordination analysis as a target of ATF4. FEBS Open Bio 2023; 13:556-569. [PMID: 36723232 PMCID: PMC9989924 DOI: 10.1002/2211-5463.13569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 01/14/2023] [Accepted: 01/31/2023] [Indexed: 02/02/2023] Open
Abstract
Evaluation of gene co-regulation is a powerful approach for revealing regulatory associations between genes and predicting biological function, especially in genetically diverse samples. Here, we applied this strategy to identify transcripts that are co-regulated with unfolded protein response (UPR) genes in cultured fibroblasts from outbred deer mice. Our analyses showed that the transcriptome associated with RASSF1, a tumor suppressor involved in cell cycle regulation and not previously linked to UPR, is highly correlated with the transcriptome of several UPR-related genes, such as BiP/GRP78, DNAJB9, GRP94, ATF4, DNAJC3, and CHOP/DDIT3. Conversely, gene ontology analyses for genes co-regulated with RASSF1 predicted a previously unreported involvement in UPR-associated apoptosis. Bioinformatic analyses indicated the presence of ATF4-binding sites in the RASSF1 promoter, which were shown to be operational using chromatin immunoprecipitation. Reporter assays revealed that the RASSF1 promoter is responsive to ATF4, while ablation of RASSF1 mitigated the expression of the ATF4 effector BBC3 and abrogated tunicamycin-induced apoptosis. Collectively, these results implicate RASSF1 in the regulation of endoplasmic reticulum stress-associated apoptosis downstream of ATF4. They also illustrate the power of gene coordination analysis in predicting biological functions and revealing regulatory associations between genes.
Collapse
Affiliation(s)
- Youwen Zhang
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Kim-Tuyen Huynh-Dam
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Xiaokai Ding
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Vitali Sikirzhytski
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Chang-Uk Lim
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Eugenia Broude
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Hippokratis Kiaris
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
- Peromyscus Genetic Stock Center, University of South Carolina, Columbia, SC, USA
| |
Collapse
|
9
|
Xu Y, Chen J, Lyu A, Cheung WK, Zhang L. dynDeepDRIM: a dynamic deep learning model to infer direct regulatory interactions using time-course single-cell gene expression data. Brief Bioinform 2022; 23:6720420. [PMID: 36168811 DOI: 10.1093/bib/bbac424] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 08/02/2022] [Accepted: 09/01/2022] [Indexed: 12/14/2022] Open
Abstract
Time-course single-cell RNA sequencing (scRNA-seq) data have been widely used to explore dynamic changes in gene expression of transcription factors (TFs) and their target genes. This information is useful to reconstruct cell-type-specific gene regulatory networks (GRNs). However, the existing tools are commonly designed to analyze either time-course bulk gene expression data or static scRNA-seq data via pseudo-time cell ordering. A few methods successfully utilize the information from multiple time points while also considering the characteristics of scRNA-seq data. We proposed dynDeepDRIM, a novel deep learning model to reconstruct GRNs using time-course scRNA-seq data. It represents the joint expression of a gene pair as an image and utilizes the image of the target TF-gene pair and the ones of the potential neighbors to reconstruct GRNs from time-course scRNA-seq data. dynDeepDRIM can effectively remove the transitive TF-gene interactions by considering neighborhood context and model the gene expression dynamics using high-dimensional tensors. We compared dynDeepDRIM with six GRN reconstruction methods on both simulation and four real time-course scRNA-seq data. dynDeepDRIM achieved substantially better performance than the other methods in inferring TF-gene interactions and eliminated the false positives effectively. We also applied dynDeepDRIM to annotate gene functions and found it achieved evidently better performance than the other tools due to considering the neighbor genes.
Collapse
Affiliation(s)
- Yu Xu
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - Jiaxing Chen
- Computer Science and Technology, Division of Science and Technology, BNU-HKBU United International College, Jintong Road, 519087, Zhuhai, China
| | - Aiping Lyu
- School of Chinese Medicine, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - William K Cheung
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| |
Collapse
|
10
|
Singh KS, van der Hooft JJJ, van Wees SCM, Medema MH. Integrative omics approaches for biosynthetic pathway discovery in plants. Nat Prod Rep 2022; 39:1876-1896. [PMID: 35997060 PMCID: PMC9491492 DOI: 10.1039/d2np00032f] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Indexed: 12/13/2022]
Abstract
Covering: up to 2022With the emergence of large amounts of omics data, computational approaches for the identification of plant natural product biosynthetic pathways and their genetic regulation have become increasingly important. While genomes provide clues regarding functional associations between genes based on gene clustering, metabolome mining provides a foundational technology to chart natural product structural diversity in plants, and transcriptomics has been successfully used to identify new members of their biosynthetic pathways based on coexpression. Thus far, most approaches utilizing transcriptomics and metabolomics have been targeted towards specific pathways and use one type of omics data at a time. Recent technological advances now provide new opportunities for integration of multiple omics types and untargeted pathway discovery. Here, we review advances in plant biosynthetic pathway discovery using genomics, transcriptomics, and metabolomics, as well as recent efforts towards omics integration. We highlight how transcriptomics and metabolomics provide complementary information to link genes to metabolites, by associating temporal and spatial gene expression levels with metabolite abundance levels across samples, and by matching mass-spectral features to enzyme families. Furthermore, we suggest that elucidation of gene regulatory networks using time-series data may prove useful for efforts to unwire the complexities of biosynthetic pathway components based on regulatory interactions and events.
Collapse
Affiliation(s)
- Kumar Saurabh Singh
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
- Plant-Microbe Interactions, Institute of Environmental Biology, Utrecht University, The Netherlands.
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa
| | - Saskia C M van Wees
- Plant-Microbe Interactions, Institute of Environmental Biology, Utrecht University, The Netherlands.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
| |
Collapse
|
11
|
Thanamit K, Hoerhold F, Oswald M, Koenig R. Linear programming based gene expression model (LPM-GEM) predicts the carbon source for Bacillus subtilis. BMC Bioinformatics 2022; 23:226. [PMID: 35689204 PMCID: PMC9188260 DOI: 10.1186/s12859-022-04742-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 05/23/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Elucidating cellular metabolism led to many breakthroughs in biotechnology, synthetic biology, and health sciences. To date, deriving metabolic fluxes by 13C tracer experiments is the most prominent approach for studying metabolic fluxes quantitatively, often with high accuracy and precision. However, the technique has a high demand for experimental resources. Alternatively, flux balance analysis (FBA) has been employed to estimate metabolic fluxes without labeling experiments. It is less informative but can benefit from the low costs and low experimental efforts and gain flux estimates in experimentally difficult conditions. Methods to integrate relevant experimental data have been emerged to improve FBA flux estimations. Data from transcription profiling is often selected since it is easy to generate at the genome scale, typically embedded by a discretization of differential and non-differential expressed genes coding for the respective enzymes. RESULT We established the novel method Linear Programming based Gene Expression Model (LPM-GEM). LPM-GEM linearly embeds gene expression into FBA constraints. We implemented three strategies to reduce thermodynamically infeasible loops, which is a necessary prerequisite for such an omics-based model building. As a case study, we built a model of B. subtilis grown in eight different carbon sources. We obtained good flux predictions based on the respective transcription profiles when validating with 13C tracer based metabolic flux data of the same conditions. We could well predict the specific carbon sources. When testing the model on another, unseen dataset that was not used during training, good prediction performance was also observed. Furthermore, LPM-GEM outperformed a well-established model building methods. CONCLUSION Employing LPM-GEM integrates gene expression data efficiently. The method supports gene expression-based FBA models and can be applied as an alternative to estimate metabolic fluxes when tracer experiments are inappropriate.
Collapse
Affiliation(s)
- Kulwadee Thanamit
- Systems Biology Research Group, Institute for Infectious Diseases and Infection Control (IIMK), Jena University Hospital, Kollegiengasse 10, 07743, Jena, Germany
| | - Franziska Hoerhold
- Systems Biology Research Group, Institute for Infectious Diseases and Infection Control (IIMK), Jena University Hospital, Kollegiengasse 10, 07743, Jena, Germany
| | - Marcus Oswald
- Systems Biology Research Group, Institute for Infectious Diseases and Infection Control (IIMK), Jena University Hospital, Kollegiengasse 10, 07743, Jena, Germany
| | - Rainer Koenig
- Systems Biology Research Group, Institute for Infectious Diseases and Infection Control (IIMK), Jena University Hospital, Kollegiengasse 10, 07743, Jena, Germany.
| |
Collapse
|
12
|
Wang P, Schumacher AM, Shiu SH. Computational prediction of plant metabolic pathways. CURRENT OPINION IN PLANT BIOLOGY 2022; 66:102171. [PMID: 35078130 DOI: 10.1016/j.pbi.2021.102171] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 12/07/2021] [Accepted: 12/18/2021] [Indexed: 06/14/2023]
Abstract
Uncovering genes encoding enzymes responsible for the biosynthesis of diverse plant metabolites is essential for metabolic engineering and production of plant metabolite-derived medicine. With the availability of multi-omics data for an ever-increasing number of plant species and the development of computational approaches, the metabolic pathways of many important plant compounds can be predicted, complementing a more traditional genetic and/or biochemical approach. Here, we summarize recent progress in predicting plant metabolic pathways using genome, transcriptome, proteome, interactome, and/or metabolome data, and the utility of integrating these data with machine learning to further improve metabolic pathway predictions.
Collapse
Affiliation(s)
- Peipei Wang
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA.
| | - Ally M Schumacher
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA; Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
13
|
Vuong P, Wise MJ, Whiteley AS, Kaur P. Small investments with big returns: environmental genomic bioprospecting of microbial life. Crit Rev Microbiol 2022; 48:641-655. [PMID: 35100064 DOI: 10.1080/1040841x.2021.2011833] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Microorganisms and their natural products are major drivers of ecological processes and industrial applications. Microbial bioprospecting has been critical for the advancement in various fields such as pharmaceuticals, sustainable industries, food security and bioremediation. Next generation sequencing has been paramount in the exploration of diverse environmental microbiomes. It presents a culture-independent approach to investigating hitherto uncultured taxa, resulting in the creation of massive sequence databases, which are available in the public domain. Genome mining searches available (meta)genomic data for target biosynthetic genes, and combined with the large-scale public data, this in-silico bioprospecting method presents an efficient and extensive way to uncover microbial bioproducts. Bioinformatic tools have progressed to a stage where we can recover genomes from the environment; these metagenome-assembled genomes present a way to understand the metabolic capacity of microorganisms in a physiological and ecological context. Environmental sampling been extensive across various ecological settings, including microbiomes with unique physicochemical properties that could influence the discovery of novel functions and metabolic pathways. Although in-silico methods cannot completely substitute in-vitro studies, the contextual information it provides is invaluable for understanding the ecological and taxonomic distribution of microbial genotypes and to form effective strategies for future microbial bioprospecting efforts.
Collapse
Affiliation(s)
- Paton Vuong
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| | - Michael J Wise
- School of Physics, Mathematics and Computing, University of Western Australia, Perth, Australia
| | - Andrew S Whiteley
- Centre for Environment & Life Sciences, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Floreat, Australia
| | - Parwinder Kaur
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| |
Collapse
|
14
|
Wang P, Moore BM, Uygun S, Lehti-Shiu MD, Barry CS, Shiu SH. Optimising the use of gene expression data to predict plant metabolic pathway memberships. THE NEW PHYTOLOGIST 2021; 231:475-489. [PMID: 33749860 DOI: 10.1111/nph.17355] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 03/13/2021] [Indexed: 06/12/2023]
Abstract
Plant metabolites from diverse pathways are important for plant survival, human nutrition and medicine. The pathway memberships of most plant enzyme genes are unknown. While co-expression is useful for assigning genes to pathways, expression correlation may exist only under specific spatiotemporal and conditional contexts. Utilising > 600 tomato (Solanum lycopersicum) expression data combinations, three strategies for predicting memberships in 85 pathways were explored. Optimal predictions for different pathways require distinct data combinations indicative of pathway functions. Naive prediction (i.e. identifying pathways with the most similarly expressed genes) is error prone. In 52 pathways, unsupervised learning performed better than supervised approaches, possibly due to limited training data availability. Using gene-to-pathway expression similarities led to prediction models that outperformed those based simply on expression levels. Using 36 experimental validated genes, the pathway-best model prediction accuracy is 58.3%, significantly better compared with that for predicting annotated genes without experimental evidence (37.0%) or random guess (1.2%), demonstrating the importance of data quality. Our study highlights the need to extensively explore expression-based features and prediction strategies to maximise the accuracy of metabolic pathway membership assignment. The prediction framework outlined here can be applied to other species and serves as a baseline model for future comparisons.
Collapse
Affiliation(s)
- Peipei Wang
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Bethany M Moore
- Department of Botany, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | | | - Melissa D Lehti-Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Cornelius S Barry
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
| | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
- Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
15
|
Colinas M, Pollier J, Vaneechoutte D, Malat DG, Schweizer F, De Milde L, De Clercq R, Guedes JG, Martínez-Cortés T, Molina-Hidalgo FJ, Sottomayor M, Vandepoele K, Goossens A. Subfunctionalization of Paralog Transcription Factors Contributes to Regulation of Alkaloid Pathway Branch Choice in Catharanthus roseus. FRONTIERS IN PLANT SCIENCE 2021; 12:687406. [PMID: 34113373 PMCID: PMC8186833 DOI: 10.3389/fpls.2021.687406] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 04/27/2021] [Indexed: 06/12/2023]
Abstract
Catharanthus roseus produces a diverse range of specialized metabolites of the monoterpenoid indole alkaloid (MIA) class in a heavily branched pathway. Recent great progress in identification of MIA biosynthesis genes revealed that the different pathway branch genes are expressed in a highly cell type- and organ-specific and stress-dependent manner. This implies a complex control by specific transcription factors (TFs), only partly revealed today. We generated and mined a comprehensive compendium of publicly available C. roseus transcriptome data for MIA pathway branch-specific TFs. Functional analysis was performed through extensive comparative gene expression analysis and profiling of over 40 MIA metabolites in the C. roseus flower petal expression system. We identified additional members of the known BIS and ORCA regulators. Further detailed study of the ORCA TFs suggests subfunctionalization of ORCA paralogs in terms of target gene-specific regulation and synergistic activity with the central jasmonate response regulator MYC2. Moreover, we identified specific amino acid residues within the ORCA DNA-binding domains that contribute to the differential regulation of some MIA pathway branches. Our results advance our understanding of TF paralog specificity for which, despite the common occurrence of closely related paralogs in many species, comparative studies are scarce.
Collapse
Affiliation(s)
- Maite Colinas
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Jacob Pollier
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Metabolomics Core, Ghent, Belgium
| | - Dries Vaneechoutte
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Deniz G. Malat
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Fabian Schweizer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Liesbeth De Milde
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Rebecca De Clercq
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Joana G. Guedes
- CIBIO/InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Vairaão, Portugal
- I3S-Instituto de Investigação e Inovação em Saúde, IBMC-Instituto de Biologia Molecular e Celular, Universidade do Porto, Porto, Portugal
- ICBAS–Instituto de Ciências Biomédicas Abel Salazar, Universidade do Porto, Porto, Portugal
| | - Teresa Martínez-Cortés
- CIBIO/InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Vairaão, Portugal
| | - Francisco J. Molina-Hidalgo
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Mariana Sottomayor
- CIBIO/InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Vairaão, Portugal
- Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Alain Goossens
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| |
Collapse
|
16
|
Bassignana G, Fransson J, Henry V, Colliot O, Zujovic V, De Vico Fallani F. Stepwise target controllability identifies dysregulations of macrophage networks in multiple sclerosis. Netw Neurosci 2021; 5:337-357. [PMID: 34189368 PMCID: PMC8233109 DOI: 10.1162/netn_a_00180] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 12/14/2020] [Indexed: 12/27/2022] Open
Abstract
Identifying the nodes able to drive the state of a network is crucial to understand, and eventually control, biological systems. Despite recent advances, such identification remains difficult because of the huge number of equivalent controllable configurations, even in relatively simple networks. Based on the evidence that in many applications it is essential to test the ability of individual nodes to control a specific target subset, we develop a fast and principled method to identify controllable driver-target configurations in sparse and directed networks. We demonstrate our approach on simulated networks and experimental gene networks to characterize macrophage dysregulation in human subjects with multiple sclerosis.
Collapse
Affiliation(s)
- Giulia Bassignana
- Sorbonne University, UPMC Univ Paris 06, Inserm U-1127, CNRS UMR-7225, Institut du Cerveau et de la Moelle Epinière, Hopital Pitié-Salpêtrière, Paris, France
- Inria Paris, Aramis Project Team, Paris, France
| | - Jennifer Fransson
- Sorbonne University, UPMC Univ Paris 06, Inserm U-1127, CNRS UMR-7225, Institut du Cerveau et de la Moelle Epinière, Hopital Pitié-Salpêtrière, Paris, France
| | - Vincent Henry
- Sorbonne University, UPMC Univ Paris 06, Inserm U-1127, CNRS UMR-7225, Institut du Cerveau et de la Moelle Epinière, Hopital Pitié-Salpêtrière, Paris, France
- Inria Paris, Aramis Project Team, Paris, France
| | - Olivier Colliot
- Sorbonne University, UPMC Univ Paris 06, Inserm U-1127, CNRS UMR-7225, Institut du Cerveau et de la Moelle Epinière, Hopital Pitié-Salpêtrière, Paris, France
- Inria Paris, Aramis Project Team, Paris, France
| | - Violetta Zujovic
- Sorbonne University, UPMC Univ Paris 06, Inserm U-1127, CNRS UMR-7225, Institut du Cerveau et de la Moelle Epinière, Hopital Pitié-Salpêtrière, Paris, France
| | - Fabrizio De Vico Fallani
- Sorbonne University, UPMC Univ Paris 06, Inserm U-1127, CNRS UMR-7225, Institut du Cerveau et de la Moelle Epinière, Hopital Pitié-Salpêtrière, Paris, France
- Inria Paris, Aramis Project Team, Paris, France
| |
Collapse
|
17
|
Lu J, Dumitrascu B, McDowell IC, Jo B, Barrera A, Hong LK, Leichter SM, Reddy TE, Engelhardt BE. Causal network inference from gene transcriptional time-series response to glucocorticoids. PLoS Comput Biol 2021; 17:e1008223. [PMID: 33513136 PMCID: PMC7875426 DOI: 10.1371/journal.pcbi.1008223] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 02/10/2021] [Accepted: 08/07/2020] [Indexed: 11/19/2022] Open
Abstract
Gene regulatory network inference is essential to uncover complex relationships among gene pathways and inform downstream experiments, ultimately enabling regulatory network re-engineering. Network inference from transcriptional time-series data requires accurate, interpretable, and efficient determination of causal relationships among thousands of genes. Here, we develop Bootstrap Elastic net regression from Time Series (BETS), a statistical framework based on Granger causality for the recovery of a directed gene network from transcriptional time-series data. BETS uses elastic net regression and stability selection from bootstrapped samples to infer causal relationships among genes. BETS is highly parallelized, enabling efficient analysis of large transcriptional data sets. We show competitive accuracy on a community benchmark, the DREAM4 100-gene network inference challenge, where BETS is one of the fastest among methods of similar performance and additionally infers whether causal effects are activating or inhibitory. We apply BETS to transcriptional time-series data of differentially-expressed genes from A549 cells exposed to glucocorticoids over a period of 12 hours. We identify a network of 2768 genes and 31,945 directed edges (FDR ≤ 0.2). We validate inferred causal network edges using two external data sources: Overexpression experiments on the same glucocorticoid system, and genetic variants associated with inferred edges in primary lung tissue in the Genotype-Tissue Expression (GTEx) v6 project. BETS is available as an open source software package at https://github.com/lujonathanh/BETS.
Collapse
Affiliation(s)
- Jonathan Lu
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
| | - Bianca Dumitrascu
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Ian C. McDowell
- Element Genomics, A UCB Company, Durham, North Carolina, United States of America
| | - Brian Jo
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Alejandro Barrera
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina, United States of America
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Linda K. Hong
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina, United States of America
| | - Sarah M. Leichter
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina, United States of America
| | - Timothy E. Reddy
- Department of Genome Sciences, Duke University, Durham, North Carolina, United States of America
| | - Barbara E. Engelhardt
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
- Center for Statistics and Machine Learning, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
18
|
Iliopoulos A, Beis G, Apostolou P, Papasotiriou I. Complex Networks, Gene Expression and Cancer Complexity: A Brief Review of Methodology and Applications. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191017093504] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
In this brief survey, various aspects of cancer complexity and how this complexity can
be confronted using modern complex networks’ theory and gene expression datasets, are described.
In particular, the causes and the basic features of cancer complexity, as well as the challenges
it brought are underlined, while the importance of gene expression data in cancer research
and in reverse engineering of gene co-expression networks is highlighted. In addition, an introduction
to the corresponding theoretical and mathematical framework of graph theory and complex
networks is provided. The basics of network reconstruction along with the limitations of gene
network inference, the enrichment and survival analysis, evolution, robustness-resilience and cascades
in complex networks, are described. Finally, an indicative and suggestive example of a cancer
gene co-expression network inference and analysis is given.
Collapse
Affiliation(s)
- A.C. Iliopoulos
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - G. Beis
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - P. Apostolou
- Research and Development Department, Research Genetic Cancer Centre S.A., Florina, Greece
| | - I. Papasotiriou
- Research Genetic Cancer Centre International GmbH, Zug, Switzerland
| |
Collapse
|
19
|
Network Analysis Prioritizes DEWAX and ICE1 as the Candidate Genes for Major eQTL Hotspots in Seed Germination of Arabidopsis thaliana. G3-GENES GENOMES GENETICS 2020; 10:4215-4226. [PMID: 32963085 PMCID: PMC7642920 DOI: 10.1534/g3.120.401477] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Seed germination is characterized by a constant change of gene expression across different time points. These changes are related to specific processes, which eventually determine the onset of seed germination. To get a better understanding on the regulation of gene expression during seed germination, we performed a quantitative trait locus mapping of gene expression (eQTL) at four important seed germination stages (primary dormant, after-ripened, six-hour after imbibition, and radicle protrusion stage) using Arabidopsis thaliana Bay x Sha recombinant inbred lines (RILs). The mapping displayed the distinctness of the eQTL landscape for each stage. We found several eQTL hotspots across stages associated with the regulation of expression of a large number of genes. Interestingly, an eQTL hotspot on chromosome five collocates with hotspots for phenotypic and metabolic QTL in the same population. Finally, we constructed a gene co-expression network to prioritize the regulatory genes for two major eQTL hotspots. The network analysis prioritizes transcription factors DEWAX and ICE1 as the most likely regulatory genes for the hotspot. Together, we have revealed that the genetic regulation of gene expression is dynamic along the course of seed germination.
Collapse
|
20
|
Chavez B, Farmaki E, Zhang Y, Altomare D, Hao J, Soltnamohammadi E, Shtutman M, Chatzistamou I, Kiaris H. A strategy for the identification of paracrine regulators of cancer cell migration. Clin Exp Pharmacol Physiol 2020; 47:1758-1763. [PMID: 32585033 DOI: 10.1111/1440-1681.13366] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 06/13/2020] [Accepted: 06/16/2020] [Indexed: 12/14/2022]
Abstract
We hypothesized that the correlation of the whole transcriptome with quantifiable phenotypes may unveil genes contributing to the regulation of the corresponding response. We tested this hypothesis in cultured fibroblasts exposed to diverse pharmacological and biological agents, to identify genes influencing chemoattraction of breast cancer cells. Our analyses revealed several genes that correlated, either positively or negatively with cell migration, suggesting that they may operate as activators or inhibitors of this process. Survey of the scientific literature showed that genes exhibiting positive or negative association with cell migration had frequently been linked to cancer and metastasis before, while those with minimal association were not. The current methodology may formulate the basis for the development of novel strategies linking genes to quantifiable phenotypes.
Collapse
Affiliation(s)
- Bernardo Chavez
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Elena Farmaki
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Youwen Zhang
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Diego Altomare
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Ji Hao
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Elham Soltnamohammadi
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Michael Shtutman
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA
| | - Ioulia Chatzistamou
- Department of Pathology, Microbiology and Immunology, School of Medicine, University of South Carolina, Columbia, SC, USA
| | - Hippokratis Kiaris
- Department of Drug Discovery and Biomedical Sciences, College of Pharmacy, University of South Carolina, Columbia, SC, USA.,Peromyscus Genetic Stock Center, University of South Carolina, Columbia, SC, USA
| |
Collapse
|
21
|
|
22
|
Schwarz B, Azodi CB, Shiu SH, Bauer P. Putative cis-Regulatory Elements Predict Iron Deficiency Responses in Arabidopsis Roots. PLANT PHYSIOLOGY 2020; 182:1420-1439. [PMID: 31937681 PMCID: PMC7054882 DOI: 10.1104/pp.19.00760] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Accepted: 12/22/2019] [Indexed: 05/03/2023]
Abstract
Plant iron deficiency (-Fe) activates a complex regulatory network that coordinates root Fe uptake and distribution to sink tissues. In Arabidopsis (Arabidopsis thaliana), FER-LIKE FE DEFICIENCY-INDUCED TRANSCRIPTION FACTOR (FIT), a basic helix-loop-helix (bHLH) transcription factor (TF), regulates root Fe acquisition genes. Many other -Fe-induced genes are FIT independent, and instead regulated by other bHLH TFs and by yet unknown TFs. The cis-regulatory code, that is, the cis-regulatory elements (CREs) and their combinations that regulate plant -Fe-responses, remains largely elusive. Using Arabidopsis root transcriptome data and coexpression clustering, we identified over 100 putative CREs (pCREs) that predicted -Fe-induced gene expression in computational models. To assess pCRE properties and possible functions, we used large-scale in vitro TF binding data, positional bias, and evolutionary conservation. As one example, our approach uncovered pCREs resembling IDE1 (iron deficiency-responsive element 1), a known grass -Fe response CRE. Arabidopsis IDE1-likes were associated with FIT-dependent gene expression, more specifically with biosynthesis of Fe-chelating compounds. Thus, IDE1 seems to be conserved in grass and nongrass species. Our pCREs matched among others in vitro binding sites of B3, NAC, bZIP, and TCP TFs, which might be regulators of -Fe responses. Altogether, our findings provide a comprehensive source of cis-regulatory information for -Fe-responsive genes that advance our mechanistic understanding and inform future efforts in engineering plants with more efficient Fe uptake or transport systems.
Collapse
Affiliation(s)
- Birte Schwarz
- Institute of Botany, Heinrich Heine University, Düsseldorf 40225 Germany
| | - Christina B Azodi
- Department of Plant Biology, Michigan State University, East Lansing, Michigan 48824
- DOE-Great Lake Bioenergy Research Center, Michigan State University, East Lansing, Michigan 48824
| | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, Michigan 48824
- DOE-Great Lake Bioenergy Research Center, Michigan State University, East Lansing, Michigan 48824
- Department of Computational, Mathematics, Science, and Engineering, Michigan State University, East Lansing, Michigan 48824
| | - Petra Bauer
- Institute of Botany, Heinrich Heine University, Düsseldorf 40225 Germany
- Cluster of Excellence on Plant Science (CEPLAS), Heinrich Heine University, Düsseldorf 40225 Germany
| |
Collapse
|
23
|
Li HWR, Resche-Rigon M, Bagchi IC, Gemzell-Danielsson K, Glasier A. Does ulipristal acetate emergency contraception (ella®) interfere with implantation? Contraception 2019; 100:386-390. [DOI: 10.1016/j.contraception.2019.07.140] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Revised: 07/15/2019] [Accepted: 07/18/2019] [Indexed: 01/04/2023]
|
24
|
Rao X, Dixon RA. Co-expression networks for plant biology: why and how. Acta Biochim Biophys Sin (Shanghai) 2019; 51:981-988. [PMID: 31436787 DOI: 10.1093/abbs/gmz080] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/20/2019] [Accepted: 07/01/2019] [Indexed: 12/29/2022] Open
Abstract
Co-expression network analysis is one of the most powerful approaches for interpretation of large transcriptomic datasets. It enables characterization of modules of co-expressed genes that may share biological functional linkages. Such networks provide an initial way to explore functional associations from gene expression profiling and can be applied to various aspects of plant biology. This review presents the applications of co-expression network analysis in plant biology and addresses optimized strategies from the recent literature for performing co-expression analysis on plant biological systems. Additionally, we describe the combined interpretation of co-expression analysis with other genomic data to enhance the generation of biologically relevant information.
Collapse
Affiliation(s)
- Xiaolan Rao
- BioDiscovery Institute and Department of Biological Sciences, University of North Texas, Denton, TX 76203, USA
| | - Richard A Dixon
- BioDiscovery Institute and Department of Biological Sciences, University of North Texas, Denton, TX 76203, USA
| |
Collapse
|
25
|
Yang W, Petersen C, Pees B, Zimmermann J, Waschina S, Dirksen P, Rosenstiel P, Tholey A, Leippe M, Dierking K, Kaleta C, Schulenburg H. The Inducible Response of the Nematode Caenorhabditis elegans to Members of Its Natural Microbiota Across Development and Adult Life. Front Microbiol 2019; 10:1793. [PMID: 31440221 PMCID: PMC6693516 DOI: 10.3389/fmicb.2019.01793] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Accepted: 07/22/2019] [Indexed: 12/11/2022] Open
Abstract
The biology of all organisms is influenced by the associated community of microorganisms. In spite of its importance, it is usually not well understood how exactly this microbiota affects host functions and what are the underlying molecular processes. To rectify this knowledge gap, we took advantage of the nematode Caenorhabditis elegans as a tractable, experimental model system and assessed the inducible transcriptome response after colonization with members of its native microbiota. For this study, we focused on two isolates of the genus Ochrobactrum. These bacteria are known to be abundant in the nematode’s microbiota and are capable of colonizing and persisting in the nematode gut, even under stressful conditions. The transcriptome response was assessed across development and three time points of adult life, using general and C. elegans-specific enrichment analyses to identify affected functions. Our assessment revealed an influence of the microbiota members on the nematode’s dietary response, development, fertility, immunity, and energy metabolism. This response is mainly regulated by a GATA transcription factor, most likely ELT-2, as indicated by the enrichment of (i) the GATA motif in the promoter regions of inducible genes and (ii) of ELT-2 targets among the differentially expressed genes. We compared our transcriptome results with a corresponding previously characterized proteome data set, highlighting a significant overlap in the differentially expressed genes, the affected functions, and ELT-2 target genes. Our analysis further identified a core set of 86 genes that consistently responded to the microbiota members across development and adult life, including several C-type lectin-like genes and genes known to be involved in energy metabolism or fertility. We additionally assessed the consequences of induced gene expression with the help of metabolic network model analysis, using a previously established metabolic network for C. elegans. This analysis complemented the enrichment analyses by revealing an influence of the Ochrobactrum isolates on C. elegans energy metabolism and furthermore metabolism of specific amino acids, fatty acids, and also folate biosynthesis. Our findings highlight the multifaceted impact of naturally colonizing microbiota isolates on C. elegans life history and thereby provide a framework for further analysis of microbiota-mediated host functions.
Collapse
Affiliation(s)
- Wentao Yang
- Research Group Evolutionary Ecology and Genetics, Zoological Institute, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Carola Petersen
- Research Group Evolutionary Ecology and Genetics, Zoological Institute, Christian-Albrechts-Universität zu Kiel, Kiel, Germany.,Research Group Comparative Immunobiology, Zoological Institute, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Barbara Pees
- Research Group Comparative Immunobiology, Zoological Institute, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Johannes Zimmermann
- Research Group Medical Systems Biology, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Silvio Waschina
- Research Group Medical Systems Biology, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Philipp Dirksen
- Research Group Evolutionary Ecology and Genetics, Zoological Institute, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Philip Rosenstiel
- Institute for Clinical Molecular Biology, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Andreas Tholey
- Research Group Proteomics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Matthias Leippe
- Research Group Comparative Immunobiology, Zoological Institute, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Katja Dierking
- Research Group Evolutionary Ecology and Genetics, Zoological Institute, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Christoph Kaleta
- Research Group Medical Systems Biology, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Hinrich Schulenburg
- Research Group Evolutionary Ecology and Genetics, Zoological Institute, Christian-Albrechts-Universität zu Kiel, Kiel, Germany.,Max Planck Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
26
|
Saha S, Murmu KC, Biswas M, Chakraborty S, Basu J, Madhulika S, Kolapalli SP, Chauhan S, Sengupta A, Prasad P. Transcriptomic Analysis Identifies RNA Binding Proteins as Putative Regulators of Myelopoiesis and Leukemia. Front Oncol 2019; 9:692. [PMID: 31448224 PMCID: PMC6691814 DOI: 10.3389/fonc.2019.00692] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 07/12/2019] [Indexed: 12/26/2022] Open
Abstract
Acute myeloid leukemia (AML) is a common and aggressive hematological malignancy. Acquisition of heterogeneous genetic aberrations and epigenetic dysregulation lead to the transformation of hematopoietic stem cells (HSC) into leukemic stem cells (LSC), which subsequently gives rise to immature blast cells and a leukemic phenotype. LSCs are responsible for disease relapse as current chemotherapeutic regimens are not able to completely eradicate these cellular sub-populations. Therefore, it is critical to improve upon the existing knowledge of LSC specific markers, which would allow for specific targeting of these cells more effectively allowing for their sustained eradication from the cellular milieu. Although significant milestones in decoding the aberrant transcriptional network of various cancers, including leukemia, have been achieved, studies on the involvement of post-transcriptional gene regulation (PTGR) in disease progression are beginning to unfold. RNA binding proteins (RBPs) are key players in mediating PTGR and they regulate the intracellular fate of individual transcripts, from their biogenesis to RNA metabolism, via interactions with RNA binding domains (RBDs). In this study, we have used an integrative approach to systematically profile RBP expression and identify key regulatory RBPs involved in normal myeloid development and AML. We have analyzed RNA-seq datasets (GSE74246) of HSCs, common myeloid progenitors (CMPs), granulocyte-macrophage progenitors (GMPs), monocytes, LSCs, and blasts. We observed that normal and leukemic cells can be distinguished on the basis of RBP expression, which is indicative of their ability to define cellular identity, similar to transcription factors. We identified that distinctly co-expressing modules of RBPs and their subclasses were enriched in hematopoietic stem/progenitor (HSPCs) and differentiated monocytes. We detected expression of DZIP3, an E3 ubiquitin ligase, in HSPCs, knockdown of which promotes monocytic differentiation in cell line model. We identified co-expression modules of RBP genes in LSCs and among these, distinct modules of RBP genes with high and low expression. The expression of several AML-specific RBPs were also validated by quantitative polymerase chain reaction. Network analysis identified densely connected hubs of ribosomal RBP genes (rRBPs) with low expression in LSCs, suggesting the dependency of LSCs on altered ribosome dynamics. In conclusion, our systematic analysis elucidates the RBP transcriptomic landscape in normal and malignant myelopoiesis, and highlights the functional consequences that may result from perturbation of RBP gene expression in these cellular landscapes.
Collapse
Affiliation(s)
- Subha Saha
- Epigenetic and Chromatin Biology Unit, Institute of Life Sciences, Bhubaneswar, India
| | - Krushna Chandra Murmu
- Epigenetic and Chromatin Biology Unit, Institute of Life Sciences, Bhubaneswar, India
| | - Mayukh Biswas
- Translational Research Unit of Excellence (TRUE), Stem Cell and Leukemia Laboratory, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
| | - Sohini Chakraborty
- Department of Pathology, New York University School of Medicine, New York, NY, United States
| | - Jhinuk Basu
- Epigenetic and Chromatin Biology Unit, Institute of Life Sciences, Bhubaneswar, India
| | - Swati Madhulika
- Epigenetic and Chromatin Biology Unit, Institute of Life Sciences, Bhubaneswar, India
| | | | - Santosh Chauhan
- Cell Biology and Infectious Disease Unit, Institute of Life Sciences, Bhubaneswar, India
| | - Amitava Sengupta
- Translational Research Unit of Excellence (TRUE), Stem Cell and Leukemia Laboratory, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
| | - Punit Prasad
- Epigenetic and Chromatin Biology Unit, Institute of Life Sciences, Bhubaneswar, India
| |
Collapse
|
27
|
Gupta C, Pereira A. Recent advances in gene function prediction using context-specific coexpression networks in plants. F1000Res 2019; 8:F1000 Faculty Rev-153. [PMID: 30800290 PMCID: PMC6364378 DOI: 10.12688/f1000research.17207.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/30/2019] [Indexed: 12/11/2022] Open
Abstract
Predicting gene functions from genome sequence alone has been difficult, and the functions of a large fraction of plant genes remain unknown. However, leveraging the vast amount of currently available gene expression data has the potential to facilitate our understanding of plant gene functions, especially in determining complex traits. Gene coexpression networks-created by integrating multiple expression datasets-connect genes with similar patterns of expression across multiple conditions. Dense gene communities in such networks, commonly referred to as modules, often indicate that the member genes are functionally related. As such, these modules serve as tools for generating new testable hypotheses, including the prediction of gene function and importance. Recently, we have seen a paradigm shift from the traditional "global" to more defined, context-specific coexpression networks. Such coexpression networks imply genetic correlations in specific biological contexts such as during development or in response to a stress. In this short review, we highlight a few recent studies that attempt to fill the large gaps in our knowledge about cellular functions of plant genes using context-specific coexpression networks.
Collapse
Affiliation(s)
- Chirag Gupta
- Crop, Soil and Environmental Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Andy Pereira
- Crop, Soil and Environmental Sciences, University of Arkansas, Fayetteville, AR, USA
| |
Collapse
|
28
|
Franke R, Hinkelmann B, Fetz V, Stradal T, Sasse F, Klawonn F, Brönstrup M. xCELLanalyzer: A Framework for the Analysis of Cellular Impedance Measurements for Mode of Action Discovery. SLAS DISCOVERY 2019; 24:213-223. [PMID: 30681906 DOI: 10.1177/2472555218819459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Mode of action (MoA) identification of bioactive compounds is very often a challenging and time-consuming task. We used a label-free kinetic profiling method based on an impedance readout to monitor the time-dependent cellular response profiles for the interaction of bioactive natural products and other small molecules with mammalian cells. Such approaches have been rarely used so far due to the lack of data mining tools to properly capture the characteristics of the impedance curves. We developed a data analysis pipeline for the xCELLigence Real-Time Cell Analysis detection platform to process the data, assess and score their reproducibility, and provide rank-based MoA predictions for a reference set of 60 bioactive compounds. The method can reveal additional, previously unknown targets, as exemplified by the identification of tubulin-destabilizing activities of the RNA synthesis inhibitor actinomycin D and the effects on DNA replication of vioprolide A. The data analysis pipeline is based on the statistical programming language R and is available to the scientific community through a GitHub repository.
Collapse
Affiliation(s)
- Raimo Franke
- 1 Department of Chemical Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Bettina Hinkelmann
- 1 Department of Chemical Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Verena Fetz
- 1 Department of Chemical Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Theresia Stradal
- 2 Department of Cell Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Florenz Sasse
- 1 Department of Chemical Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Frank Klawonn
- 3 Biostatistics Group, Helmholtz Centre for Infection Research, Braunschweig, Germany.,4 Department of Computer Science, Ostfalia University, Wolfenbuettel, Germany
| | - Mark Brönstrup
- 1 Department of Chemical Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany.,5 Center of Biomolecular Drug Research (BMWZ), Institute of Organic Chemistry, Leibniz Universität, Hannover, Germany
| |
Collapse
|
29
|
Abstract
Specialized metabolites are critical for plant–environment interactions, e.g., attracting pollinators or defending against herbivores, and are important sources of plant-based pharmaceuticals. However, it is unclear what proportion of enzyme-encoding genes play a role in specialized metabolism (SM) as opposed to general metabolism (GM) in any plant species. This is because of the diversity of specialized metabolites and the considerable number of incompletely characterized pathways responsible for their production. In addition, SM gene ancestors frequently played roles in GM. We evaluate features distinguishing SM and GM genes and build a computational model that accurately predicts SM genes. Our predictions provide candidates for experimental studies, and our modeling approach can be applied to other species that produce medicinally or industrially useful compounds. Plant specialized metabolism (SM) enzymes produce lineage-specific metabolites with important ecological, evolutionary, and biotechnological implications. Using Arabidopsis thaliana as a model, we identified distinguishing characteristics of SM and GM (general metabolism, traditionally referred to as primary metabolism) genes through a detailed study of features including duplication pattern, sequence conservation, transcription, protein domain content, and gene network properties. Analysis of multiple sets of benchmark genes revealed that SM genes tend to be tandemly duplicated, coexpressed with their paralogs, narrowly expressed at lower levels, less conserved, and less well connected in gene networks relative to GM genes. Although the values of each of these features significantly differed between SM and GM genes, any single feature was ineffective at predicting SM from GM genes. Using machine learning methods to integrate all features, a prediction model was established with a true positive rate of 87% and a true negative rate of 71%. In addition, 86% of known SM genes not used to create the machine learning model were predicted. We also demonstrated that the model could be further improved when we distinguished between SM, GM, and junction genes responsible for reactions shared by SM and GM pathways, indicating that topological considerations may further improve the SM prediction model. Application of the prediction model led to the identification of 1,220 A. thaliana genes with previously unknown functions, each assigned a confidence measure called an SM score, providing a global estimate of SM gene content in a plant genome.
Collapse
|
30
|
Fornito A, Arnatkevičiūtė A, Fulcher BD. Bridging the Gap between Connectome and Transcriptome. Trends Cogn Sci 2019; 23:34-50. [DOI: 10.1016/j.tics.2018.10.005] [Citation(s) in RCA: 156] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Revised: 10/10/2018] [Accepted: 10/23/2018] [Indexed: 11/24/2022]
|
31
|
Kolberg L, Kuzmin I, Adler P, Vilo J, Peterson H. funcExplorer: a tool for fast data-driven functional characterisation of high-throughput expression data. BMC Genomics 2018; 19:817. [PMID: 30428831 PMCID: PMC6236982 DOI: 10.1186/s12864-018-5176-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Accepted: 10/16/2018] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND A widely applied approach to extract knowledge from high-throughput genomic data is clustering of gene expression profiles followed by functional enrichment analysis. This type of analysis, when done manually, is highly subjective and has limited reproducibility. Moreover, this pipeline can be very time-consuming and resource-demanding as enrichment analysis is done for tens to hundreds of clusters at a time. Thus, the task often needs programming skills to form a pipeline of different software tools or R packages to enable an automated approach. Furthermore, visualising the results can be challenging. RESULTS We developed a web tool, funcExplorer, which automatically combines hierarchical clustering and enrichment analysis to detect functionally related gene clusters. The functional characterisation is achieved using structured knowledge from data sources such as Gene Ontology, KEGG and Reactome pathways, Human Protein Atlas, and Human Phenotype Ontology. funcExplorer includes various measures for finding biologically meaningful clusters, provides a modern graphical user interface, and has wide-ranging data export and sharing options as well as software transparency by open-source code. The results are presented in a visually compact and interactive format, enabling users to explore the biological essence of the data. We compared our results with previously published gene clusters to demonstrate that funcExplorer can perform the data characterisation equally well, but without requiring labour-intensive manual interference. CONCLUSIONS The open-source web tool funcExplorer enables scientists with high-throughput genomic data to obtain a preliminary interactive overview of the expression patterns, gene names, and shared functionalities in their dataset in a visually pleasing format. funcExplorer is publicly available at https://biit.cs.ut.ee/funcexplorer.
Collapse
Affiliation(s)
- Liis Kolberg
- Institute of Computer Science, University of Tartu, Juhan Liivi 2, Tartu, Estonia
| | - Ivan Kuzmin
- Institute of Computer Science, University of Tartu, Juhan Liivi 2, Tartu, Estonia
| | - Priit Adler
- Institute of Computer Science, University of Tartu, Juhan Liivi 2, Tartu, Estonia
- Quretec Ltd, Ülikooli 6a, Tartu, Estonia
| | - Jaak Vilo
- Institute of Computer Science, University of Tartu, Juhan Liivi 2, Tartu, Estonia
- Quretec Ltd, Ülikooli 6a, Tartu, Estonia
| | - Hedi Peterson
- Institute of Computer Science, University of Tartu, Juhan Liivi 2, Tartu, Estonia
- Quretec Ltd, Ülikooli 6a, Tartu, Estonia
| |
Collapse
|
32
|
Functional Annotation of Caenorhabditis elegans Genes by Analysis of Gene Co-Expression Networks. Biomolecules 2018; 8:biom8030070. [PMID: 30081521 PMCID: PMC6163173 DOI: 10.3390/biom8030070] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 07/30/2018] [Accepted: 08/01/2018] [Indexed: 12/20/2022] Open
Abstract
Caenorhabditis elegans (C. elegans) is a well-characterized metazoan, whose transcriptome has been profiled in different tissues, development stages, or other conditions. Large-scale transcriptomes can be reused for gene function annotation through systematic analysis of gene co-expression relationships. We collected 2101 microarray data from National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO), and identified 48 modules of co-expressed genes that correspond to tissues, development stages, and other experimental conditions. These modules provide an overview of the transcriptional organizations that may work under different conditions. By analyzing higher-order module networks, we found that nucleus and plasma membrane modules are more connected than other intracellular modules. Module-based gene function annotation may help to extend the candidate cuticle gene list. A comparison with other published data validates the credibility of our result. Our findings provide a new source for future gene discovery in C. elegans.
Collapse
|
33
|
Ranking genome-wide correlation measurements improves microarray and RNA-seq based global and targeted co-expression networks. Sci Rep 2018; 8:10885. [PMID: 30022075 PMCID: PMC6052111 DOI: 10.1038/s41598-018-29077-3] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 06/27/2018] [Indexed: 02/06/2023] Open
Abstract
Co-expression networks are essential tools to infer biological associations between gene products and predict gene annotation. Global networks can be analyzed at the transcriptome-wide scale or after querying them with a set of guide genes to capture the transcriptional landscape of a given pathway in a process named Pathway Level Coexpression (PLC). A critical step in network construction remains the definition of gene co-expression. In the present work, we compared how Pearson Correlation Coefficient (PCC), Spearman Correlation Coefficient (SCC), their respective ranked values (Highest Reciprocal Rank (HRR)), Mutual Information (MI) and Partial Correlations (PC) performed on global networks and PLCs. This evaluation was conducted on the model plant Arabidopsis thaliana using microarray and differently pre-processed RNA-seq datasets. We particularly evaluated how dataset × distance measurement combinations performed in 5 PLCs corresponding to 4 well described plant metabolic pathways (phenylpropanoid, carbohydrate, fatty acid and terpene metabolisms) and the cytokinin signaling pathway. Our present work highlights how PCC ranked with HRR is better suited for global network construction and PLC with microarray and RNA-seq data than other distance methods, especially to cluster genes in partitions similar to biological subpathways.
Collapse
|
34
|
Xing A, Last RL. A Regulatory Hierarchy of the Arabidopsis Branched-Chain Amino Acid Metabolic Network. THE PLANT CELL 2017; 29:1480-1499. [PMID: 28522547 PMCID: PMC5502462 DOI: 10.1105/tpc.17.00186] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Revised: 04/12/2017] [Accepted: 05/11/2017] [Indexed: 05/18/2023]
Abstract
The branched-chain amino acids (BCAAs) Ile, Val, and Leu are essential nutrients that humans and other animals obtain from plants. However, total and relative amounts of plant BCAAs rarely match animal nutritional needs, and improvement requires a better understanding of the mechanistic basis for BCAA homeostasis. We present an in vivo regulatory model of BCAA homeostasis derived from analysis of feedback-resistant Arabidopsis thaliana mutants for the three allosteric committed enzymes in the biosynthetic network: threonine deaminase (also named l-O-methylthreonine resistant 1 [OMR1]), acetohydroxyacid synthase small subunit 2 (AHASS2), and isopropylmalate synthase 1 (IPMS1). In this model, OMR1 exerts primary control on Ile accumulation and functions independently of AHAS and IPMS AHAS and IPMS regulate Val and Leu homeostasis, where AHAS affects total Val+Leu and IPMS controls partitioning between these amino acids. In addition, analysis of feedback-resistant and loss-of-function single and double mutants revealed that each AHAS and IPMS isoenzyme contributes to homeostasis rather than being functionally redundant. The characterized feedback resistance mutations caused increased free BCAA levels in both seedlings and seeds. These results add to our understanding of the basis of in vivo BCAA homeostasis and inform approaches to improve the amount and balance of these essential nutrients in crops.
Collapse
Affiliation(s)
- Anqi Xing
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319
| | - Robert L Last
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319
- Department of Plant Biology, Michigan State University, East Lansing, Michigan 48824-1319
| |
Collapse
|
35
|
Wisecaver JH, Borowsky AT, Tzin V, Jander G, Kliebenstein DJ, Rokas A. A Global Coexpression Network Approach for Connecting Genes to Specialized Metabolic Pathways in Plants. THE PLANT CELL 2017; 29:944-959. [PMID: 28408660 PMCID: PMC5466033 DOI: 10.1105/tpc.17.00009] [Citation(s) in RCA: 137] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2017] [Revised: 03/12/2017] [Accepted: 04/09/2017] [Indexed: 05/20/2023]
Abstract
Plants produce diverse specialized metabolites (SMs), but the genes responsible for their production and regulation remain largely unknown, hindering efforts to tap plant pharmacopeia. Given that genes comprising SM pathways exhibit environmentally dependent coregulation, we hypothesized that genes within a SM pathway would form tight associations (modules) with each other in coexpression networks, facilitating their identification. To evaluate this hypothesis, we used 10 global coexpression data sets, each a meta-analysis of hundreds to thousands of experiments, across eight plant species to identify hundreds of coexpressed gene modules per data set. In support of our hypothesis, 15.3 to 52.6% of modules contained two or more known SM biosynthetic genes, and module genes were enriched in SM functions. Moreover, modules recovered many experimentally validated SM pathways, including all six known to form biosynthetic gene clusters (BGCs). In contrast, bioinformatically predicted BGCs (i.e., those lacking an associated metabolite) were no more coexpressed than the null distribution for neighboring genes. These results suggest that most predicted plant BGCs are not genuine SM pathways and argue that BGCs are not a hallmark of plant specialized metabolism. We submit that global gene coexpression is a rich, largely untapped resource for discovering the genetic basis and architecture of plant natural products.
Collapse
Affiliation(s)
- Jennifer H Wisecaver
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235
| | - Alexander T Borowsky
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235
| | - Vered Tzin
- French Associates Institute for Agriculture and Biotechnology of Drylands, Jacob Blaustein Institute for Desert Research, Ben Gurion University, Sede-Boqer Campus 84990, Israel
| | - Georg Jander
- Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, New York 14853
| | - Daniel J Kliebenstein
- Department of Plant Sciences, University of California-Davis, Davis, California 95616
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235
| |
Collapse
|