1
|
Piya S, Pantalone V, Zadegan SB, Shipp S, Lakhssassi N, Knizia D, Krishnan HB, Meksem K, Hewezi T. Soybean gene co-expression network analysis identifies two co-regulated gene modules associated with nodule formation and development. MOLECULAR PLANT PATHOLOGY 2023; 24:628-636. [PMID: 36975024 DOI: 10.1111/mpp.13327] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/03/2023] [Accepted: 03/06/2023] [Indexed: 05/18/2023]
Abstract
Gene co-expression network analysis is an efficient systems biology approach for the discovery of novel gene functions and trait-associated gene modules. To identify clusters of functionally related genes involved in soybean nodule formation and development, we performed a weighted gene co-expression network analysis. Two nodule-specific modules (NSM-1 and NSM-2, containing 304 and 203 genes, respectively) were identified. The NSM-1 gene promoters were significantly enriched in cis-binding elements for ERF, MYB, and C2H2-type zinc transcription factors, whereas NSM-2 gene promoters were enriched in cis-binding elements for TCP, bZIP, and bHLH transcription factors, suggesting a role of these regulatory factors in the transcriptional activation of nodule co-expressed genes. The co-expressed gene modules included genes with potential novel roles in nodulation, including those involved in xylem development, transmembrane transport, the ethylene signalling pathway, cytoskeleton organization, cytokinesis and regulation of the cell cycle, regulation of meristem initiation and growth, transcriptional regulation, DNA methylation, and histone modifications. Functional analysis of two co-expressed genes using TILLING mutants provided novel insight into the involvement of unsaturated fatty acid biosynthesis and folate metabolism in nodule formation and development. The identified gene co-expression modules provide valuable resources for further functional genomics studies to dissect the genetic basis of nodule formation and development in soybean.
Collapse
Affiliation(s)
- Sarbottam Piya
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee, 37996, USA
| | - Vince Pantalone
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee, 37996, USA
| | | | - Sarah Shipp
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee, 37996, USA
| | - Naoufal Lakhssassi
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, Illinois, 62901, USA
| | - Dounya Knizia
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, Illinois, 62901, USA
| | - Hari B Krishnan
- Plant Science Division, University of Missouri, Columbia, Missouri, USA
- Plant Genetics Research, USDA Agricultural Research Service, Columbia, Missouri, USA
| | - Khalid Meksem
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, Illinois, 62901, USA
| | - Tarek Hewezi
- Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee, 37996, USA
| |
Collapse
|
2
|
Approaches in Gene Coexpression Analysis in Eukaryotes. BIOLOGY 2022; 11:biology11071019. [PMID: 36101400 PMCID: PMC9312353 DOI: 10.3390/biology11071019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 06/28/2022] [Accepted: 07/04/2022] [Indexed: 11/22/2022]
Abstract
Simple Summary Genes whose expression levels rise and fall similarly in a large set of samples, may be considered coexpressed. Gene coexpression analysis refers to the en masse discovery of coexpressed genes from a large variety of transcriptomic experiments. The type of biological networks that studies gene coexpression, known as Gene Coexpression Networks, consist of an undirected graph depicting genes and their coexpression relationships. Coexpressed genes are clustered in smaller subnetworks, the predominant biological roles of which can be determined through enrichment analysis. By studying well-annotated gene partners, the attribution of new roles to genes of unknown function or assumption for participation in common metabolic pathways can be achieved, through a guilt-by-association approach. In this review, we present key issues in gene coexpression analysis, as well as the most popular tools that perform it. Abstract Gene coexpression analysis constitutes a widely used practice for gene partner identification and gene function prediction, consisting of many intricate procedures. The analysis begins with the collection of primary transcriptomic data and their preprocessing, continues with the calculation of the similarity between genes based on their expression values in the selected sample dataset and results in the construction and visualisation of a gene coexpression network (GCN) and its evaluation using biological term enrichment analysis. As gene coexpression analysis has been studied extensively, we present most parts of the methodology in a clear manner and the reasoning behind the selection of some of the techniques. In this review, we offer a comprehensive and comprehensible account of the steps required for performing a complete gene coexpression analysis in eukaryotic organisms. We comment on the use of RNA-Seq vs. microarrays, as well as the best practices for GCN construction. Furthermore, we recount the most popular webtools and standalone applications performing gene coexpression analysis, with details on their methods, features and outputs.
Collapse
|
3
|
Zainal-Abidin RA, Harun S, Vengatharajuloo V, Tamizi AA, Samsulrizal NH. Gene Co-Expression Network Tools and Databases for Crop Improvement. PLANTS (BASEL, SWITZERLAND) 2022; 11:1625. [PMID: 35807577 PMCID: PMC9269215 DOI: 10.3390/plants11131625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 06/05/2022] [Accepted: 06/05/2022] [Indexed: 06/15/2023]
Abstract
Transcriptomics has significantly grown as a functional genomics tool for understanding the expression of biological systems. The generated transcriptomics data can be utilised to produce a gene co-expression network that is one of the essential downstream omics data analyses. To date, several gene co-expression network databases that store correlation values, expression profiles, gene names and gene descriptions have been developed. Although these resources remain scattered across the Internet, such databases complement each other and support efficient growth in the functional genomics area. This review presents the features and the most recent gene co-expression network databases in crops and summarises the present status of the tools that are widely used for constructing the gene co-expression network. The highlights of gene co-expression network databases and the tools presented here will pave the way for a robust interpretation of biologically relevant information. With this effort, the researcher would be able to explore and utilise gene co-expression network databases for crops improvement.
Collapse
Affiliation(s)
- Rabiatul-Adawiah Zainal-Abidin
- Biotechnology and Nanotechnology Research Centre, Malaysian Agricultural Research and Development Institute (MARDI), Serdang 43400, Selangor, Malaysia; (R.-A.Z.-A.); (A.-A.T.)
| | - Sarahani Harun
- Centre for Bioinformatics Research, Institute of Systems Biology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia;
| | - Vinothienii Vengatharajuloo
- Centre for Bioinformatics Research, Institute of Systems Biology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia;
| | - Amin-Asyraf Tamizi
- Biotechnology and Nanotechnology Research Centre, Malaysian Agricultural Research and Development Institute (MARDI), Serdang 43400, Selangor, Malaysia; (R.-A.Z.-A.); (A.-A.T.)
- Department of Plant Science, Kulliyyah of Science, International Islamic Universiti Malaysia (IIUM), Jalan Sultan Ahmad Shah, Bandar Indera Mahkota, Kuantan 25200, Pahang, Malaysia
| | - Nurul Hidayah Samsulrizal
- Department of Plant Science, Kulliyyah of Science, International Islamic Universiti Malaysia (IIUM), Jalan Sultan Ahmad Shah, Bandar Indera Mahkota, Kuantan 25200, Pahang, Malaysia
| |
Collapse
|
4
|
Fabris F, Palmer D, de Magalhães JP, Freitas AA. Comparing enrichment analysis and machine learning for identifying gene properties that discriminate between gene classes. Brief Bioinform 2021; 21:803-814. [PMID: 30895300 DOI: 10.1093/bib/bbz028] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 02/18/2019] [Accepted: 02/19/2019] [Indexed: 01/08/2023] Open
Abstract
Biologists very often use enrichment methods based on statistical hypothesis tests to identify gene properties that are significantly over-represented in a given set of genes of interest, by comparison with a 'background' set of genes. These enrichment methods, although based on rigorous statistical foundations, are not always the best single option to identify patterns in biological data. In many cases, one can also use classification algorithms from the machine-learning field. Unlike enrichment methods, classification algorithms are designed to maximize measures of predictive performance and are capable of analysing combinations of gene properties, instead of one property at a time. In practice, however, the majority of studies use either enrichment or classification methods (rather than both), and there is a lack of literature discussing the pros and cons of both types of method. The goal of this paper is to compare and contrast enrichment and classification methods, offering two contributions. First, we discuss the (to some extent complementary) advantages and disadvantages of both types of methods for identifying gene properties that discriminate between gene classes. Second, we provide a set of high-level recommendations for using enrichment and classification methods. Overall, by highlighting the strengths and the weaknesses of both types of methods we argue that both should be used in bioinformatics analyses.
Collapse
Affiliation(s)
- Fabio Fabris
- School of Computing, University of Kent, Kent, CT2 7NF, UK
| | - Daniel Palmer
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, UK
| | - João Pedro de Magalhães
- Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, UK
| | - Alex A Freitas
- School of Computing, University of Kent, Kent, CT2 7NF, UK
| |
Collapse
|
5
|
Wang Y, Zhang R, Liang Z, Li S. Grape-RNA: A Database for the Collection, Evaluation, Treatment, and Data Sharing of Grape RNA-Seq Datasets. Genes (Basel) 2020; 11:genes11030315. [PMID: 32188014 PMCID: PMC7140798 DOI: 10.3390/genes11030315] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 03/09/2020] [Accepted: 03/12/2020] [Indexed: 01/08/2023] Open
Abstract
Since its inception, RNA sequencing (RNA-seq) has become the most effective way to study gene expression. After more than a decade of development, numerous RNA-seq datasets have been created, and the full utilization of these datasets has emerged as a major issue. In this study, we built a comprehensive database named Grape-RNA, which is focused on the collection, evaluation, treatment, and data sharing of grape RNA-seq datasets. This database contains 1529 RNA-seq samples, 112 microRNA samples from the public platform, and 485 RNA-seq in-house datasets sequenced by our lab. We classified these data into 25 conditions and provide the sample information, cleaned raw data, expression level, assembled unigenes, useful tools, and other relevant information to the users. Thus, this study provides data and tools that should be beneficial for researchers by allowing them to easily use the RNA-seq. The provided information can greatly contribute to grape breeding and genomic and biological research. This study may improve the usage of RNA-seq.
Collapse
Affiliation(s)
- Yi Wang
- Beijing Key Laboratory of Grape Science and Enology, and CAS Key Laboratory of Plant Resources, Institute of Botany, the Innovative Academy of Seed Design, the Chinese Academy of Science, Beijing 100093, China; (Y.W.); (S.L.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Rui Zhang
- College of Plant Protection, Shandong Agricultural University, Taian 271018, China;
| | - Zhenchang Liang
- Beijing Key Laboratory of Grape Science and Enology, and CAS Key Laboratory of Plant Resources, Institute of Botany, the Innovative Academy of Seed Design, the Chinese Academy of Science, Beijing 100093, China; (Y.W.); (S.L.)
- Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan 430074, China
- Correspondence: ; Tel./Fax: 86-010-62836064
| | - Shaohua Li
- Beijing Key Laboratory of Grape Science and Enology, and CAS Key Laboratory of Plant Resources, Institute of Botany, the Innovative Academy of Seed Design, the Chinese Academy of Science, Beijing 100093, China; (Y.W.); (S.L.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
6
|
Chaudhary J, Khatri P, Singla P, Kumawat S, Kumari A, R V, Vikram A, Jindal SK, Kardile H, Kumar R, Sonah H, Deshmukh R. Advances in Omics Approaches for Abiotic Stress Tolerance in Tomato. BIOLOGY 2019; 8:biology8040090. [PMID: 31775241 PMCID: PMC6956103 DOI: 10.3390/biology8040090] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2019] [Revised: 11/11/2019] [Accepted: 11/19/2019] [Indexed: 12/21/2022]
Abstract
Tomato, one of the most important crops worldwide, has a high demand in the fresh fruit market and processed food industries. Despite having considerably high productivity, continuous supply as per the market demand is hard to achieve, mostly because of periodic losses occurring due to biotic as well as abiotic stresses. Although tomato is a temperate crop, it is grown in almost all the climatic zones because of widespread demand, which makes it challenge to adapt in diverse conditions. Development of tomato cultivars with enhanced abiotic stress tolerance is one of the most sustainable approaches for its successful production. In this regard, efforts are being made to understand the stress tolerance mechanism, gene discovery, and interaction of genetic and environmental factors. Several omics approaches, tools, and resources have already been developed for tomato growing. Modern sequencing technologies have greatly accelerated genomics and transcriptomics studies in tomato. These advancements facilitate Quantitative trait loci (QTL) mapping, genome-wide association studies (GWAS), and genomic selection (GS). However, limited efforts have been made in other omics branches like proteomics, metabolomics, and ionomics. Extensive cataloging of omics resources made here has highlighted the need for integration of omics approaches for efficient utilization of resources and a better understanding of the molecular mechanism. The information provided here will be helpful to understand the plant responses and the genetic regulatory networks involved in abiotic stress tolerance and efficient utilization of omics resources for tomato crop improvement.
Collapse
Affiliation(s)
- Juhi Chaudhary
- Department of Biology, Oberlin College, Oberlin, OH 44074, USA;
| | - Praveen Khatri
- National Agri-Food Biotechnology Institute (NABI), Mohali, Punjab 140306, India; (P.K.); (P.S.); (S.K.); (A.K.)
| | - Pankaj Singla
- National Agri-Food Biotechnology Institute (NABI), Mohali, Punjab 140306, India; (P.K.); (P.S.); (S.K.); (A.K.)
| | - Surbhi Kumawat
- National Agri-Food Biotechnology Institute (NABI), Mohali, Punjab 140306, India; (P.K.); (P.S.); (S.K.); (A.K.)
| | - Anu Kumari
- National Agri-Food Biotechnology Institute (NABI), Mohali, Punjab 140306, India; (P.K.); (P.S.); (S.K.); (A.K.)
| | - Vinaykumar R
- Department of Vegetable Science, Dr. Yashwant Singh Parmar University of Horticulture and Forestry, Solan, Himachal Pradesh 173230, India; (V.R.); (A.V.)
| | - Amit Vikram
- Department of Vegetable Science, Dr. Yashwant Singh Parmar University of Horticulture and Forestry, Solan, Himachal Pradesh 173230, India; (V.R.); (A.V.)
| | - Salesh Kumar Jindal
- Department of Vegetable Science, Punjab Agricultural University, Ludhiana, Punjab 141004, India;
| | - Hemant Kardile
- Division of Crop Improvement, ICAR-Central Potato Research Institute (CPRI), Shimla, Himachal Pradesh 171001, India;
| | - Rahul Kumar
- Department of Plant Science, University of Hyderabad, Hyderabad 500046, India;
| | - Humira Sonah
- National Agri-Food Biotechnology Institute (NABI), Mohali, Punjab 140306, India; (P.K.); (P.S.); (S.K.); (A.K.)
- Correspondence: (H.S.); (R.D.)
| | - Rupesh Deshmukh
- National Agri-Food Biotechnology Institute (NABI), Mohali, Punjab 140306, India; (P.K.); (P.S.); (S.K.); (A.K.)
- Correspondence: (H.S.); (R.D.)
| |
Collapse
|
7
|
Narise T, Sakurai N, Obayashi T, Ohta H, Shibata D. Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene. BMC Genomics 2017; 18:437. [PMID: 28583129 PMCID: PMC5460524 DOI: 10.1186/s12864-017-3786-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 05/10/2017] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Gene co-expression, the similarity of gene expression profiles under various experimental conditions, has been used as an indicator of functional relationships between genes, and many co-expression databases have been developed for predicting gene functions. These databases usually provide users with a co-expression network and a list of strongly co-expressed genes for a query gene. Several of these databases also provide functional information on a set of strongly co-expressed genes (i.e., provide biological processes and pathways that are enriched in these strongly co-expressed genes), which is generally analyzed via over-representation analysis (ORA). A limitation of this approach may be that users can predict gene functions only based on the strongly co-expressed genes. RESULTS In this study, we developed a new co-expression database that enables users to predict the function of tomato genes from the results of functional enrichment analyses of co-expressed genes while considering the genes that are not strongly co-expressed. To achieve this, we used the ORA approach with several thresholds to select co-expressed genes, and performed gene set enrichment analysis (GSEA) applied to a ranked list of genes ordered by the co-expression degree. We found that internal correlation in pathways affected the significance levels of the enrichment analyses. Therefore, we introduced a new measure for evaluating the relationship between the gene and pathway, termed the percentile (p)-score, which enables users to predict functionally relevant pathways without being affected by the internal correlation in pathways. In addition, we evaluated our approaches using receiver operating characteristic curves, which concluded that the p-score could improve the performance of the ORA. CONCLUSIONS We developed a new database, named Co-expressed Pathways DataBase for Tomato, which is available at http://cox-path-db.kazusa.or.jp/tomato . The database allows users to predict pathways that are relevant to a query gene, which would help to infer gene functions.
Collapse
Affiliation(s)
- Takafumi Narise
- Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba, 292-0818 Japan
| | - Nozomu Sakurai
- Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba, 292-0818 Japan
| | - Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku University, 6-3-09 Aramaki-Aza-Aoba, Aoba-ku, Sendai, Miyagi, 980-8579 Japan
| | - Hiroyuki Ohta
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259-B-65 Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa, 226-8501 Japan
| | - Daisuke Shibata
- Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba, 292-0818 Japan
| |
Collapse
|