1
|
Razavi H, Katanforosh A. Identification of novel key regulatory lncRNAs in gastric adenocarcinoma. BMC Genomics 2022; 23:352. [PMID: 35525925 PMCID: PMC9080188 DOI: 10.1186/s12864-022-08578-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 04/22/2022] [Indexed: 12/02/2022] Open
Abstract
Background Stomach adenocarcinoma (STAD) is one of the most common and deadly cancers worldwide. Recent evidence has demonstrated that dysregulation of long noncoding RNAs (lncRNA) is associated with different hallmarks of cancer. lncRNAs also were suggested as novel promising biomarkers for cancer diagnosis and prognosis. Despite these previous investigations, the expression pattern, diagnostic role, and hallmark association of lncRNAs in STAD remain unclear. Results In this study, The STAD lncRNA-mRNA network was constructed based on RNAs that differentially expressed among tumor and normal samples and had a strong expression correlation with others. The high degree nodes of the network were associated with overall survival. In addition, we found that the hubs’ regulatory roles have previously been confirmed in different types of cancers by literature. For example, the HCG22 hub inhibited cell proliferation and invasion and induced apoptosis in oral squamous cell carcinoma (OSCC) cells. The levels of PCNA, Vimentin, and Bcl2 were decreased and E-cadherin and Bax expression was elevated in OSCC cells after HCG22 overexpression. Additionally, HCG22 overexpression inhibited the Akt, mTOR, and Wnt/β-catenin pathways. Then lncRNAs were mapped to their related GO terms and cancer hallmarks. Based on these mappings, we predict the hallmarks that might be associated with each lncRNA. Finally, the literature review confirmed our prediction. Among the 20 lncRNAs of the STAD network, 11 lncRNAs (LINC02560, SOX21-AS1, C5orf66-AS1, HCG22, PGM5-AS1, NALT1, ENSG00000241224.2, TINCR, MIR205HG, HNF4A-AS1, ENSG00000262756) demonstrated expression correlation with overall survival (OS). Based on expression analysis, survival analysis, hallmark associations, and literature review, LINC02560, SOX21-AS1, C5orf66-AS1, HCG22, PGM5-AS1, NALT1, ENSG00000241224.2, TINCR, MIR205HG, HNF4A-AS1 plays a regulatory role in STAD. For example, our prediction of association between C5orf66-AS1 expression dysregulation and “sustaining proliferative signal” and “Activating invasion and metastasis” has been confirmed in STAD, OSCC and cervical cancer. Finally, we developed a lncRNA signature with SOX21-AS1 and LINC02560, which classified patients into high and low-risk subgroups with significantly different survival outcomes. The mortality rate of the high-risk patients was significantly higher compared to the low-risk patients (28/1% vs 60.13). Conclusion These findings help in designing more precise and detailed experimental studies to find STAD biomarkers and therapeutic targets. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08578-6.
Collapse
Affiliation(s)
- Houri Razavi
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.
| | - Ali Katanforosh
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
| |
Collapse
|
2
|
Pettersen JP, Almaas E. csdR, an R package for differential co-expression analysis. BMC Bioinformatics 2022; 23:79. [PMID: 35183100 PMCID: PMC8858518 DOI: 10.1186/s12859-022-04605-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 02/07/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Differential co-expression network analysis has become an important tool to gain understanding of biological phenotypes and diseases. The CSD algorithm is a method to generate differential co-expression networks by comparing gene co-expressions from two different conditions. Each of the gene pairs is assigned conserved (C), specific (S) and differentiated (D) scores based on the co-expression of the gene pair between the two conditions. The result of the procedure is a network where the nodes are genes and the links are the gene pairs with the highest C-, S-, and D-scores. However, the existing CSD-implementations suffer from poor computational performance, difficult user procedures and lack of documentation.
Results
We created the R-package aimed at reaching good performance together with ease of use, sufficient documentation, and with the ability to play well with other tools for data analysis. was benchmarked on a realistic dataset with 20,645 genes. After verifying that the chosen number of iterations gave sufficient robustness, we tested the performance against the two existing CSD implementations. was superior in performance to one of the implementations, whereas the other did not run. Our implementation can utilize multiple processing cores. However, we were unable to achieve more than $$\sim$$
∼
2.7 parallel speedup with saturation reached at about 10 cores.
Conclusion
The results suggest that is a useful tool for differential co-expression analysis and is able to generate robust results within a workday on datasets of realistic sizes when run on a workstation or compute server.
Collapse
|
3
|
Burns JJR, Shealy BT, Greer MS, Hadish JA, McGowan MT, Biggs T, Smith MC, Feltus FA, Ficklin SP. Addressing noise in co-expression network construction. Brief Bioinform 2021; 23:6446269. [PMID: 34850822 PMCID: PMC8769892 DOI: 10.1093/bib/bbab495] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 10/25/2021] [Accepted: 10/28/2021] [Indexed: 11/13/2022] Open
Abstract
Gene co-expression networks (GCNs) provide multiple benefits to molecular research including hypothesis generation and biomarker discovery. Transcriptome profiles serve as input for GCN construction and are derived from increasingly larger studies with samples across multiple experimental conditions, treatments, time points, genotypes, etc. Such experiments with larger numbers of variables confound discovery of true network edges, exclude edges and inhibit discovery of context (or condition) specific network edges. To demonstrate this problem, a 475-sample dataset is used to show that up to 97% of GCN edges can be misleading because correlations are false or incorrect. False and incorrect correlations can occur when tests are applied without ensuring assumptions are met, and pairwise gene expression may not meet test assumptions if the expression of at least one gene in the pairwise comparison is a function of multiple confounding variables. The ‘one-size-fits-all’ approach to GCN construction is therefore problematic for large, multivariable datasets. Recently, the Knowledge Independent Network Construction toolkit has been used in multiple studies to provide a dynamic approach to GCN construction that ensures statistical tests meet assumptions and confounding variables are addressed. Additionally, it can associate experimental context for each edge of the network resulting in context-specific GCNs (csGCNs). To help researchers recognize such challenges in GCN construction, and the creation of csGCNs, we provide a review of the workflow.
Collapse
Affiliation(s)
- Joshua J R Burns
- Department of Horticulture, 149 Johnson Hall. Washington State University, Pullman, WA 99164. USA
| | - Benjamin T Shealy
- Department of Electrical & Computer Engineering, 105 Riggs Hall. Clemson University, Clemson, SC 29631. USA
| | - Mitchell S Greer
- School of Electrical Engineering and Computer Science, EME 102. Washington State University, Pullman, WA 99164. USA
| | - John A Hadish
- Molecular Plant Sciences Program, French Ad 324g. Washington State University, Pullman, WA 99164. USA
| | - Matthew T McGowan
- Molecular Plant Sciences Program, French Ad 324g. Washington State University, Pullman, WA 99164. USA
| | - Tyler Biggs
- Department of Horticulture, 149 Johnson Hall. Washington State University, Pullman, WA 99164. USA
| | - Melissa C Smith
- Department of Electrical & Computer Engineering, 105 Riggs Hall. Clemson University, Clemson, SC 29631. USA
| | - F Alex Feltus
- Department of Genetics and Biochemistry, 130 McGinty Court. Clemson University, Clemson, SC 29634. USA.,Biomedical Data Science & Informatics Program, 100 McAdams Hall. Clemson University, Clemson, SC 29634. USA.,Clemson Center for Human Genetics, 114 Gregor Mendel Circle, Greenwood, SC 29646. USA
| | - Stephen P Ficklin
- Department of Horticulture, 149 Johnson Hall. Washington State University, Pullman, WA 99164. USA.,School of Electrical Engineering and Computer Science, EME 102. Washington State University, Pullman, WA 99164. USA
| |
Collapse
|
4
|
Supranutritional Maternal Organic Selenium Supplementation during Different Trimesters of Pregnancy Affects the Muscle Gene Transcriptome of Newborn Beef Calves in a Time-Dependent Manner. Genes (Basel) 2021; 12:genes12121884. [PMID: 34946830 PMCID: PMC8701265 DOI: 10.3390/genes12121884] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 11/23/2021] [Accepted: 11/23/2021] [Indexed: 12/13/2022] Open
Abstract
Selenium (Se) is an essential micronutrient for growth and immune function in beef cattle. We previously showed that supranutritional maternal organic Se supplementation during late pregnancy improves immune function in their newborn calves; however, the effects of maternal organic Se-supplementation on fetal programming during different pregnancy stages have yet to be elucidated. Herein, we investigated the effects of supranutritional maternal organic Se-supplementation in different pregnancy trimesters on their beef calf’s genome-wide transcriptome profiles. Within 12 to 48 h of birth, whole blood and Longissimus dorsi (LD) muscle biopsies were collected from calves born to 40 crossbred Angus cows that received, except for the control group (CTR), Se-yeast boluses (105 mg of Se/wk) during the first (TR1), second (TR2), or third (TR3) trimester of gestation. Whole-blood Se concentrations of newborn calves increased from CTR, TR1, TR2 to TR3, whereas muscle Se concentrations of newborn calves were only increased in TR3 group. We identified 3048 unique differentially expressed genes (DEGs) across all group comparisons (FDR ≤ 0.05 and |log2FC| ≥ 1.5). Furthermore, we predicted 237 unique transcription factors that putatively regulate the DEGs. Independent of supplementation trimester, supranutritional maternal organic Se supplementation downregulated genes involved in adaptive immunity in all trimesters. Dependent on supplementation trimester, genes involved in muscle development were upregulated by TR3 Se supplementation and downregulated by TR1 Se-supplementation, and genes involved in collagen formation were downregulated by TR2 Se-supplementation. Supranutritional maternal organic Se supplementation in the last trimester of pregnancy resulted in upregulation of myosin and actin filament associated genes, potentially allowing for optimal muscle function and contraction. Our findings suggest a beneficial effect of supranutritional maternal organic Se supplementation during late gestation on Se-status and muscle development and function of newborn calves.
Collapse
|
5
|
Sabatini S, Gastaldelli A. Disparity-filtered differential correlation network analysis: a case study on CRC metabolomics. J Integr Bioinform 2021; 18:jib-2021-0030. [PMID: 34792303 PMCID: PMC8709737 DOI: 10.1515/jib-2021-0030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 10/18/2021] [Indexed: 11/15/2022] Open
Abstract
Differential network analysis has become a widely used technique to investigate changes of interactions among different conditions. Although the relationship between observed interactions and biochemical mechanisms is hard to establish, differential network analysis can provide useful insights about dysregulated pathways and candidate biomarkers. The available methods to detect differential interactions are heterogeneous and often rely on assumptions that are unrealistic in many applications. To address these issues, we develop a novel method for differential network analysis, using the so-called disparity filter as network reduction technique. In addition, we propose a classification model based on the inferred network interactions. The main novelty of this work lies in its ability to preserve connections that are statistically significant with respect to a null model without favouring any resolution scale, as a hard threshold would do, and without Gaussian assumptions. The method was tested using a published metabolomic dataset on colorectal cancer (CRC). Detected hub metabolites were consistent with recent literature and the classifier was able to distinguish CRC from polyp and healthy subjects with great accuracy. In conclusion, the proposed method provides a new simple and effective framework for the identification of differential interaction patterns and improves the biological interpretation of metabolomics data.
Collapse
Affiliation(s)
- Silvia Sabatini
- Institute of Clinical Physiology, CNR-Pisa, Via Moruzzi 1, Pisa, Italy.,University of Siena, Siena, Italy
| | | |
Collapse
|
6
|
Oliveira de Biagi CA, Nociti RP, Brotto DB, Funicheli BO, Cássia Ruy PD, Bianchi Ximenez JP, Alves Figueiredo DL, Araújo Silva W. CeTF: an R/Bioconductor package for transcription factor co-expression networks using regulatory impact factors (RIF) and partial correlation and information (PCIT) analysis. BMC Genomics 2021; 22:624. [PMID: 34416858 PMCID: PMC8379792 DOI: 10.1186/s12864-021-07918-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 07/30/2021] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Finding meaningful gene-gene interaction and the main Transcription Factors (TFs) in co-expression networks is one of the most important challenges in gene expression data mining. RESULTS Here, we developed the R package "CeTF" that integrates the Partial Correlation with Information Theory (PCIT) and Regulatory Impact Factors (RIF) algorithms applied to gene expression data from microarray, RNA-seq, or single-cell RNA-seq platforms. This approach allows identifying the transcription factors most likely to regulate a given network in different biological systems - for example, regulation of gene pathways in tumor stromal cells and tumor cells of the same tumor. This pipeline can be easily integrated into the high-throughput analysis. To demonstrate the CeTF package application, we analyzed gastric cancer RNA-seq data obtained from TCGA (The Cancer Genome Atlas) and found the HOXB3 gene as the second most relevant TFs with a high regulatory impact (TFs-HRi) regulating gene pathways in the cell cycle. CONCLUSION This preliminary finding shows the potential of CeTF to list master regulators of gene networks. CeTF was designed as a user-friendly tool that provides many highly automated functions without requiring the user to perform many complicated processes. It is available on Bioconductor ( http://bioconductor.org/packages/CeTF ) and GitHub ( http://github.com/cbiagii/CeTF ).
Collapse
Affiliation(s)
- Carlos Alberto Oliveira de Biagi
- Department of Genetics at Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil.,Center for Cell-Based Therapy (CEPID/FAPESP), National Institute of Science and Technology in Stem Cell and Cell Therapy (INCTC/CNPq), Regional Blood Center of Ribeirão Preto, Ribeirão Preto, Brazil.,Institute for Cancer Research, IPEC, Guarapuava, Brazil
| | - Ricardo Perecin Nociti
- Center for Cell-Based Therapy (CEPID/FAPESP), National Institute of Science and Technology in Stem Cell and Cell Therapy (INCTC/CNPq), Regional Blood Center of Ribeirão Preto, Ribeirão Preto, Brazil.,Laboratory of Molecular Morphophysiology and Development, Department of Veterinary Medicine, Faculty of Animal Science and Food Engineering, University of São Paulo, Pirassununga, Brazil
| | - Danielle Barbosa Brotto
- Department of Genetics at Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil.,Center for Cell-Based Therapy (CEPID/FAPESP), National Institute of Science and Technology in Stem Cell and Cell Therapy (INCTC/CNPq), Regional Blood Center of Ribeirão Preto, Ribeirão Preto, Brazil
| | - Breno Osvaldo Funicheli
- Center for Cell-Based Therapy (CEPID/FAPESP), National Institute of Science and Technology in Stem Cell and Cell Therapy (INCTC/CNPq), Regional Blood Center of Ribeirão Preto, Ribeirão Preto, Brazil
| | - Patrícia de Cássia Ruy
- Center for Cell-Based Therapy (CEPID/FAPESP), National Institute of Science and Technology in Stem Cell and Cell Therapy (INCTC/CNPq), Regional Blood Center of Ribeirão Preto, Ribeirão Preto, Brazil.,Center for Medical Genomics, HCFMRP/USP, Ribeirão Preto, Brazil
| | - João Paulo Bianchi Ximenez
- Center for Cell-Based Therapy (CEPID/FAPESP), National Institute of Science and Technology in Stem Cell and Cell Therapy (INCTC/CNPq), Regional Blood Center of Ribeirão Preto, Ribeirão Preto, Brazil
| | - David Livingstone Alves Figueiredo
- Institute for Cancer Research, IPEC, Guarapuava, Brazil.,Department of Medicine, Midwest State University of Paraná-UNICENTRO, Guarapuava, Brazil
| | - Wilson Araújo Silva
- Department of Genetics at Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil. .,Center for Cell-Based Therapy (CEPID/FAPESP), National Institute of Science and Technology in Stem Cell and Cell Therapy (INCTC/CNPq), Regional Blood Center of Ribeirão Preto, Ribeirão Preto, Brazil. .,Institute for Cancer Research, IPEC, Guarapuava, Brazil. .,Center for Integrative Systems Biology (CISBi) - NAP/USP, University of São Paulo, Ribeirão Preto, Brazil.
| |
Collapse
|
7
|
Lau LY, Nguyen LT, Reverter A, Moore SS, Lynn A, McBride‐Kelly L, Phillips‐Rose L, Plath M, Macfarlane R, Vasudivan V, Morton L, Ardley R, Ye Y, Fortes MRS. Gene regulation could be attributed to TCF3 and other key transcription factors in the muscle of pubertal heifers. Vet Med Sci 2020; 6:695-710. [PMID: 32432381 PMCID: PMC7738712 DOI: 10.1002/vms3.278] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Revised: 03/13/2020] [Accepted: 04/09/2020] [Indexed: 01/17/2023] Open
Abstract
Puberty is a whole-body event, driven by the hypothalamic integration of peripheral signals such as leptin or IGF-1. In the process of puberty, reproductive development is simultaneous to growth, including muscle growth. To enhance our understanding of muscle function related to puberty, we performed transcriptome analyses of muscle samples from six pre- and six post-pubertal Brahman heifers (Bos indicus). Our aims were to perform differential expression analyses and co-expression analyses to derive a regulatory gene network associate with puberty. As a result, we identified 431 differentially expressed (DEx) transcripts (genes and non-coding RNAs) when comparing pre- to post-pubertal average gene expression. The DEx transcripts were compared with all expressed transcripts in our samples (over 14,000 transcripts) for functional enrichment analyses. The DEx transcripts were associated with "extracellular region," "inflammatory response" and "hormone activity" (adjusted p < .05). Inflammatory response for muscle regeneration is a necessary aspect of muscle growth, which is accelerated during puberty. The term "hormone activity" may signal genes that respond to progesterone signalling in the muscle, as the presence of this hormone is an important difference between pre- and post-pubertal heifers in our experimental design. The DEx transcript with the highest average expression difference was a mitochondrial gene, ENSBTAG00000043574 that might be another important link between energy metabolism and puberty. In the derived co-expression gene network, we identified six hub genes: CDC5L, MYC, TCF3, RUNX2, ATF2 and CREB1. In the same network, 48 key regulators of DEx transcripts were identified, using a regulatory impact factor metric. The hub gene TCF3 was also a key regulator. The majority of the key regulators (22 genes) are members of the zinc finger family, which has been implicated in bovine puberty in other tissues. In conclusion, we described how puberty may affect muscle gene expression in cattle.
Collapse
Affiliation(s)
- Li Yieng Lau
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Loan T. Nguyen
- Queensland Alliance for Agriculture and Food InnovationThe University of QueenslandBrisbaneQLDAustralia
| | - Antonio Reverter
- CSIRO Agriculture and FoodQueensland Biosciences PrecinctBrisbaneQLDAustralia
| | - Stephen S. Moore
- Queensland Alliance for Agriculture and Food InnovationThe University of QueenslandBrisbaneQLDAustralia
| | - Aaron Lynn
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Liam McBride‐Kelly
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Louis Phillips‐Rose
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Mackenzie Plath
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Rhys Macfarlane
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Vanisha Vasudivan
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Lachlan Morton
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Ryan Ardley
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Yunan Ye
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
| | - Marina R. S. Fortes
- School of Chemistry and Molecular BiologyThe University of QueenslandBrisbaneQLDAustralia
- Queensland Alliance for Agriculture and Food InnovationThe University of QueenslandBrisbaneQLDAustralia
| |
Collapse
|
8
|
Chowdhury HA, Bhattacharyya DK, Kalita JK. (Differential) Co-Expression Analysis of Gene Expression: A Survey of Best Practices. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1154-1173. [PMID: 30668502 DOI: 10.1109/tcbb.2019.2893170] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Analysis of gene expression data is widely used in transcriptomic studies to understand functions of molecules inside a cell and interactions among molecules. Differential co-expression analysis studies diseases and phenotypic variations by finding modules of genes whose co-expression patterns vary across conditions. We review the best practices in gene expression data analysis in terms of analysis of (differential) co-expression, co-expression network, differential networking, and differential connectivity considering both microarray and RNA-seq data along with comparisons. We highlight hurdles in RNA-seq data analysis using methods developed for microarrays. We include discussion of necessary tools for gene expression analysis throughout the paper. In addition, we shed light on scRNA-seq data analysis by including preprocessing and scRNA-seq in co-expression analysis along with useful tools specific to scRNA-seq. To get insights, biological interpretation and functional profiling is included. Finally, we provide guidelines for the analyst, along with research issues and challenges which should be addressed.
Collapse
|
9
|
Class CA, Ha MJ, Baladandayuthapani V, Do KA. iDINGO-integrative differential network analysis in genomics with Shiny application. Bioinformatics 2018; 34:1243-1245. [PMID: 29194470 PMCID: PMC6030922 DOI: 10.1093/bioinformatics/btx750] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Accepted: 11/28/2017] [Indexed: 11/27/2022] Open
Abstract
Motivation Differential network analysis is an important way to understand network rewiring involved in disease progression and development. Building differential networks from multiple ‘omics data provides insight into the holistic differences of the interactive system under different patient-specific groups. DINGO was developed to infer group-specific dependencies and build differential networks. However, DINGO and other existing tools are limited to analyze data arising from a single platform, and modeling each of the multiple ‘omics data independently does not account for the hierarchical structure of the data. Results We developed the iDINGO R package to estimate group-specific dependencies and make inferences on the integrative differential networks, considering the biological hierarchy among the platforms. A Shiny application has also been developed to facilitate easier analysis and visualization of results, including integrative differential networks and hub gene identification across platforms. Availability and implementation R package is available on CRAN (https://cran.r-project.org/web/packages/iDINGO) and Shiny application at https://github.com/MinJinHa/iDINGO. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Caleb A Class
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Min Jin Ha
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | | | - Kim-Anh Do
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
10
|
Ji J, He D, Feng Y, He Y, Xue F, Xie L. JDINAC: joint density-based non-parametric differential interaction network analysis and classification using high-dimensional sparse omics data. Bioinformatics 2018; 33:3080-3087. [PMID: 28582486 DOI: 10.1093/bioinformatics/btx360] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 06/01/2017] [Indexed: 12/26/2022] Open
Abstract
Motivation A complex disease is usually driven by a number of genes interwoven into networks, rather than a single gene product. Network comparison or differential network analysis has become an important means of revealing the underlying mechanism of pathogenesis and identifying clinical biomarkers for disease classification. Most studies, however, are limited to network correlations that mainly capture the linear relationship among genes, or rely on the assumption of a parametric probability distribution of gene measurements. They are restrictive in real application. Results We propose a new Joint density based non-parametric Differential Interaction Network Analysis and Classification (JDINAC) method to identify differential interaction patterns of network activation between two groups. At the same time, JDINAC uses the network biomarkers to build a classification model. The novelty of JDINAC lies in its potential to capture non-linear relations between molecular interactions using high-dimensional sparse data as well as to adjust confounding factors, without the need of the assumption of a parametric probability distribution of gene measurements. Simulation studies demonstrate that JDINAC provides more accurate differential network estimation and lower classification error than that achieved by other state-of-the-art methods. We apply JDINAC to a Breast Invasive Carcinoma dataset, which includes 114 patients who have both tumor and matched normal samples. The hub genes and differential interaction patterns identified were consistent with existing experimental studies. Furthermore, JDINAC discriminated the tumor and normal sample with high accuracy by virtue of the identified biomarkers. JDINAC provides a general framework for feature selection and classification using high-dimensional sparse omics data. Availability and implementation R scripts available at https://github.com/jijiadong/JDINAC. Contact lxie@iscb.org. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiadong Ji
- Department of Mathematical Statistics, School of Statistics, Shandong University of Finance and Economics, Jinan 250014, China
| | - Di He
- Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, NY 10016, USA
| | - Yang Feng
- Department of Statistics, Columbia University, New York, NY 10027, USA
| | - Yong He
- Department of Mathematical Statistics, School of Statistics, Shandong University of Finance and Economics, Jinan 250014, China
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Shandong University, Jinan 250012, China
| | - Lei Xie
- Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, NY 10016, USA.,Department of Computer Science, Hunter College, The City University of New York, NY 10065, USA
| |
Collapse
|
11
|
Li C, Liu L, Dinu V. Pathways of topological rank analysis (PoTRA): a novel method to detect pathways involved in hepatocellular carcinoma. PeerJ 2018; 6:e4571. [PMID: 29666752 PMCID: PMC5896492 DOI: 10.7717/peerj.4571] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 03/14/2018] [Indexed: 01/01/2023] Open
Abstract
Complex diseases such as cancer are usually the result of a combination of environmental factors and one or several biological pathways consisting of sets of genes. Each biological pathway exerts its function by delivering signaling through the gene network. Theoretically, a pathway is supposed to have a robust topological structure under normal physiological conditions. However, the pathway's topological structure could be altered under some pathological condition. It is well known that a normal biological network includes a small number of well-connected hub nodes and a large number of nodes that are non-hubs. In addition, it is reported that the loss of connectivity is a common topological trait of cancer networks, which is an assumption of our method. Hence, from normal to cancer, the process of the network losing connectivity might be the process of disrupting the structure of the network, namely, the number of hub genes might be altered in cancer compared to that in normal or the distribution of topological ranks of genes might be altered. Based on this, we propose a new PageRank-based method called Pathways of Topological Rank Analysis (PoTRA) to detect pathways involved in cancer. We use PageRank to measure the relative topological ranks of genes in each biological pathway, then select hub genes for each pathway, and use Fisher's exact test to test if the number of hub genes in each pathway is altered from normal to cancer. Alternatively, if the distribution of topological ranks of gene in a pathway is altered between normal and cancer, this pathway might also be involved in cancer. Hence, we use the Kolmogorov-Smirnov test to detect pathways that have an altered distribution of topological ranks of genes between two phenotypes. We apply PoTRA to study hepatocellular carcinoma (HCC) and several subtypes of HCC. Very interestingly, we discover that all significant pathways in HCC are cancer-associated generally, while several significant pathways in subtypes of HCC are HCC subtype-associated specifically. In conclusion, PoTRA is a new approach to explore and discover pathways involved in cancer. PoTRA can be used as a complement to other existing methods to broaden our understanding of the biological mechanisms behind cancer at the system-level.
Collapse
Affiliation(s)
- Chaoxing Li
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
| | - Li Liu
- Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ, United States of America
| | - Valentin Dinu
- Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ, United States of America
| |
Collapse
|
12
|
Vassy Z, Kósa I, Vassányi I. Correlation Clustering of Stable Angina Clinical Care Patterns for 506 Thousand Patients. JOURNAL OF HEALTHCARE ENGINEERING 2017; 2017:6937194. [PMID: 29348908 PMCID: PMC5734000 DOI: 10.1155/2017/6937194] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Accepted: 10/17/2017] [Indexed: 11/17/2022]
Abstract
Objectives Our goal was to apply statistical and network science techniques to depict how the clinical pathways of patients can be used to characterize the practices of care providers. Methods We included the data of 506,087 patients who underwent procedures related to ischemic heart disease. Patients were assigned to one of the 136 primary health-care centers using a voting scheme based on their residence. The clinical pathways were classified, and the spectrum of the pathway types was computed for each center, then a network was built with the centers as nodes and spectrum correlations as edge weights. Then Louvain clustering was used to group centers with similar pathway spectra. Results We identified 3 clusters with rather distinct characteristics that occupy quite compact spatial areas, though no geographical information was used in clustering. Network analysis and hierarchical clustering show the dominance of medical university clinics in each cluster. Conclusion Though clinical guidelines provide a uniform regulation for medical decisions, doctors have great freedom in daily clinical practice. This freedom leads to regional preferences of certain clinical pathways, the intercenter professional links, and geographical locality and coupled with quantifiable consequences in terms of care costs and periprocedural risk of patients.
Collapse
Affiliation(s)
- Zsolt Vassy
- Medical Informatics Research and Development Centre, University of Pannonia, Veszprém, Egyetem u. 10 8200, Hungary
| | - István Kósa
- Medical Informatics Research and Development Centre, University of Pannonia, Veszprém, Egyetem u. 10 8200, Hungary
- Department of Medical Rehabilitation and Physical Medicine, University of Szeged, Szeged, Korányi fasor 8-10 6720, Hungary
| | - István Vassányi
- Medical Informatics Research and Development Centre, University of Pannonia, Veszprém, Egyetem u. 10 8200, Hungary
| |
Collapse
|
13
|
Meng S, Liu G, Su L, Sun L, Wu D, Wang L, Zheng Z. Functional clusters analysis and research based on differential coexpression networks. BIOTECHNOL BIOTEC EQ 2017. [DOI: 10.1080/13102818.2017.1358669] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Affiliation(s)
- Shuai Meng
- College of Computer Science and Technology, Jilin University, Changchun, PR China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, PR China
| | - Guixia Liu
- College of Computer Science and Technology, Jilin University, Changchun, PR China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, PR China
| | - Lingtao Su
- College of Computer Science and Technology, Jilin University, Changchun, PR China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, PR China
| | - Liyan Sun
- College of Computer Science and Technology, Jilin University, Changchun, PR China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, PR China
| | - Di Wu
- College of Computer Science and Technology, Jilin University, Changchun, PR China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, PR China
| | - Lingwei Wang
- College of Computer Science and Technology, Jilin University, Changchun, PR China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, PR China
| | - Zhao Zheng
- College of Computer Science and Technology, Jilin University, Changchun, PR China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, PR China
| |
Collapse
|
14
|
Voigt A, Nowick K, Almaas E. A composite network of conserved and tissue specific gene interactions reveals possible genetic interactions in glioma. PLoS Comput Biol 2017; 13:e1005739. [PMID: 28957313 PMCID: PMC5634634 DOI: 10.1371/journal.pcbi.1005739] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Revised: 10/10/2017] [Accepted: 08/24/2017] [Indexed: 02/08/2023] Open
Abstract
Differential co-expression network analyses have recently become an important step in the investigation of cellular differentiation and dysfunctional gene-regulation in cell and tissue disease-states. The resulting networks have been analyzed to identify and understand pathways associated with disorders, or to infer molecular interactions. However, existing methods for differential co-expression network analysis are unable to distinguish between various forms of differential co-expression. To close this gap, here we define the three different kinds (conserved, specific, and differentiated) of differential co-expression and present a systematic framework, CSD, for differential co-expression network analysis that incorporates these interactions on an equal footing. In addition, our method includes a subsampling strategy to estimate the variance of co-expressions. Our framework is applicable to a wide variety of cases, such as the study of differential co-expression networks between healthy and disease states, before and after treatments, or between species. Applying the CSD approach to a published gene-expression data set of cerebral cortex and basal ganglia samples from healthy individuals, we find that the resulting CSD network is enriched in genes associated with cognitive function, signaling pathways involving compounds with well-known roles in the central nervous system, as well as certain neurological diseases. From the CSD analysis, we identify a set of prominent hubs of differential co-expression, whose neighborhood contains a substantial number of genes associated with glioblastoma. The resulting gene-sets identified by our CSD analysis also contain many genes that so far have not been recognized as having a role in glioblastoma, but are good candidates for further studies. CSD may thus aid in hypothesis-generation for functional disease-associations.
Collapse
Affiliation(s)
- André Voigt
- Network Systems Biology Group, Department of Biotechnology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Katja Nowick
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- Bioinformatics, Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
- Human Biology, Institute for Biology, Free University Berlin, Berlin, Germany
| | - Eivind Almaas
- Network Systems Biology Group, Department of Biotechnology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and General Practice, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
15
|
Will T, Helms V. Rewiring of the inferred protein interactome during blood development studied with the tool PPICompare. BMC SYSTEMS BIOLOGY 2017; 11:44. [PMID: 28376810 PMCID: PMC5379774 DOI: 10.1186/s12918-017-0400-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 01/26/2017] [Indexed: 12/24/2022]
Abstract
BACKGROUND Differential analysis of cellular conditions is a key approach towards understanding the consequences and driving causes behind biological processes such as developmental transitions or diseases. The progress of whole-genome expression profiling enabled to conveniently capture the state of a cell's transcriptome and to detect the characteristic features that distinguish cells in specific conditions. In contrast, mapping the physical protein interactome for many samples is experimentally infeasible at the moment. For the understanding of the whole system, however, it is equally important how the interactions of proteins are rewired between cellular states. To overcome this deficiency, we recently showed how condition-specific protein interaction networks that even consider alternative splicing can be inferred from transcript expression data. Here, we present the differential network analysis tool PPICompare that was specifically designed for isoform-sensitive protein interaction networks. RESULTS Besides detecting significant rewiring events between the interactomes of grouped samples, PPICompare infers which alterations to the transcriptome caused each rewiring event and what is the minimal set of alterations necessary to explain all between-group changes. When applied to the development of blood cells, we verified that a reasonable amount of rewiring events were reported by the tool and found that differential gene expression was the major determinant of cellular adjustments to the interactome. Alternative splicing events were consistently necessary in each developmental step to explain all significant alterations and were especially important for rewiring in the context of transcriptional control. CONCLUSIONS Applying PPICompare enabled us to investigate the dynamics of the human protein interactome during developmental transitions. A platform-independent implementation of the tool PPICompare is available at https://sourceforge.net/projects/ppicompare/ .
Collapse
Affiliation(s)
- Thorsten Will
- Center for Bioinformatics, Saarland University, Campus E2.1, Saarbrücken, 66123 Germany
- Graduate School of Computer Science, Saarland University, Campus E1.3, Saarbrücken, 66123 Germany
| | - Volkhard Helms
- Center for Bioinformatics, Saarland University, Campus E2.1, Saarbrücken, 66123 Germany
| |
Collapse
|
16
|
Zuo Y, Cui Y, Di Poto C, Varghese RS, Yu G, Li R, Ressom HW. INDEED: Integrated differential expression and differential network analysis of omic data for biomarker discovery. Methods 2016; 111:12-20. [PMID: 27592383 DOI: 10.1016/j.ymeth.2016.08.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Revised: 08/25/2016] [Accepted: 08/30/2016] [Indexed: 01/03/2023] Open
Abstract
Differential expression (DE) analysis is commonly used to identify biomarker candidates that have significant changes in their expression levels between distinct biological groups. One drawback of DE analysis is that it only considers the changes on single biomolecule level. Recently, differential network (DN) analysis has become popular due to its capability to measure the changes on biomolecular pair level. In DN analysis, network is typically built based on correlation and biomarker candidates are selected by investigating the network topology. However, correlation tends to generate over-complicated networks and the selection of biomarker candidates purely based on network topology ignores the changes on single biomolecule level. In this paper, we propose a novel approach, INDEED, that builds sparse differential network based on partial correlation and integrates DE and DN analyses for biomarker discovery. We applied this approach on real proteomic and glycomic data generated by liquid chromatography coupled with mass spectrometry for hepatocellular carcinoma (HCC) biomarker discovery study. For each omic data, we used one dataset to select biomarker candidates, built a disease classifier and evaluated the performance of the classifier on an independent dataset. The biomarker candidates, selected by INDEED, were more reproducible across independent datasets, and led to a higher classification accuracy in predicting HCC cases and cirrhotic controls compared with those selected by separate DE and DN analyses. INDEED also identified some candidates previously reported to be relevant to HCC, such as intercellular adhesion molecule 2 (ICAM2) and c4b-binding protein alpha chain (C4BPA), which were missed by both DE and DN analyses. In addition, we applied INDEED for survival time prediction based on transcriptomic data acquired by analysis of samples from breast cancer patients. We selected biomarker candidates and built a regression model for survival time prediction based on a gene expression dataset and patients' survival records. We evaluated the performance of the regression model on an independent dataset. Compared with the biomarker candidates selected by DE and DN analyses, those selected through INDEED led to more accurate survival time prediction.
Collapse
Affiliation(s)
- Yiming Zuo
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA; Department of Radiation Oncology, Stanford University, Palo Alto, CA 94304, USA; Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20007, USA.
| | - Yi Cui
- Department of Radiation Oncology, Stanford University, Palo Alto, CA 94304, USA.
| | - Cristina Di Poto
- Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20007, USA.
| | - Rency S Varghese
- Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20007, USA.
| | - Guoqiang Yu
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
| | - Ruijiang Li
- Department of Radiation Oncology, Stanford University, Palo Alto, CA 94304, USA.
| | - Habtom W Ressom
- Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20007, USA.
| |
Collapse
|
17
|
Ji J, Yuan Z, Zhang X, Xue F. A powerful score-based statistical test for group difference in weighted biological networks. BMC Bioinformatics 2016; 17:86. [PMID: 26867929 PMCID: PMC4751708 DOI: 10.1186/s12859-016-0916-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 01/29/2016] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Complex disease is largely determined by a number of biomolecules interwoven into networks, rather than a single biomolecule. A key but inadequately addressed issue is how to test possible differences of the networks between two groups. Group-level comparison of network properties may shed light on underlying disease mechanisms and benefit the design of drug targets for complex diseases. We therefore proposed a powerful score-based statistic to detect group difference in weighted networks, which simultaneously capture the vertex changes and edge changes. RESULTS Simulation studies indicated that the proposed network difference measure (NetDifM) was stable and outperformed other methods existed, under various sample sizes and network topology structure. One application to real data about GWAS of leprosy successfully identified the specific gene interaction network contributing to leprosy. For additional gene expression data of ovarian cancer, two candidate subnetworks, PI3K-AKT and Notch signaling pathways, were considered and identified respectively. CONCLUSIONS The proposed method, accounting for the vertex changes and edge changes simultaneously, is valid and powerful to capture the group difference of biological networks.
Collapse
Affiliation(s)
- Jiadong Ji
- Department of Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, Shandong, China.
| | - Zhongshang Yuan
- Department of Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, Shandong, China.
| | - Xiaoshuai Zhang
- Department of Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, Shandong, China.
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, Shandong, China.
| |
Collapse
|
18
|
Functional Analysis and Characterization of Differential Coexpression Networks. Sci Rep 2015; 5:13295. [PMID: 26282208 PMCID: PMC4539605 DOI: 10.1038/srep13295] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 07/27/2015] [Indexed: 01/10/2023] Open
Abstract
Differential coexpression analysis is emerging as a complement to conventional differential gene expression analysis. The identified differential coexpression links can be assembled into a differential coexpression network (DCEN) in response to environmental stresses or genetic changes. Differential coexpression analyses have been successfully used to identify condition-specific modules; however, the structural properties and biological significance of general DCENs have not been well investigated. Here, we analyzed two independent Saccharomyces cerevisiae DCENs constructed from large-scale time-course gene expression profiles in response to different situations. Topological analyses show that DCENs are tree-like networks possessing scale-free characteristics, but not small-world. Functional analyses indicate that differentially coexpressed gene pairs in DCEN tend to link different biological processes, achieving complementary or synergistic effects. Furthermore, the gene pairs lacking common transcription factors are sensitive to perturbation and hence lead to differential coexpression. Based on these observations, we integrated transcriptional regulatory information into DCEN and identified transcription factors that might cause differential coexpression by gain or loss of activation in response to different situations. Collectively, our results not only uncover the unique structural characteristics of DCEN but also provide new insights into interpretation of DCEN to reveal its biological significance and infer the underlying gene regulatory dynamics.
Collapse
|
19
|
Module Based Differential Coexpression Analysis Method for Type 2 Diabetes. BIOMED RESEARCH INTERNATIONAL 2015; 2015:836929. [PMID: 26339648 PMCID: PMC4538423 DOI: 10.1155/2015/836929] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 12/29/2014] [Indexed: 11/24/2022]
Abstract
More and more studies have shown that many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional biological pathway or network and are highly correlated. Differential coexpression analysis, as a more comprehensive technique to the differential expression analysis, was raised to research gene regulatory networks and biological pathways of phenotypic changes through measuring gene correlation changes between disease and normal conditions. In this paper, we propose a gene differential coexpression analysis algorithm in the level of gene sets and apply the algorithm to a publicly available type 2 diabetes (T2D) expression dataset. Firstly, we calculate coexpression biweight midcorrelation coefficients between all gene pairs. Then, we select informative correlation pairs using the “differential coexpression threshold” strategy. Finally, we identify the differential coexpression gene modules using maximum clique concept and k-clique algorithm. We apply the proposed differential coexpression analysis method on simulated data and T2D data. Two differential coexpression gene modules about T2D were detected, which should be useful for exploring the biological function of the related genes.
Collapse
|
20
|
Ha MJ, Baladandayuthapani V, Do KA. DINGO: differential network analysis in genomics. Bioinformatics 2015; 31:3413-20. [PMID: 26148744 DOI: 10.1093/bioinformatics/btv406] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2014] [Accepted: 06/26/2015] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION Cancer progression and development are initiated by aberrations in various molecular networks through coordinated changes across multiple genes and pathways. It is important to understand how these networks change under different stress conditions and/or patient-specific groups to infer differential patterns of activation and inhibition. Existing methods are limited to correlation networks that are independently estimated from separate group-specific data and without due consideration of relationships that are conserved across multiple groups. METHOD We propose a pathway-based differential network analysis in genomics (DINGO) model for estimating group-specific networks and making inference on the differential networks. DINGO jointly estimates the group-specific conditional dependencies by decomposing them into global and group-specific components. The delineation of these components allows for a more refined picture of the major driver and passenger events in the elucidation of cancer progression and development. RESULTS Simulation studies demonstrate that DINGO provides more accurate group-specific conditional dependencies than achieved by using separate estimation approaches. We apply DINGO to key signaling pathways in glioblastoma to build differential networks for long-term survivors and short-term survivors in The Cancer Genome Atlas. The hub genes found by mRNA expression, DNA copy number, methylation and microRNA expression reveal several important roles in glioblastoma progression. AVAILABILITY AND IMPLEMENTATION R Package at: odin.mdacc.tmc.edu/∼vbaladan. CONTACT veera@mdanderson.org SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Min Jin Ha
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | | | - Kim-Anh Do
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
21
|
Guo B, Greenwood PL, Cafe LM, Zhou G, Zhang W, Dalrymple BP. Transcriptome analysis of cattle muscle identifies potential markers for skeletal muscle growth rate and major cell types. BMC Genomics 2015; 16:177. [PMID: 25887672 PMCID: PMC4364331 DOI: 10.1186/s12864-015-1403-x] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Accepted: 02/24/2015] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND This study aimed to identify markers for muscle growth rate and the different cellular contributors to cattle muscle and to link the muscle growth rate markers to specific cell types. RESULTS The expression of two groups of genes in the longissimus muscle (LM) of 48 Brahman steers of similar age, significantly enriched for "cell cycle" and "ECM (extracellular matrix) organization" Gene Ontology (GO) terms was correlated with average daily gain/kg liveweight (ADG/kg) of the animals. However, expression of the same genes was only partly related to growth rate across a time course of postnatal LM development in two cattle genotypes, Piedmontese x Hereford (high muscling) and Wagyu x Hereford (high marbling). The deposition of intramuscular fat (IMF) altered the relationship between the expression of these genes and growth rate. K-means clustering across the development time course with a large set of genes (5,596) with similar expression profiles to the ECM genes was undertaken. The locations in the clusters of published markers of different cell types in muscle were identified and used to link clusters of genes to the cell type most likely to be expressing them. Overall correspondence between published cell type expression of markers and predicted major cell types of expression in cattle LM was high. However, some exceptions were identified: expression of SOX8 previously attributed to muscle satellite cells was correlated with angiogenesis. Analysis of the clusters and cell types suggested that the "cell cycle" and "ECM" signals were from the fibro/adipogenic lineage. Significant contributions to these signals from the muscle satellite cells, angiogenic cells and adipocytes themselves were not as strongly supported. Based on the clusters and cell type markers, sets of five genes predicted to be representative of fibro/adipogenic precursors (FAPs) and endothelial cells, and/or ECM remodelling and angiogenesis were identified. CONCLUSIONS Gene sets and gene markers for the analysis of many of the major processes/cell populations contributing to muscle composition and growth have been proposed, enabling a consistent interpretation of gene expression datasets from cattle LM. The same gene sets are likely to be applicable in other cattle muscles and in other species.
Collapse
Affiliation(s)
- Bing Guo
- Key Laboratory of Meat Processing and Quality Control, Synergetic Innovation Centre of Food Safety and Nutrition, College of Food Science and Technology, Nanjing Agriculture University, Nanjing, 210095, P. R. China.
- CSIRO Agriculture Flagship, St. Lucia, QLD, 4067, Australia.
| | - Paul L Greenwood
- CSIRO Agriculture Flagship, Armidale, NSW, 2350, Australia.
- NSW Department of Primary Industries, University of New England, Armidale, NSW, 2351, Australia.
| | - Linda M Cafe
- NSW Department of Primary Industries, University of New England, Armidale, NSW, 2351, Australia.
| | - Guanghong Zhou
- Key Laboratory of Meat Processing and Quality Control, Synergetic Innovation Centre of Food Safety and Nutrition, College of Food Science and Technology, Nanjing Agriculture University, Nanjing, 210095, P. R. China.
| | - Wangang Zhang
- Key Laboratory of Meat Processing and Quality Control, Synergetic Innovation Centre of Food Safety and Nutrition, College of Food Science and Technology, Nanjing Agriculture University, Nanjing, 210095, P. R. China.
| | | |
Collapse
|
22
|
Gene differential coexpression analysis based on biweight correlation and maximum clique. BMC Bioinformatics 2014; 15 Suppl 15:S3. [PMID: 25474074 PMCID: PMC4271563 DOI: 10.1186/1471-2105-15-s15-s3] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Differential coexpression analysis usually requires the definition of 'distance' or 'similarity' between measured datasets. Until now, the most common choice is Pearson correlation coefficient. However, Pearson correlation coefficient is sensitive to outliers. Biweight midcorrelation is considered to be a good alternative to Pearson correlation since it is more robust to outliers. In this paper, we introduce to use Biweight Midcorrelation to measure 'similarity' between gene expression profiles, and provide a new approach for gene differential coexpression analysis. Firstly, we calculate the biweight midcorrelation coefficients between all gene pairs. Then, we filter out non-informative correlation pairs using the 'half-thresholding' strategy and calculate the differential coexpression value of gene, The experimental results on simulated data show that the new approach performed better than three previously published differential coexpression analysis (DCEA) methods. Moreover, we use the maximum clique analysis to gene subset included genes identified by our approach and previously reported T2D-related genes, many additional discoveries can be found through our method.
Collapse
|
23
|
Reverter A, Henshall JM, McCulloch R, Sasazaki S, Hawken R, Lehnert SA. Numerical analysis of intensity signals resulting from genotyping pooled DNA samples in beef cattle and broiler chicken. J Anim Sci 2014; 92:1874-85. [PMID: 24663186 DOI: 10.2527/jas.2013-7133] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Pooled genomic DNA has been proposed as a cost-effective approach in genomewide association studies (GWAS). However, algorithms for genotype calling of biallelic SNP are not adequate with pooled DNA samples because they assume the presence of 2 fluorescent signals, 1 for each allele, and operate under the expectation that at most 2 copies of the variant allele can be found for any given SNP and DNA sample. We adapt analytical methodology from 2-channel gene expression microarray technology to SNP genotyping of pooled DNA samples. Using 5 datasets from beef cattle and broiler chicken of varying degrees of complexity in terms of design and phenotype, continuous and dichotomous, we show that both differential hybridization (M = green minus red intensity signal) and abundance (A = average of red and green intensities) provide useful information in the prediction of SNP allele frequencies. This is predominantly true when making inference about extreme SNP that are either nearly fixed or highly polymorphic. We propose the use of model-based clustering via mixtures of bivariate normal distributions as an optimal framework to capture the relationship between hybridization intensity and allele frequency from pooled DNA samples. The range of M and A values observed here are in agreement with those reported within the context of gene expression microarray and also with those from SNP array data within the context of analytical methodology for the identification of copy number variants. In particular, we confirm that highly polymorphic SNP yield a strong signal from both channels (red and green) while lowly or nonpolymorphic SNP yield a strong signal from 1 channel only. We further confirm that when the SNP allele frequencies are known, either because the individuals in the pools or from a closely related population are themselves genotyped, a multiple regression model with linear and quadratic components can be developed with high prediction accuracy. We conclude that when these approaches are applied to the estimation of allele frequencies, the resulting estimates allow for the development of cost-effective and reliable GWAS.
Collapse
Affiliation(s)
- A Reverter
- CSIRO Food Futures Flagship and CSIRO Animal, Food and Health Sciences, 306Carmody Road, St. Lucia, Brisbane, Queensland 4067, Australia
| | | | | | | | | | | |
Collapse
|
24
|
Ingham AB, Osborne SA, Menzies M, Briscoe S, Chen W, Kongsuwan K, Reverter A, Jeanes A, Dalrymple BP, Wijffels G, Seymour R, Hudson NJ. RNF14 is a regulator of mitochondrial and immune function in muscle. BMC SYSTEMS BIOLOGY 2014; 8:10. [PMID: 24472305 PMCID: PMC3906743 DOI: 10.1186/1752-0509-8-10] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 01/21/2014] [Indexed: 12/21/2022]
Abstract
BACKGROUND Muscle development and remodelling, mitochondrial physiology and inflammation are thought to be inter-related and to have implications for metabolism in both health and disease. However, our understanding of their molecular control is incomplete. RESULTS In this study we have confirmed that the ring finger 14 protein (RNF14), a poorly understood transcriptional regulator, influences the expression of both mitochondrial and immune-related genes. The prediction was based on a combination of network connectivity and differential connectivity in cattle (a non-model organism) and mice data sets, with a focus on skeletal muscle. They assigned similar probability to mammalian RNF14 playing a regulatory role in mitochondrial and immune gene expression. To try and resolve this apparent ambiguity we performed a genome-wide microarray expression analysis on mouse C2C12 myoblasts transiently transfected with two Rnf14 transcript variants that encode 2 naturally occurring but different RNF14 protein isoforms. The effect of both constructs was significantly different to the control samples (untransfected cells and cells transfected with an empty vector). Cluster analyses revealed that transfection with the two Rnf14 constructs yielded discrete expression signatures from each other, but in both cases a substantial set of genes annotated as encoding proteins related to immune function were perturbed. These included cytokines and interferon regulatory factors. Additionally, transfection of the longer transcript variant 1 coordinately increased the expression of 12 (of the total 13) mitochondrial proteins encoded by the mitochondrial genome, 3 of which were significant in isolated pair-wise comparisons (Mt-coxII, Mt-nd2 and mt-nd4l). This apparent additional mitochondrial function may be attributable to the RWD protein domain that is present only in the longer RNF14 isoform. CONCLUSIONS RNF14 influences the expression of both mitochondrial and immune related genes in a skeletal muscle context, and has likely implications for the inter-relationship between bioenergetic status and inflammation.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Nicholas J Hudson
- CSIRO Animal, Food and Health Sciences, 306 Carmody Road, St, Lucia, Queensland, Australia.
| |
Collapse
|
25
|
Yang J, Yu H, Liu BH, Zhao Z, Liu L, Ma LX, Li YX, Li YY. DCGL v2.0: an R package for unveiling differential regulation from differential co-expression. PLoS One 2013; 8:e79729. [PMID: 24278165 PMCID: PMC3835854 DOI: 10.1371/journal.pone.0079729] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2013] [Accepted: 10/03/2013] [Indexed: 12/25/2022] Open
Abstract
Motivation Differential co-expression analysis (DCEA) has emerged in recent years as a novel, systematic investigation into gene expression data. While most DCEA studies or tools focus on the co-expression relationships among genes, some are developing a potentially more promising research domain, differential regulation analysis (DRA). In our previously proposed R package DCGL v1.0, we provided functions to facilitate basic differential co-expression analyses; however, the output from DCGL v1.0 could not be translated into differential regulation mechanisms in a straightforward manner. Results To advance from DCEA to DRA, we upgraded the DCGL package from v1.0 to v2.0. A new module named “Differential Regulation Analysis” (DRA) was designed, which consists of three major functions: DRsort, DRplot, and DRrank. DRsort selects differentially regulated genes (DRGs) and differentially regulated links (DRLs) according to the transcription factor (TF)-to-target information. DRrank prioritizes the TFs in terms of their potential relevance to the phenotype of interest. DRplot graphically visualizes differentially co-expressed links (DCLs) and/or TF-to-target links in a network context. In addition to these new modules, we streamlined the codes from v1.0. The evaluation results proved that our differential regulation analysis is able to capture the regulators relevant to the biological subject. Conclusions With ample functions to facilitate differential regulation analysis, DCGL v2.0 was upgraded from a DCEA tool to a DRA tool, which may unveil the underlying differential regulation from the observed differential co-expression. DCGL v2.0 can be applied to a wide range of gene expression data in order to systematically identify novel regulators that have not yet been documented as critical. Availability DCGL v2.0 package is available at http://cran.r-project.org/web/packages/DCGL/index.html or at our project home page http://lifecenter.sgst.cn/main/en/dcgl.jsp.
Collapse
Affiliation(s)
- Jing Yang
- School of Biotechnology, East China University of Science and Technology, Shanghai, P. R. China
- Bioinformatics Center, Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Hui Yu
- Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, P. R. China
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Bao-Hong Liu
- Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, P. R. China
| | - Zhongming Zhao
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Departments of Psychiatry, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Lei Liu
- Bioinformatics Center, Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Liang-Xiao Ma
- Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, P. R. China
| | - Yi-Xue Li
- School of Biotechnology, East China University of Science and Technology, Shanghai, P. R. China
- Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, P. R. China
- * E-mail: (YYL); (YXL)
| | - Yuan-Yuan Li
- Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, P. R. China
- * E-mail: (YYL); (YXL)
| |
Collapse
|
26
|
Ficklin SP, Feltus FA. A systems-genetics approach and data mining tool to assist in the discovery of genes underlying complex traits in Oryza sativa. PLoS One 2013; 8:e68551. [PMID: 23874666 PMCID: PMC3713027 DOI: 10.1371/journal.pone.0068551] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 05/30/2013] [Indexed: 12/13/2022] Open
Abstract
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.
Collapse
Affiliation(s)
- Stephen P Ficklin
- Plant and Environmental Sciences, Clemson University, Clemson, South Carolina, United States of America
| | | |
Collapse
|
27
|
Gambardella G, Moretti MN, de Cegli R, Cardone L, Peron A, di Bernardo D. Differential network analysis for the identification of condition-specific pathway activity and regulation. ACTA ACUST UNITED AC 2013; 29:1776-85. [PMID: 23749957 PMCID: PMC3702259 DOI: 10.1093/bioinformatics/btt290] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
MOTIVATION Identification of differential expressed genes has led to countless new discoveries. However, differentially expressed genes are only a proxy for finding dysregulated pathways. The problem is to identify how the network of regulatory and physical interactions rewires in different conditions or in disease. RESULTS We developed a procedure named DINA (DIfferential Network Analysis), which is able to identify set of genes, whose co-regulation is condition-specific, starting from a collection of condition-specific gene expression profiles. DINA is also able to predict which transcription factors (TFs) may be responsible for the pathway condition-specific co-regulation. We derived 30 tissue-specific gene networks in human and identified several metabolic pathways as the most differentially regulated across the tissues. We correctly identified TFs such as Nuclear Receptors as their main regulators and demonstrated that a gene with unknown function (YEATS2) acts as a negative regulator of hepatocyte metabolism. Finally, we showed that DINA can be used to make hypotheses on dysregulated pathways during disease progression. By analyzing gene expression profiles across primary and transformed hepatocytes, DINA identified hepatocarcinoma-specific metabolic and transcriptional pathway dysregulation. AVAILABILITY We implemented an on-line web-tool http://dina.tigem.it enabling the user to apply DINA to identify tissue-specific pathways or gene signatures. CONTACT dibernardo@tigem.it SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
28
|
Feltus FA, Ficklin SP, Gibson SM, Smith MC. Maximizing capture of gene co-expression relationships through pre-clustering of input expression samples: an Arabidopsis case study. BMC SYSTEMS BIOLOGY 2013; 7:44. [PMID: 23738693 PMCID: PMC3679940 DOI: 10.1186/1752-0509-7-44] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2012] [Accepted: 05/14/2013] [Indexed: 12/11/2022]
Abstract
Background In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. Results A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Conclusions Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired.
Collapse
Affiliation(s)
- F Alex Feltus
- Department of Genetics & Biochemistry, Clemson University, 105 Collings Street, Clemson, SC 29634, USA.
| | | | | | | |
Collapse
|
29
|
Amar D, Safer H, Shamir R. Dissection of regulatory networks that are altered in disease via differential co-expression. PLoS Comput Biol 2013; 9:e1002955. [PMID: 23505361 PMCID: PMC3591264 DOI: 10.1371/journal.pcbi.1002955] [Citation(s) in RCA: 112] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2012] [Accepted: 01/14/2013] [Indexed: 12/26/2022] Open
Abstract
Comparing the gene-expression profiles of sick and healthy individuals can help in understanding disease. Such differential expression analysis is a well-established way to find gene sets whose expression is altered in the disease. Recent approaches to gene-expression analysis go a step further and seek differential co-expression patterns, wherein the level of co-expression of a set of genes differs markedly between disease and control samples. Such patterns can arise from a disease-related change in the regulatory mechanism governing that set of genes, and pinpoint dysfunctional regulatory networks. Here we present DICER, a new method for detecting differentially co-expressed gene sets using a novel probabilistic score for differential correlation. DICER goes beyond standard differential co-expression and detects pairs of modules showing differential co-expression. The expression profiles of genes within each module of the pair are correlated across all samples. The correlation between the two modules, however, differs markedly between the disease and normal samples. We show that DICER outperforms the state of the art in terms of significance and interpretability of the detected gene sets. Moreover, the gene sets discovered by DICER manifest regulation by disease-specific microRNA families. In a case study on Alzheimer's disease, DICER dissected biological processes and protein complexes into functional subunits that are differentially co-expressed, thereby revealing inner structures in disease regulatory networks. The most fundamental and popular gene-expression experiments measure genome-wide transcription levels in two populations: perturbed and wild type, or cases and controls. The genes that show significantly different expression between the two populations (the differentially expressed genes) are useful for understanding the biology underlying the phenotype difference, and can sometimes also serve as biomarkers for classification. In contrast, genes that have similar expression to each other across all profiles (co-expressed genes) can yield clues about the functional commonality of the two populations. Differential co-expression has recently been proposed as a way to combine the benefits of these two approaches: it seeks gene groups that are co-expressed in one phenotype much more than in the other. Here we develop a new method for detecting differential co-expression and test it on case-control expression profiles of several diseases. Our algorithm improves upon the state of the art in the strength of the detected patterns and in agreement with current biological knowledge. We show that our method can predict gene regulators that are associated with the disease of interest and demonstrate that it can dissect known biological pathways into subcomponents that are not detected using standard analyses.
Collapse
Affiliation(s)
- David Amar
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Hershel Safer
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Ron Shamir
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
- * E-mail:
| |
Collapse
|
30
|
Gibson SM, Ficklin SP, Isaacson S, Luo F, Feltus FA, Smith MC. Massive-scale gene co-expression network construction and robustness testing using random matrix theory. PLoS One 2013; 8:e55871. [PMID: 23409071 PMCID: PMC3567026 DOI: 10.1371/journal.pone.0055871] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Accepted: 01/03/2013] [Indexed: 11/18/2022] Open
Abstract
The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.
Collapse
Affiliation(s)
- Scott M. Gibson
- Holcombe Department of Electrical and Computer Engineering, Clemson University, Clemson, South Carolina, United States of America
| | - Stephen P. Ficklin
- Plant and Environmental Sciences, Clemson University, Clemson, South Carolina, United States of America
| | - Sven Isaacson
- Department of Computer Science, Wittenberg University, Springfield, Ohio, United States of America
| | - Feng Luo
- School of Computing, Clemson University, Clemson, South Carolina, United States of America
| | - Frank A. Feltus
- Plant and Environmental Sciences, Clemson University, Clemson, South Carolina, United States of America
- Department of Genetics & Biochemistry, Clemson University, Clemson, South Carolina, United States of America
- * E-mail:
| | - Melissa C. Smith
- Holcombe Department of Electrical and Computer Engineering, Clemson University, Clemson, South Carolina, United States of America
| |
Collapse
|
31
|
Odibat O, Reddy CK. Ranking differential hubs in gene co-expression networks. J Bioinform Comput Biol 2012; 10:1240002. [PMID: 22809303 DOI: 10.1142/s0219720012400021] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Identifying the genes that change their expressions between two conditions (such as normal versus cancer) is a crucial task that can help in understanding the causes of diseases. Differential networking has emerged as a powerful approach to detect the changes in network structures and to identify the differentially connected genes among two networks. However, existing differential network-based methods primarily depend on pairwise comparisons of the genes based on their connectivity. Therefore, these methods cannot capture the essential topological changes in the network structures. In this paper, we propose a novel algorithm, DiffRank, which ranks the genes based on their contribution to the differences between the two networks. To achieve this goal, we define two novel structural scoring measures: a local structure measure (differential connectivity) and a global structure measure (differential betweenness centrality). These measures are optimized by propagating the scores through the network structure and then ranking the genes based on these propagated scores. We demonstrate the effectiveness of DiffRank on synthetic and real datasets. For the synthetic datasets, we developed a simulator for generating synthetic differential scale-free networks, and we compared our method with existing methods. The comparisons show that our algorithm outperforms these existing methods. For the real datasets, we apply the proposed algorithm on several gene expression datasets and demonstrate that the proposed method provides biologically interesting results.
Collapse
Affiliation(s)
- Omar Odibat
- Department of Computer Science, Wayne State University, Detroit, MI 48228, USA.
| | | |
Collapse
|
32
|
Vaquero AR, Ferreira NE, Omae SV, Rodrigues MV, Teixeira SK, Krieger JE, Pereira AC. Using gene-network landscape to dissect genotype effects of TCF7L2 genetic variant on diabetes and cardiovascular risk. Physiol Genomics 2012; 44:903-14. [PMID: 22872755 DOI: 10.1152/physiolgenomics.00030.2012] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The single nucleotide polymorphism (SNP) within the TCF7L2 gene, rs7903146, is, to date, the most significant genetic marker associated with Type 2 diabetes mellitus (T2DM) risk. Nonetheless, its functional role in disease pathology is poorly understood. The aim of the present study was to investigate, in vascular smooth muscle cells from 92 patients undergoing aortocoronary bypass surgery, the contribution of this SNP in T2DM using expression levels and expression correlation comparison approaches, which were visually represented as gene interaction networks. Initially, the expression levels of 41 genes (seven TCF7L2 splice forms and 40 other T2DM relevant genes) were compared between rs7903146 wild-type (CC) and T2DM-risk (CT + TT) genotype groups. Next, we compared the expression correlation patterns of these 41 genes between groups to observe if the relationships between genes were different. Five TCF7L2 splice forms and nine genes showed significant expression differences between groups. RXRα gene was pinpointed as showing the most different expression correlation pattern with other genes. Therefore, T2DM risk alleles appear to be influencing TCF7L2 splice form's expression in vascular smooth muscle cells, and RXRα gene is pointed out as a treatment target candidate for risk reduction in individuals with high risk of developing T2DM, especially individuals harboring TCF7L2 risk genotypes.
Collapse
Affiliation(s)
- Andre R Vaquero
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, University of Sao Paulo Medical School, Sao Paulo, Brazil
| | | | | | | | | | | | | |
Collapse
|
33
|
Gaiteri C, Sibille E. Differentially expressed genes in major depression reside on the periphery of resilient gene coexpression networks. Front Neurosci 2011; 5:95. [PMID: 21922000 PMCID: PMC3166821 DOI: 10.3389/fnins.2011.00095] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2011] [Accepted: 07/15/2011] [Indexed: 02/03/2023] Open
Abstract
The structure of gene coexpression networks reflects the activation and interaction of multiple cellular systems. Since the pathology of neuropsychiatric disorders is influenced by diverse cellular systems and pathways, we investigated gene coexpression networks in major depression, and searched for putative unifying themes in network connectivity across neuropsychiatric disorders. Specifically, based on the prevalence of the lethality–centrality relationship in disease-related networks, we hypothesized that network changes between control and major depression-related networks would be centered around coexpression hubs, and secondly, that differentially expressed (DE) genes would have a characteristic position and connectivity level in those networks. Mathematically, the first hypothesis tests the relationship of differential coexpression to network connectivity, while the second “hybrid” expression-and-network hypothesis tests the relationship of differential expression to network connectivity. To answer these questions about the potential interaction of coexpression network structure with differential expression, we utilized all available human post-mortem depression-related datasets appropriate for coexpression analysis, which spanned different microarray platforms, cohorts, and brain regions. Similar studies were also performed in an animal model of depression and in schizophrenia and bipolar disorder microarray datasets. We now provide results which consistently support (1) that genes assemble into small-world and scale-free networks in control subjects, (2) that this efficient network topology is largely resilient to changes in depressed subjects, and (3) that DE genes are positioned on the periphery of coexpression networks. Similar results were observed in a mouse model of depression, and in selected bipolar- and schizophrenia-related networks. Finally, we show that baseline expression variability contributes to the propensity of genes to be network hubs and/or to be DE in disease. In summary, our results suggest that the small-world and scale-free properties of gene networks are resilient to pathological changes in major depression, and that the network structure may constrain the extent to which a gene may be DE in the illness, hence informing further gene-network-based mechanistic studies of neuropsychiatric disorders.
Collapse
Affiliation(s)
- Chris Gaiteri
- Department of Psychiatry, Center for Neuroscience, University of Pittsburgh Pittsburgh, PA, USA
| | | |
Collapse
|
34
|
Yu H, Liu BH, Ye ZQ, Li C, Li YX, Li YY. Link-based quantitative methods to identify differentially coexpressed genes and gene pairs. BMC Bioinformatics 2011; 12:315. [PMID: 21806838 PMCID: PMC3199761 DOI: 10.1186/1471-2105-12-315] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2010] [Accepted: 08/02/2011] [Indexed: 01/01/2023] Open
Abstract
Background Differential coexpression analysis (DCEA) is increasingly used for investigating the global transcriptional mechanisms underlying phenotypic changes. Current DCEA methods mostly adopt a gene connectivity-based strategy to estimate differential coexpression, which is characterized by comparing the numbers of gene neighbors in different coexpression networks. Although it simplifies the calculation, this strategy mixes up the identities of different coexpression neighbors of a gene, and fails to differentiate significant differential coexpression changes from those trivial ones. Especially, the correlation-reversal is easily missed although it probably indicates remarkable biological significance. Results We developed two link-based quantitative methods, DCp and DCe, to identify differentially coexpressed genes and gene pairs (links). Bearing the uniqueness of exploiting the quantitative coexpression change of each gene pair in the coexpression networks, both methods proved to be superior to currently popular methods in simulation studies. Re-mining of a publicly available type 2 diabetes (T2D) expression dataset from the perspective of differential coexpression analysis led to additional discoveries than those from differential expression analysis. Conclusions This work pointed out the critical weakness of current popular DCEA methods, and proposed two link-based DCEA algorithms that will make contribution to the development of DCEA and help extend it to a broader spectrum.
Collapse
Affiliation(s)
- Hui Yu
- Bioinformatics Center, Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, P.R. China
| | | | | | | | | | | |
Collapse
|
35
|
Chu JH, Lazarus R, Carey VJ, Raby BA. Quantifying differential gene connectivity between disease states for objective identification of disease-relevant genes. BMC SYSTEMS BIOLOGY 2011; 5:89. [PMID: 21627793 PMCID: PMC3128864 DOI: 10.1186/1752-0509-5-89] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2010] [Accepted: 05/31/2011] [Indexed: 02/16/2023]
Abstract
Background Network modeling of whole transcriptome expression data enables characterization of complex epistatic (gene-gene) interactions that underlie cellular functions. Though numerous methods have been proposed and successfully implemented to develop these networks, there are no formal methods for comparing differences in network connectivity patterns as a function of phenotypic trait. Results Here we describe a novel approach for quantifying the differences in gene-gene connectivity patterns across disease states based on Graphical Gaussian Models (GGMs). We compare the posterior probabilities of connectivity for each gene pair across two disease states, expressed as a posterior odds-ratio (postOR) for each pair, which can be used to identify network components most relevant to disease status. The method can also be generalized to model differential gene connectivity patterns within previously defined gene sets, gene networks and pathways. We demonstrate that the GGM method reliably detects differences in network connectivity patterns in datasets of varying sample size. Applying this method to two independent breast cancer expression data sets, we identified numerous reproducible differences in network connectivity across histological grades of breast cancer, including several published gene sets and pathways. Most notably, our model identified two gene hubs (MMP12 and CXCL13) that each exhibited differential connectivity to more than 30 transcripts in both datasets. Both genes have been previously implicated in breast cancer pathobiology, but themselves are not differentially expressed by histologic grade in either dataset, and would thus have not been identified using traditional differential gene expression testing approaches. In addition, 16 curated gene sets demonstrated significant differential connectivity in both data sets, including the matrix metalloproteinases, PPAR alpha sequence targets, and the PUFA synthesis pathway. Conclusions Our results suggest that GGM can be used to formally evaluate differences in global interactome connectivity across disease states, and can serve as a powerful tool for exploring the molecular events that contribute to disease at a systems level.
Collapse
Affiliation(s)
- Jen-hwa Chu
- Channing Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston MA 02115, USA.
| | | | | | | |
Collapse
|
36
|
De Jager N, Hudson NJ, Reverter A, Wang YH, Nagaraj SH, Cafe LM, Greenwood PL, Barnard RT, Kongsuwan KP, Dalrymple BP. Chronic exposure to anabolic steroids induces the muscle expression of oxytocin and a more than fiftyfold increase in circulating oxytocin in cattle. Physiol Genomics 2011; 43:467-78. [PMID: 21325062 DOI: 10.1152/physiolgenomics.00226.2010] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Molecular mechanisms in skeletal muscle associated with anabolic steroid treatment of cattle are unclear and we aimed to characterize transcriptional changes. Cattle were chronically exposed (68 ± 20 days) to a steroid hormone implant containing 200 mg trenbolone acetate and 20 mg estradiol (Revalor-H). Biopsy samples from 48 cattle (half treated) from longissimus dorsi (LD) muscle under local anesthesia were collected. Gene expression levels were profiled by microarray, covering 16,944 unique bovine genes: 121 genes were differentially expressed (DE) due to the implant (99.99% posterior probability of not being false positives). Among DE genes, a decrease in expression of a number of fat metabolism-associated genes, likely reflecting the lipid storage activity of intramuscular adipocytes, was observed. The expression of IGF1 and genes related to the extracellular matrix, slow twitch fibers, and cell cycle (including SOX8, a satellite cell marker) was increased in the treated muscle. Unexpectedly, a very large 21- (microarray) to 97 (real time quantitative PCR)-fold higher expression of the mRNA encoding the neuropeptide hormone oxytocin was observed in treated muscle. We also observed an ∼50-fold higher level of circulating oxytocin in the plasma of treated animals at the time of biopsy. Using a coexpression network strategy OXTR was identified as more likely than IGF1R to be a major mediator of the muscle response to Revalor-H. A re-investigation of in vivo cattle LD muscle samples during early to mid-fetal development identified a >128-fold increased expression of OXT, coincident with myofiber differentiation and fusion. We propose that oxytocin may be involved in mediating the anabolic effects of Revalor-H treatment.
Collapse
Affiliation(s)
- Nadia De Jager
- Australian Cooperative Research Centre for Beef Genetic Technologies, University of New England, Armidale, New South Wales
- Commonwealth Scientific and Industrial Research Organisation Livestock Industries, Queensland Bioscience Precinct
- School of Chemistry and Molecular Biosciences, Faculty of Science and
| | - Nicholas J. Hudson
- Australian Cooperative Research Centre for Beef Genetic Technologies, University of New England, Armidale, New South Wales
- Commonwealth Scientific and Industrial Research Organisation Livestock Industries, Queensland Bioscience Precinct
| | - Antonio Reverter
- Australian Cooperative Research Centre for Beef Genetic Technologies, University of New England, Armidale, New South Wales
- Commonwealth Scientific and Industrial Research Organisation Livestock Industries, Queensland Bioscience Precinct
| | - Yong-Hong Wang
- Australian Cooperative Research Centre for Beef Genetic Technologies, University of New England, Armidale, New South Wales
- Commonwealth Scientific and Industrial Research Organisation Livestock Industries, Queensland Bioscience Precinct
| | - Shivashankar H. Nagaraj
- Commonwealth Scientific and Industrial Research Organisation Livestock Industries, Queensland Bioscience Precinct
| | - Linda M. Cafe
- Australian Cooperative Research Centre for Beef Genetic Technologies, University of New England, Armidale, New South Wales
- Industry & Investment NSW, Beef Industry Centre, University of New England, Armidale, New South Wales, Australia
| | - Paul L. Greenwood
- Australian Cooperative Research Centre for Beef Genetic Technologies, University of New England, Armidale, New South Wales
- Industry & Investment NSW, Beef Industry Centre, University of New England, Armidale, New South Wales, Australia
| | - Ross T. Barnard
- School of Molecular and Microbial Sciences, Centre for Infectious Disease Research, University of Queensland, St. Lucia, Queensland; and
| | - Kritaya P. Kongsuwan
- Commonwealth Scientific and Industrial Research Organisation Livestock Industries, Queensland Bioscience Precinct
| | - Brian P. Dalrymple
- Australian Cooperative Research Centre for Beef Genetic Technologies, University of New England, Armidale, New South Wales
- Commonwealth Scientific and Industrial Research Organisation Livestock Industries, Queensland Bioscience Precinct
| |
Collapse
|
37
|
A Boolean-based systems biology approach to predict novel genes associated with cancer: Application to colorectal cancer. BMC SYSTEMS BIOLOGY 2011; 5:35. [PMID: 21352556 PMCID: PMC3051904 DOI: 10.1186/1752-0509-5-35] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2010] [Accepted: 02/26/2011] [Indexed: 12/21/2022]
Abstract
Background Cancer has remarkable complexity at the molecular level, with multiple genes, proteins, pathways and regulatory interconnections being affected. We introduce a systems biology approach to study cancer that formally integrates the available genetic, transcriptomic, epigenetic and molecular knowledge on cancer biology and, as a proof of concept, we apply it to colorectal cancer. Results We first classified all the genes in the human genome into cancer-associated and non-cancer-associated genes based on extensive literature mining. We then selected a set of functional attributes proven to be highly relevant to cancer biology that includes protein kinases, secreted proteins, transcription factors, post-translational modifications of proteins, DNA methylation and tissue specificity. These cancer-associated genes were used to extract 'common cancer fingerprints' through these molecular attributes, and a Boolean logic was implemented in such a way that both the expression data and functional attributes could be rationally integrated, allowing for the generation of a guilt-by-association algorithm to identify novel cancer-associated genes. Finally, these candidate genes are interlaced with the known cancer-related genes in a network analysis aimed at identifying highly conserved gene interactions that impact cancer outcome. We demonstrate the effectiveness of this approach using colorectal cancer as a test case and identify several novel candidate genes that are classified according to their functional attributes. These genes include the following: 1) secreted proteins as potential biomarkers for the early detection of colorectal cancer (FXYD1, GUCA2B, REG3A); 2) kinases as potential drug candidates to prevent tumor growth (CDC42BPB, EPHB3, TRPM6); and 3) potential oncogenic transcription factors (CDK8, MEF2C, ZIC2). Conclusion We argue that this is a holistic approach that faithfully mimics cancer characteristics, efficiently predicts novel cancer-associated genes and has universal applicability to the study and advancement of cancer research.
Collapse
|
38
|
Liu BH, Yu H, Tu K, Li C, Li YX, Li YY. DCGL: an R package for identifying differentially coexpressed genes and links from gene expression microarray data. ACTA ACUST UNITED AC 2010; 26:2637-8. [PMID: 20801914 PMCID: PMC2951087 DOI: 10.1093/bioinformatics/btq471] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
SUMMARY Gene coexpression analysis was developed to explore gene interconnection at the expression level from a systems perspective, and differential coexpression analysis (DCEA), which examines the change in gene expression correlation between two conditions, was accordingly designed as a complementary technique to traditional differential expression analysis (DEA). Since there is a shortage of DCEA tools, we implemented in an R package 'DCGL' five DCEA methods for identification of differentially coexpressed genes and differentially coexpressed links, including three currently popular methods and two novel algorithms described in a companion paper. DCGL can serve as an easy-to-use tool to facilitate differential coexpression analyses. CONTACT yyli@scbit.org and yxli@scbit.org SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bao-Hong Liu
- School of Life Science and Technology, Tongji University, Shanghai 200092, P R China
| | | | | | | | | | | |
Collapse
|
39
|
de la Fuente A. From 'differential expression' to 'differential networking' - identification of dysfunctional regulatory networks in diseases. Trends Genet 2010; 26:326-33. [PMID: 20570387 DOI: 10.1016/j.tig.2010.05.001] [Citation(s) in RCA: 308] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2010] [Revised: 04/28/2010] [Accepted: 05/03/2010] [Indexed: 01/09/2023]
Abstract
Understanding diseases requires identifying the differences between healthy and affected tissues. Gene expression data have revolutionized the study of diseases by making it possible to simultaneously consider thousands of genes. The identification of disease-associated genes requires studying the genes in the context of the regulatory systems they are involved in. A major goal is to identify specific regulatory networks that are dysfunctional in a given disease state. Although we still have not reached a stage where the elucidation of differential regulatory networks is commonly feasible, recent advances have described the first steps towards this goal - the identification of differential coexpression networks. This review describes the shift from differential gene expression to differential networking and outlines how this shift will affect the study of the genetic basis of disease.
Collapse
Affiliation(s)
- Alberto de la Fuente
- CRS4 Bioinformatica, Polaris Edificio 3, Località Piscina Manna, 09010 Pula (CA), Italy.
| |
Collapse
|
40
|
Moreno-Sánchez N, Rueda J, Carabaño MJ, Reverter A, McWilliam S, González C, Díaz C. Skeletal muscle specific genes networks in cattle. Funct Integr Genomics 2010; 10:609-18. [PMID: 20524025 PMCID: PMC2990504 DOI: 10.1007/s10142-010-0175-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2010] [Revised: 04/21/2010] [Accepted: 04/30/2010] [Indexed: 11/29/2022]
Abstract
While physiological differences across skeletal muscles have been described, the differential gene expression underlying them and the discovery of how they interact to perform specific biological processes are largely to be elucidated. The purpose of the present study was, firstly, to profile by cDNA microarrays the differential gene expression between two skeletal muscle types, Psoas major (PM) and Flexor digitorum (FD), in beef cattle and then to interpret the results in the context of a bovine gene coexpression network, detecting possible changes in connectivity across the skeletal muscle system. Eighty four genes were differentially expressed (DE) between muscles. Approximately 54% encoded metabolic enzymes and structural-contractile proteins. DE genes were involved in similar processes and functions, but the proportion of genes in each category varied within each muscle. A correlation matrix was obtained for 61 out of the 84 DE genes from a gene coexpression network. Different groups of coexpression were observed, the largest one having 28 metabolic and contractile genes, up-regulated in PM, and mainly encoding fast-glycolytic fibre structural components and glycolytic enzymes. In FD, genes related to cell support seemed to constitute its identity feature and did not positively correlate to the rest of DE genes in FD. Moreover, changes in connectivity for some DE genes were observed in the different gene ontologies. Our results confirm the existence of a muscle dependent transcription and coexpression pattern and suggest the necessity of integrating different muscle types to perform comprehensive networks for the transcriptional landscape of bovine skeletal muscle.
Collapse
Affiliation(s)
- Natalia Moreno-Sánchez
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria, Ctra de A Coruña km 7.5, 28040 Madrid, Spain.
| | | | | | | | | | | | | |
Collapse
|
41
|
Leonardson AS, Zhu J, Chen Y, Wang K, Lamb JR, Reitman M, Emilsson V, Schadt EE. The effect of food intake on gene expression in human peripheral blood. Hum Mol Genet 2010; 19:159-69. [PMID: 19837700 PMCID: PMC2792154 DOI: 10.1093/hmg/ddp476] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Human gene expression traits have been shown to be dependent on gender, age and time of day in blood and other tissues. However, other factors that may impact gene expression have not been systematically explored. For example, in studies linking blood gene expression to obesity related traits, whether the fasted or fed state will be the most informative is an open question. Here, we employed a two-arm cross-over design to perform a genome-wide survey of gene expression in human peripheral blood to address explicitly this type of question. We were able to distinguish expression changes due to individual and time-specific effects from those due to food intake. We demonstrate that the transcriptional response to food intake is robust by constructing a classifier from the gene expression traits with >90% accuracy classifying individuals as being in the fasted or fed state. Gene expression traits that were best able to discriminate the fasted and fed states were more heritable and achieved greater coherence with respect to pathways associated with metabolic traits. The connectivity structure among gene expression traits was explored in the context of coexpression networks. Changes in the connectivity structure were observed between the fasted and fed states. We demonstrate that differential expression and differential connectivity are two complementary ways to characterize changes between fasted and fed states. Both gene sets were significantly enriched for genes associated with obesity related traits. Our results suggest that the pair of fasted/fed blood expression profiles provide more comprehensive information about an individual's metabolic states.
Collapse
Affiliation(s)
- Amy S Leonardson
- Rosetta Inpharmatics, LLC, Merck & Co., Inc., Seattle, WA 98109, USA
| | | | | | | | | | | | | | | |
Collapse
|
42
|
Reverter A, Hudson NJ, Nagaraj SH, Pérez-Enciso M, Dalrymple BP. Regulatory impact factors: unraveling the transcriptional regulation of complex traits from expression data. ACTA ACUST UNITED AC 2010; 26:896-904. [PMID: 20144946 DOI: 10.1093/bioinformatics/btq051] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
MOTIVATION Although transcription factors (TF) play a central regulatory role, their detection from expression data is limited due to their low, and often sparse, expression. In order to fill this gap, we propose a regulatory impact factor (RIF) metric to identify critical TF from gene expression data. RESULTS To substantiate the generality of RIF, we explore a set of experiments spanning a wide range of scenarios including breast cancer survival, fat, gonads and sex differentiation. We show that the strength of RIF lies in its ability to simultaneously integrate three sources of information into a single measure: (i) the change in correlation existing between the TF and the differentially expressed (DE) genes; (ii) the amount of differential expression of DE genes; and (iii) the abundance of DE genes. As a result, RIF analysis assigns an extreme score to those TF that are consistently most differentially co-expressed with the highly abundant and highly DE genes (RIF1), and to those TF with the most altered ability to predict the abundance of DE genes (RIF2). We show that RIF analysis alone recovers well-known experimentally validated TF for the processes studied. The TF identified confirm the importance of PPAR signaling in adipose development and the importance of transduction of estrogen signals in breast cancer survival and sexual differentiation. We argue that RIF has universal applicability, and advocate its use as a promising hypotheses generating tool for the systematic identification of novel TF not yet documented as critical.
Collapse
Affiliation(s)
- Antonio Reverter
- Bioinformatics Group, CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St. Lucia, Brisbane, Queensland 4067, Australia.
| | | | | | | | | |
Collapse
|
43
|
Hudson NJ, Reverter A, Dalrymple BP. A differential wiring analysis of expression data correctly identifies the gene containing the causal mutation. PLoS Comput Biol 2009; 5:e1000382. [PMID: 19412532 PMCID: PMC2671163 DOI: 10.1371/journal.pcbi.1000382] [Citation(s) in RCA: 149] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2008] [Accepted: 04/01/2009] [Indexed: 11/18/2022] Open
Abstract
Transcription factor (TF) regulation is often post-translational. TF
modifications such as reversible phosphorylation and missense mutations, which
can act independent of TF expression level, are overlooked by differential
expression analysis. Using bovine Piedmontese myostatin mutants as
proof-of-concept, we propose a new algorithm that correctly identifies the gene
containing the causal mutation from microarray data alone. The myostatin
mutation releases the brakes on Piedmontese muscle growth by translating a
dysfunctional protein. Compared to a less muscular non-mutant breed we find that
myostatin is not differentially expressed at any of ten developmental time
points. Despite this challenge, the algorithm identifies the myostatin
‘smoking gun’ through a coordinated, simultaneous, weighted
integration of three sources of microarray information: transcript abundance,
differential expression, and differential wiring. By asking the novel question
“which regulator is cumulatively most differentially wired to the
abundant most differentially expressed genes?” it yields the correct
answer, “myostatin”. Our new approach identifies causal
regulatory changes by globally contrasting co-expression network dynamics. The
entirely data-driven ‘weighting’ procedure emphasises
regulatory movement relative to the phenotypically relevant part of the network.
In contrast to other published methods that compare co-expression networks,
significance testing is not used to eliminate connections. Evolution, development, and cancer are governed by regulatory circuits where the
central nodes are transcription factors. Consequently, there is great interest
in methods that can identify the causal mutation/perturbation responsible for
any circuit rewiring. The most widely available high-throughput technology, the
microarray, assays the transcriptome. However, many regulatory perturbations are
post-transcriptional. This means that they are overlooked by traditional
differential gene expression analysis. We hypothesised that by viewing
biological systems as networks one could identify causal mutations and
perturbations by examining those regulators whose position in the network
changes the most. Using muscular myostatin mutant cattle as a proof-of-concept,
we propose an analysis that succeeds based solely on microarray expression data
from just 27 animals. Our analysis differs from competing network approaches in
that we do not use significance testing to eliminate connections. All
connections are contrasted, no matter how weak. Further, the identity of target
genes is maintained throughout the analysis. Finally, the analysis is
‘weighted’ such that movement relative to the phenotypically
most relevant part of the network is emphasised. By identifying the question to
which myostatin is the answer, we present a comparison of network connectivity
that is potentially generalisable.
Collapse
Affiliation(s)
- Nicholas J. Hudson
- Food Futures Flagship and Livestock Industries, Commonwealth Scientific
and Industrial Research Organisation, Queensland Bioscience Precinct, St. Lucia
Brisbane, Queensland, Australia
| | - Antonio Reverter
- Food Futures Flagship and Livestock Industries, Commonwealth Scientific
and Industrial Research Organisation, Queensland Bioscience Precinct, St. Lucia
Brisbane, Queensland, Australia
- * E-mail:
| | - Brian P. Dalrymple
- Food Futures Flagship and Livestock Industries, Commonwealth Scientific
and Industrial Research Organisation, Queensland Bioscience Precinct, St. Lucia
Brisbane, Queensland, Australia
| |
Collapse
|
44
|
Muir WM, Rosa GJM, Pittendrigh BR, Xu S, Rider SD, Fountain M, Ogas J. A mixture model approach for the analysis of small exploratory microarray experiments. Comput Stat Data Anal 2009; 53:1566-1576. [PMID: 20160862 DOI: 10.1016/j.csda.2008.06.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
The microarray is an important and powerful tool for prescreening of genes for further research. However, alternative solutions are needed to increase power in small microarray experiments. Use of traditional parametric and even non-parametric tests for such small experiments lack power and have distributional problems. A mixture model is described that is performed directly on expression differences assuming that genes in alternative treatments are expressed or not in all combinations (i) not expressed in either condition, (ii) expressed only under the first condition, (iii) expressed only under the second condition, and (iv) expressed under both conditions, giving rise to 4 possible clusters with two treatments. The approach is termed a Mean-Difference-Mixture-Model (MD-MM) method. Accuracy and power of the MD-MM was compared to other commonly used methods, using both simulations, microarray data, and quantitative real time PCR (qRT-PCR). The MD-MM was found to be generally superior to other methods in most situations. The advantage was greatest in situations where there were few replicates, poor signal to noise ratios, or non-homogenous variances.
Collapse
Affiliation(s)
- W M Muir
- Dept. Animal Sciences, Purdue University, W. Lafayette IN 47907
| | | | | | | | | | | | | |
Collapse
|
45
|
Pérez-Enciso M, Ferraz ALJ, Ojeda A, López-Béjar M. Impact of breed and sex on porcine endocrine transcriptome: a bayesian biometrical analysis. BMC Genomics 2009; 10:89. [PMID: 19239697 PMCID: PMC2656523 DOI: 10.1186/1471-2164-10-89] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2008] [Accepted: 02/24/2009] [Indexed: 11/17/2022] Open
Abstract
Background Transcriptome variability is due to genetic and environmental causes, much like any other complex phenotype. Ascertaining the transcriptome differences between individuals is an important step to understand how selection and genetic drift may affect gene expression. To that end, extant divergent livestock breeds offer an ideal genetic material. Results We have analyzed with microarrays five tissues from the endocrine axis (hypothalamus, adenohypophysis, thyroid gland, gonads and fat tissue) of 16 pigs from both sexes pertaining to four extreme breeds (Duroc, Large White, Iberian and a cross with SinoEuropean hybrid line). Using a Bayesian linear model approach, we observed that the largest breed variability corresponded to the male gonads, and was larger than at the remaining tissues, including ovaries. Measurement of sex hormones in peripheral blood at slaughter did not detect any breed-related differences. Not unexpectedly, the gonads were the tissue with the largest number of sex biased genes. There was a strong correlation between sex and breed bias expression, although the most breed biased genes were not the most sex biased genes. A combined analysis of connectivity and differential expression suggested three biological processes as being primarily different between breeds: spermatogenesis, muscle differentiation and several metabolic processes. Conclusion These results suggest that differences across breeds in gene expression of the male gonads are larger than in other endocrine tissues in the pig. Nevertheless, the strong presence of breed biased genes in the male gonads cannot be explained solely by changes in spermatogenesis nor by differences in the reproductive tract development.
Collapse
Affiliation(s)
- Miguel Pérez-Enciso
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain.
| | | | | | | |
Collapse
|
46
|
Reverter A, Ingham A, Dalrymple BP. Mining tissue specificity, gene connectivity and disease association to reveal a set of genes that modify the action of disease causing genes. BioData Min 2008; 1:8. [PMID: 18822114 PMCID: PMC2556670 DOI: 10.1186/1756-0381-1-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2008] [Accepted: 09/19/2008] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND The tissue specificity of gene expression has been linked to a number of significant outcomes including level of expression, and differential rates of polymorphism, evolution and disease association. Recent studies have also shown the importance of exploring differential gene connectivity and sequence conservation in the identification of disease-associated genes. However, no study relates gene interactions with tissue specificity and disease association. METHODS We adopted an a priori approach making as few assumptions as possible to analyse the interplay among gene-gene interactions with tissue specificity and its subsequent likelihood of association with disease. We mined three large datasets comprising expression data drawn from massively parallel signature sequencing across 32 tissues, describing a set of 55,606 true positive interactions for 7,197 genes, and microarray expression results generated during the profiling of systemic inflammation, from which 126,543 interactions among 7,090 genes were reported. RESULTS Amongst the myriad of complex relationships identified between expression, disease, connectivity and tissue specificity, some interesting patterns emerged. These include elevated rates of expression and network connectivity in housekeeping and disease-associated tissue-specific genes. We found that disease-associated genes are more likely to show tissue specific expression and most frequently interact with other disease genes. Using the thresholds defined in these observations, we develop a guilt-by-association algorithm and discover a group of 112 non-disease annotated genes that predominantly interact with disease-associated genes, impacting on disease outcomes. CONCLUSION We conclude that parameters such as tissue specificity and network connectivity can be used in combination to identify a group of genes, not previously confirmed as disease causing, that are involved in interactions with disease causing genes. Our guilt-by-association algorithm should be useful for the discovery of additional modifiers of genetic diseases, and more generally, for the ability to associate genes of unknown function to clusters of genes with defined functions allowing for novel biological inference that can be subsequently validated.
Collapse
Affiliation(s)
- Antonio Reverter
- Computational and Systems Biology, CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St. Lucia, Brisbane, Queensland 4067, Australia
| | - Aaron Ingham
- Computational and Systems Biology, CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St. Lucia, Brisbane, Queensland 4067, Australia
| | - Brian P Dalrymple
- Computational and Systems Biology, CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St. Lucia, Brisbane, Queensland 4067, Australia
| |
Collapse
|
47
|
Reverter A, Chan EKF. Combining partial correlation and an information theory approach to the reversed engineering of gene co-expression networks. ACTA ACUST UNITED AC 2008; 24:2491-7. [PMID: 18784117 DOI: 10.1093/bioinformatics/btn482] [Citation(s) in RCA: 222] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION We present PCIT, an algorithm for the reconstruction of gene co-expression networks (GCN) that combines the concept partial correlation coefficient with information theory to identify significant gene to gene associations defining edges in the reconstruction of GCN. The properties of PCIT are examined in the context of the topology of the reconstructed network including connectivity structure, clustering coefficient and sensitivity. RESULTS We apply PCIT to a series of simulated datasets with varying levels of complexity in terms of number of genes and experimental conditions, as well as to three real datasets. Results show that, as opposed to the constant cutoff approach commonly used in the literature, the PCIT algorithm can identify and allow for more moderate, yet not less significant, estimates of correlation (r) to still establish a connection in the GCN. We show that PCIT is more sensitive than established methods and capable of detecting functionally validated gene-gene interactions coming from absolute r values as low as 0.3. These bona fide associations, which often relate to genes with low variation in expression patterns, are beyond the detection limits of conventional fixed-threshold methods, and would be overlooked by studies relying on those methods. AVAILABILITY FORTRAN 90 source code to perform the PCIT algorithm is available as Supplementary File 1.
Collapse
Affiliation(s)
- Antonio Reverter
- CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, Brisbane, Queensland 4067, Australia.
| | | |
Collapse
|
48
|
Principal components analysis based methodology to identify differentially expressed genes in time-course microarray data. BMC Bioinformatics 2008; 9:267. [PMID: 18534040 PMCID: PMC2435549 DOI: 10.1186/1471-2105-9-267] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2008] [Accepted: 06/06/2008] [Indexed: 11/25/2022] Open
Abstract
Background Time-course microarray experiments are being increasingly used to characterize dynamic biological processes. In these experiments, the goal is to identify genes differentially expressed in time-course data, measured between different biological conditions. These differentially expressed genes can reveal the changes in biological process due to the change in condition which is essential to understand differences in dynamics. Results In this paper, we propose a novel method for finding differentially expressed genes in time-course data and across biological conditions (say C1 and C2). We model the expression at C1 using Principal Component Analysis and represent the expression profile of each gene as a linear combination of the dominant Principal Components (PCs). Then the expression data from C2 is projected on the developed PCA model and scores are extracted. The difference between the scores is evaluated using a hypothesis test to quantify the significance of differential expression. We evaluate the proposed method to understand differences in two case studies (1) the heat shock response of wild-type and HSF1 knockout mice, and (2) cell-cycle between wild-type and Fkh1/Fkh2 knockout Yeast strains. Conclusion In both cases, the proposed method identified biologically significant genes.
Collapse
|
49
|
Moser RJ, Reverter A, Lehnert SA. Gene expression profiling of porcine peripheral blood leukocytes after infection with Actinobacillus pleuropneumoniae. Vet Immunol Immunopathol 2007; 121:260-74. [PMID: 18054086 DOI: 10.1016/j.vetimm.2007.10.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2007] [Revised: 08/02/2007] [Accepted: 10/11/2007] [Indexed: 01/15/2023]
Abstract
The gene expression profile of peripheral blood leukocytes (PBL) from extreme performing pigs after infection with Actinobacillus pleuropneumoniae was analysed using a custom complementary DNA (cDNA) microarray and quantitative reverse transcription-PCR (qRT-PCR). Four high performing animals with low disease-score (HP), three low performing animals with high disease-score (LP) and one medium performing animal with medium disease-score (MP) were selected for microarray profiling. PBL RNA from these eight pigs collected before and at 24h after APP infection, was examined. The study identified 92 genes that were up-regulated and four genes that were down-regulated in PBL RNA from HP pigs compared to LP pigs. The majority of differentially expressed (DE) genes were identified by virtue of their elevated expression in the HP animals at 24h post-infection. A large number of annotated DE genes are involved in innate immune response pathways. The gene expression profile of 10 DE candidate genes was further explored across the entire pig population in the same infection trial using qRT-PCR. Considerable animal-to-animal variation in PBL gene expression was observed, especially in the LP group. The qRT-PCR analysis suggested that only one true LP pig might be present in this study, which contributes significantly to the differential expression profile of the selected genes in HP animals following APP infection. This study has therefore identified a set of genes which could serve as molecular indicators for an effective immune response to APP in pigs and which could also serve as source for gene marker development in molecular genetics studies of heritable immune traits.
Collapse
Affiliation(s)
- Ralf J Moser
- CSIRO Livestock Industries, St Lucia 4067, Australia.
| | | | | |
Collapse
|
50
|
Lehnert SA, Reverter A, Byrne KA, Wang Y, Nattrass GS, Hudson NJ, Greenwood PL. Gene expression studies of developing bovine longissimus muscle from two different beef cattle breeds. BMC DEVELOPMENTAL BIOLOGY 2007; 7:95. [PMID: 17697390 PMCID: PMC2031903 DOI: 10.1186/1471-213x-7-95] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2007] [Accepted: 08/16/2007] [Indexed: 12/03/2022]
Abstract
BACKGROUND The muscle fiber number and fiber composition of muscle is largely determined during prenatal development. In order to discover genes that are involved in determining adult muscle phenotypes, we studied the gene expression profile of developing fetal bovine longissimus muscle from animals with two different genetic backgrounds using a bovine cDNA microarray. Fetal longissimus muscle was sampled at 4 stages of myogenesis and muscle maturation: primary myogenesis (d 60), secondary myogenesis (d 135), as well as beginning (d 195) and final stages (birth) of functional differentiation of muscle fibers. All fetuses and newborns (total n = 24) were from Hereford dams and crossed with either Wagyu (high intramuscular fat) or Piedmontese (GDF8 mutant) sires, genotypes that vary markedly in muscle and compositional characteristics later in postnatal life. RESULTS We obtained expression profiles of three individuals for each time point and genotype to allow comparisons across time and between sire breeds. Quantitative reverse transcription-PCR analysis of RNA from developing longissimus muscle was able to validate the differential expression patterns observed for a selection of differentially expressed genes, with one exception. We detected large-scale changes in temporal gene expression between the four developmental stages in genes coding for extracellular matrix and for muscle fiber structural and metabolic proteins. FSTL1 and IGFBP5 were two genes implicated in growth and differentiation that showed developmentally regulated expression levels in fetal muscle. An abundantly expressed gene with no functional annotation was found to be developmentally regulated in the same manner as muscle structural proteins. We also observed differences in gene expression profiles between the two different sire breeds. Wagyu-sired calves showed higher expression of fatty acid binding protein 5 (FABP5) RNA at birth. The developing longissimus muscle of fetuses carrying the Piedmontese mutation shows an emphasis on glycolytic muscle biochemistry and a large-scale up-regulation of the translational machinery at birth. We also document evidence for timing differences in differentiation events between the two breeds. CONCLUSION Taken together, these findings provide a detailed description of molecular events accompanying skeletal muscle differentiation in the bovine, as well as gene expression differences that may underpin the phenotype differences between the two breeds. In addition, this study has highlighted a non-coding RNA, which is abundantly expressed and developmentally regulated in bovine fetal muscle.
Collapse
Affiliation(s)
- Sigrid A Lehnert
- Cooperative Research Centre for Cattle and Beef Quality, Australia
- CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St Lucia 4067, Australia
| | - Antonio Reverter
- Cooperative Research Centre for Cattle and Beef Quality, Australia
- CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St Lucia 4067, Australia
| | - Keren A Byrne
- Cooperative Research Centre for Cattle and Beef Quality, Australia
- CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St Lucia 4067, Australia
| | - Yonghong Wang
- Cooperative Research Centre for Cattle and Beef Quality, Australia
- CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St Lucia 4067, Australia
| | - Greg S Nattrass
- Cooperative Research Centre for Cattle and Beef Quality, Australia
- South Australian Research & Development Institute (SARDI), Livestock Systems, Roseworthy 5371, Australia
| | - Nicholas J Hudson
- CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St Lucia 4067, Australia
- School of Integrative Biology, University of Queensland, St Lucia 4072, Australia
| | - Paul L Greenwood
- Cooperative Research Centre for Cattle and Beef Quality, Australia
- Beef Industry Centre of Excellence, NSW Department of Primary Industries, JSF Barker Building, University of New England, Armidale 2351, Australia
| |
Collapse
|