1
|
The effect of developmental variation on expression QTLs in a multi parental Caenorhabditis elegans population. G3 (BETHESDA, MD.) 2024; 14:jkad273. [PMID: 38015660 PMCID: PMC10849341 DOI: 10.1093/g3journal/jkad273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 09/21/2023] [Accepted: 10/27/2023] [Indexed: 11/30/2023]
Abstract
Regulation of gene expression plays a crucial role in developmental processes and adaptation to changing environments. expression quantitative trait locus (eQTL) mapping is a technique used to study the genetic regulation of gene expression using the transcriptomes of recombinant inbred lines (RILs). Typically, the age of the inbred lines at the time of RNA sampling is carefully controlled. This is necessary because the developmental process causes changes in gene expression, complicating the interpretation of eQTL mapping experiments. However, due to genetics and variation in ambient micro-environments, organisms can differ in their "developmental age," even if they are of the same chronological age. As a result, eQTL patterns are affected by developmental variation in gene expression. The model organism Caenorhabditis elegans is particularly suited for studying the effect of developmental variation on eQTL mapping patterns. In a span of days, C. elegans transitions from embryo through 4 larval stages to adult while undergoing massive changes to its transcriptome. Here, we use C. elegans to investigate the effect of developmental age variation on eQTL patterns and present a normalization procedure. We used dynamical eQTL mapping, which includes the developmental age as a cofactor, to separate the variation in development from genotypic variation and explain variation in gene expression levels. We compare classical single marker eQTL mapping and dynamical eQTL mapping using RNA-seq data of ∼200 multi-parental RILs of C. elegans. The results show that (1) many eQTLs are caused by developmental variation, (2) most trans-bands are developmental QTLs, and (3) dynamical eQTL mapping detects additional eQTLs not found with classical eQTL mapping. We recommend that correction for variation in developmental age should be strongly considered in eQTL mapping studies given the large impact of processes like development on the transcriptome.
Collapse
|
2
|
Integrated multi-omics analyses and genome-wide association studies reveal prime candidate genes of metabolic and vegetative growth variation in canola. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 117:713-728. [PMID: 37964699 DOI: 10.1111/tpj.16524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 10/17/2023] [Accepted: 10/23/2023] [Indexed: 11/16/2023]
Abstract
Genome-wide association studies (GWAS) identified thousands of genetic loci associated with complex plant traits, including many traits of agronomical importance. However, functional interpretation of GWAS results remains challenging because of large candidate regions due to linkage disequilibrium. High-throughput omics technologies, such as genomics, transcriptomics, proteomics and metabolomics open new avenues for integrative systems biological analyses and help to nominate systems information supported (prime) candidate genes. In the present study, we capitalise on a diverse canola population with 477 spring-type lines which was previously analysed by high-throughput phenotyping of growth-related traits and by RNA sequencing and metabolite profiling for multi-omics-based hybrid performance prediction. We deepened the phenotypic data analysis, now providing 123 time-resolved image-based traits, to gain insight into the complex relations during early vegetative growth and reanalysed the transcriptome data based on the latest Darmor-bzh v10 genome assembly. Genome-wide association testing revealed 61 298 robust quantitative trait loci (QTL) including 187 metabolite QTL, 56814 expression QTL and 4297 phenotypic QTL, many clustered in pronounced hotspots. Combining information about QTL colocalisation across omics layers and correlations between omics features allowed us to discover prime candidate genes for metabolic and vegetative growth variation. Prioritised candidate genes for early biomass accumulation include A06p05760.1_BnaDAR (PIAL1), A10p16280.1_BnaDAR, C07p48260.1_BnaDAR (PRL1) and C07p48510.1_BnaDAR (CLPR4). Moreover, we observed unequal effects of the Brassica A and C subgenomes on early biomass production.
Collapse
|
3
|
Learning gene networks under SNP perturbation using SNP and allele-specific expression data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.23.563661. [PMID: 37961468 PMCID: PMC10634764 DOI: 10.1101/2023.10.23.563661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Allele-specific expression quantification from RNA-seq reads provides opportunities to study the control of gene regulatory networks by cis-acting and trans-acting genetic variants. Many existing methods performed a single-gene and single-SNP association analysis to identify expression quantitative trait loci (eQTLs), and placed the eQTLs against known gene networks for functional interpretation. Instead, we view eQTL data as a capture of the effects of perturbation of gene regulatory system by a large number of genetic variants and reconstruct a gene network perturbed by eQTLs. We introduce a statistical framework called CiTruss for simultaneously learning a gene network and cis-acting and trans-acting eQTLs that perturb this network, given population allele-specific expression and SNP data. CiTruss uses a multi-level conditional Gaussian graphical model to model trans-acting eQTLs perturbing the expression of both alleles in gene network at the top level and cis-acting eQTLs perturbing the expression of each allele at the bottom level. We derive a transformation of this model that allows efficient learning for large-scale human data. Our analysis of the GTEx and LG×SM advanced intercross line mouse data for multiple tissue types with CiTruss provides new insights into genetics of gene regulation. CiTruss revealed that gene networks consist of local subnetworks over proximally located genes and global subnetworks over genes scattered across genome, and that several aspects of gene regulation by eQTLs such as the impact of genetic diversity, pleiotropy, tissue-specific gene regulation, and local and long-range linkage disequilibrium among eQTLs can be explained through these local and global subnetworks.
Collapse
|
4
|
Network reconstruction for trans acting genetic loci using multi-omics data and prior information. Genome Med 2022; 14:125. [PMID: 36344995 PMCID: PMC9641770 DOI: 10.1186/s13073-022-01124-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 10/11/2022] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Molecular measurements of the genome, the transcriptome, and the epigenome, often termed multi-omics data, provide an in-depth view on biological systems and their integration is crucial for gaining insights in complex regulatory processes. These data can be used to explain disease related genetic variants by linking them to intermediate molecular traits (quantitative trait loci, QTL). Molecular networks regulating cellular processes leave footprints in QTL results as so-called trans-QTL hotspots. Reconstructing these networks is a complex endeavor and use of biological prior information can improve network inference. However, previous efforts were limited in the types of priors used or have only been applied to model systems. In this study, we reconstruct the regulatory networks underlying trans-QTL hotspots using human cohort data and data-driven prior information. METHODS We devised a new strategy to integrate QTL with human population scale multi-omics data. State-of-the art network inference methods including BDgraph and glasso were applied to these data. Comprehensive prior information to guide network inference was manually curated from large-scale biological databases. The inference approach was extensively benchmarked using simulated data and cross-cohort replication analyses. Best performing methods were subsequently applied to real-world human cohort data. RESULTS Our benchmarks showed that prior-based strategies outperform methods without prior information in simulated data and show better replication across datasets. Application of our approach to human cohort data highlighted two novel regulatory networks related to schizophrenia and lean body mass for which we generated novel functional hypotheses. CONCLUSIONS We demonstrate that existing biological knowledge can improve the integrative analysis of networks underlying trans associations and generate novel hypotheses about regulatory mechanisms.
Collapse
|
5
|
A consensus map for quality traits in durum wheat based on genome-wide association studies and detection of ortho-meta QTL across cereal species. Front Genet 2022; 13:982418. [PMID: 36110219 PMCID: PMC9468538 DOI: 10.3389/fgene.2022.982418] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 07/21/2022] [Indexed: 11/13/2022] Open
Abstract
The present work focused on the identification of durum wheat QTL hotspots from a collection of genome-wide association studies, for quality traits, such as grain protein content and composition, yellow color, fiber, grain microelement content (iron, magnesium, potassium, selenium, sulfur, calcium, cadmium), kernel vitreousness, semolina, and dough quality test. For the first time a total of 10 GWAS studies, comprising 395 marker-trait associations (MTA) on 57 quality traits, with more than 1,500 genotypes from 9 association panels, were used to investigate consensus QTL hotspots representative of a wide durum wheat genetic variation. MTA were found distributed on all the A and B genomes chromosomes with minimum number of MTA observed on chromosome 5B (15) and a maximum of 45 on chromosome 7A, with an average of 28 MTA per chromosome. The MTA were equally distributed on A (48%) and B (52%) genomes and allowed the identification of 94 QTL hotspots. Synteny maps for QTL were also performed in Zea mays, Brachypodium, and Oryza sativa, and candidate gene identification allowed the association of genes involved in biological processes playing a major role in the control of quality traits.
Collapse
|
6
|
Integration of eQTL Analysis and GWAS Highlights Regulation Networks in Cotton under Stress Condition. Int J Mol Sci 2022; 23:ijms23147564. [PMID: 35886912 PMCID: PMC9324452 DOI: 10.3390/ijms23147564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 07/02/2022] [Accepted: 07/05/2022] [Indexed: 12/04/2022] Open
Abstract
The genus Gossypium is one of the most economically important crops in the world. Here, we used RNA-seq to quantify gene expression in a collection of G. arboreum seedlings and performed eGWAS on 28,382 expressed genes. We identified a total of 30,089 eQTLs in 10,485 genes, of which >90% were trans-regulate target genes. Using luciferase assays, we confirmed that different cis-eQTL haplotypes could affect promoter activity. We found ~6600 genes associated with ~1300 eQTL hotspots. Moreover, hotspot 309 regulates the expression of 325 genes with roles in stem length, fresh weight, seed germination rate, and genes related to cell wall biosynthesis and salt stress. Transcriptome-wide association study (TWAS) identified 19 candidate genes associated with the cotton growth and salt stress response. The variation in gene expression across the population played an essential role in population differentiation. Only a small number of the differentially expressed genes between South China, the Yangtze River region, and the Yellow River region sites were located in different chromosomal regions. The eQTLs found across the duplicated gene pairs showed conservative cis- or trans- regulation and that the expression levels of gene pairs were correlated. This study provides new insights into the evolution of gene expression regulation in cotton, and identifies eQTLs in stress-related genes for use in breeding improved cotton varieties.
Collapse
|
7
|
The Expression Quantitative Trait Loci in Immune Response Genes Impact the Characteristics and Survival of Colorectal Cancer. Diagnostics (Basel) 2022; 12:diagnostics12020315. [PMID: 35204406 PMCID: PMC8871427 DOI: 10.3390/diagnostics12020315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/24/2022] [Accepted: 01/24/2022] [Indexed: 02/05/2023] Open
Abstract
The impact of germline variants on the regulation of the expression of tumor microenvironment (TME)-based immune response genes remains unclear. Expression quantitative trait loci (eQTL) provide insight into the effect of downstream target genes (eGenes) regulated by germline-associated variants (eVariants). Through eQTL analyses, we illustrated the relationships between germline eVariants, TME-based immune response eGenes, and clinical outcomes. In this study, both RNA sequencing data from primary tumor and germline whole-genome sequencing data were collected from patients with stage III colorectal cancer (CRC). Ninety-nine high-risk subjects were subjected to immune response gene expression analyses. Seventy-seven subjects remained for further analysis after quality control, of which twenty-two patients (28.5%) experienced tumor recurrence. We found that 65 eQTL, including 60 germline eVariants and 22 TME-based eGenes, impacted the survival of cancer patients. For the recurrence prediction model, 41 differentially expressed genes (DEGs) achieved the best area under the receiver operating characteristic curve of 0.93. In total, 19 survival-associated eGenes were identified among the DEGs. Most of these genes were related to the regulation of lymphocytes and cytokines. A high expression of HGF, CCR5, IL18, FCER1G, TDO2, IFITM2, and LAPTM5 was significantly associated with a poor prognosis. In addition, the FCER1G eGene was associated with tumor invasion, tumor nodal stage, and tumor site. The eVariants that regulate the TME-based expression of FCER1G, including rs2118867 and rs12124509, were determined to influence survival and chromatin binding preferences. We also demonstrated that FCER1G and co-expressed genes in TME were related to the aggregation of leukocytes via pathway analysis. By analyzing the eQTL from the cancer genome using germline variants and TME-based RNA sequencing, we identified the eQTL in immune response genes that impact colorectal cancer characteristics and survival.
Collapse
|
8
|
Dissection of quantitative trait loci in the Lachancea waltii yeast species highlights major hotspots. G3 (BETHESDA, MD.) 2021; 11:jkab242. [PMID: 34544138 PMCID: PMC8496267 DOI: 10.1093/g3journal/jkab242] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 07/06/2021] [Indexed: 11/30/2022]
Abstract
Dissecting the genetic basis of complex trait remains a real challenge. The budding yeast Saccharomyces cerevisiae has become a model organism for studying quantitative traits, successfully increasing our knowledge in many aspects. However, the exploration of the genotype-phenotype relationship in non-model yeast species could provide a deeper insight into the genetic basis of complex traits. Here, we have studied this relationship in the Lachancea waltii species which diverged from the S. cerevisiae lineage prior to the whole-genome duplication. By performing linkage mapping analyses in this species, we identified 86 quantitative trait loci (QTL) impacting the growth in a large number of conditions. The distribution of these loci across the genome has revealed two major QTL hotspots. A first hotspot corresponds to a general growth QTL, impacting a wide range of conditions. By contrast, the second hotspot highlighted a trade-off with a disadvantageous allele for drug-free conditions which proved to be advantageous in the presence of several drugs. Finally, a comparison of the detected QTL in L. waltii with those which had been previously identified for the same trait in a closely related species, Lachancea kluyveri was performed. This analysis clearly showed the absence of shared QTL across these species. Altogether, our results represent a first step toward the exploration of the genetic architecture of quantitative trait across different yeast species.
Collapse
|
9
|
Ghost QTL and hotspots in experimental crosses: novel approach for modeling polygenic effects. Genetics 2021; 217:6067404. [PMID: 33789342 DOI: 10.1093/genetics/iyaa041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 12/10/2020] [Indexed: 11/14/2022] Open
Abstract
Ghost quantitative trait loci (QTL) are the false discoveries in QTL mapping, that arise due to the "accumulation" of the polygenic effects, uniformly distributed over the genome. The locations on the chromosome that are strongly correlated with the total of the polygenic effects depend on a specific sample correlation structure determined by the genotypes at all loci. The problem is particularly severe when the same genotypes are used to study multiple QTL, e.g. using recombinant inbred lines or studying the expression QTL. In this case, the ghost QTL phenomenon can lead to false hotspots, where multiple QTL show apparent linkage to the same locus. We illustrate the problem using the classic backcross design and suggest that it can be solved by the application of the extended mixed effect model, where the random effects are allowed to have a nonzero mean. We provide formulas for estimating the thresholds for the corresponding t-test statistics and use them in the stepwise selection strategy, which allows for a simultaneous detection of several QTL. Extensive simulation studies illustrate that our approach eliminates ghost QTL/false hotspots, while preserving a high power of true QTL detection.
Collapse
|
10
|
A statistical framework for QTL hotspot detection. G3-GENES GENOMES GENETICS 2021; 11:6151767. [PMID: 33638985 PMCID: PMC8049418 DOI: 10.1093/g3journal/jkab056] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 02/11/2021] [Indexed: 11/13/2022]
Abstract
Quantitative trait loci (QTL) hotspots (genomic locations enriched in QTL) are a common and notable feature when collecting many QTL for various traits in many areas of biological studies. The QTL hotspots are important and attractive since they are highly informative and may harbor genes for the quantitative traits. So far, the current statistical methods for QTL hotspot detection use either the individual-level data from the genetical genomics experiments or the summarized data from public QTL databases to proceed with the detection analysis. These methods may suffer from the problems of ignoring the correlation structure among traits, neglecting the magnitude of LOD scores for the QTL, or paying a very high computational cost, which often lead to the detection of excessive spurious hotspots, failure to discover biologically interesting hotspots composed of a small-to-moderate number of QTL with strong LOD scores, and computational intractability, respectively, during the detection process. In this article, we describe a statistical framework that can handle both types of data as well as address all the problems at a time for QTL hotspot detection. Our statistical framework directly operates on the QTL matrix and hence has a very cheap computational cost and is deployed to take advantage of the QTL mapping results for assisting the detection analysis. Two special devices, trait grouping and top γn,α profile, are introduced into the framework. The trait grouping attempts to group the traits controlled by closely linked or pleiotropic QTL together into the same trait groups and randomly allocates these QTL together across the genomic positions separately by trait group to account for the correlation structure among traits, so as to have the ability to obtain much stricter thresholds and dismiss spurious hotspots. The top γn,α profile is designed to outline the LOD-score pattern of QTL in a hotspot across the different hotspot architectures, so that it can serve to identify and characterize the types of QTL hotspots with varying sizes and LOD-score distributions. Real examples, numerical analysis, and simulation study are performed to validate our statistical framework, investigate the detection properties, and also compare with the current methods in QTL hotspot detection. The results demonstrate that the proposed statistical framework can effectively accommodate the correlation structure among traits, identify the types of hotspots, and still keep the notable features of easy implementation and fast computation for practical QTL hotspot detection.
Collapse
|
11
|
Integrative genomic analysis of blood pressure and related phenotypes in rats. Dis Model Mech 2021; 14:dmm048090. [PMID: 34010951 PMCID: PMC8188887 DOI: 10.1242/dmm.048090] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 03/23/2021] [Indexed: 12/12/2022] Open
Abstract
Despite remarkable progress made in human genome-wide association studies, there remains a substantial gap between statistical evidence for genetic associations and functional comprehension of the underlying mechanisms governing these associations. As a means of bridging this gap, we performed genomic analysis of blood pressure (BP) and related phenotypes in spontaneously hypertensive rats (SHR) and their substrain, stroke-prone SHR (SHRSP), both of which are unique genetic models of severe hypertension and cardiovascular complications. By integrating whole-genome sequencing, transcriptome profiling, genome-wide linkage scans (maximum n=1415), fine congenic mapping (maximum n=8704), pharmacological intervention and comparative analysis with transcriptome-wide association study (TWAS) datasets, we searched causal genes and causal pathways for the tested traits. The overall results validated the polygenic architecture of elevated BP compared with a non-hypertensive control strain, Wistar Kyoto rats (WKY); e.g. inter-strain BP differences between SHRSP and WKY could be largely explained by an aggregate of BP changes in seven SHRSP-derived consomic strains. We identified 26 potential target genes, including rat homologs of human TWAS loci, for the tested traits. In this study, we re-discovered 18 genes that had previously been determined to contribute to hypertension or cardiovascular phenotypes. Notably, five of these genes belong to the kallikrein-kinin/renin-angiotensin systems (KKS/RAS), in which the most prominent differential expression between hypertensive and non-hypertensive alleles could be detected in rat Klk1 paralogs. In combination with a pharmacological intervention, we provide in vivo experimental evidence supporting the presence of key disease pathways, such as KKS/RAS, in a rat polygenic hypertension model.
Collapse
|
12
|
Integrated Genome-Wide Analysis of MicroRNA Expression Quantitative Trait Loci in Pig Longissimus Dorsi Muscle. Front Genet 2021; 12:644091. [PMID: 33859669 PMCID: PMC8042294 DOI: 10.3389/fgene.2021.644091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 02/24/2021] [Indexed: 01/19/2023] Open
Abstract
Determining mechanisms regulating complex traits in pigs is essential to improve the production efficiency of this globally important protein source. MicroRNAs (miRNAs) are a class of non-coding RNAs known to post-transcriptionally regulate gene expression affecting numerous phenotypes, including those important to the pig industry. To facilitate a more comprehensive understanding of the regulatory mechanisms controlling growth, carcass composition, and meat quality phenotypes in pigs, we integrated miRNA and gene expression data from longissimus dorsi muscle samples with genotypic and phenotypic data from the same animals. We identified 23 miRNA expression Quantitative Trait Loci (miR-eQTL) at the genome-wide level and examined their potential effects on these important production phenotypes through miRNA target prediction, correlation, and colocalization analyses. One miR-eQTL miRNA, miR-874, has target genes that colocalize with phenotypic QTL for 12 production traits across the genome including backfat thickness, dressing percentage, muscle pH at 24 h post-mortem, and cook yield. The results of our study reveal genomic regions underlying variation in miRNA expression and identify miRNAs and genes for future validation of their regulatory effects on traits of economic importance to the global pig industry.
Collapse
|
13
|
Network Analysis Prioritizes DEWAX and ICE1 as the Candidate Genes for Major eQTL Hotspots in Seed Germination of Arabidopsis thaliana. G3-GENES GENOMES GENETICS 2020; 10:4215-4226. [PMID: 32963085 PMCID: PMC7642920 DOI: 10.1534/g3.120.401477] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Seed germination is characterized by a constant change of gene expression across different time points. These changes are related to specific processes, which eventually determine the onset of seed germination. To get a better understanding on the regulation of gene expression during seed germination, we performed a quantitative trait locus mapping of gene expression (eQTL) at four important seed germination stages (primary dormant, after-ripened, six-hour after imbibition, and radicle protrusion stage) using Arabidopsis thaliana Bay x Sha recombinant inbred lines (RILs). The mapping displayed the distinctness of the eQTL landscape for each stage. We found several eQTL hotspots across stages associated with the regulation of expression of a large number of genes. Interestingly, an eQTL hotspot on chromosome five collocates with hotspots for phenotypic and metabolic QTL in the same population. Finally, we constructed a gene co-expression network to prioritize the regulatory genes for two major eQTL hotspots. The network analysis prioritizes transcription factors DEWAX and ICE1 as the most likely regulatory genes for the hotspot. Together, we have revealed that the genetic regulation of gene expression is dynamic along the course of seed germination.
Collapse
|
14
|
The Gene scb-1 Underlies Variation in Caenorhabditis elegans Chemotherapeutic Responses. G3-GENES GENOMES GENETICS 2020; 10:2353-2364. [PMID: 32385045 PMCID: PMC7341127 DOI: 10.1534/g3.120.401310] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Pleiotropy, the concept that a single gene controls multiple distinct traits, is prevalent in most organisms and has broad implications for medicine and agriculture. The identification of the molecular mechanisms underlying pleiotropy has the power to reveal previously unknown biological connections between seemingly unrelated traits. Additionally, the discovery of pleiotropic genes increases our understanding of both genetic and phenotypic complexity by characterizing novel gene functions. Quantitative trait locus (QTL) mapping has been used to identify several pleiotropic regions in many organisms. However, gene knockout studies are needed to eliminate the possibility of tightly linked, non-pleiotropic loci. Here, we use a panel of 296 recombinant inbred advanced intercross lines of Caenorhabditis elegans and a high-throughput fitness assay to identify a single large-effect QTL on the center of chromosome V associated with variation in responses to eight chemotherapeutics. We validate this QTL with near-isogenic lines and pair genome-wide gene expression data with drug response traits to perform mediation analysis, leading to the identification of a pleiotropic candidate gene, scb-1, for some of the eight chemotherapeutics. Using deletion strains created by genome editing, we show that scb-1, which was previously implicated in response to bleomycin, also underlies responses to other double-strand DNA break-inducing chemotherapeutics. This finding provides new evidence for the role of scb-1 in the nematode drug response and highlights the power of mediation analysis to identify causal genes.
Collapse
|
15
|
Abstract
Background micro RNA (miRNA) are important regulators of gene expression and may influence phenotypes and disease traits. The connection between genetics and miRNA expression can be determined through expression quantitative loci (eQTL) analysis, which has been extensively used in a variety of tissues, and in both human and model organisms. miRNA play an important role in brain-related diseases, but eQTL studies of miRNA in brain tissue are limited. We aim to catalog miRNA eQTL in brain tissue using miRNA expression measured on a recombinant inbred mouse panel. Because samples were collected without any intervention or treatment (naïve), the panel allows characterization of genetic influences on miRNAs’ expression levels. We used brain RNA expression levels of 881 miRNA and 1416 genomic locations to identify miRNA eQTL. To address multiple testing, we employed permutation p-values and subsequent zero permutation p-value correction. We also investigated the underlying biology of miRNA regulation using additional analyses, including hotspot analysis to search for regions controlling multiple miRNAs, and Bayesian network analysis to identify scenarios where a miRNA mediates the association between genotype and mRNA expression. We used addiction related phenotypes to illustrate the utility of our results. Results Thirty-eight miRNA eQTL were identified after appropriate multiple testing corrections. Ten of these miRNAs had target genes enriched for brain-related pathways and mapped to four miRNA eQTL hotspots. Bayesian network analysis revealed four biological networks relating genetic variation, miRNA expression and gene expression. Conclusions Our extensive evaluation of miRNA eQTL provides valuable insight into the role of miRNA regulation in brain tissue. Our miRNA eQTL analysis and extended statistical exploration identifies miRNA candidates in brain for future study.
Collapse
|
16
|
A systems genetics analysis in Eucalyptus reveals coordination of metabolic pathways associated with xylan modification in wood-forming tissues. THE NEW PHYTOLOGIST 2019; 223:1952-1972. [PMID: 31144333 DOI: 10.1111/nph.15972] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2018] [Accepted: 05/01/2019] [Indexed: 06/09/2023]
Abstract
Acetyl- and methylglucuronic acid decorations of xylan, the dominant hemicellulose in secondary cell walls (SCWs) of woody dicots, affect its interaction with cellulose and lignin to determine SCW structure and extractability. Genes and pathways involved in these modifications may be targets for genetic engineering; however, little is known about the regulation of xylan modifications in woody plants. To address this, we assessed genetic and gene expression variation associated with xylan modification in developing xylem of Eucalyptus grandis × Eucalyptus urophylla interspecific hybrids. Expression quantitative trait locus (eQTL) mapping identified potential regulatory polymorphisms affecting gene expression modules associated with xylan modification. We identified 14 putative xylan modification genes that are members of five expression modules sharing seven trans-eQTL hotspots. The xylan modification genes are prevalent in two expression modules. The first comprises nucleotide sugar interconversion pathways supplying the essential precursors for cellulose and xylan biosynthesis. The second contains genes responsible for phenylalanine biosynthesis and S-adenosylmethionine biosynthesis required for glucuronic acid and monolignol methylation. Co-expression and co-regulation analyses also identified four metabolic sources of acetyl coenxyme A that appear to be transcriptionally coordinated with xylan modification. Our systems genetics analysis may provide new avenues for metabolic engineering to alter wood SCW biology for enhanced biomass processability.
Collapse
|
17
|
Dissecting the genetic architecture of a stepwise infection process. Mol Ecol 2019; 28:3942-3957. [PMID: 31283079 DOI: 10.1111/mec.15166] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 06/27/2019] [Accepted: 06/28/2019] [Indexed: 02/06/2023]
Abstract
How a host fights infection depends on an ordered sequence of steps, beginning with attempts to prevent a pathogen from establishing an infection, through to steps that mitigate a pathogen's control of host resources or minimize the damage caused during infection. Yet empirically characterizing the genetic basis of these steps remains challenging. Although each step is likely to have a unique genetic and environmental signature, and may therefore respond to selection in different ways, events that occur earlier in the infection process can mask or overwhelm the contributions of subsequent steps. In this study, we dissect the genetic architecture of a stepwise infection process using a quantitative trait locus (QTL) mapping approach. We control for variation at the first line of defence against a bacterial pathogen and expose downstream genetic variability related to the host's ability to mitigate the damage pathogens cause. In our model, the water-flea Daphnia magna, we found a single major effect QTL, explaining 64% of the variance, that is linked to the host's ability to completely block pathogen entry by preventing their attachment to the host oesophagus; this is consistent with the detection of this locus in previous studies. In susceptible hosts allowing attachment, however, a further 23 QTLs, explaining between 5% and 16% of the variance, were mapped to traits related to the expression of disease. The general lack of pleiotropy and epistasis for traits related to the different stages of the infection process, together with the wide distribution of QTLs across the genome, highlights the modular nature of a host's defence portfolio, and the potential for each different step to evolve independently. We discuss how isolating the genetic basis of individual steps can help to resolve discussion over the genetic architecture of host resistance.
Collapse
|
18
|
Conserved properties of genetic architecture of renal and fat transcriptomes in rat models of insulin resistance. Dis Model Mech 2019; 12:dmm.038539. [PMID: 31213483 PMCID: PMC6679378 DOI: 10.1242/dmm.038539] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2019] [Accepted: 05/20/2019] [Indexed: 12/19/2022] Open
Abstract
To define renal molecular mechanisms that are affected by permanent hyperglycaemia and might promote phenotypes relevant to diabetic nephropathy, we carried out linkage analysis of genome-wide gene transcription in the kidneys of F2 offspring from the Goto-Kakizaki (GK) rat model of type 2 diabetes and normoglycaemic Brown Norway (BN) rats. We mapped 2526 statistically significant expression quantitative trait loci (eQTLs) in the cross. More than 40% of eQTLs mapped in the close vicinity of the linked transcripts, underlying possible cis-regulatory mechanisms of gene expression. We identified eQTL hotspots on chromosomes 5 and 9 regulating the expression of 80-165 genes, sex or cross direction effects, and enriched metabolic and immunological processes by segregating GK alleles. Comparative analysis with adipose tissue eQTLs in the same cross showed that 496 eQTLs, in addition to the top enriched biological pathways, are conserved in the two tissues. Extensive similarities in eQTLs mapped in the GK rat and in the spontaneously hypertensive rat (SHR) suggest a common aetiology of disease phenotypes common to the two strains, including insulin resistance, which is a prominent pathophysiological feature in both GK rats and SHRs. Our data shed light on shared and tissue-specific molecular mechanisms that might underlie aetiological aspects of insulin resistance in the context of spontaneously occurring hyperglycaemia and hypertension. Summary: Kidney and fat expression QTL mapping in rat models of spontaneously occurring insulin resistance associated with either diabetes or hypertension reveals conserved gene expression regulation, suggesting shared aetiology of disease phenotypes.
Collapse
|
19
|
A Statistical Procedure for Genome-Wide Detection of QTL Hotspots Using Public Databases with Application to Rice. G3-GENES GENOMES GENETICS 2019; 9:439-452. [PMID: 30541929 PMCID: PMC6385979 DOI: 10.1534/g3.118.200922] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Genome-wide detection of quantitative trait loci (QTL) hotspots underlying variation in many molecular and phenotypic traits has been a key step in various biological studies since the QTL hotspots are highly informative and can be linked to the genes for the quantitative traits. Several statistical methods have been proposed to detect QTL hotspots. These hotspot detection methods rely heavily on permutation tests performed on summarized QTL data or individual-level data (with genotypes and phenotypes) from the genetical genomics experiments. In this article, we propose a statistical procedure for QTL hotspot detection by using the summarized QTL (interval) data collected in public web-accessible databases. First, a simple statistical method based on the uniform distribution is derived to convert the QTL interval data into the expected QTL frequency (EQF) matrix. And then, to account for the correlation structure among traits, the QTL for correlated traits are grouped together into the same categories to form a reduced EQF matrix. Furthermore, a permutation algorithm on the EQF elements or on the QTL intervals is developed to compute a sliding scale of EQF thresholds, ranging from strict to liberal, for assessing the significance of QTL hotspots. With grouping, much stricter thresholds can be obtained to avoid the detection of spurious hotspots. Real example analysis and simulation study are carried out to illustrate our procedure, evaluate the performances and compare with other methods. It shows that our procedure can control the genome-wide error rates at the target levels, provide appropriate thresholds for correlated data and is comparable to the methods using individual-level data in hotspot detection. Depending on the thresholds used, more than 100 hotspots are detected in GRAMENE rice database. We also perform a genome-wide comparative analysis of the detected hotspots and the known genes collected in the Rice Q-TARO database. The comparative analysis reveals that the hotspots and genes are conformable in the sense that they co-localize closely and are functionally related to relevant traits. Our statistical procedure can provide a framework for exploring the networks among QTL hotspots, genes and quantitative traits in biological studies. The R codes that produce both numerical and graphical outputs of QTL hotspot detection in the genome are available on the worldwide web http://www.stat.sinica.edu.tw/chkao/.
Collapse
|
20
|
Shared Genomic Regions Underlie Natural Variation in Diverse Toxin Responses. Genetics 2018; 210:1509-1525. [PMID: 30341085 PMCID: PMC6283156 DOI: 10.1534/genetics.118.301311] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Accepted: 10/16/2018] [Indexed: 01/25/2023] Open
Abstract
Phenotypic complexity is caused by the contributions of environmental factors and multiple genetic loci, interacting or acting independently. Studies of yeast and Arabidopsis often find that the majority of natural variation across phenotypes is attributable to independent additive quantitative trait loci (QTL). Detected loci in these organisms explain most of the estimated heritable variation. By contrast, many heritable components underlying phenotypic variation in metazoan models remain undetected. Before the relative impacts of additive and interactive variance components on metazoan phenotypic variation can be dissected, high replication and precise phenotypic measurements are required to obtain sufficient statistical power to detect loci contributing to this missing heritability. Here, we used a panel of 296 recombinant inbred advanced intercross lines of Caenorhabditis elegans and a high-throughput fitness assay to detect loci underlying responses to 16 different toxins, including heavy metals, chemotherapeutic drugs, pesticides, and neuropharmaceuticals. Using linkage mapping, we identified 82 QTL that underlie variation in responses to these toxins, and predicted the relative contributions of additive loci and genetic interactions across various growth parameters. Additionally, we identified three genomic regions that impact responses to multiple classes of toxins. These QTL hotspots could represent common factors impacting toxin responses. We went further to generate near-isogenic lines and chromosome substitution strains, and then experimentally validated these QTL hotspots, implicating additive and interactive loci that underlie toxin-response variation.
Collapse
|
21
|
Temporal genetic association and temporal genetic causality methods for dissecting complex networks. Nat Commun 2018; 9:3980. [PMID: 30266904 PMCID: PMC6162292 DOI: 10.1038/s41467-018-06203-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Accepted: 08/23/2018] [Indexed: 12/27/2022] Open
Abstract
A large amount of panomic data has been generated in populations for understanding causal relationships in complex biological systems. Both genetic and temporal models can be used to establish causal relationships among molecular, cellular, or phenotypical traits, but with limitations. To fully utilize high-dimension temporal and genetic data, we develop a multivariate polynomial temporal genetic association (MPTGA) approach for detecting temporal genetic loci (teQTLs) of quantitative traits monitored over time in a population and a temporal genetic causality test (TGCT) for inferring causal relationships between traits linked to the locus. We apply MPTGA and TGCT to simulated data sets and a yeast F2 population in response to rapamycin, and demonstrate increased power to detect teQTLs. We identify a teQTL hotspot locus interacting with rapamycin treatment, infer putative causal regulators of the teQTL hotspot, and experimentally validate RRD1 as the causal regulator for this teQTL hotspot.
Collapse
|
22
|
Abstract
The majority of gene loci that have been associated with type 2 diabetes play a role in pancreatic islet function. To evaluate the role of islet gene expression in the etiology of diabetes, we sensitized a genetically diverse mouse population with a Western diet high in fat (45% kcal) and sucrose (34%) and carried out genome-wide association mapping of diabetes-related phenotypes. We quantified mRNA abundance in the islets and identified 18,820 expression QTL. We applied mediation analysis to identify candidate causal driver genes at loci that affect the abundance of numerous transcripts. These include two genes previously associated with monogenic diabetes (PDX1 and HNF4A), as well as three genes with nominal association with diabetes-related traits in humans (FAM83E, IL6ST, and SAT2). We grouped transcripts into gene modules and mapped regulatory loci for modules enriched with transcripts specific for α-cells, and another specific for δ-cells. However, no single module enriched for β-cell-specific transcripts, suggesting heterogeneity of gene expression patterns within the β-cell population. A module enriched in transcripts associated with branched-chain amino acid metabolism was the most strongly correlated with physiological traits that reflect insulin resistance. Although the mice in this study were not overtly diabetic, the analysis of pancreatic islet gene expression under dietary-induced stress enabled us to identify correlated variation in groups of genes that are functionally linked to diabetes-associated physiological traits. Our analysis suggests an expected degree of concordance between diabetes-associated loci in the mouse and those found in human populations, and demonstrates how the mouse can provide evidence to support nominal associations found in human genome-wide association mapping.
Collapse
|
23
|
Systems genomics study reveals expression quantitative trait loci, regulator genes and pathways associated with boar taint in pigs. PLoS One 2018; 13:e0192673. [PMID: 29438444 PMCID: PMC5811030 DOI: 10.1371/journal.pone.0192673] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 01/29/2018] [Indexed: 01/14/2023] Open
Abstract
Boar taint is an offensive odour and/or taste from a proportion of non-castrated male pigs caused by skatole and androstenone accumulation during sexual maturity. Castration is widely used to avoid boar taint but is currently under debate because of animal welfare concerns. This study aimed to identify expression quantitative trait loci (eQTLs) with potential effects on boar taint compounds to improve breeding possibilities for reduced boar taint. Danish Landrace male boars with low, medium and high genetic merit for skatole and human nose score (HNS) were slaughtered at ~100 kg. Gene expression profiles were obtained by RNA-Seq, and genotype data were obtained by an Illumina 60K Porcine SNP chip. Following quality control and filtering, 10,545 and 12,731 genes from liver and testis were included in the eQTL analysis, together with 20,827 SNP variants. A total of 205 and 109 single-tissue eQTLs associated with 102 and 58 unique genes were identified in liver and testis, respectively. By employing a multivariate Bayesian hierarchical model, 26 eQTLs were identified as significant multi-tissue eQTLs. The highest densities of eQTLs were found on pig chromosomes SSC12, SSC1, SSC13, SSC9 and SSC14. Functional characterisation of eQTLs revealed functions within regulation of androgen and the intracellular steroid hormone receptor signalling pathway and of xenobiotic metabolism by cytochrome P450 system and cellular response to oestradiol. A QTL enrichment test revealed 89 QTL traits curated by the Animal Genome PigQTL database to be significantly overlapped by the genomic coordinates of cis-acting eQTLs. Finally, a subset of 35 cis-acting eQTLs overlapped with known boar taint QTL traits. These eQTLs could be useful in the development of a DNA test for boar taint but careful monitoring of other overlapping QTL traits should be performed to avoid any negative consequences of selection.
Collapse
|
24
|
Genetical genomics of quality related traits in potato tubers using proteomics. BMC PLANT BIOLOGY 2018; 18:20. [PMID: 29361908 PMCID: PMC5781343 DOI: 10.1186/s12870-018-1229-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 01/09/2018] [Indexed: 05/21/2023]
Abstract
BACKGROUND Recent advances in ~omics technologies such as transcriptomics, metabolomics and proteomics along with genotypic profiling have permitted the genetic dissection of complex traits such as quality traits in non-model species. To get more insight into the genetic factors underlying variation in quality traits related to carbohydrate and starch metabolism and cold sweetening, we determined the protein content and composition in potato tubers using 2D-gel electrophoresis in a diploid potato mapping population. Upon analyzing we made sure that the proteins from the patatin family were excluded to ensure a better representation of the other proteins. RESULTS We subsequently performed pQTL analyses for all other proteins with a sufficient representation in the population and established a relationship between proteins and 26 potato tuber quality traits (e.g. flesh colour, enzymatic discoloration) by co-localization on the genetic map and a direct correlation study of protein abundances and phenotypic traits. Over 1643 unique protein spots were detected in total over the two harvests. We were able to map pQTLs for over 300 different protein spots some of which co-localized with traits such as starch content and cold sweetening. pQTLs were observed on every chromosome although not evenly distributed over the chromosomes. The largest number of pQTLs was found for chromosome 8 and the lowest for chromosome number 10. For some 20 protein spots multiple QTLs were observed. CONCLUSIONS From this analysis, hotspot areas for protein QTLs were identified on chromosomes three, five, eight and nine. The hotspot on chromosome 3 coincided with a QTL previously identified for total protein content and had more than 23 pQTLs in the region from 70 to 80 cM. Some of the co-localizing protein spots associated with some of the most interesting tuber quality traits were identified, albeit far less than we had anticipated at the onset of the experiments.
Collapse
|
25
|
Abstract
BACKGROUND The genetics underlying body mass and growth are key to understanding a wide range of topics in biology, both evolutionary and developmental. Body mass and growth traits are affected by many genetic variants of small effect. This complicates genetic mapping of growth and body mass. Experimental intercrosses between individuals from divergent populations allows us to map naturally occurring genetic variants for selected traits, such as body mass by linkage mapping. By simultaneously measuring traits and intermediary molecular phenotypes, such as gene expression, one can use integrative genomics to search for potential causative genes. RESULTS In this study, we use linkage mapping approach to map growth traits (N = 471) and liver gene expression (N = 130) in an advanced intercross of wild Red Junglefowl and domestic White Leghorn layer chickens. We find 16 loci for growth traits, and 1463 loci for liver gene expression, as measured by microarrays. Of these, the genes TRAK1, OSBPL8, YEATS4, CEP55, and PIP4K2B are identified as strong candidates for growth loci in the chicken. We also show a high degree of sex-specific gene-regulation, with almost every gene expression locus exhibiting sex-interactions. Finally, several trans-regulatory hotspots were found, one of which coincides with a major growth locus. CONCLUSIONS These findings not only serve to identify several strong candidates affecting growth, but also show how sex-specificity and local gene-regulation affect growth regulation in the chicken.
Collapse
|
26
|
Trait Mapping Approaches Through Linkage Mapping in Plants. PLANT GENETICS AND MOLECULAR BIOLOGY 2018; 164:53-82. [DOI: 10.1007/10_2017_49] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
|
27
|
Time-course expression QTL-atlas of the global transcriptional response of wheat to Fusarium graminearum. PLANT BIOTECHNOLOGY JOURNAL 2017; 15:1453-1464. [PMID: 28332274 PMCID: PMC5633761 DOI: 10.1111/pbi.12729] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Revised: 01/11/2017] [Accepted: 03/16/2017] [Indexed: 05/09/2023]
Abstract
Fusarium head blight is a devastating disease of small grain cereals such as bread wheat (Triticum aestivum). The pathogen switches from a biotrophic to a nectrotrophic lifestyle in course of disease development forcing its host to adapt its defence strategies. Using a genetical genomics approach, we illustrate genome-wide reconfigurations of genetic control over transcript abundances between two decisive time points after inoculation with the causative pathogen Fusarium graminearum. Whole transcriptome measurements have been recorded for 163 lines of a wheat doubled haploid population segregating for several resistance genes yielding 15 552 at 30 h and 15 888 eQTL at 50 h after inoculation. The genetic map saturated with transcript abundance-derived markers identified of a novel QTL on chromosome 6A, besides the previously reported QTL Fhb1 and Qfhs.ifa-5A. We find a highly different distribution of eQTL between time points with about 40% of eQTL being unique for the respective assessed time points. But also for more than 20% of genes governed by eQTL at either time point, genetic control changes in time. These changes are reflected in the dynamic compositions of three major regulatory hotspots on chromosomes 2B, 4A and 5A. In particular, control of defence-related biological mechanisms concentrated in the hotspot at 4A shift to hotspot 2B as the disease progresses. Hotspots do not colocalize with phenotypic QTL, and within their intervals no higher than expected number of eQTL was detected. Thus, resistance conferred by either QTL is mediated by few or single genes.
Collapse
|
28
|
A Gene Module-Based eQTL Analysis Prioritizing Disease Genes and Pathways in Kidney Cancer. Comput Struct Biotechnol J 2017; 15:463-470. [PMID: 29158875 PMCID: PMC5683705 DOI: 10.1016/j.csbj.2017.09.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Revised: 09/16/2017] [Accepted: 09/24/2017] [Indexed: 12/17/2022] Open
Abstract
Clear cell renal cell carcinoma (ccRCC) is the most common and most aggressive form of renal cell cancer (RCC). The incidence of RCC has increased steadily in recent years. The pathogenesis of renal cell cancer remains poorly understood. Many of the tumor suppressor genes, oncogenes, and dysregulated pathways in ccRCC need to be revealed for improvement of the overall clinical outlook of the disease. Here, we developed a systems biology approach to prioritize the somatic mutated genes that lead to dysregulation of pathways in ccRCC. The method integrated multi-layer information to infer causative mutations and disease genes. First, we identified differential gene modules in ccRCC by coupling transcriptome and protein-protein interactions. Each of these modules consisted of interacting genes that were involved in similar biological processes and their combined expression alterations were significantly associated with disease type. Then, subsequent gene module-based eQTL analysis revealed somatic mutated genes that had driven the expression alterations of differential gene modules. Our study yielded a list of candidate disease genes, including several known ccRCC causative genes such as BAP1 and PBRM1, as well as novel genes such as NOD2, RRM1, CSRNP1, SLC4A2, TTLL1 and CNTN1. The differential gene modules and their driver genes revealed by our study provided a new perspective for understanding the molecular mechanisms underlying the disease. Moreover, we validated the results in independent ccRCC patient datasets. Our study provided a new method for prioritizing disease genes and pathways.
Collapse
Key Words
- AUC, Area Under Curve
- Causative mutation
- DEG, Differentially expressed gene
- DGM, Differential gene module
- Gene module
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- Pathways
- Protein-protein interaction
- RCC, Renal cell cancer
- ROC, Receiver Operating Characteristic
- SVM, Support vector machine
- TCGA, The Cancer Genome Atlas
- ccRCC
- ccRCC, Clear cell renal cell carcinoma
- eQTL
- eQTL, Expression quantitative trait loci
Collapse
|
29
|
Dynamic Role of trans Regulation of Gene Expression in Relation to Complex Traits. Am J Hum Genet 2017; 100:571-580. [PMID: 28285768 DOI: 10.1016/j.ajhg.2017.02.003] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Accepted: 02/01/2017] [Indexed: 11/29/2022] Open
Abstract
Identifying causal genetic variants and understanding their mechanisms of effect on traits remains a challenge in genome-wide association studies (GWASs). In particular, how genetic variants (i.e., trans-eQTLs) affect expression of remote genes (i.e., trans-eGenes) remains unknown. We hypothesized that some trans-eQTLs regulate expression of distant genes by altering the expression of nearby genes (cis-eGenes). Using published GWAS datasets with 39,165 single-nucleotide polymorphisms (SNPs) associated with 1,960 traits, we explored whole blood gene expression associations of trait-associated SNPs in 5,257 individuals from the Framingham Heart Study. We identified 2,350 trans-eQTLs (at p < 10-7); more than 80% of them were found to have cis-associated eGenes. Mediation testing suggested that for 35% of trans-eQTL-trans-eGene pairs in different chromosomes and 90% pairs in the same chromosome, the disease-associated SNP may alter expression of the trans-eGene via cis-eGene expression. In addition, we identified 13 trans-eQTL hotspots, affecting from ten to hundreds of genes, suggesting the existence of master genetic regulators. Using causal inference testing, we searched causal variants across eight cardiometabolic traits (BMI, systolic and diastolic blood pressure, LDL cholesterol, HDL cholesterol, total cholesterol, triglycerides, and fasting blood glucose) and identified several cis-eGenes (ALDH2 for systolic and diastolic blood pressure, MCM6 and DARS for total cholesterol, and TRIB1 for triglycerides) that were causal mediators for the corresponding traits, as well as examples of trans-mediators (TAGAP for LDL cholesterol). The finding of extensive evidence of genome-wide mediation effects suggests a critical role of cryptic gene regulation underlying many disease traits.
Collapse
|
30
|
Efficient inference for genetic association studies with multiple outcomes. Biostatistics 2017; 18:618-636. [DOI: 10.1093/biostatistics/kxx007] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 02/06/2017] [Indexed: 02/04/2023] Open
Abstract
SUMMARY
Combined inference for heterogeneous high-dimensional data is critical in modern biology, where clinical and various kinds of molecular data may be available from a single study. Classical genetic association studies regress a single clinical outcome on many genetic variants one by one, but there is an increasing demand for joint analysis of many molecular outcomes and genetic variants in order to unravel functional interactions. Unfortunately, most existing approaches to joint modeling are either too simplistic to be powerful or are impracticable for computational reasons. Inspired by Richardson and others (2010, Bayesian Statistics 9), we consider a sparse multivariate regression model that allows simultaneous selection of predictors and associated responses. As Markov chain Monte Carlo (MCMC) inference on such models can be prohibitively slow when the number of genetic variants exceeds a few thousand, we propose a variational inference approach which produces posterior information very close to that of MCMC inference, at a much reduced computational cost. Extensive numerical experiments show that our approach outperforms popular variable selection methods and tailored Bayesian procedures, dealing within hours with problems involving hundreds of thousands of genetic variants and tens to hundreds of clinical or molecular outcomes.
Collapse
|
31
|
AraQTL - workbench and archive for systems genetics in Arabidopsis thaliana. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 89:1225-1235. [PMID: 27995664 DOI: 10.1111/tpj.13457] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 11/24/2016] [Accepted: 12/06/2016] [Indexed: 06/06/2023]
Abstract
Genetical genomics studies uncover genome-wide genetic interactions between genes and their transcriptional regulators. High-throughput measurement of gene expression in recombinant inbred line populations has enabled investigation of the genetic architecture of variation in gene expression. This has the potential to enrich our understanding of the molecular mechanisms affected by and underlying natural variation. Moreover, it contributes to the systems biology of natural variation, as a substantial number of experiments have resulted in a valuable amount of interconnectable phenotypic, molecular and genotypic data. A number of genetical genomics studies have been published for Arabidopsis thaliana, uncovering many expression quantitative trait loci (eQTLs). However, these complex data are not easily accessible to the plant research community, leaving most of the valuable genetic interactions unexplored as cross-analysis of these studies is a major effort. We address this problem with AraQTL (http://www.bioinformatics.nl/Ara QTL/), an easily accessible workbench and database for comparative analysis and meta-analysis of all published Arabidopsis eQTL datasets. AraQTL provides a workbench for comparing, re-using and extending upon the results of these experiments. For example, one can easily screen a physical region for specific local eQTLs that could harbour candidate genes for phenotypic QTLs, or detect gene-by-environment interactions by comparing eQTLs under different conditions.
Collapse
|
32
|
Regulatory Architecture of Gene Expression Variation in the Threespine Stickleback Gasterosteus aculeatus. G3-GENES GENOMES GENETICS 2017; 7:165-178. [PMID: 27836907 PMCID: PMC5217106 DOI: 10.1534/g3.116.033241] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Much adaptive evolutionary change is underlain by mutational variation in regions of the genome that regulate gene expression rather than in the coding regions of the genes themselves. An understanding of the role of gene expression variation in facilitating local adaptation will be aided by an understanding of underlying regulatory networks. Here, we characterize the genetic architecture of gene expression variation in the threespine stickleback (Gasterosteus aculeatus), an important model in the study of adaptive evolution. We collected transcriptomic and genomic data from 60 half-sib families using an expression microarray and genotyping-by-sequencing, and located expression quantitative trait loci (eQTL) underlying the variation in gene expression in liver tissue using an interval mapping approach. We identified eQTL for several thousand expression traits. Expression was influenced by polymorphism in both cis- and trans-regulatory regions. Trans-eQTL clustered into hotspots. We did not identify master transcriptional regulators in hotspot locations: rather, the presence of hotspots may be driven by complex interactions between multiple transcription factors. One observed hotspot colocated with a QTL recently found to underlie salinity tolerance in the threespine stickleback. However, most other observed hotspots did not colocate with regions of the genome known to be involved in adaptive divergence between marine and freshwater habitats.
Collapse
|
33
|
eQTLs Regulating Transcript Variations Associated with Rapid Internode Elongation in Deepwater Rice. FRONTIERS IN PLANT SCIENCE 2017; 8:1753. [PMID: 29081784 PMCID: PMC5645499 DOI: 10.3389/fpls.2017.01753] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Accepted: 09/25/2017] [Indexed: 05/09/2023]
Abstract
To avoid low oxygen, oxygen deficiency or oxygen deprivation, deepwater rice cultivated in flood planes can develop elongated internodes in response to submergence. Knowledge of the gene regulatory networks underlying rapid internode elongation is important for an understanding of the evolution and adaptation of major crops in response to flooding. To elucidate the genetic and molecular basis controlling their deepwater response we used microarrays and performed expression quantitative trait loci (eQTL) and phenotypic QTL (phQTL) analyses of internode samples of 85 recombinant inbred line (RIL) populations of non-deepwater (Taichung 65)- and deepwater rice (Bhadua). After evaluating the phenotypic response of the RILs exposed to submergence, confirming the genotypes of the populations, and generating 188 genetic markers, we identified 10,047 significant eQTLs comprised of 2,902 cis-eQTLs and 7,145 trans-eQTLs and three significant eQTL hotspots on chromosomes 1, 4, and 12 that affect the expression of many genes. The hotspots on chromosomes 1 and 4 located at different position from phQTLs detected in this study and other previous studies. We then regarded the eQTL hotspots as key regulatory points to infer causal regulatory networks of deepwater response including rapid internode elongation. Our results suggest that the downstream regulation of the eQTL hotspots on chromosomes 1 and 4 is independent, and that the target genes are partially regulated by SNORKEL1 and SNORKEL2 genes (SK1/2), key ethylene response factors. Subsequent bioinformatic analyses, including gene ontology-based annotation and functional enrichment analysis and promoter enrichment analysis, contribute to enhance our understanding of SK1/2-dependent and independent pathways. One remarkable observation is that the functional categories related to photosynthesis and light signaling are significantly over-represented in the candidate target genes of SK1/2. The combined results of these investigations together with genetical genomics approaches using structured populations with a deepwater response are also discussed in the context of current molecular models concerning the rapid internode elongation in deepwater rice. This study provides new insights into the underlying genetic architecture of gene expression regulating the response to flooding in deepwater rice and will be an important community resource for analyses on the genetic basis of deepwater responses.
Collapse
|
34
|
Abstract
INTRODUCTION Seed germination is inherently related to seed metabolism, which changes throughout its maturation, desiccation and germination processes. The metabolite content of a seed and its ability to germinate are determined by underlying genetic architecture and environmental effects during development. OBJECTIVE This study aimed to assess an integrative approach to explore genetics modulating seed metabolism in different developmental stages and the link between seed metabolic- and germination traits. METHODS We have utilized gas chromatography-time-of-flight/mass spectrometry (GC-TOF/MS) metabolite profiling to characterize tomato seeds during dry and imbibed stages. We describe, for the first time in tomato, the use of a so-called generalized genetical genomics (GGG) model to study the interaction between genetics, environment and seed metabolism using 100 tomato recombinant inbred lines (RILs) derived from a cross between Solanum lycopersicum and Solanum pimpinellifolium. RESULTS QTLs were found for over two-thirds of the metabolites within several QTL hotspots. The transition from dry to 6 h imbibed seeds was associated with programmed metabolic switches. Significant correlations varied among individual metabolites and the obtained clusters were significantly enriched for metabolites involved in specific biochemical pathways. CONCLUSIONS Extensive genetic variation in metabolite abundance was uncovered. Numerous identified genetic regions that coordinate groups of metabolites were detected and these will contain plausible candidate genes. The combined analysis of germination phenotypes and metabolite profiles provides a strong indication for the hypothesis that metabolic composition is related to germination phenotypes and thus to seed performance.
Collapse
|
35
|
A METHYLATION-TO-EXPRESSION FEATURE MODEL FOR GENERATING ACCURATE PROGNOSTIC RISK SCORES AND IDENTIFYING DISEASE TARGETS IN CLEAR CELL KIDNEY CANCER. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017; 22:509-520. [PMID: 27897002 PMCID: PMC5177986 DOI: 10.1142/9789813207813_0047] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Many researchers now have available multiple high-dimensional molecular and clinical datasets when studying a disease. As we enter this multi-omic era of data analysis, new approaches that combine different levels of data (e.g. at the genomic and epigenomic levels) are required to fully capitalize on this opportunity. In this work, we outline a new approach to multi-omic data integration, which combines molecular and clinical predictors as part of a single analysis to create a prognostic risk score for clear cell renal cell carcinoma. The approach integrates data in multiple ways and yet creates models that are relatively straightforward to interpret and with a high level of performance. Furthermore, the proposed process of data integration captures relationships in the data that represent highly disease-relevant functions.
Collapse
|
36
|
Abstract
Systems genetics stems from systems biology and similarly employs integrative modeling approaches to describe the perturbations and phenotypic effects observed in a complex system. However, in the case of systems genetics the main source of perturbation is naturally occurring genetic variation, which can be analyzed at the systems-level to explain the observed variation in phenotypic traits. In contrast with conventional single-variant association approaches, the success of systems genetics has been in the identification of gene networks and molecular pathways that underlie complex disease. In addition, systems genetics has proven useful in the discovery of master trans-acting genetic regulators of functional networks and pathways, which in many cases revealed unexpected gene targets for disease. Here we detail the central components of a fully integrated systems genetics approach to complex disease, starting from assessment of genetic and gene expression variation, linking DNA sequence variation to mRNA (expression QTL mapping), gene regulatory network analysis and mapping the genetic control of regulatory networks. By summarizing a few illustrative (and successful) examples, we highlight how different data-modeling strategies can be effectively integrated in a systems genetics study.
Collapse
|
37
|
Deciphering the regulation of porcine genes influencing growth, fatness and yield-related traits through genetical genomics. Mamm Genome 2016; 28:130-142. [PMID: 27942838 DOI: 10.1007/s00335-016-9674-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 11/25/2016] [Indexed: 10/20/2022]
Abstract
Genetical genomics approaches aim at identifying quantitative trait loci for molecular traits, also known as intermediate phenotypes, such as gene expression, that could link variation in genetic information to physiological traits. In the current study, an expression GWAS has been carried out on an experimental Iberian × Landrace backcross in order to identify the genomic regions regulating the gene expression of those genes whose expression is correlated with growth, fat deposition, and premium cut yield measures in pig. The analyses were conducted exploiting Porcine 60K SNP BeadChip genotypes and Porcine Expression Microarray data hybridized on mRNA from Longissimus dorsi muscle. In order to focus the analysis on productive traits and reduce the number of analyses, only those probesets whose expression showed significant correlation with at least one of the seven phenotypes of interest were selected for the eGWAS. A total of 63 eQTL regions were identified with effects on 36 different transcripts. Those eQTLs overlapping with phenotypic QTLs on SSC4, SSC9, SSC13, and SSC17 chromosomes previously detected in the same animal material were further analyzed. Moreover, candidate genes and SNPs were analyzed. Among the most promising results, a long non-coding RNA, ALDBSSCG0000001928, was identified, whose expression is correlated with premium cut yield. Association analysis and in silico sequence domain annotation support TXNRD3 polymorphisms as candidate to regulate ALDBSSCG0000001928 expression, which can be involved in the transcriptional regulation of surrounding genes, affecting productive and meat quality traits.
Collapse
|
38
|
Interference with ethylene perception at receptor level sheds light on auxin and transcriptional circuits associated with the climacteric ripening of apple fruit (Malus x domestica Borkh.). THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2016; 88:963-975. [PMID: 27531564 DOI: 10.1111/tpj.13306] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 08/09/2016] [Accepted: 08/11/2016] [Indexed: 05/08/2023]
Abstract
Apple (Malus x domestica Borkh.) is a model species for studying the metabolic changes that occur at the onset of ripening in fruit crops, and the physiological mechanisms that are governed by the hormone ethylene. In this study, to dissect the climacteric interplay in apple, a multidisciplinary approach was employed. To this end, a comprehensive analysis of gene expression together with the investigation of several physiological entities (texture, volatilome and content of polyphenolic compounds) was performed throughout fruit development and ripening. The transcriptomic profiling was conducted with two microarray platforms: a dedicated custom array (iRIPE) and a whole genome array specifically enriched with ripening-related genes for apple (WGAA). The transcriptomic and phenotypic changes following the application of 1-methylcyclopropene (1-MCP), an ethylene inhibitor leading to important modifications in overall fruit physiology, were also highlighted. The integrative comparative network analysis showed both negative and positive correlations between ripening-related transcripts and the accumulation of specific metabolites or texture components. The ripening distortion caused by the inhibition of ethylene perception, in addition to affecting the ethylene pathway, stimulated the de-repression of auxin-related genes, transcription factors and photosynthetic genes. Overall, the comprehensive repertoire of results obtained here advances the elucidation of the multi-layered climacteric mechanism of fruit ripening, thus suggesting a possible transcriptional circuit governed by hormones and transcription factors.
Collapse
|
39
|
Transcriptome Profiling in Rat Inbred Strains and Experimental Cross Reveals Discrepant Genetic Architecture of Genome-Wide Gene Expression. G3-GENES GENOMES GENETICS 2016; 6:3671-3683. [PMID: 27646706 PMCID: PMC5100866 DOI: 10.1534/g3.116.033274] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
To test the impact of genetic heterogeneity on cis- and trans-mediated mechanisms of gene expression regulation, we profiled the transcriptome of adipose tissue in 20 inbred congenic strains derived from diabetic Goto-Kakizaki (GK) rats and Brown-Norway (BN) controls, which contain well-defined blocks (1-183 Mb) of genetic polymorphisms, and in 123 genetically heterogeneous rats of an (GK × BN)F2 offspring. Within each congenic we identified 73-1351 differentially expressed genes (DEGs), only 7.7% of which mapped within the congenic blocks, and which may be regulated in cis The remainder localized outside the blocks, and therefore must be regulated in trans Most trans-regulated genes exhibited approximately twofold expression changes, consistent with monoallelic expression. Altered biological pathways were replicated between congenic strains sharing blocks of genetic polymorphisms, but polymorphisms at different loci also had redundant effects on transcription of common distant genes and pathways. We mapped 2735 expression quantitative trait loci (eQTL) in the F2 cross, including 26% predominantly cis-regulated genes, which validated DEGs in congenic strains. A hotspot of >300 eQTL in a 10 cM region of chromosome 1 was enriched in DEGs in a congenic strain. However, many DEGs among GK, BN and congenic strains did not replicate as eQTL in F2 hybrids, demonstrating distinct mechanisms of gene expression when alleles segregate in an outbred population or are fixed homozygous across the entire genome or in short genomic regions. Our analysis provides conceptual advances in our understanding of the complex architecture of genome expression and pathway regulation, and suggests a prominent impact of epistasis and monoallelic expression on gene transcription.
Collapse
|
40
|
Association of a Network of Interferon-Stimulated Genes with a Locus Encoding a Negative Regulator of Non-conventional IKK Kinases and IFNB1. Cell Rep 2016; 17:425-435. [PMID: 27705791 DOI: 10.1016/j.celrep.2016.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Revised: 08/11/2016] [Accepted: 09/02/2016] [Indexed: 11/25/2022] Open
Abstract
Functional genomic analysis of gene expression in mice allowed us to identify a quantitative trait locus (QTL) linked in trans to the expression of 190 gene transcripts and in cis to the expression of only two genes, one of which was Ypel5. Most of the trans-expression QTL genes were interferon-stimulated genes (ISGs), and their expression in mouse macrophage cell lines was stimulated in an IFNB1-dependent manner by Ypel5 silencing. In human HEK293T cells, YPEL5 silencing enhanced the induction of IFNB1 by pattern recognition receptors and phosphorylation of TBK1/IKBKE kinases, whereas co-immunoprecipitation experiments revealed that YPEL5 interacted physically with IKBKE. We thus found that the Ypel5 gene (contained in a locus linked to a network of ISGs in mice) is a negative regulator of IFNB1 production and innate immune responses that interacts functionally and physically with TBK1/IKBKE kinases.
Collapse
|
41
|
Abstract
Complementary to traditional gene mapping approaches used to identify the hereditary components of complex diseases, integrative genomics and systems genetics have emerged as powerful strategies to decipher the key genetic drivers of molecular pathways that underlie disease. Broadly speaking, integrative genomics aims to link cellular-level traits (such as mRNA expression) to the genome to identify their genetic determinants. With the characterization of several cellular-level traits within the same system, the integrative genomics approach evolved into a more comprehensive study design, called systems genetics, which aims to unravel the complex biological networks and pathways involved in disease, and in turn map their genetic control points. The first fully integrated systems genetics study was carried out in rats, and the results, which revealed conserved trans-acting genetic regulation of a pro-inflammatory network relevant to type 1 diabetes, were translated to humans. Many studies using different organisms subsequently stemmed from this example. The aim of this Review is to describe the most recent advances in the fields of integrative genomics and systems genetics applied in the rat, with a focus on studies of complex diseases ranging from inflammatory to cardiometabolic disorders. We aim to provide the genetics community with a comprehensive insight into how the systems genetics approach came to life, starting from the first integrative genomics strategies [such as expression quantitative trait loci (eQTLs) mapping] and concluding with the most sophisticated gene network-based analyses in multiple systems and disease states. Although not limited to studies that have been directly translated to humans, we will focus particularly on the successful investigations in the rat that have led to primary discoveries of genes and pathways relevant to human disease.
Collapse
|
42
|
Genetics of water use physiology in locally adapted Arabidopsis thaliana. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2016; 251:12-22. [PMID: 27593459 DOI: 10.1016/j.plantsci.2016.03.015] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Revised: 03/26/2016] [Accepted: 03/28/2016] [Indexed: 06/06/2023]
Abstract
Identifying the genetic basis of adaptation to climate has long been a goal in evolutionary biology and has applications in agriculture. Adaptation to drought represents one important aspect of local adaptation, and drought is the major factor limiting agricultural yield. We examined local adaptation between Sweden and Italy Arabidopsis thaliana ecotypes, which show contrasting levels of water availability in their local environments. To identify quantitative trait loci (QTL) controlling water use physiology traits and adaptive trait QTL (genomic regions where trait QTL and fitness QTL colocalize), we performed QTL mapping on 374F9 recombinant inbred lines in well-watered and terminal drought conditions. We found 72 QTL (32 in well-watered, 31 in drought, 9 for plasticity) across five water use physiology traits: δ(13)C, rosette area, dry rosette weight, leaf water content and percent leaf nitrogen. Some of these genomic regions colocalize with fitness QTL and with other physiology QTL in defined hotspots. In addition, we found evidence of both constitutive and inducible water use physiology QTL. Finally, we identified highly divergent candidate genes, in silico. Our results suggest that many genes with minor effects may influence adaptation through water use physiology and that pleiotropic water use physiology QTL have fitness consequences.
Collapse
|
43
|
Hypothalamic transcriptomes of 99 mouse strains reveal trans eQTL hotspots, splicing QTLs and novel non-coding genes. eLife 2016; 5. [PMID: 27623010 PMCID: PMC5053804 DOI: 10.7554/elife.15614] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2016] [Accepted: 09/12/2016] [Indexed: 12/19/2022] Open
Abstract
Previous studies had shown that the integration of genome wide expression profiles, in metabolic tissues, with genetic and phenotypic variance, provided valuable insight into the underlying molecular mechanisms. We used RNA-Seq to characterize hypothalamic transcriptome in 99 inbred strains of mice from the Hybrid Mouse Diversity Panel (HMDP), a reference resource population for cardiovascular and metabolic traits. We report numerous novel transcripts supported by proteomic analyses, as well as novel non coding RNAs. High resolution genetic mapping of transcript levels in HMDP, reveals both local and trans expression Quantitative Trait Loci (eQTLs) demonstrating 2 trans eQTL 'hotspots' associated with expression of hundreds of genes. We also report thousands of alternative splicing events regulated by genetic variants. Finally, comparison with about 150 metabolic and cardiovascular traits revealed many highly significant associations. Our data provide a rich resource for understanding the many physiologic functions mediated by the hypothalamus and their genetic regulation. DOI:http://dx.doi.org/10.7554/eLife.15614.001 Metabolism is a term that describes all the chemical reactions that are involved in keeping a living organism alive. Diseases related to metabolism – such as obesity, heart disease and diabetes – are a major health problem in the Western world. The causes of these diseases are complex and include both environmental factors, such as diet and exercise, and genetics. Indeed, many genetic variants that contribute to obesity have been uncovered in both humans and mice. However, it is only dimly understood how these genetic variants affect the underlying networks of interacting genes that cause metabolic disorders. Measuring gene activity or expression, and tracing how genetic instructions are carried from DNA into RNA and proteins, can reliably identify groups of genes that correlate with metabolic traits in specific organs. This strategy was successfully used in previous studies to reveal new information about abnormalities linked to obesity in specific tissues such as the liver and fat tissues. It was also shown that this approach might suggest new molecules that could be targeted to treat metabolic disorders. A brain region called the hypothalamus is key to the control of metabolism, including feeding behavior and obesity. Hasin-Brumshtein et al. set out to explore gene expression in the hypothalamus of 99 different strains of mice, in the hope that the data will help identify new connections between gene expression and metabolism. This approach showed that thousands of new and known genes are expressed in the mouse hypothalamus, some of which coded for proteins, and some of which did not. Hasin-Brumshtein et al. uncovered two genetic variants that controlled the expression of hundreds of other genes. Further analysis then revealed thousands of genetic variants that regulated the expression of, and type of RNA (so-called "spliceforms") produced from neighboring genes. Also, the expression of many individual genes showed significant similarities with about 150 metabolic measurements that had been evaluated previously in the mice. This new dataset is a unique resource that can be coupled with different approaches to test existing ideas and develop new ones about the role of particular genes or genetic mechanisms in obesity. Future studies will likely focus on new genes that show strong associations with attributes that are relevant to metabolic disorders, such as insulin levels, weight and fat mass. DOI:http://dx.doi.org/10.7554/eLife.15614.002
Collapse
|
44
|
Molecular QTL discovery incorporating genomic annotations using Bayesian false discovery rate control. Ann Appl Stat 2016. [DOI: 10.1214/16-aoas952] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
45
|
Systems genetics reveals key genetic elements of drought induced gene regulation in diploid potato. PLANT, CELL & ENVIRONMENT 2016; 39:1895-1908. [PMID: 27353051 DOI: 10.1111/pce.12744] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 03/01/2016] [Accepted: 03/03/2016] [Indexed: 06/06/2023]
Abstract
In plants, tolerance to drought stress is a result of numerous minor effect loci in which transcriptional regulation contributes significantly to the observed phenotypes. Under severe drought conditions, a major expression quantitative trait loci hotspot was identified on chromosome five in potato. A putative Nuclear factor y subunit C4 was identified as key candidate in the regulatory cascade in response to drought. Further investigation of the eQTL hotspots suggests a role for a putative Homeobox leucine zipper protein 12 in relation to drought in potato. Genes strongly co-expressed with Homeobox leucine zipper protein 12 were plant growth regulators responsive to water deficit stress in Arabidopsis thaliana, implying a possible conserved mechanism. Integrative analysis of genetic, genomic, phenotypic and transcriptomic data provided insights in the downstream functional components of the drought response. The abscisic acid- and environmental stress-inducible protein TAS14 was highly induced by severe drought in potato and acts as a reliable biomarker for the level of stress perceived by the plant. The systems genetics approach supported a role for multiple genes responsive to severe drought stress of Solanum tuberosum. The combination of gene regulatory networks, expression quantitative trait loci mapping and phenotypic analysis proved useful for candidate gene selection.
Collapse
|
46
|
Quantitative trait gene Slit2 positively regulates murine hematopoietic stem cell numbers. Sci Rep 2016; 6:31412. [PMID: 27503415 PMCID: PMC4977545 DOI: 10.1038/srep31412] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2016] [Accepted: 07/21/2016] [Indexed: 12/30/2022] Open
Abstract
Hematopoietic stem cells (HSC) demonstrate natural variation in number and function. The genetic factors responsible for the variations (or quantitative traits) are largely unknown. We previously identified a gene whose differential expression underlies the natural variation of HSC numbers in C57BL/6 (B6) and DBA/2 (D2) mice. We now report the finding of another gene, Slit2, on chromosome 5 that also accounts for variation in HSC number. In reciprocal chromosome 5 congenic mice, introgressed D2 alleles increased HSC numbers, whereas B6 alleles had the opposite effect. Using gene array and quantitative polymerase chain reaction, we identified Slit2 as a quantitative trait gene whose expression was positively correlated with the number of HSCs. Ectopic expression of Slit2 not only increased the number of the long-term colony forming HSCs, but also enhanced their repopulation capacity upon transplantation. Therefore, Slit2 is a novel quantitative trait gene and a positive regulator of the number and function of murine HSCs. This finding suggests that Slit2 may be a potential therapeutic target for the effective in vitro and in vivo expansion of HSCs without compromising normal hematopoiesis.
Collapse
|
47
|
Recent Perspective of Next Generation Sequencing: Applications in Molecular Plant Biology and Crop Improvement. ACTA ACUST UNITED AC 2016. [DOI: 10.1007/s40011-016-0770-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
48
|
Genomic architecture of phenotypic divergence between two hybridizing plant species along an elevational gradient. AOB PLANTS 2016; 8:plw022. [PMID: 27083198 PMCID: PMC4887755 DOI: 10.1093/aobpla/plw022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 03/19/2016] [Indexed: 05/03/2023]
Abstract
Knowledge of the genetic basis of phenotypic divergence between species and how such divergence is caused and maintained is crucial to an understanding of speciation and the generation of biodiversity. The hybrid zone between Senecio aethnensis and S. chrysanthemifolius on Mount Etna, Sicily, provides a well-studied example of species divergence in response to conditions at different elevations, despite hybridization and gene flow. Here, we investigate the genetic architecture of divergence between these two species using a combination of quantitative trait locus (QTL) mapping and genetic differentiation measures based on genetic marker analysis. A QTL architecture characterized by physical QTL clustering, epistatic interactions between QTLs, and pleiotropy was identified, and is consistent with the presence of divergent QTL complexes resistant to gene flow. A role for divergent selection between species was indicated by significant negative associations between levels of interspecific genetic differentiation at mapped marker gene loci and map distance from QTLs and hybrid incompatibility loci. Within-species selection contributing to interspecific differentiation was evidenced by negative associations between interspecific genetic differentiation and genetic diversity within species. These results show that the two Senecio species, while subject to gene flow, maintain divergent genomic regions consistent with local selection within species and selection against hybrids between species which, in turn, contribute to the maintenance of their distinct phenotypic differences.
Collapse
|
49
|
POEM: Identifying Joint Additive Effects on Regulatory Circuits. Front Genet 2016; 7:48. [PMID: 27148351 PMCID: PMC4835676 DOI: 10.3389/fgene.2016.00048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 03/17/2016] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Expression Quantitative Trait Locus (eQTL) mapping tackles the problem of identifying variation in DNA sequence that have an effect on the transcriptional regulatory network. Major computational efforts are aimed at characterizing the joint effects of several eQTLs acting in concert to govern the expression of the same genes. Yet, progress toward a comprehensive prediction of such joint effects is limited. For example, existing eQTL methods commonly discover interacting loci affecting the expression levels of a module of co-regulated genes. Such "modularization" approaches, however, are focused on epistatic relations and thus have limited utility for the case of additive (non-epistatic) effects. RESULTS Here we present POEM (Pairwise effect On Expression Modules), a methodology for identifying pairwise eQTL effects on gene modules. POEM is specifically designed to achieve high performance in the case of additive joint effects. We applied POEM to transcription profiles measured in bone marrow-derived dendritic cells across a population of genotyped mice. Our study reveals widespread additive, trans-acting pairwise effects on gene modules, characterizes their organizational principles, and highlights high-order interconnections between modules within the immune signaling network. These analyses elucidate the central role of additive pairwise effect in regulatory circuits, and provide computational tools for future investigations into the interplay between eQTLs. AVAILABILITY The software described in this article is available at csgi.tau.ac.il/POEM/.
Collapse
|
50
|
The Dissection of Expression Quantitative Trait Locus Hotspots. Genetics 2016; 202:1563-74. [PMID: 26837753 DOI: 10.1534/genetics.115.183624] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Accepted: 01/27/2016] [Indexed: 02/03/2023] Open
Abstract
Studies of the genetic loci that contribute to variation in gene expression frequently identify loci with broad effects on gene expression: expression quantitative trait locus hotspots. We describe a set of exploratory graphical methods as well as a formal likelihood-based test for assessing whether a given hotspot is due to one or multiple polymorphisms. We first look at the pattern of effects of the locus on the expression traits that map to the locus: the direction of the effects and the degree of dominance. A second technique is to focus on the individuals that exhibit no recombination event in the region, apply dimensionality reduction (e.g., with linear discriminant analysis), and compare the phenotype distribution in the nonrecombinant individuals to that in the recombinant individuals: if the recombinant individuals display a different expression pattern than the nonrecombinant individuals, this indicates the presence of multiple causal polymorphisms. In the formal likelihood-based test, we compare a two-locus model, with each expression trait affected by one or the other locus, to a single-locus model. We apply our methods to a large mouse intercross with gene expression microarray data on six tissues.
Collapse
|