1
|
Long E, Williams J, Zhang H, Choi J. An evolving understanding of multiple causal variants underlying genetic association signals. Am J Hum Genet 2025; 112:741-750. [PMID: 39965570 DOI: 10.1016/j.ajhg.2025.01.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2024] [Revised: 01/15/2025] [Accepted: 01/21/2025] [Indexed: 02/20/2025] Open
Abstract
Understanding how genetic variation contributes to phenotypic variation is a fundamental question in genetics. Genome-wide association studies (GWASs) have discovered numerous genetic associations with various human phenotypes, most of which contain co-inherited variants in strong linkage disequilibrium (LD) with indistinguishable statistical significance. The experimental and analytical difficulty in identifying the "causal variant" among the co-inherited variants has traditionally led mechanistic studies to focus on relatively simple loci, where a single functional variant is presumed to explain most of the association signal and affect a target gene. The notion that a single causal variant is responsible for an association signal, while other variants in LD are merely correlated, has often been assumed in functional studies. However, emerging evidence powered by high-throughput experimental tools and context-specific functional databases argues that even a single independent signal may involve multiple functional variants in strong LD, each contributing to the observed genetic association. In this perspective, we articulate this evolving understanding of causal variants through examples from both traditional locus-by-locus approaches and more recent high-throughput functional studies. We then discuss the implications and prospects of this notion in understanding the genetic architecture of complex traits and interpreting the variant-level causality in GWAS follow-up studies.
Collapse
Affiliation(s)
- Erping Long
- State Key Laboratory of Respiratory Health and Multimorbidity, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
| | - Jacob Williams
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Haoyu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Jiyeon Choi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA.
| |
Collapse
|
2
|
Widney KA, Yang DD, Rusch LM, Copley SD. CRISPR-Cas9-assisted genome editing in E. coli elevates the frequency of unintended mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.19.584922. [PMID: 38562785 PMCID: PMC10983943 DOI: 10.1101/2024.03.19.584922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Cas-assisted lambda Red recombineering techniques have rapidly become a mainstay of bacterial genome editing. Such techniques have been used to construct both individual mutants and massive libraries to assess the effects of genomic changes. We have found that a commonly used Cas9-assisted editing method results in unintended mutations elsewhere in the genome in 26% of edited clones. The unintended mutations are frequently found over 200 kb from the intended edit site and even over 10 kb from potential off-target sites. We attribute the high frequency of unintended mutations to error-prone polymerases expressed in response to dsDNA breaks introduced at the edit site. Most unintended mutations occur in regulatory or coding regions and thus may have phenotypic effects. Our findings highlight the risks associated with genome editing techniques involving dsDNA breaks in E. coli and likely other bacteria and emphasize the importance of sequencing the genomes of edited cells to ensure the absence of unintended mutations.
Collapse
Affiliation(s)
- Karl A. Widney
- Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309, USA
- Department of Biochemistry, University of Colorado Boulder, Boulder, CO, 80309, USA
- Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO, 80205, USA
| | - Dong-Dong Yang
- Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309, USA
- Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO, 80205, USA
| | - Leo M. Rusch
- Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309, USA
- Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO, 80205, USA
| | - Shelley D. Copley
- Department of Molecular, Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309, USA
- Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO, 80205, USA
| |
Collapse
|
3
|
Richer S, Tian Y, Schoenfelder S, Hurst L, Murrell A, Pisignano G. Widespread allele-specific topological domains in the human genome are not confined to imprinted gene clusters. Genome Biol 2023; 24:40. [PMID: 36869353 PMCID: PMC9983196 DOI: 10.1186/s13059-023-02876-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 02/13/2023] [Indexed: 03/05/2023] Open
Abstract
BACKGROUND There is widespread interest in the three-dimensional chromatin conformation of the genome and its impact on gene expression. However, these studies frequently do not consider parent-of-origin differences, such as genomic imprinting, which result in monoallelic expression. In addition, genome-wide allele-specific chromatin conformation associations have not been extensively explored. There are few accessible bioinformatic workflows for investigating allelic conformation differences and these require pre-phased haplotypes which are not widely available. RESULTS We developed a bioinformatic pipeline, "HiCFlow," that performs haplotype assembly and visualization of parental chromatin architecture. We benchmarked the pipeline using prototype haplotype phased Hi-C data from GM12878 cells at three disease-associated imprinted gene clusters. Using Region Capture Hi-C and Hi-C data from human cell lines (1-7HB2, IMR-90, and H1-hESCs), we can robustly identify the known stable allele-specific interactions at the IGF2-H19 locus. Other imprinted loci (DLK1 and SNRPN) are more variable and there is no "canonical imprinted 3D structure," but we could detect allele-specific differences in A/B compartmentalization. Genome-wide, when topologically associating domains (TADs) are unbiasedly ranked according to their allele-specific contact frequencies, a set of allele-specific TADs could be defined. These occur in genomic regions of high sequence variation. In addition to imprinted genes, allele-specific TADs are also enriched for allele-specific expressed genes. We find loci that have not previously been identified as allele-specific expressed genes such as the bitter taste receptors (TAS2Rs). CONCLUSIONS This study highlights the widespread differences in chromatin conformation between heterozygous loci and provides a new framework for understanding allele-specific expressed genes.
Collapse
Affiliation(s)
- Stephen Richer
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Yuan Tian
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
- UCL Cancer Institute, University College London, Paul O'Gorman Building, London, UK
| | | | - Laurence Hurst
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Adele Murrell
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK.
| | - Giuseppina Pisignano
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK.
| |
Collapse
|
4
|
Bai J, Zhong JY, Liao W, Hu R, Chen L, Wu XJ, Liu SP. iTRAQ‑based proteomic analysis reveals potential regulatory networks in dust mite‑related asthma treated with subcutaneous allergen immunotherapy. Mol Med Rep 2020; 22:3607-3620. [PMID: 32901873 PMCID: PMC7533450 DOI: 10.3892/mmr.2020.11472] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 02/24/2020] [Indexed: 12/29/2022] Open
Abstract
Asthma is one of the most common childhood chronic diseases worldwide. Subcutaneous immunotherapy (SCIT) is commonly used in the treatment of house dust mite (HDM)‑related asthma in children. However, the therapeutic mechanism of SCIT in asthma remains unclear. The present study aimed to investigate the molecular biomarkers associated with HDM‑related asthma in asthmatic children prior and subsequent to SCIT treatment compared with those in healthy children via proteomic analysis. The study included a control group (30 healthy children), ‑Treatment group (30 children with HDM‑related allergic asthma) and +Treatment group (30 children with HDM‑related allergic asthma treated with SCIT). An isobaric labeling with relative and absolute quantification‑based method was used to analyze serum proteome changes to detect differentially expressed proteins, while functional enrichment and protein‑protein interaction network analysis were used to select candidate biomarkers. A total of 72 differentially expressed proteins were detected in the ‑Treatment, +Treatment and control groups. A total of 33 and 57 differentially expressed proteins were observed in the ‑Treatment vs. control and +Treatment vs. control groups, respectively. Through bioinformatics analysis, 5 candidate proteins [keratin 1 (KRT1), apolipoprotein B (APOB), fibronectin 1, antithrombin III (SERPINC1) and α‑1‑antitrypsin (SERPINA1)] were selected for validation by western blotting; among them, 4 proteins (KRT1, APOB, SERPINC1 and SERPINA1) showed robust reproducibility in asthma and control samples. This study illustrated the changes in proteome regulation following SCIT treatment for asthma. The 4 identified proteins may serve as potential biomarkers prior and subsequent to SCIT treatment, and help elucidate the molecular regulation mechanisms of SCIT to treat HDM‑related asthma.
Collapse
Affiliation(s)
- Jun Bai
- Department of Pediatrics, Foshan Maternal and Children's Hospital Affiliated to Southern Medical University, Foshan, Guangdong 528000, P.R. China
| | - Jia-Yong Zhong
- Institute of Pediatrics, Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou, Guangdong 510623, P.R. China
| | - Wang Liao
- Department of Pediatrics, Foshan Maternal and Children's Hospital Affiliated to Southern Medical University, Foshan, Guangdong 528000, P.R. China
| | - Ruo Hu
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, Guangdong 510000, P.R. China
| | - Liang Chen
- Department of Pediatrics, Foshan Maternal and Children's Hospital Affiliated to Southern Medical University, Foshan, Guangdong 528000, P.R. China
| | - Xian-Jin Wu
- Key Laboratory of Research and Utilization of Ethnomedicinal Plant Resources of Hunan Province, Key Laboratory of Hunan Higher Education for Western Hunan Medicinal Plant and Ethnobotany, College of Biological and Food Engineering, Huaihua University, Huaihua, Hunan 418008, P.R. China
| | - Shuang-Ping Liu
- Chronic Disease Research Center, Medical College, Dalian University, Dalian, Liaoning 116622, P.R. China
| |
Collapse
|
5
|
Li Y, Dong J, Xiao H, Zhang S, Wang B, Cui M, Fan S. Gut commensal derived-valeric acid protects against radiation injuries. Gut Microbes 2020; 11:789-806. [PMID: 31931652 PMCID: PMC7524389 DOI: 10.1080/19490976.2019.1709387] [Citation(s) in RCA: 110] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Hematopoietic and intestinal systems side effects are frequently found in patients who suffered from accidental or medical radiation exposure. In this case, we investigated the effects of gut microbiota produced-valeric acid (VA) on radiation-induced injuries. METHODS Mice were exposed to total body irradiation (TBI) or total abdominal irradiation (TAI) to mimic accidental or clinical scenarios. High-performance liquid chromatography (HPLC) was performed to assess short-chain fatty acids (SCFAs) in fecal pellets. Oral gavage with VA was used to mitigate radiation-induced toxicity. Gross examination was performed to assess tissue injuries of thymus, spleen and small intestine. High-throughput sequencing was used to characterize the gut microbiota profile. Isobaric tags for relative and absolute quantitation (iTRAQ) were performed to analyze the difference of protein profile. Hydrodynamic-based gene delivery assay was performed to silence KRT1 in vivo. RESULTS VA exerted the most significant radioprotection among the SCFAs. In detail, VA replenishment elevated the survival rate of irradiated mice, protected hematogenic organs, improved gastrointestinal (GI) tract function and intestinal epithelial integrity in irradiated mice. High-throughput sequencing and iTRAQ showed that oral gavage of VA restored the enteric bacteria taxonomic proportions, reprogrammed the small intestinal protein profile of mice following TAI exposure. Importantly, keratin 1 (KRT1) played a pivotal role in the radioprotection of VA. CONCLUSIONS Our findings provide new insights into gut microbiota-produced VA and underpin that VA might be employed as a therapeutic option to mitigate radiation injury in pre-clinical settings.
Collapse
Affiliation(s)
- Yuan Li
- Tianjin Key Laboratory of Radiation Medicine and Molecular Nuclear Medicine, Institute of Radiation Medicine, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China
| | - Jiali Dong
- Tianjin Key Laboratory of Radiation Medicine and Molecular Nuclear Medicine, Institute of Radiation Medicine, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China
| | - Huiwen Xiao
- Tianjin Key Laboratory of Radiation Medicine and Molecular Nuclear Medicine, Institute of Radiation Medicine, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China
| | - Shuqin Zhang
- Tianjin Key Laboratory of Radiation Medicine and Molecular Nuclear Medicine, Institute of Radiation Medicine, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China
| | - Bin Wang
- Tianjin Key Laboratory of Radiation Medicine and Molecular Nuclear Medicine, Institute of Radiation Medicine, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China
| | - Ming Cui
- Tianjin Key Laboratory of Radiation Medicine and Molecular Nuclear Medicine, Institute of Radiation Medicine, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China,CONTACT Ming Cui ; Saijun Fan
| | - Saijun Fan
- Tianjin Key Laboratory of Radiation Medicine and Molecular Nuclear Medicine, Institute of Radiation Medicine, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin, China
| |
Collapse
|
6
|
Xuan A, Song Y, Bu C, Chen P, El-Kassaby YA, Zhang D. Changes in DNA Methylation in Response to 6-Benzylaminopurine Affect Allele-Specific Gene Expression in Populus Tomentosa. Int J Mol Sci 2020; 21:E2117. [PMID: 32204454 PMCID: PMC7139286 DOI: 10.3390/ijms21062117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Revised: 03/13/2020] [Accepted: 03/17/2020] [Indexed: 12/30/2022] Open
Abstract
Cytokinins play important roles in the growth and development of plants. Physiological and photosynthetic characteristics are common indicators to measure the growth and development in plants. However, few reports have described the molecular mechanisms of physiological and photosynthetic changes in response to cytokinin, particularly in woody plants. DNA methylation is an essential epigenetic modification that dynamically regulates gene expression in response to the external environment. In this study, we examined genome-wide DNA methylation variation and transcriptional variation in poplar (Populus tomentosa) after short-term treatment with the synthetic cytokinin 6-benzylaminopurine (6-BA). We identified 460 significantly differentially methylated regions (DMRs) in response to 6-BA treatment. Transcriptome analysis showed that 339 protein-coding genes, 262 long non-coding RNAs (lncRNAs), and 15,793 24-nt small interfering RNAs (siRNAs) were differentially expressed under 6-BA treatment. Among these, 79% were differentially expressed between alleles in P. tomentosa, and 102,819 allele-specific expression (ASE) loci in 19,200 genes were detected showing differences in ASE levels after 6-BA treatment. Combined DNA methylation and gene expression analysis demonstrated that DNA methylation plays an important role in regulating allele-specific gene expression. To further investigate the relationship between these 6-BA-responsive genes and phenotypic variation, we performed SNP analysis of 460 6-BA-responsive DMRs via re-sequencing using a natural population of P. tomentosa, and we identified 206 SNPs that were significantly associated with growth and wood properties. Association analysis indicated that 53% of loci with allele-specific expression had primarily dominant effects on poplar traits. Our comprehensive analyses of P. tomentosa DNA methylation and the regulation of allele-specific gene expression suggest that DNA methylation is an important regulator of imbalanced expression between allelic loci.
Collapse
Affiliation(s)
- Anran Xuan
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China; (A.X.); (Y.S.); (C.B.); (P.C.)
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China
| | - Yuepeng Song
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China; (A.X.); (Y.S.); (C.B.); (P.C.)
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China
| | - Chenhao Bu
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China; (A.X.); (Y.S.); (C.B.); (P.C.)
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China
| | - Panfei Chen
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China; (A.X.); (Y.S.); (C.B.); (P.C.)
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China
| | - Yousry A. El-Kassaby
- Department of Forest and Conservation Sciences, Faculty of Forestry, Forest Sciences Centre, University of British Columbia, Vancouver, BC V6T 1Z4, Canada;
| | - Deqiang Zhang
- National Engineering Laboratory for Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China; (A.X.); (Y.S.); (C.B.); (P.C.)
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, China
| |
Collapse
|
7
|
Identification of rs11615992 as a novel regulatory SNP for human P2RX7 by allele-specific expression. Mol Genet Genomics 2019; 295:23-30. [PMID: 31410611 DOI: 10.1007/s00438-019-01598-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 07/26/2019] [Indexed: 12/12/2022]
Abstract
P2RX7 (purinergic receptor P2X 7) is an important membrane ion channel and involved in multiple physiological processes. One non-synonymous SNP on P2RX7, rs3751143, had been proven to reduce ion channel function and further associated with multiple diseases. However, it was still unclear whether there were other cis-regulatory elements for P2RX7, which might further contribute to related diseases. Allele-specific expression (ASE) is a robust and sensitive approach to identify the potential functional region in human genome. In the current study, we measured ASE on rs3751143 in lung tissues and observed a consistent excess of A allele over C (P = 0.001), which indicated that SNP(s) in linkage disequilibrium (LD) could regulate P2RX7 expression. By analyzing the 1000 genomes project data for Chinese, one SNP locating ~ 5 kb away and downstream of P2RX7, rs11615992, was disclosed to be in strong LD with rs3751143. The dual-luciferase assay confirmed that rs11615992 could alter target gene expression in lung cell line. Through chromosome conformation capture, it was verified that the region surrounding rs11615992 could interact with P2RX7 promoter and effect as an enhancer. By chromatin immunoprecipitation, the related transcription factor POU2F1 (POU class 2 homeobox 1) was recognized to bind the region spanning rs11615992. Our work identified a novel long-distance cis-regulatory SNP for P2RX7, which might contribute to multiple diseases.
Collapse
|
8
|
Li XX, Peng T, Gao J, Feng JG, Wu DD, Yang T, Zhong L, Fu WP, Sun C. Allele-specific expression identified rs2509956 as a novel long-distance cis-regulatory SNP for SCGB1A1, an important gene for multiple pulmonary diseases. Am J Physiol Lung Cell Mol Physiol 2019; 317:L456-L463. [PMID: 31322430 DOI: 10.1152/ajplung.00275.2018] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
SCGB1A1 (secretoglobin family 1A member 1) is an important protein for multiple pulmonary diseases, especially asthma, chronic obstructive pulmonary disease, and lung cancer. One single-nucleotide polymorphism (SNP) at 5'-untranslated region of SCGB1A1, rs3741240, has been suggested to be associated with reduced protein expression and further asthma susceptibility. However, it was still unclear whether there were other cis-regulatory elements for SCGB1A1 that might further contribute to pulmonary diseases. Allele-specific expression (ASE) is a novel approach to identify the functional region in human genome. In the present study, we measured ASE on rs3741240 in lung tissues and observed a consistent excess of G allele over A (P < 10-6), which indicated that this SNP or the one(s) in linkage disequilibrium (LD) could regulate SCGB1A1 expression. By analyzing 1000 Genomes Project data for Chinese, one SNP locating ~10.2 kb away and downstream of SCGB1A1, rs2509956, was identified to be in strong LD with rs3741240. Reporter gene assay confirmed that both SNPs could regulate gene expression in the lung cell. By chromosome conformation capture, it was verified that the region surrounding rs2509956 could interact with SCGB1A1 promoter region and act as an enhancer. Through chromatin immunoprecipitation and overexpression assay, the related transcription factor RELA (RELA proto-oncogene, NF-kB subunit) was recognized to bind the region spanning rs2509956. Our work identified a novel long-distance cis-regulatory SNP for SCGB1A1, which might contribute to multiple pulmonary diseases.
Collapse
Affiliation(s)
- Xiu-Xiong Li
- National Engineering Laboratory for Resource Development of Endangered Crude Drugs in Northwest China, Key Laboratory of the Ministry of Education for Medicinal Resources and Natural Pharmaceutical Chemistry, College of Life Sciences, Shaanxi Normal University, Xi'an, People's Republic of China
| | - Tao Peng
- National Engineering Laboratory for Resource Development of Endangered Crude Drugs in Northwest China, Key Laboratory of the Ministry of Education for Medicinal Resources and Natural Pharmaceutical Chemistry, College of Life Sciences, Shaanxi Normal University, Xi'an, People's Republic of China
| | - Jing Gao
- National Engineering Laboratory for Resource Development of Endangered Crude Drugs in Northwest China, Key Laboratory of the Ministry of Education for Medicinal Resources and Natural Pharmaceutical Chemistry, College of Life Sciences, Shaanxi Normal University, Xi'an, People's Republic of China
| | - Jia-Gang Feng
- Department of Respiratory Critical Care Medicine, The First Affiliated Hospital of Kunming Medical University, Kunming, People's Republic of China
| | - Dan-Dan Wu
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, People's Republic of China
| | - Ting Yang
- Department of Respiratory Critical Care Medicine, The First Affiliated Hospital of Kunming Medical University, Kunming, People's Republic of China
| | - Li Zhong
- National Engineering Laboratory for Resource Development of Endangered Crude Drugs in Northwest China, Key Laboratory of the Ministry of Education for Medicinal Resources and Natural Pharmaceutical Chemistry, College of Life Sciences, Shaanxi Normal University, Xi'an, People's Republic of China.,Provincial Demonstration Center for Experimental Biology Education, Shaanxi Normal University, Xi'an, People's Republic of China
| | - Wei-Ping Fu
- Department of Respiratory Critical Care Medicine, The First Affiliated Hospital of Kunming Medical University, Kunming, People's Republic of China
| | - Chang Sun
- National Engineering Laboratory for Resource Development of Endangered Crude Drugs in Northwest China, Key Laboratory of the Ministry of Education for Medicinal Resources and Natural Pharmaceutical Chemistry, College of Life Sciences, Shaanxi Normal University, Xi'an, People's Republic of China
| |
Collapse
|
9
|
Phasing quality assessment in a brown layer population through family- and population-based software. BMC Genet 2019; 20:57. [PMID: 31311514 PMCID: PMC6636125 DOI: 10.1186/s12863-019-0759-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Accepted: 06/23/2019] [Indexed: 01/05/2023] Open
Abstract
Background Haplotype data contains more information than genotype data and provides possibilities such as imputing low frequency variants, inferring points of recombination, detecting recurrent mutations, mapping linkage disequilibrium (LD), studying selection signatures, estimating IBD probabilities, etc. In addition, haplotype structure is used to assess genetic diversity and expected accuracy in genomic selection programs. Nevertheless, the quality and efficiency of phasing has rarely been a subject of thorough study but was assessed mainly as a by-product in imputation quality studies. Moreover, phasing studies based on data of a poultry population are non-existent. The aim of this study was to evaluate the phasing quality of FImpute and Beagle, two of the most used phasing software. Results We simulated ten replicated samples of a layer population comprising 888 individuals from a real SNP dataset of 580 k and a pedigree of 12 generations. Chromosomes analyzed were 1, 7 and 20. We measured the percentage of SNPs that were phased equally between true and phased haplotypes (Eqp), proportion of individuals completely correctly phased, number of incorrectly phased SNPs or Breakpoints (Bkp) and the length of inverted haplotype segments. Results were obtained for three different groups of individuals, with no parents or offspring genotyped in the dataset, with only one parent, and with both parents, respectively. The phasing was performed with Beagle (v3.3 and v4.1) and FImpute v2.2 (with and without pedigree). Eqp values ranged from 88 to 100%, with the best results from haplotypes phased with Beagle v4.1 and FImpute with pedigree information and at least one parent genotyped. FImpute haplotypes showed a higher number of Bkp than Beagle. As a consequence, switched haplotype segments were longer for Beagle than for FImpute. Conclusion We concluded that for the dataset applied in this study Beagle v4.1 or FImpute with pedigree information and at least one parent genotyped in the data set were the best alternatives for obtaining high quality phased haplotypes. Electronic supplementary material The online version of this article (10.1186/s12863-019-0759-3) contains supplementary material, which is available to authorized users.
Collapse
|
10
|
Choudhury O, Chakrabarty A, Emrich SJ. Highly Accurate and Efficient Data-Driven Methods for Genotype Imputation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1107-1116. [PMID: 28574365 DOI: 10.1109/tcbb.2017.2708701] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
High-throughput sequencing techniques have generated massive quantities of genotype data. Haplotype phasing has proven to be a useful and effective method for analyzing these data. However, the quality of phasing is undermined due to missing information. Imputation provides an effective means of improving the underlying genotype information. For model organisms, imputation can rely on an available reference genotype panel and a physical or genetic map. For non-model organisms, which often do not have a genotype panel, it is important to design an imputation technique that does not rely on reference data. Here, we present Accurate Data-Driven Imputation Technique (ADDIT), which is composed of two data-driven algorithms capable of handling data generated from model and non-model organisms. The non-model variant of ADDIT (referred to as ADDIT-NM) employs statistical inference methods to impute missing genotypes, whereas the model variant (referred to as ADDIT-M) leverages a supervised learning-based approach for imputation. We demonstrate that both variants of ADDIT are more accurate, faster, and require less memory than leading state-of-the-art imputation tools using model (human) and non-model (maize, apple, and grape) genotype data. Software Availability: The source code of ADDIT and test data sets are available at https://github.com/NDBL/ADDIT.
Collapse
|
11
|
Im C, Sapkota Y, Moon W, Kawashima M, Nakamura M, Tokunaga K, Yasui Y. Genome-wide haplotype association analysis of primary biliary cholangitis risk in Japanese. Sci Rep 2018; 8:7806. [PMID: 29773854 PMCID: PMC5958065 DOI: 10.1038/s41598-018-26112-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Accepted: 04/30/2018] [Indexed: 12/16/2022] Open
Abstract
Primary biliary cholangitis (PBC) susceptibility loci have largely been discovered through single SNP association testing. In this study, we report genic haplotype patterns associated with PBC risk genome-wide in two Japanese cohorts. Among the 74 genic PBC risk haplotype candidates we detected with a novel methodological approach in a discovery cohort of 1,937 Japanese, nearly two-thirds were replicated (49 haplotypes, Bonferroni-corrected P < 6.8 × 10-4) in an independent Japanese cohort (N = 949). Along with corroborating known PBC-associated loci (TNFSF15, HLA-DRA), risk haplotypes may potentially model cis-interactions that regulate gene expression. For example, one replicated haplotype association (9q32-9q33.1, OR = 1.7, P = 3.0 × 10-21) consists of intergenic SNPs outside of the human leukocyte antigen (HLA) region that overlap regulatory histone mark peaks in liver and blood cells, and are significantly associated with TNFSF8 expression in whole blood. We also replicated a novel haplotype association involving non-HLA SNPs mapped to UMAD1 (7p21.3; OR = 15.2, P = 3.9 × 10-9) that overlap enhancer peaks in liver and memory Th cells. Our analysis demonstrates the utility of haplotype association analyses in discovering and characterizing PBC susceptibility loci.
Collapse
Affiliation(s)
- Cindy Im
- School of Public Health, University of Alberta, Edmonton, Alberta, T6G 1C9, Canada.
| | - Yadav Sapkota
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Wonjong Moon
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Minae Kawashima
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, 113-0033, Japan
| | - Minoru Nakamura
- Department of Hepatology, Nagasaki University Graduate School of Biomedical Sciences and Clinical Research Center, National Hospital Organization Nagasaki Medical Center, Omura, Nagasaki, 856-8562, Japan
| | - Katsushi Tokunaga
- Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, 113-0033, Japan
| | - Yutaka Yasui
- School of Public Health, University of Alberta, Edmonton, Alberta, T6G 1C9, Canada. .,Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA.
| |
Collapse
|
12
|
Nariai N, Greenwald WW, DeBoever C, Li H, Frazer KA. Efficient Prioritization of Multiple Causal eQTL Variants via Sparse Polygenic Modeling. Genetics 2017; 207:1301-1312. [PMID: 29074555 PMCID: PMC5714449 DOI: 10.1534/genetics.117.300435] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Accepted: 10/13/2017] [Indexed: 11/18/2022] Open
Abstract
Expression quantitative trait loci (eQTL) studies have typically used single-variant association analysis to identify genetic variants correlated with gene expression. However, this approach has several drawbacks: causal variants cannot be distinguished from nonfunctional variants in strong linkage disequilibrium, combined effects from multiple causal variants cannot be captured, and low-frequency (<5% MAF) eQTL variants are difficult to identify. While these issues possibly could be overcome by using sparse polygenic models, which associate multiple genetic variants with gene expression simultaneously, the predictive performance of these models for eQTL studies has not been evaluated. Here, we assessed the ability of three sparse polygenic models (Lasso, Elastic Net, and BSLMM) to identify causal variants, and compared their efficacy to single-variant association analysis and a fine-mapping model. Using simulated data, we determined that, while these methods performed similarly when there was one causal SNP present at a gene, BSLMM substantially outperformed single-variant association analysis for prioritizing causal eQTL variants when multiple causal eQTL variants were present (1.6- to 5.2-fold higher recall at 20% precision), and identified up to 2.3-fold more low frequency variants as the top eQTL SNP. Analysis of real RNA-seq and whole-genome sequencing data of 131 iPSC samples showed that the eQTL SNPs identified by BSLMM had a higher functional enrichment in DHS sites and were more often low-frequency than those identified with single-variant association analysis. Our study showed that BSLMM is a more effective approach than single-variant association analysis for prioritizing multiple causal eQTL variants at a single gene.
Collapse
Affiliation(s)
- Naoki Nariai
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, California 92093-0761
| | - William W Greenwald
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, California 92093-0761
| | - Christopher DeBoever
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, California 92093-0761
| | - He Li
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, California 92093-0761
| | - Kelly A Frazer
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, California 92093-0761
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, California 92093-0761
| |
Collapse
|
13
|
Miar Y, Sargolzaei M, Schenkel FS. A comparison of different algorithms for phasing haplotypes using Holstein cattle genotypes and pedigree data. J Dairy Sci 2017; 100:2837-2849. [PMID: 28161175 DOI: 10.3168/jds.2016-11590] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2016] [Accepted: 12/09/2016] [Indexed: 01/25/2023]
Abstract
Phasing genotypes to haplotypes is becoming increasingly important due to its applications in the study of diseases, population and evolutionary genetics, imputation, and so on. Several studies have focused on the development of computational methods that infer haplotype phase from population genotype data. The aim of this study was to compare phasing algorithms implemented in Beagle, Findhap, FImpute, Impute2, and ShapeIt2 software using 50k and 777k (HD) genotyping data. Six scenarios were considered: no-parents, sire-progeny pairs, sire-dam-progeny trios, each with and without pedigree information in Holstein cattle. Algorithms were compared with respect to their phasing accuracy and computational efficiency. In the studied population, Beagle and FImpute were more accurate than other phasing algorithms. Across scenarios, phasing accuracies for Beagle and FImpute were 99.49-99.90% and 99.44-99.99% for 50k, respectively, and 99.90-99.99% and 99.87-99.99% for HD, respectively. Generally, FImpute resulted in higher accuracy when genotypic information of at least one parent was available. In the absence of parental genotypes and pedigree information, Beagle and Impute2 (with double the default number of states) were slightly more accurate than FImpute. Findhap gave high phasing accuracy when parents' genotypes and pedigree information were available. In terms of computing time, Findhap was the fastest algorithm followed by FImpute. FImpute was 30 to 131, 87 to 786, and 353 to 1,400 times faster across scenarios than Beagle, ShapeIt2, and Impute2, respectively. In summary, FImpute and Beagle were the most accurate phasing algorithms. Moreover, the low computational requirement of FImpute makes it an attractive algorithm for phasing genotypes of large livestock populations.
Collapse
Affiliation(s)
- Younes Miar
- Department of Animal Science and Aquaculture, Dalhousie University, Truro, Nova Scotia, Canada B2N 5E3; Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada N1G 2W1.
| | - Mehdi Sargolzaei
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada N1G 2W1; The Semex Alliance, Guelph, Ontario, Canada N1H 6J2
| | - Flavio S Schenkel
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, Ontario, Canada N1G 2W1
| |
Collapse
|
14
|
Discovering Single Nucleotide Polymorphisms Regulating Human Gene Expression Using Allele Specific Expression from RNA-seq Data. Genetics 2016; 204:1057-1064. [PMID: 27765809 DOI: 10.1534/genetics.115.177246] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2015] [Accepted: 09/07/2016] [Indexed: 12/20/2022] Open
Abstract
The study of the genetics of gene expression is of considerable importance to understanding the nature of common, complex diseases. The most widely applied approach to identifying relationships between genetic variation and gene expression is the expression quantitative trait loci (eQTL) approach. Here, we increased the computational power of eQTL with an alternative and complementary approach based on analyzing allele specific expression (ASE). We designed a novel analytical method to identify cis-acting regulatory variants based on genome sequencing and measurements of ASE from RNA-sequencing (RNA-seq) data. We evaluated the power and resolution of our method using simulated data. We then applied the method to map regulatory variants affecting gene expression in lymphoblastoid cell lines (LCLs) from 77 unrelated northern and western European individuals (CEU), which were part of the HapMap project. A total of 2309 SNPs were identified as being associated with ASE patterns. The SNPs associated with ASE were enriched within promoter regions and were significantly more likely to signal strong evidence for a regulatory role. Finally, among the candidate regulatory SNPs, we identified 108 SNPs that were previously associated with human immune diseases. With further improvements in quantifying ASE from RNA-seq, the application of our method to other datasets is expected to accelerate our understanding of the biological basis of common diseases.
Collapse
|
15
|
The genetic architecture of autism spectrum disorders (ASDs) and the potential importance of common regulatory genetic variants. SCIENCE CHINA-LIFE SCIENCES 2016; 58:968-75. [PMID: 26335735 DOI: 10.1007/s11427-012-4336-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Currently, there is great interest in identifying genetic variants that contribute to the risk of developing autism spectrum disorders (ASDs), due in part to recent increases in the frequency of diagnosis of these disorders worldwide. While there is nearly universal agreement that ASDs are complex diseases, with multiple genetic and environmental contributing factors, there is less agreement concerning the relative importance of common vs rare genetic variants in ASD liability. Recent observations that rare mutations and copy number variants (CNVs) are frequently associated with ASDs, combined with reduced fecundity of individuals with these disorders, has led to the hypothesis that ASDs are caused primarily by de novo or rare genetic mutations. Based on this model, large-scale whole-genome DNA sequencing has been proposed as the most appropriate method for discovering ASD liability genes. While this approach will undoubtedly identify many novel candidate genes and produce important new insights concerning the genetic causes of these disorders, a full accounting of the genetics of ASDs will be incomplete absent an understanding of the contributions of common regulatory variants, which are likely to influence ASD liability by modifying the effects of rare variants or, by assuming unfavorable combinations, directly produce these disorders. Because it is not yet possible to identify regulatory genetic variants by examination of DNA sequences alone, their identification will require experimentation. In this essay, I discuss these issues and describe the advantages of measurements of allelic expression imbalance (AEI) of mRNA expression for identifying cis-acting regulatory variants that contribute to ASDs.
Collapse
|
16
|
Ballester M, Revilla M, Puig-Oliveras A, Marchesi JAP, Castelló A, Corominas J, Fernández AI, Folch JM. Analysis of the porcine APOA2 gene expression in liver, polymorphism identification and association with fatty acid composition traits. Anim Genet 2016; 47:552-9. [PMID: 27296287 DOI: 10.1111/age.12462] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/26/2016] [Indexed: 12/20/2022]
Abstract
APOA2 is a protein implicated in triglyceride, fatty acid and glucose metabolism. In pigs, the APOA2 gene is located on pig chromosome 4 (SSC4) in a QTL region affecting fatty acid composition, fatness and growth traits. In this study, we evaluated APOA2 as a candidate gene for meat quality traits in an Iberian × Landrace backcross population. The APOA2:c.131T>A polymorphism, located in exon 3 of APOA2 and determining a missense mutation, was associated with the percentage of hexadecenoic acid [C16:1(n-9)], linoleic acid [C18:2(n-6)], α-linolenic acid [C18:3(n-3)], dihomo-gamma-linolenic acid [C20:3(n-6)] and polyunsaturated fatty acids (PUFAs) in backfat. Furthermore, this SNP was associated with the global mRNA expression levels of APOA2 in liver and was used as a marker to determine allelic expression imbalance by pyrosequencing. We determined an overexpression of the T allele in heterozygous samples with a mean ratio of 2.8 (T/A), observing a high variability in the allelic expression among individuals. This result suggests that complex regulatory mechanisms, beyond a single polymorphism (e.g. epigenetic effects or multiple cis-acting polymorphisms), may be regulating APOA2 gene expression.
Collapse
Affiliation(s)
- M Ballester
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Campus UAB, Bellaterra, 08193, Barcelona, Spain. .,Plant and Animal Genomics, Centre de Recerca en Agrigenòmica (Consorci CSIC-IRTA-UAB-UB), Edifici CRAG, Campus UAB, Bellaterra, 08193, Barcelona, Spain. .,IRTA, Genètica i Millora Animal, Torre Marimon, 08140, Caldes de Montbui, Spain.
| | - M Revilla
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Campus UAB, Bellaterra, 08193, Barcelona, Spain.,Plant and Animal Genomics, Centre de Recerca en Agrigenòmica (Consorci CSIC-IRTA-UAB-UB), Edifici CRAG, Campus UAB, Bellaterra, 08193, Barcelona, Spain
| | - A Puig-Oliveras
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Campus UAB, Bellaterra, 08193, Barcelona, Spain.,Plant and Animal Genomics, Centre de Recerca en Agrigenòmica (Consorci CSIC-IRTA-UAB-UB), Edifici CRAG, Campus UAB, Bellaterra, 08193, Barcelona, Spain
| | - J A P Marchesi
- Plant and Animal Genomics, Centre de Recerca en Agrigenòmica (Consorci CSIC-IRTA-UAB-UB), Edifici CRAG, Campus UAB, Bellaterra, 08193, Barcelona, Spain
| | - A Castelló
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Campus UAB, Bellaterra, 08193, Barcelona, Spain.,Plant and Animal Genomics, Centre de Recerca en Agrigenòmica (Consorci CSIC-IRTA-UAB-UB), Edifici CRAG, Campus UAB, Bellaterra, 08193, Barcelona, Spain
| | - J Corominas
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Campus UAB, Bellaterra, 08193, Barcelona, Spain.,Plant and Animal Genomics, Centre de Recerca en Agrigenòmica (Consorci CSIC-IRTA-UAB-UB), Edifici CRAG, Campus UAB, Bellaterra, 08193, Barcelona, Spain
| | - A I Fernández
- Departamento de Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28040, Madrid, Spain
| | - J M Folch
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Campus UAB, Bellaterra, 08193, Barcelona, Spain.,Plant and Animal Genomics, Centre de Recerca en Agrigenòmica (Consorci CSIC-IRTA-UAB-UB), Edifici CRAG, Campus UAB, Bellaterra, 08193, Barcelona, Spain
| |
Collapse
|
17
|
A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals. Nat Commun 2016; 7:11101. [PMID: 27089393 PMCID: PMC4837449 DOI: 10.1038/ncomms11101] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Accepted: 02/19/2016] [Indexed: 02/07/2023] Open
Abstract
Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).
Collapse
|
18
|
Alakus H, Bollschweiler E, Hölscher AH, Warnecke-Eberz U, Frazer KA, Harismendy O, Lowy AM, Mönig SP, Eberz PM, Maus M, Drebber U, Siffert W, Metzger R. Homozygous GNAS 393C-allele carriers with locally advanced esophageal cancer fail to benefit from platinum-based preoperative chemoradiotherapy. Ann Surg Oncol 2014; 21:4375-82. [PMID: 24986238 DOI: 10.1245/s10434-014-3843-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Indexed: 01/13/2023]
Abstract
BACKGROUND Currently, patients with locally advanced esophageal cancer receive neoadjuvant chemoradiotherapy but only about half of these patients benefit from this treatment. GNAS T393C has been shown to predict the postoperative course in solid tumors and may therefore be useful for treatment stratification. The aim of the present study was to determine if the single-nucleotide polymorphism GNAS T393C can be used for treatment stratification in esophageal cancer patients. METHODS A total of 596 patients underwent surgical resection for esophageal carcinoma from 1996 to 2008; 279 patients received chemoradiotherapy prior to surgery (RTX-SURG group). All patients and a reference group of 820 healthy White individuals were genotyped for GNAS T393C. RESULTS The 5-year-survival rate for the 317 patients who underwent esophagectomy as initial treatment (SURG group) was 57 % for homozygous C-allele carriers (n = 99) and 43 % for T-allele carriers (n = 218; log- rank test p = 0.025). Multivariate analysis revealed the GNAS T393C genotype (p = 0.034), pT (p < 0.001), pN (p < 0.001) and age (p < 0.001) as prognostic of survival. Homozygous C-allele carriers with a locally advanced tumor stage (cT3/T4, n = 129) in the SURG group had a 5-year survival rate of 37 %, which, remarkably, exceeded the 5-year survival rate of 30 % for the entire RTX-SURG group (n = 279). In the RTX-SURG group, the GNAS T393C genotype did not show any prognostic significance. CONCLUSIONS Patients with a locally advanced esophageal cancer and an homozygous GNAS 393C genotype do not benefit from platinum-based neoadjuvant chemoradiotherapy, indicating that these patients should be treated by alternative treatment strategies.
Collapse
Affiliation(s)
- Hakan Alakus
- Department of General, Visceral and Cancer Surgery, University of Cologne, Cologne, Germany,
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Langbein L, Reichelt J, Eckhart L, Praetzel-Wunder S, Kittstein W, Gassler N, Schweizer J. New facets of keratin K77: interspecies variations of expression and different intracellular location in embryonic and adult skin of humans and mice. Cell Tissue Res 2013; 354:793-812. [PMID: 24057875 DOI: 10.1007/s00441-013-1716-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2013] [Accepted: 07/19/2013] [Indexed: 01/08/2023]
Abstract
The differential expression of keratins is central to the formation of various epithelia and their appendages. Structurally, the type II keratin K77 is closely related to K1, the prototypical type II keratin of the suprabasal epidermis. Here, we perform a developmental study on K77 expression in human and murine skin. In both species, K77 is expressed in the suprabasal fetal epidermis. While K77 appears after K1 in the human epidermis, the opposite is true for the murine tissue. This species-specific pattern of expression is also found in conventional and organotypic cultures of human and murine keratinocytes. Ultrastructure investigation shows that, in contrast to K77 intermediate filaments of mice, those of the human ortholog are not attached to desmosomes. After birth, K77 disappears without deleterious consequences from human epidermis while it is maintained in the adult mouse epidermis, where its presence has so far gone unnoticed. After targeted Krt1 gene deletion in mice, K77 is normally expressed but fails to functionally replace K1. Besides the epidermis, both human and mouse K77 are present in luminal duct cells of eccrine sweat glands. The demonstration of a K77 ortholog in platypus but not in non-mammalian vertebrates identifies K77 as an evolutionarily ancient component of the mammalian integument that has evolved different patterns of intracellular distribution and adult tissue expression in primates.
Collapse
Affiliation(s)
- Lutz Langbein
- Genetics of Skin Carcinogenesis, A110, German Cancer Research Center, Im Neuenheimer Feld 280, 69120, Heidelberg, Germany,
| | | | | | | | | | | | | |
Collapse
|
20
|
Murani E, Ponsuksili S, Reyer H, Wittenburg D, Wimmers K. Expression variation of the porcine ADRB2 has a complex genetic background. Mol Genet Genomics 2013; 288:615-25. [PMID: 23996144 DOI: 10.1007/s00438-013-0776-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2013] [Accepted: 08/19/2013] [Indexed: 11/25/2022]
Abstract
Porcine adrenergic receptor beta 2 (ADRB2) gene exhibits differential allelic expression in skeletal muscle, and its genetic variation has been associated with muscle pH. Exploring the molecular-genetic background of expression variation for porcine ADRB2 will provide insight into the mechanisms driving its regulatory divergence and may also contribute to unraveling the genetic basis of muscle-related traits in pigs. In the present study, we therefore examined haplotype effects on the expression of porcine ADRB2 in four tissues: longissimus dorsi muscle, liver, subcutaneous fat, and spleen. The diversity and structure of haplotypes of the proximal gene region segregating in German commercial breeds were characterized. Seven haplotypes falling into three clades were identified. Two clades including five haplotypes most likely originated from introgression of Asian genetics during formation of modern breeds. Expression analyses revealed that the Asian-derived haplotypes increase expression of the porcine ADRB2 compared to the major, wild-type haplotype independently of tissue type. In addition, several tissue-specific differences in the expression of the Asian-derived haplotypes were found. Inspection of haplotype sequences showed that differentially expressed haplotypes exhibit polymorphisms in a polyguanine tract located in the core promoter region. These findings demonstrate that expression variation of the porcine ADRB2 has a complex genetic basis and suggest that the promoter polyguanine tract is causally involved. This study highlights the challenges of finding causal genetic variants underlying complex traits.
Collapse
Affiliation(s)
- Eduard Murani
- Institute for Genome Biology, Leibniz Institute for Farm Animal Biology (FBN), Wilhelm-Stahl-Allee 2, 18196, Dummerstorf, Germany,
| | | | | | | | | |
Collapse
|
21
|
Connelly CF, Skelly DA, Dunham MJ, Akey JM. Population genomics and transcriptional consequences of regulatory motif variation in globally diverse Saccharomyces cerevisiae strains. Mol Biol Evol 2013; 30:1605-13. [PMID: 23619145 DOI: 10.1093/molbev/mst073] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Noncoding genetic variation is known to significantly influence gene expression levels in a growing number of specific cases; however, the patterns of genome-wide noncoding variation present within populations, the evolutionary forces acting on noncoding variants, and the relative effects of regulatory polymorphisms on transcript abundance are not well characterized. Here, we address these questions by analyzing patterns of regulatory variation in motifs for 177 DNA binding proteins in 37 strains of Saccharomyces cerevisiae. Between S. cerevisiae strains, we found considerable polymorphism in regulatory motifs across strains (mean π = 0.005) as well as diversity in regulatory motifs (mean 0.91 motifs differences per regulatory region). Population genetics analyses reveal that motifs are under purifying selection, and there is considerable heterogeneity in the magnitude of selection across different motifs. Finally, we obtained RNA-Seq data in 22 strains and identified 49 polymorphic DNA sequence motifs in 30 distinct genes that are significantly associated with transcriptional differences between strains. In 22 of these genes, there was a single polymorphic motif associated with expression in the upstream region. Our results provide comprehensive insights into the evolutionary trajectory of regulatory variation in yeast and the characteristics of a compendium of regulatory alleles.
Collapse
|
22
|
Gaur U, Li K, Mei S, Liu G. Research progress in allele-specific expression and its regulatory mechanisms. J Appl Genet 2013; 54:271-83. [PMID: 23609142 DOI: 10.1007/s13353-013-0148-y] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Revised: 03/22/2013] [Accepted: 04/03/2013] [Indexed: 12/12/2022]
Abstract
Although the majority of genes are expressed equally from both alleles, some genes are differentially expressed. Organisms possess characteristics to preferentially express a particular allele under regulatory factors, which is termed allele-specific expression (ASE). It is one of the important genetic factors that lead to phenotypic variation and can be used to identify the variance of gene regulation factors. ASE indicates mechanisms such as DNA methylation, histone modifications, and non-coding RNAs function. Here, we review a broad survey of progress in ASE studies, and what this simple yet very effective approach can offer in functional genomics, and possible implications toward our better understanding of the underlying mechanisms of complex traits.
Collapse
Affiliation(s)
- Uma Gaur
- Institute of Animal Science and Veterinary Medicine, Hubei Academy of Agricultural Sciences, Yaoyuan No. 1, Nanhu, Hongshan District, Wuhan, 430064, People's Republic of China
| | | | | | | |
Collapse
|
23
|
Martin A, Orgogozo V. The Loci of repeated evolution: a catalog of genetic hotspots of phenotypic variation. Evolution 2013; 67:1235-50. [PMID: 23617905 DOI: 10.1111/evo.12081] [Citation(s) in RCA: 227] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Accepted: 01/26/2013] [Indexed: 12/11/2022]
Abstract
What is the nature of the genetic changes underlying phenotypic evolution? We have catalogued 1008 alleles described in the literature that cause phenotypic differences among animals, plants, and yeasts. Surprisingly, evolution of similar traits in distinct lineages often involves mutations in the same gene ("gene reuse"). This compilation yields three important qualitative implications about repeated evolution. First, the apparent evolution of similar traits by gene reuse can be traced back to two alternatives, either several independent causative mutations or a single original mutational event followed by sorting processes. Second, hotspots of evolution-defined as the repeated occurrence of de novo mutations at orthologous loci and causing similar phenotypic variation-are omnipresent in the literature with more than 100 examples covering various levels of analysis, including numerous gain-of-function events. Finally, several alleles of large effect have been shown to result from the aggregation of multiple small-effect mutations at the same hotspot locus, thus reconciling micromutationist theories of adaptation with the empirical observation of large-effect variants. Although data heterogeneity and experimental biases prevented us from extracting quantitative trends, our synthesis highlights the existence of genetic paths of least resistance leading to viable evolutionary change.
Collapse
Affiliation(s)
- Arnaud Martin
- Department of Ecology and Evolutionary Biology, Cornell University, Corson Hall, 215 Tower Road, Ithaca, New York, 14853, USA.
| | | |
Collapse
|
24
|
Olbromski R, Siadkowska E, Zelazowska B, Zwierzchowski L. Allelic gene expression imbalance of bovine IGF2, LEP and CCL2 genes in liver, kidney and pituitary. Mol Biol Rep 2012. [PMID: 23184004 PMCID: PMC3538019 DOI: 10.1007/s11033-012-2161-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Allelic expression imbalance (AEI) is an important genetic factor being the cause of differences in phenotypic traits that can be heritable. Studying AEI can be useful in searching for factors that modulate gene expression and help to understand molecular mechanisms underlying phenotypic changes. Although it was commonly recognized in many species and we know many genes show allelic expression imbalance, this phenomena was not studied on a larger scale in cattle. Using the pyrosequencing method we analyzed a set of 29 bovine genes in order to find those that have preferential allelic expression. The study was conducted in three tissues: liver, pituitary and kindey. Out of the studied group of genes 3 of them—LEP (leptin), IGF2 (insulin-like growth factor 2), CCL2 (chemokine C–C motif ligand 2) showed allelic expression imbalance.
Collapse
Affiliation(s)
- R Olbromski
- Department of Molecular Biology, Institute of Genetics and Animal Breeding, Polish Academy of Sciences (IGAB PAS), Jastrzębiec, 05-552, Magdalenka, Poland.
| | | | | | | |
Collapse
|
25
|
Teare MD, Pinyakorn S, Heighway J, Santibanez Koref MF. Comparing methods for mapping cis acting polymorphisms using allelic expression ratios. PLoS One 2011; 6:e28636. [PMID: 22174852 PMCID: PMC3236754 DOI: 10.1371/journal.pone.0028636] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2011] [Accepted: 11/11/2011] [Indexed: 02/04/2023] Open
Abstract
Genome wide association studies frequently reveal associations between disease susceptibility and polymorphisms outside coding regions. Such associations cannot always be explained by linkage disequilibrium with changes affecting the transcription products. This has stimulated the interest in characterising sequence variation influencing gene expression levels, in particular in changes acting in cis. Differences in transcription between the two alleles at an autosomal locus can be used to test the association between candidate polymorphisms and the modulation of gene expression in cis. This type of approach requires at least one transcribed polymorphism and one candidate polymorphism. In the past five years, different methods have been proposed to analyse such data. Here we use simulations and real data sets to compare the power of some of these methods. The results show that when it is not possible to determine the phase between the transcribed and potentially cis acting allele there is some advantage in using methods that estimate phased genotype and effect on expression simultaneously. However when the phase can be determined, simple regression models seem preferable because of their simplicity and flexibility. The simulations and the analysis of experimental data suggest that in the majority of situations, methods that assume a lognormal distribution of the allelic expression ratios are both robust to deviations from this assumption and more powerful than alternatives that do not make these assumptions.
Collapse
Affiliation(s)
- Marion Dawn Teare
- School of Health and Related Research, University of Sheffield, Sheffield, United Kingdom.
| | | | | | | |
Collapse
|
26
|
Xu X, Wang H, Zhu M, Sun Y, Tao Y, He Q, Wang J, Chen L, Saffen D. Next-generation DNA sequencing-based assay for measuring allelic expression imbalance (AEI) of candidate neuropsychiatric disorder genes in human brain. BMC Genomics 2011; 12:518. [PMID: 22013986 PMCID: PMC3228908 DOI: 10.1186/1471-2164-12-518] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Accepted: 10/20/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Common genetic variants that regulate gene expression are widely suspected to contribute to the etiology and phenotypic variability of complex diseases. Although high-throughput, microarray-based assays have been developed to measure differences in mRNA expression among independent samples, these assays often lack the sensitivity to detect rare mRNAs and the reproducibility to quantify small changes in mRNA expression. By contrast, PCR-based allelic expression imbalance (AEI) assays, which use a "marker" single nucleotide polymorphism (mSNP) in the mRNA to distinguish expression from pairs of genetic alleles in individual samples, have high sensitivity and accuracy, allowing differences in mRNA expression greater than 1.2-fold to be quantified with high reproducibility. In this paper, we describe the use of an efficient PCR/next-generation DNA sequencing-based assay to analyze allele-specific differences in mRNA expression for candidate neuropsychiatric disorder genes in human brain. RESULTS Using our assay, we successfully analyzed AEI for 70 candidate neuropsychiatric disorder genes in 52 independent human brain samples. Among these genes, 62/70 (89%) showed AEI ratios greater than 1 ± 0.2 in at least one sample and 8/70 (11%) showed no AEI. Arranging log2AEI ratios in increasing order from negative-to-positive values revealed highly reproducible distributions of log2AEI ratios that are distinct for each gene/marker SNP combination. Mathematical modeling suggests that these log2AEI distributions can provide important clues concerning the number, location and contributions of cis-acting regulatory variants to mRNA expression. CONCLUSIONS We have developed a highly sensitive and reproducible method for quantifying AEI of mRNA expressed in human brain. Importantly, this assay allowed quantification of differential mRNA expression for many candidate disease genes entirely missed in previously published microarray-based studies of mRNA expression in human brain. Given the ability of next-generation sequencing technology to generate large numbers of independent sequencing reads, our method should be suitable for analyzing from 100- to 200-candidate genes in 100 samples in a single experiment. We believe that this is the appropriate scale for investigating variation in mRNA expression for defined sets candidate disorder genes, allowing, for example, comprehensive coverage of genes that function within biological pathways implicated in specific disorders. The combination of AEI measurements and mathematical modeling described in this study can assist in identifying SNPs that correlate with mRNA expression. Alleles of these SNPs (individually or as sets) that accurately predict high- or low-mRNA expression should be useful as markers in genetic association studies aimed at linking candidate genes to specific neuropsychiatric disorders.
Collapse
Affiliation(s)
- Xiang Xu
- Institutes of Brain Science, Fudan University, 138 Yixueyuan Road, Shanghai 200032, China
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Abstract
Determination of haplotype phase is becoming increasingly important as we enter the era of large-scale sequencing because many of its applications, such as imputing low-frequency variants and characterizing the relationship between genetic variation and disease susceptibility, are particularly relevant to sequence data. Haplotype phase can be generated through laboratory-based experimental methods, or it can be estimated using computational approaches. We assess the haplotype phasing methods that are available, focusing in particular on statistical methods, and we discuss the practical aspects of their application. We also describe recent developments that may transform this field, particularly the use of identity-by-descent for computational phasing.
Collapse
Affiliation(s)
- Sharon R. Browning
- Department of Biostatistics, University of Washington, Seattle WA 98195, USA
| | - Brian L. Browning
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle WA 98195, USA
| |
Collapse
|
28
|
Suk EK, McEwen GK, Duitama J, Nowick K, Schulz S, Palczewski S, Schreiber S, Holloway DT, McLaughlin S, Peckham H, Lee C, Huebsch T, Hoehe MR. A comprehensively molecular haplotype-resolved genome of a European individual. Genome Res 2011; 21:1672-85. [PMID: 21813624 DOI: 10.1101/gr.125047.111] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Independent determination of both haplotype sequences of an individual genome is essential to relate genetic variation to genome function, phenotype, and disease. To address the importance of phase, we have generated the most complete haplotype-resolved genome to date, "Max Planck One" (MP1), by fosmid pool-based next generation sequencing. Virtually all SNPs (>99%) and 80,000 indels were phased into haploid sequences of up to 6.3 Mb (N50 ~1 Mb). The completeness of phasing allowed determination of the concrete molecular haplotype pairs for the vast majority of genes (81%) including potential regulatory sequences, of which >90% were found to be constituted by two different molecular forms. A subset of 159 genes with potentially severe mutations in either cis or trans configurations exemplified in particular the role of phase for gene function, disease, and clinical interpretation of personal genomes (e.g., BRCA1). Extended genomic regions harboring manifold combinations of physically and/or functionally related genes and regulatory elements were resolved into their underlying "haploid landscapes," which may define the functional genome. Moreover, the majority of genes and functional sequences were found to contain individual or rare SNPs, which cannot be phased from population data alone, emphasizing the importance of molecular phasing for characterizing a genome in its molecular individuality. Our work provides the foundation to understand that the distinction of molecular haplotypes is essential to resolve the (inherently individual) biology of genes, genomes, and disease, establishing a reference point for "phase-sensitive" personal genomics. MP1's annotated haploid genomes are available as a public resource.
Collapse
Affiliation(s)
- Eun-Kyung Suk
- Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Xiao R, Scott LJ. Detection of cis-acting regulatory SNPs using allelic expression data. Genet Epidemiol 2011; 35:515-25. [PMID: 21769929 DOI: 10.1002/gepi.20601] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2011] [Revised: 05/09/2011] [Accepted: 05/20/2011] [Indexed: 11/06/2022]
Abstract
Allelic expression (AE) imbalance between the two alleles of a gene can be used to detect cis-acting regulatory SNPs (rSNPs) in individuals heterozygous for a transcribed SNP (tSNP). In this paper, we propose three tests for AE analysis focusing on phase-unknown data and any degree of linkage disequilibrium (LD) between the rSNP and tSNP: a test based on the minimum P-value of a one-sided F test and a two-sided t test (proposed previously for phase-unknown data), a test the combines the F and t tests, and a mixture-model-based test. We compare these three tests to the F and t tests and an existing regression-based test for phase-known data. We show that the ranking of the tests based on power depends most strongly on the magnitude of the LD between the rSNP and tSNP. For phase-unknown data, we find that under a range of scenarios, our proposed tests have higher power than the F and t tests when LD between the rSNP and tSNP is moderate (∼0.2<<∼0.8). We further demonstrate that the presence of a second ungenotyped rSNP almost never invalidates the proposed tests nor substantially changes their power rankings. For detection of cis-acting regulatory SNPs using phase-unknown AE data, we recommend the F test when the rSNP and tSNP are in or near linkage equilibrium (<0.2); the t test when the two SNPs are in strong LD (<0.7); and the mixture-model-based test for intermediate LD levels (0.2<<0.7).
Collapse
Affiliation(s)
- Rui Xiao
- Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA, USA.
| | | |
Collapse
|
30
|
Zhang X, Cal AJ, Borevitz JO. Genetic architecture of regulatory variation in Arabidopsis thaliana. Genome Res 2011; 21:725-33. [PMID: 21467266 DOI: 10.1101/gr.115337.110] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Studying the genetic regulation of expression variation is a key method to dissect complex phenotypic traits. To examine the genetic architecture of regulatory variation in Arabidopsis thaliana, we performed genome-wide association (GWA) mapping of gene expression in an F(1) hybrid diversity panel. At a genome-wide false discovery rate (FDR) of 0.2, an associated single nucleotide polymorphism (SNP) explains >38% of trait variation. In comparison with SNPs that are distant from the genes to which they were associated, locally associated SNPs are preferentially found in regions with extended linkage disequilibrium (LD) and have distinct population frequencies of the derived alleles (where Arabidopsis lyrata has the ancestral allele), suggesting that different selective forces are acting. Locally associated SNPs tend to have additive inheritance, whereas distantly associated SNPs are primarily dominant. In contrast to results from mapping of expression quantitative trait loci (eQTL) in linkage studies, we observe extensive allelic heterogeneity for local regulatory loci in our diversity panel. By association mapping of allele-specific expression (ASE), we detect a significant enrichment for cis-acting variation in local regulatory variation. In addition to gene expression variation, association mapping of splicing variation reveals both local and distant genetic regulation for intron and exon level traits. Finally, we identify candidate genes for 59 diverse phenotypic traits that were mapped to eQTL.
Collapse
Affiliation(s)
- Xu Zhang
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA
| | | | | |
Collapse
|
31
|
Tung J, Akinyi MY, Mutura S, Altmann J, Wray GA, Alberts SC. Allele-specific gene expression in a wild nonhuman primate population. Mol Ecol 2011; 20:725-39. [PMID: 21226779 DOI: 10.1111/j.1365-294x.2010.04970.x] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Natural populations hold enormous potential for evolutionary genetic studies, especially when phenotypic, genetic and environmental data are all available on the same individuals. However, untangling the genotype-phenotype relationship in natural populations remains a major challenge. Here, we describe results of an investigation of one class of phenotype, allele-specific gene expression (ASGE), in the well-studied natural population of baboons of the Amboseli basin, Kenya. ASGE measurements identify cases in which one allele of a gene is overexpressed relative to the alternative allele of the same gene, within individuals, thus providing a control for background genetic and environmental effects. Here, we characterize the incidence of ASGE in the Amboseli baboon population, focusing on the genetic and environmental contributions to ASGE in a set of eleven genes involved in immunity and defence. Within this set, we identify evidence for common ASGE in four genes. We also present examples of two relationships between cis-regulatory genetic variants and the ASGE phenotype. Finally, we identify one case in which this relationship is influenced by a novel gene-environment interaction. Specifically, the dominance rank of an individual's mother during its early life (an aspect of that individual's social environment) influences the expression of the gene CCL5 via an interaction with cis-regulatory genetic variation. These results illustrate how environmental and ecological data can be integrated into evolutionary genetic studies of functional variation in natural populations. They also highlight the potential importance of early life environmental variation in shaping the genetic architecture of complex traits in wild mammals.
Collapse
Affiliation(s)
- J Tung
- Department of Biology, Duke University, PO Box 90338, Durham, NC 27708, USA Institute for Genome Sciences & Policy, Durham, NC 27708, USA.
| | | | | | | | | | | |
Collapse
|
32
|
Groth M, Wiegand C, Szafranski K, Huse K, Kramer M, Rosenstiel P, Schreiber S, Norgauer J, Platzer M. Both copy number and sequence variations affect expression of human DEFB4. Genes Immun 2010; 11:458-66. [PMID: 20445567 DOI: 10.1038/gene.2010.19] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Copy number variations (CNVs) were found to contribute massively to the variability of genomes. One of the best studied CNV region is the beta-defensin cluster (DEFB) on 8p23.1. Individual DEFFB copy numbers (CNs) between 2 and 12 were found, whereas low CNs predispose for Crohn's disease. A further level of complexity is represented by sequence variations between copies (multisite variations, MSVs). To address the relation of DEFB CN and MSV to the expression of beta-defensin genes, we analyzed DEFB4 expression in B-lymphoblastoid cell lines (LCLs) and primary keratinocytes (normal human epidermal keratinocyte, NHEK) before and after stimulation with lipopolysaccharide, tumor necrosis factor-alpha (TNF-alpha) and interferon-gamma (IFN-gamma). Moreover, we quantified one DEFB4 MSV in DNA and mRNA as a marker for variant-specific expression (VSE) and resequenced a region of approximately 2 kb upstream of DEFB4 in LCLs. We found a strong correlation of DEFB CN and DEFB4 expression in 16 LCLs, although several LCLs with very different CNs exhibit similar expression levels. Quantification of the MSV revealed VSE with consistently lower expression of one variant. Costimulation of NHEKs with TNF-alpha/IFN-gamma leads to a synergistic increase in total DEFB4 expression and suppresses VSE. Analysis of the DEFB4 promoter region showed remarkably high density of sequence variabilities (approximately 1 MSV/41 bp).
Collapse
Affiliation(s)
- M Groth
- Genome Analysis, Leibniz Institute for Age Research-Fritz Lipmann Institute, Jena, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Yuferov V, Levran O, Proudnikov D, Nielsen DA, Kreek MJ. Search for genetic markers and functional variants involved in the development of opiate and cocaine addiction and treatment. Ann N Y Acad Sci 2010; 1187:184-207. [PMID: 20201854 PMCID: PMC3769182 DOI: 10.1111/j.1749-6632.2009.05275.x] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Addiction to opiates and illicit use of psychostimulants is a chronic, relapsing brain disease that, if left untreated, can cause major medical, social, and economic problems. This article reviews recent progress in studies of association of gene variants with vulnerability to develop opiate and cocaine addictions, focusing primarily on genes of the opioid and monoaminergic systems. In addition, we provide the first evidence of a cis-acting polymorphism and a functional haplotype in the PDYN gene, of significantly higher DNA methylation rate of the OPRM1 gene in the lymphocytes of heroin addicts, and significant differences in genotype frequencies of three single-nucleotide polymorphisms of the P-glycoprotein gene (ABCB1) between "higher" and "lower" methadone doses in methadone-maintained patients. In genomewide and multigene association studies, we found association of several new genes and new variants of known genes with heroin addiction. Finally, we describe the development and application of a novel technique: molecular haplotyping for studies in genetics of drug addiction.
Collapse
MESH Headings
- ATP Binding Cassette Transporter, Subfamily B
- ATP Binding Cassette Transporter, Subfamily B, Member 1/genetics
- Catechol O-Methyltransferase/genetics
- Cocaine-Related Disorders/genetics
- Cocaine-Related Disorders/therapy
- Enkephalins/genetics
- Epigenesis, Genetic
- Genetic Markers
- Genetic Variation
- Genome-Wide Association Study
- Haplotypes
- Humans
- Methadone/metabolism
- Methadone/therapeutic use
- Opioid-Related Disorders/genetics
- Opioid-Related Disorders/therapy
- Pharmacogenetics
- Protein Precursors/genetics
- Receptor, Melanocortin, Type 2/genetics
- Receptor, Serotonin, 5-HT1B/genetics
- Receptors, Opioid, kappa/genetics
- Receptors, Opioid, mu/genetics
- Tryptophan Hydroxylase/genetics
Collapse
Affiliation(s)
- Vadim Yuferov
- The Laboratory of the Biology of Addictive Diseases, The Rockefeller University, New York, New York
| | - Orna Levran
- The Laboratory of the Biology of Addictive Diseases, The Rockefeller University, New York, New York
| | - Dmitri Proudnikov
- The Laboratory of the Biology of Addictive Diseases, The Rockefeller University, New York, New York
| | - David A. Nielsen
- The Laboratory of the Biology of Addictive Diseases, The Rockefeller University, New York, New York
| | - Mary Jeanne Kreek
- The Laboratory of the Biology of Addictive Diseases, The Rockefeller University, New York, New York
| |
Collapse
|
34
|
Sun C, Southard C, Witonsky DB, Olopade OI, Di Rienzo A. Allelic imbalance (AI) identifies novel tissue-specific cis-regulatory variation for human UGT2B15. Hum Mutat 2010; 31:99-107. [PMID: 19847790 PMCID: PMC2922057 DOI: 10.1002/humu.21145] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Allelic imbalance (AI) is a powerful tool to identify cis-regulatory variation for gene expression. UGT2B15 is an important enzyme involved in the metabolism of multiple endobiotics and xenobiotics. In this study, we measured the relative expression of two alleles at this gene by using SNP rs1902023:G>T. An excess of the G over the T allele was consistently observed in liver (P<0.001), but not in breast (P=0.06) samples, suggesting that SNPs in strong linkage disequilibrium with G253T regulate UGT2B15 expression in liver. Seven such SNPs were identified by resequencing the promoter and exon 1, which define two distinct haplotypes. Reporter gene assays confirmed that one haplotype displayed approximately 20% higher promoter activity compared to the other major haplotype in liver HepG2 (P<0.001), but not in breast MCF-7 (P=0.540) cells. Reporter gene assays with additional constructs pointed to rs34010522:G>T and rs35513228:C>T as the cis-regulatory variants; both SNPs were also evaluated in LNCaP and Caco-2 cells. By ChIP, we showed that the transcription factor Nrf2 binds to the region spanning rs34010522:G>T in all four cell lines. Our results provide a good example for how AI can be used to identify cis-regulatory variation and gain insights into the tissue specific regulation of gene expression.
Collapse
Affiliation(s)
- Chang Sun
- Department of Human Genetics, University of Chicago, Chicago, IL 60637
| | | | - David B. Witonsky
- Department of Human Genetics, University of Chicago, Chicago, IL 60637
| | | | - Anna Di Rienzo
- Department of Human Genetics, University of Chicago, Chicago, IL 60637
| |
Collapse
|
35
|
Maia AT, Spiteri I, Lee AJX, O'Reilly M, Jones L, Caldas C, Ponder BAJ. Extent of differential allelic expression of candidate breast cancer genes is similar in blood and breast. Breast Cancer Res 2009; 11:R88. [PMID: 20003265 PMCID: PMC2815552 DOI: 10.1186/bcr2458] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2009] [Revised: 11/10/2009] [Accepted: 12/10/2009] [Indexed: 12/31/2022] Open
Abstract
Introduction Normal gene expression variation is thought to play a central role in inter-individual variation and susceptibility to disease. Regulatory polymorphisms in cis-acting elements result in the unequal expression of alleles. Differential allelic expression (DAE) in heterozygote individuals could be used to develop a new approach to discover regulatory breast cancer susceptibility loci. As access to large numbers of fresh breast tissue to perform such studies is difficult, a suitable surrogate test tissue must be identified for future studies. Methods We measured differential allelic expression of 12 candidate genes possibly related to breast cancer susceptibility (BRCA1, BRCA2, C1qA, CCND3, EMSY, GPX1, GPX4, MLH3, MTHFR, NBS1, TP53 and TRXR2) in breast tissue (n = 40) and fresh blood (n = 170) of healthy individuals and EBV-transformed lymphoblastoid cells (n = 19). Differential allelic expression ratios were determined by Taqman assay. Ratio distributions were compared using t-test and Wilcoxon rank sum test, for mean ratios and variances respectively. Results We show that differential allelic expression is common among these 12 candidate genes and is comparable between breast and blood (fresh and transformed lymphoblasts) in a significant proportion of them. We found that eight out of nine genes with DAE in breast and fresh blood were comparable, as were 10 out of 11 genes between breast and transformed lymphoblasts. Conclusions Our findings support the use of differential allelic expression in blood as a surrogate for breast tissue in future studies on predisposition to breast cancer.
Collapse
Affiliation(s)
- Ana-Teresa Maia
- Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre and Department of Oncology, University of Cambridge, Robinson Way, Cambridge CB2 0RE, UK.
| | | | | | | | | | | | | |
Collapse
|
36
|
Abstract
Variation in gene expression constitutes an important source of biological variability within and between populations that is likely to contribute significantly to phenotypic diversity. Recent conceptual, technical, and methodological advances have enabled the genome-scale dissection of transcriptional variation. Here, we outline common approaches for detecting gene expression quantitative trait loci, and summarize the insights gleaned from these studies regarding the genetic architecture of transcriptional variation and the nature of regulatory alleles. Particular emphasis is placed on human studies, and we discuss experimental designs that ensure that increasingly large and complex studies continue to advance our understanding of gene expression variation. We conclude by discussing the evolution of gene expression levels, and we explore prospects for leveraging new technological developments to investigate inherited variation in gene expression in even greater depth.
Collapse
Affiliation(s)
- Daniel A Skelly
- Department of Genome Sciences, University of Washington, Seattle, Washington, 98195, USA.
| | | | | |
Collapse
|
37
|
Identification of a Cis-acting regulatory polymorphism in a Eucalypt COBRA-like gene affecting cellulose content. Genetics 2009; 183:1153-64. [PMID: 19737751 DOI: 10.1534/genetics.109.106591] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Populations with low linkage disequilibrium (LD) offer unique opportunities to study functional variants influencing quantitative traits. We exploited the low LD in forest trees to identify functional polymorphisms in a Eucalyptus nitens COBRA-like gene (EniCOBL4A), whose Arabidopsis homolog has been implicated in cellulose deposition. Linkage analysis in a full-sib family revealed that EniCOBL4A is the most strongly associated marker in a quantitative trait locus (QTL) region for cellulose content. Analysis of LD by genotyping 11 common single-nucleotide polymorphisms (SNPs) and a simple sequence repeat (SSR) in an association population revealed that LD declines within the length of the gene. Using association studies we fine mapped the effect of the gene to SNP7, a synonymous SNP in exon 5, which occurs between two small haplotype blocks. We observed patterns of allelic expression imbalance (AEI) and differential binding of nuclear proteins to the SNP7 region that indicate that SNP7 is a cis-acting regulatory polymorphism affecting allelic expression. We also observed AEI in SNP7 heterozygotes in a full-sib family that is linked to heritable allele-specific methylation near SNP7. This study demonstrates the potential to reveal functional polymorphisms underlying quantitative traits in low LD populations.
Collapse
|
38
|
Tung J, Fédrigo O, Haygood R, Mukherjee S, Wray GA. Genomic features that predict allelic imbalance in humans suggest patterns of constraint on gene expression variation. Mol Biol Evol 2009; 26:2047-59. [PMID: 19506001 PMCID: PMC2734157 DOI: 10.1093/molbev/msp113] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/26/2009] [Indexed: 12/29/2022] Open
Abstract
Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary constraint.
Collapse
Affiliation(s)
- Jenny Tung
- Department of Biology, Duke University, Durham, NC, USA.
| | | | | | | | | |
Collapse
|
39
|
Warner LR, Babbitt CC, Primus AE, Severson TF, Haygood R, Wray GA. Functional consequences of genetic variation in primates on tyrosine hydroxylase (TH) expression in vitro. Brain Res 2009; 1288:1-8. [DOI: 10.1016/j.brainres.2009.06.086] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2009] [Revised: 06/22/2009] [Accepted: 06/25/2009] [Indexed: 11/16/2022]
|
40
|
Yuferov V, Ji F, Nielsen DA, Levran O, Ho A, Morgello S, Shi R, Ott J, Kreek MJ. A functional haplotype implicated in vulnerability to develop cocaine dependence is associated with reduced PDYN expression in human brain. Neuropsychopharmacology 2009; 34:1185-97. [PMID: 18923396 PMCID: PMC2778041 DOI: 10.1038/npp.2008.187] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Dynorphin peptides and the kappa-opioid receptor are important in the rewarding properties of cocaine, heroin, and alcohol. We tested polymorphisms of the prodynorphin gene (PDYN) for association with cocaine dependence and cocaine/alcohol codependence. We genotyped six single nucleotide polymorphisms (SNPs), located in the promoter region, exon 4 coding, and 3' untranslated region, in 106 Caucasians and 204 African Americans who were cocaine dependent, cocaine/alcohol codependent, or controls. In Caucasians, we found point-wise significant associations of 3'UTR SNPs (rs910080, rs910079, and rs2235749) with cocaine dependence and cocaine/alcohol codependence. These SNPs are in high linkage disequilibrium, comprising a haplotype block. The haplotype CCT was significantly experiment-wise associated with cocaine dependence and with combined cocaine dependence and cocaine/alcohol codependence (false discovery rate, q=0.04 and 0.03, respectively). We investigated allele-specific gene expression of PDYN, using SNP rs910079 as a reporter, in postmortem human brains from eight heterozygous subjects, using SNaPshot assay. There was significantly lower expression for C allele (rs910079), with ratios ranging from 0.48 to 0.78, indicating lower expression of the CCT haplotype of PDYN in both the caudate and nucleus accumbens. Analysis of total PDYN expression in 43 postmortem brains also showed significantly lower levels of preprodynorphin mRNA in subjects having the risk CCT haplotype. This study provides evidence that a 3'UTR PDYN haplotype, implicated in vulnerability to develop cocaine addiction and/or cocaine/alcohol codependence, is related to lower mRNA expression of the PDYN gene in human dorsal and ventral striatum.
Collapse
Affiliation(s)
- Vadim Yuferov
- Laboratory of the Biology of Addictive Diseases, The Rockefeller University, New York, NY 10065, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Allele-specific expression and gene methylation in the control of CYP1A2 mRNA level in human livers. THE PHARMACOGENOMICS JOURNAL 2009; 9:208-17. [DOI: 10.1038/tpj.2009.4] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
42
|
Nica AC, Dermitzakis ET. Using gene expression to investigate the genetic basis of complex disorders. Hum Mol Genet 2009; 17:R129-34. [PMID: 18852201 DOI: 10.1093/hmg/ddn285] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The identification of complex disease susceptibility loci through genome-wide association studies (GWAS) has recently become possible and is now a method of choice for investigating the genetic basis of complex traits. The number of results from such studies is constantly increasing but the challenge lying forward is to identify the biological context in which these statistically significant candidate variants act. Regulatory variation plays an important role in shaping phenotypic differences among individuals and thus is very likely to also influence disease susceptibility. As such, integrating gene expression data and other disease relevant intermediate phenotypes with GWAS results could potentially help prioritize fine-mapping efforts and provide a shortcut to disease biology. Combining these different levels of information in a meaningful way is however not trivial. In the present review, we outline the several approaches that have been explored so far in this sense and their achievements. We also discuss the limitations of the methods and how upcoming technological developments could help circumvent these limitations. Overall, such efforts will be very helpful in understanding initially regulatory effects on disease and disease etiology in general.
Collapse
Affiliation(s)
- Alexandra C Nica
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge CB10 1HH, UK
| | | |
Collapse
|
43
|
Abstract
Gene expression levels vary heritably, with approximately 25-35% of the loci affecting expression acting in cis. We characterized standing cis-regulatory variation among 16 wild-derived strains of Drosophila melanogaster. Our experiment's robust biological and technical replication enabled precise estimates of variation in allelic expression on a high-throughput SNP genotyping platform. We observed concordant, significant differential allelic expression (DAE) in 7/10 genes queried with multiple SNPs, and every member of a set of eight additional, one-assay genes suggest significant DAE. Four of the high-confidence, multiple-assay genes harbor three or more statistically distinguishable allelic classes, often at intermediate frequency. Numerous intermediate-frequency, detectable regulatory polymorphisms cast doubt on a model in which cis-acting variation is a product of deleterious mutations of large effect. Comparing our data to predictions of population genetics theory using coalescent simulations, we estimate that a typical gene harbors 7-15 cis-regulatory sites (nucleotides) at which a selectively neutral mutation would elicit an observable expression phenotype. If standing cis-regulatory variation is actually slightly deleterious, the true mutational target size is larger.
Collapse
|
44
|
Zhang X, Pan F, Wang W, Nobel A. Mining Non-Redundant High Order Correlations in Binary Data. PROCEEDINGS OF THE VLDB ENDOWMENT. INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES 2008; 1:1178-1188. [PMID: 20485469 PMCID: PMC2871700 DOI: 10.14778/1453856.1453981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Many approaches have been proposed to find correlations in binary data. Usually, these methods focus on pair-wise correlations. In biology applications, it is important to find correlations that involve more than just two features. Moreover, a set of strongly correlated features should be non-redundant in the sense that the correlation is strong only when all the interacting features are considered together. Removing any feature will greatly reduce the correlation.In this paper, we explore the problem of finding non-redundant high order correlations in binary data. The high order correlations are formalized using multi-information, a generalization of pairwise mutual information. To reduce the redundancy, we require any subset of a strongly correlated feature subset to be weakly correlated. Such feature subsets are referred to as Non-redundant Interacting Feature Subsets (NIFS). Finding all NIFSs is computationally challenging, because in addition to enumerating feature combinations, we also need to check all their subsets for redundancy. We study several properties of NIFSs and show that these properties are useful in developing efficient algorithms. We further develop two sets of upper and lower bounds on the correlations, which can be incorporated in the algorithm to prune the search space. A simple and effective pruning strategy based on pair-wise mutual information is also developed to further prune the search space. The efficiency and effectiveness of our approach are demonstrated through extensive experiments on synthetic and real-life datasets.
Collapse
Affiliation(s)
- Xiang Zhang
- Department of Computer Science, University of North Carolina at Chapel Hill
| | - Feng Pan
- Department of Computer Science, University of North Carolina at Chapel Hill
| | - Wei Wang
- Department of Computer Science, University of North Carolina at Chapel Hill
| | - Andrew Nobel
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill
| |
Collapse
|
45
|
Fullerton JM, Willis-Owen SAG, Yalcin B, Shifman S, Copley RR, Miller SR, Bhomra A, Davidson S, Oliver PL, Mott R, Flint J. Human-mouse quantitative trait locus concordance and the dissection of a human neuroticism locus. Biol Psychiatry 2008; 63:874-83. [PMID: 18083140 DOI: 10.1016/j.biopsych.2007.10.019] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2007] [Revised: 09/28/2007] [Accepted: 10/17/2007] [Indexed: 12/01/2022]
Abstract
BACKGROUND Exploiting synteny between mouse and human disease loci has been proposed as a cost-effective method for the identification of human susceptibility genes. Here we explore its utility in an analysis of a human personality trait, neuroticism, which can be modeled in mice by tests of emotionality. We investigated a mouse emotionality locus on chromosome 1 that contains no annotated genes but abuts four regulators of G protein signaling, one of which (rgs2) has been previously identified as a quantitative trait gene for emotionality. This locus is syntenic with a human region that has been consistently implicated in the genetic aetiology of neuroticism. METHODS The functional candidacy of 29 murine sequence variants was tested by a combination of gel shift and transient transfection assays. Murine sequences that contained functional variants and exhibited significant cross-species conservation were prioritized for investigation in humans. Genetic association with neuroticism was tested in 1869 high and 2032 low unrelated individuals scored for neuroticism, selected from the extremes of 88,141 people from southwest England. RESULTS Fifteen sequence variants contributed to variation in the expression of rgs18, the gene lying at the edge of the quantitative trait loci (QTL) interval. There was no evidence of association between neuroticism and single nucleotide polymorphisms (SNPs) lying in the human regions homologous to those of mouse functional variants. One SNP, rs6428058, in a region of sequence conservation 644 kb upstream of RGS18, showed significant association (p = .000631). CONCLUSIONS It is unlikely that a single variant is responsible for the mouse emotionality locus on chromosome 1. This level of underlying genetic complexity means that although cross-species QTL concordance may be invaluable for the identification of human disease loci, it is unlikely to be as informative in the identification of human disease-causing variants.
Collapse
Affiliation(s)
- Janice M Fullerton
- Wellcome Trust Centre for Human Genetics, Headington, Oxford, United Kingdom
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Differential allelic expression in the human genome: a robust approach to identify genetic and epigenetic cis-acting mechanisms regulating gene expression. PLoS Genet 2008; 4:e1000006. [PMID: 18454203 PMCID: PMC2265535 DOI: 10.1371/journal.pgen.1000006] [Citation(s) in RCA: 188] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2007] [Accepted: 01/15/2008] [Indexed: 11/19/2022] Open
Abstract
The recent development of whole genome association studies has lead to the robust identification of several loci involved in different common human diseases. Interestingly, some of the strongest signals of association observed in these studies arise from non-coding regions located in very large introns or far away from any annotated genes, raising the possibility that these regions are involved in the etiology of the disease through some unidentified regulatory mechanisms. These findings highlight the importance of better understanding the mechanisms leading to inter-individual differences in gene expression in humans. Most of the existing approaches developed to identify common regulatory polymorphisms are based on linkage/association mapping of gene expression to genotypes. However, these methods have some limitations, notably their cost and the requirement of extensive genotyping information from all the individuals studied which limits their applications to a specific cohort or tissue. Here we describe a robust and high-throughput method to directly measure differences in allelic expression for a large number of genes using the Illumina Allele-Specific Expression BeadArray platform and quantitative sequencing of RT-PCR products. We show that this approach allows reliable identification of differences in the relative expression of the two alleles larger than 1.5-fold (i.e., deviations of the allelic ratio larger than 60∶40) and offers several advantages over the mapping of total gene expression, particularly for studying humans or outbred populations. Our analysis of more than 80 individuals for 2,968 SNPs located in 1,380 genes confirms that differential allelic expression is a widespread phenomenon affecting the expression of 20% of human genes and shows that our method successfully captures expression differences resulting from both genetic and epigenetic cis-acting mechanisms. We describe a new methodology to identify individual differences in the expression of the two copies of one gene. This is achieved by comparing the mRNA level of the two alleles using a heterozygous polymorphism in the transcript as marker. We show that this approach allows an exhaustive survey of cis-acting regulation in the genome; we can identify allelic expression differences due to epigenetic mechanisms of gene regulation (e.g. imprinting or X-inactivation) as well as differences due to the presence of polymorphisms in regulatory elements. The direct comparison of the expression of both alleles nullifies possible trans-acting regulatory effects (that influence equally both alleles) and thus complements the findings from gene expression association studies. Our approach can be easily applied to any cohort of interest for a wide range of studies. It notably allows following up association signals and testing whether a gene sitting on a particular haplotype is over- or under-expressed, or can be used for screening cancer tissues for aberrant gene expression due to newly arisen mutations or alteration of the methylation patterns.
Collapse
|
47
|
|
48
|
A genome-wide approach to identifying novel-imprinted genes. Hum Genet 2007; 122:625-34. [DOI: 10.1007/s00439-007-0440-1] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2007] [Accepted: 10/11/2007] [Indexed: 12/01/2022]
|
49
|
Brissett NC, Pitcher RS, Juarez R, Picher AJ, Green AJ, Dafforn TR, Fox GC, Blanco L, Doherty AJ. Structure of a NHEJ polymerase-mediated DNA synaptic complex. Science 2007; 318:456-9. [PMID: 17947582 DOI: 10.1126/science.1145112] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Nonhomologous end joining (NHEJ) is a critical DNA double-strand break (DSB) repair pathway required to maintain genome stability. Many prokaryotes possess a minimalist NHEJ apparatus required to repair DSBs during stationary phase, composed of two conserved core proteins, Ku and ligase D (LigD). The crystal structure of Mycobacterium tuberculosis polymerase domain of LigD mediating the synapsis of two noncomplementary DNA ends revealed a variety of interactions, including microhomology base pairing, mismatched and flipped-out bases, and 3' termini forming hairpin-like ends. Biochemical and biophysical studies confirmed that polymerase-induced end synapsis also occurs in solution. We propose that this DNA synaptic structure reflects an intermediate bridging stage of the NHEJ process, before end processing and ligation, with both the polymerase and the DNA sequence playing pivotal roles in determining the sequential order of synapsis and remodeling before end joining.
Collapse
Affiliation(s)
- Nigel C Brissett
- Genome Damage and Stability Centre, University of Sussex, Brighton BN1 9RQ, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
50
|
McGregor AP, Orgogozo V, Delon I, Zanet J, Srinivasan DG, Payre F, Stern DL. Morphological evolution through multiple cis-regulatory mutations at a single gene. Nature 2007; 448:587-90. [PMID: 17632547 DOI: 10.1038/nature05988] [Citation(s) in RCA: 238] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2006] [Accepted: 06/05/2007] [Indexed: 12/26/2022]
Abstract
One central, and yet unsolved, question in evolutionary biology is the relationship between the genetic variants segregating within species and the causes of morphological differences between species. The classic neo-darwinian view postulates that species differences result from the accumulation of small-effect changes at multiple loci. However, many examples support the possible role of larger abrupt changes in the expression of developmental genes in morphological evolution. Although this evidence might be considered a challenge to a neo-darwinian micromutationist view of evolution, there are currently few examples of the actual genes causing morphological differences between species. Here we examine the genetic basis of a trichome pattern difference between Drosophila species, previously shown to result from the evolution of a single gene, shavenbaby (svb), probably through cis-regulatory changes. We first identified three distinct svb enhancers from D. melanogaster driving reporter gene expression in partly overlapping patterns that together recapitulate endogenous svb expression. All three homologous enhancers from D. sechellia drive expression in modified patterns, in a direction consistent with the evolved svb expression pattern. To test the influence of these enhancers on the actual phenotypic difference, we conducted interspecific genetic mapping at a resolution sufficient to recover multiple intragenic recombinants. This functional analysis revealed that independent genetic regions upstream of svb that overlap the three identified enhancers are collectively required to generate the D. sechellia trichome pattern. Our results demonstrate that the accumulation of multiple small-effect changes at a single locus underlies the evolution of a morphological difference between species. These data support the view that alleles of large effect that distinguish species may sometimes reflect the accumulation of multiple mutations of small effect at select genes.
Collapse
Affiliation(s)
- Alistair P McGregor
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey 08544, USA
| | | | | | | | | | | | | |
Collapse
|