Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

122
(from Reference Citation Analysis)

Article PDFs (34)

Cited by > 0 (97)

Searched Name

Juan R. González

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Statistics

Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Category

Show more Refine

Number	Citation Analysis
1	Green space exposure and blood DNA methylation at birth and in childhood - A multi-cohort study. ENVIRONMENT INTERNATIONAL 2024;188:108684. [PMID: 38776651 DOI: 10.1016/j.envint.2024.108684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 03/21/2024] [Accepted: 04/21/2024] [Indexed: 05/25/2024] Abstract Green space exposure has been associated with improved mental, physical and general health. However, the underlying biological mechanisms remain largely unknown. The aim of this study was to investigate the association between green space exposure and cord and child blood DNA methylation. Data from eight European birth cohorts with a total of 2,988 newborns and 1,849 children were used. Two indicators of residential green space exposure were assessed: (i) surrounding greenness (satellite-based Normalized Difference Vegetation Index (NDVI) in buffers of 100 m and 300 m) and (ii) proximity to green space (having a green space ≥ 5,000 m2 within a distance of 300 m). For these indicators we assessed two exposure windows: (i) pregnancy, and (ii) the period from pregnancy to child blood DNA methylation assessment, named as cumulative exposure. DNA methylation was measured with the Illumina 450K or EPIC arrays. To identify differentially methylated positions (DMPs) we fitted robust linear regression models between pregnancy green space exposure and cord blood DNA methylation and between cumulative green space exposure and child blood DNA methylation. Two sensitivity analyses were conducted: (i) without adjusting for cellular composition, and (ii) adjusting for air pollution. Cohort results were combined through fixed-effect inverse variance weighted meta-analyses. Differentially methylated regions (DMRs) were identified from meta-analysed results using the Enmix-combp and DMRcate methods. There was no statistical evidence of pregnancy or cumulative exposures associating with any DMP (False Discovery Rate, FDR, p-value < 0.05). However, surrounding greenness exposure was inversely associated with four DMRs (three in cord blood and one in child blood) annotated to ADAMTS2, KCNQ1DN, SLC6A12 and SDK1 genes. Results did not change substantially in the sensitivity analyses. Overall, we found little evidence of the association between green space exposure and blood DNA methylation. Although we identified associations between surrounding greenness exposure with four DMRs, these findings require replication. Collapse Key Words Child blood Cord blood DMP DMR DNA methylation Green space Collapse MESH Headings Collapse Grants Collapse
2	Clonal chromosomal mosaicism and loss of chromosome Y in elderly men increase vulnerability for SARS-CoV-2. Commun Biol 2024;7:202. [PMID: 38374351 PMCID: PMC10876565 DOI: 10.1038/s42003-024-05805-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 01/11/2024] [Indexed: 02/21/2024] Open Abstract The pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2, COVID-19) had an estimated overall case fatality ratio of 1.38% (pre-vaccination), being 53% higher in males and increasing exponentially with age. Among 9578 individuals diagnosed with COVID-19 in the SCOURGE study, we found 133 cases (1.42%) with detectable clonal mosaicism for chromosome alterations (mCA) and 226 males (5.08%) with acquired loss of chromosome Y (LOY). Individuals with clonal mosaic events (mCA and/or LOY) showed a 54% increase in the risk of COVID-19 lethality. LOY is associated with transcriptomic biomarkers of immune dysfunction, pro-coagulation activity and cardiovascular risk. Interferon-induced genes involved in the initial immune response to SARS-CoV-2 are also down-regulated in LOY. Thus, mCA and LOY underlie at least part of the sex-biased severity and mortality of COVID-19 in aging patients. Given its potential therapeutic and prognostic relevance, evaluation of clonal mosaicism should be implemented as biomarker of COVID-19 severity in elderly people. Collapse Key Words genetics research infectious diseases molecular medicine Collapse MESH Headings Male Humans Aged SARS-CoV-2/genetics Mosaicism COVID-19/genetics Chromosomes, Human, Y Aging Collapse Grants Collapse
3	Epimutation detection in the clinical context: guidelines and a use case from a new Bioconductor package. Epigenetics 2023;18:2230670. [PMID: 37409354 DOI: 10.1080/15592294.2023.2230670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2023] Open Abstract Epimutations are rare alterations of the normal DNA methylation pattern at specific loci, which can lead to rare diseases. Methylation microarrays enable genome-wide epimutation detection, but technical limitations prevent their use in clinical settings: methods applied to rare diseases' data cannot be easily incorporated to standard analyses pipelines, while epimutation methods implemented in R packages (ramr) have not been validated for rare diseases. We have developed epimutacions, a Bioconductor package (https://bioconductor.org/packages/release/bioc/html/epimutacions.html). epimutacions implements two previously reported methods and four new statistical approaches to detect epimutations, along with functions to annotate and visualize epimutations. Additionally, we have developed an user-friendly Shiny app to facilitate epimutations detection (https://github.com/isglobal-brge/epimutacionsShiny) to non-bioinformatician users. We first compared the performance of epimutacions and ramr packages using three public datasets with experimentally validated epimutations. Methods in epimutacions had a high performance at low sample sizes and outperformed methods in ramr. Second, we used two general population children cohorts (INMA and HELIX) to determine the technical and biological factors that affect epimutations detection, providing guidelines on how designing the experiments or preprocessing the data. In these cohorts, most epimutations did not correlate with detectable regional gene expression changes. Finally, we exemplified how epimutacions can be used in a clinical context. We run epimutacions in a cohort of children with autism disorder and identified novel recurrent epimutations in candidate genes for autism. Overall, we present epimutacions a new Bioconductor package for incorporating epimutations detection to rare disease diagnosis and provide guidelines for the design and data analyses. Collapse Key Words Epigenetics bioinformatics epidemiology rare disease Collapse MESH Headings Collapse Grants Collapse
4	Prenatal environmental exposures associated with sex differences in childhood obesity and neurodevelopment. BMC Med 2023;21:142. [PMID: 37046291 PMCID: PMC10099694 DOI: 10.1186/s12916-023-02815-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 03/06/2023] [Indexed: 04/14/2023] Open Abstract BACKGROUND Obesity and neurodevelopmental delay are complex traits that often co-occur and differ between boys and girls. Prenatal exposures are believed to influence children's obesity, but it is unknown whether exposures of pregnant mothers can confer a different risk of obesity between sexes, and whether they can affect neurodevelopment. METHODS We analyzed data from 1044 children from the HELIX project, comprising 93 exposures during pregnancy, and clinical, neuropsychological, and methylation data during childhood (5-11 years). Using exposome-wide interaction analyses, we identified prenatal exposures with the highest sexual dimorphism in obesity risk, which were used to create a multiexposure profile. We applied causal random forest to classify individuals into two environments: E1 and E0. E1 consists of a combination of exposure levels where girls have significantly less risk of obesity than boys, as compared to E0, which consists of the remaining combination of exposure levels. We investigated whether the association between sex and neurodevelopmental delay also differed between E0 and E1. We used methylation data to perform an epigenome-wide association study between the environments to see the effect of belonging to E1 or E0 at the molecular level. RESULTS We observed that E1 was defined by the combination of low dairy consumption, non-smokers' cotinine levels in blood, low facility richness, and the presence of green spaces during pregnancy (OR_interaction = 0.070, P = 2.59 × 10^-5). E1 was also associated with a lower risk of neurodevelopmental delay in girls, based on neuropsychological tests of non-verbal intelligence (OR_interaction = 0.42, P = 0.047) and working memory (OR_interaction = 0.31, P = 0.02). In line with this, several neurodevelopmental functions were enriched in significant differentially methylated probes between E1 and E0. CONCLUSIONS The risk of obesity can be different for boys and girls in certain prenatal environments. We identified an environment combining four exposure levels that protect girls from obesity and neurodevelopment delay. The combination of single exposures into multiexposure profiles using causal inference can help determine populations at risk. Collapse Key Words Causal inference Childhood obesity DNA methylation Multiexposure profile Neurodevelopment Prenatal environment Sexual dimorphism Collapse MESH Headings Collapse Grants 308333 (HELIX project) Seventh Framework Programme 874583 (ATHLETE project) Preventing Disease Programme WT101597MA Wellcome Trust MR/N024397/1 MRF 6-04-2014_31V-66 Lithuanian Agency for Science Innovation and Technology CEX2018-000806-S Ministerio de Ciencia e Innovación SLT017/20/00006 Departament de Salut, Generalitat de Catalunya SLT017/20/000119 Departament de Salut, Generalitat de Catalunya Collapse
5	Sex Differences in the Association between Risk of Anterior Cruciate Ligament Rupture and COL5A1 Polymorphisms in Elite Footballers. Genes (Basel) 2022;14:33. [PMID: 36672775 PMCID: PMC9858943 DOI: 10.3390/genes14010033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/15/2022] [Accepted: 12/20/2022] [Indexed: 12/25/2022] Open Abstract BACKGROUND Single-nucleotide polymorphisms (SNPs) in collagen genes are predisposing factors for anterior cruciate ligament (ACL) rupture. Although these events are more frequent in females, the sex-specific risk of reported SNPs has not been evaluated. PURPOSE We aimed to assess the sex-specific risk of historic non-contact ACL rupture considering candidate SNPs in genes previously associated with muscle, tendon, ligament and ACL injury in elite footballers. STUDY DESIGN This was a cohort genetic association study. METHODS Forty-six (twenty-four females) footballers playing for the first team of FC Barcelona (Spain) during the 2020-21 season were included in the study. We evaluated the association between a history of non-contact ACL rupture before July 2022 and 108 selected SNPs, stratified by sex. SNPs with nominally significant associations in one sex were then tested for their interactions with sex on ACL. RESULTS Seven female (29%) and one male (4%) participants had experienced non-contact ACL rupture during their professional football career before the last date of observation. We found a significant association between the rs13946 C/C genotype and ACL injury in women footballers (p = 0.017). No significant associations were found in male footballers. The interaction between rs13946 and sex was significant (p = 0.027). We found that the C-allele of rs13946 was exclusive to one haplotype of five SNPs spanning COL5A1. CONCLUSIONS The present study suggests the role of SNPs in genes encoding for collagens as female risk factors for ACL injury in football players. CLINICAL RELEVANCE The genetic profiling of athletes at high risk of ACL rupture can contribute to sex-specific strategies for injury prevention in footballers. Collapse Key Words COL5A1 anterior cruciate ligament collagen female football injury rs12722 rs13946 sex differences single-nucleotide polymorphisms team sport Collapse MESH Headings Humans Male Female Anterior Cruciate Ligament Injuries/genetics Anterior Cruciate Ligament Sex Characteristics Collagen/genetics Genotype Collagen Type V/genetics Collapse Grants Collapse
6	Multi-omics signatures of the human early life exposome. Nat Commun 2022;13:7024. [PMID: 36411288 PMCID: PMC9678903 DOI: 10.1038/s41467-022-34422-2] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 10/25/2022] [Indexed: 11/23/2022] Open Abstract Environmental exposures during early life play a critical role in life-course health, yet the molecular phenotypes underlying environmental effects on health are poorly understood. In the Human Early Life Exposome (HELIX) project, a multi-centre cohort of 1301 mother-child pairs, we associate individual exposomes consisting of >100 chemical, outdoor, social and lifestyle exposures assessed in pregnancy and childhood, with multi-omics profiles (methylome, transcriptome, proteins and metabolites) in childhood. We identify 1170 associations, 249 in pregnancy and 921 in childhood, which reveal potential biological responses and sources of exposure. Pregnancy exposures, including maternal smoking, cadmium and molybdenum, are predominantly associated with child DNA methylation changes. In contrast, childhood exposures are associated with features across all omics layers, most frequently the serum metabolome, revealing signatures for diet, toxic chemical compounds, essential trace elements, and weather conditions, among others. Our comprehensive and unique resource of all associations ( https://helixomics.isglobal.org/ ) will serve to guide future investigation into the biological imprints of the early life exposome. Collapse Key Words risk factors systems biology prognostic markers Collapse MESH Headings Pregnancy Female Humans Exposome Environmental Exposure/adverse effects Cohort Studies Metabolome Transcriptome Collapse Grants MR/N024397/1 Medical Research Council MR/S019669/1 Medical Research Council MR/S03532X/1 Medical Research Council R21 ES029681 NIEHS NIH HHS The study has received funding from the European Community’s Seventh Framework Programme (FP7/2007-206) under grant agreement no 308333 (HELIX project) and the H2020-EU.3.1.2. - Preventing Disease Programme under grant agreement no 874583 (ATHLETE project). Collapse
7	Software Application Profile: ShinyDataSHIELD—an R Shiny application to perform federated non-disclosive data analysis in multicohort studies. Int J Epidemiol 2022. [DOI: 10.1093/ije/dyac201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract Abstract Motivation DataSHIELD is an open-source software infrastructure enabling the analysis of data distributed across multiple databases (federated data) without leaking individuals’ information (non-disclosive). It has applications in many scientific domains, ranging from biosciences to social sciences and including high-throughput genomic studies. R is the language used to interact with (and build) DataSHIELD. This creates difficulties for researchers who do not have experience writing R code or lack the time to learn how to use the DataSHIELD functions. To help new researchers use the DataSHIELD infrastructure and to improve the user-friendliness for experienced researchers, we present ShinyDataSHIELD. Implementation ShinyDataSHIELD is a web application with an R backend that serves as a graphical user interface (GUI) to the DataSHIELD infrastructure. General features The version of the application presented here includes modules to perform: (i) exploratory analysis through descriptive summary statistics and graphical representations (scatter plots, histograms, heatmaps and boxplots); (ii) statistical modelling (generalized linear fixed and mixed-effects models, survival analysis through Cox regression); (iii) genome-wide association studies (GWAS); and (iv) omic analysis (transcriptomics, epigenomics and multi-omic integration). Availability ShinyDataSHIELD is publicly hosted online [https://datashield-demo.obiba.org/], the source code and user guide are deposited on Zenodo DOI 10.5281/zenodo.6500323, freely available to non-commercial users under ‘Commons Clause’ License Condition v1.0. Docker images are also available [https://hub.docker.com/r/brgelab/shiny-data-shield]. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
8	Mass Spectrometry Identification of Biomarkers in Extracellular Vesicles From Plasmodium vivax Liver Hypnozoite Infections. Mol Cell Proteomics 2022;21:100406. [PMID: 36030044 PMCID: PMC9520272 DOI: 10.1016/j.mcpro.2022.100406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Revised: 08/12/2022] [Accepted: 08/20/2022] [Indexed: 01/18/2023] Open Abstract Latent liver stages termed hypnozoites cause relapsing Plasmodium vivax malaria infection and represent a major obstacle in the goal of malaria elimination. Hypnozoites are clinically undetectable, and presently, there are no biomarkers of this persistent parasite reservoir in the human liver. Here, we have identified parasite and human proteins associated with extracellular vesicles (EVs) secreted from in vivo infections exclusively containing hypnozoites. We used P. vivax-infected human liver-chimeric (huHEP) FRG KO mice treated with the schizonticidal experimental drug MMV048 as hypnozoite infection model. Immunofluorescence-based quantification of P. vivax liver forms showed that MMV048 removed schizonts from chimeric mice livers. Proteomic analysis of EVs derived from FRG huHEP mice showed that human EV cargo from infected FRG huHEP mice contain inflammation markers associated with active schizont replication and identified 66 P. vivax proteins. To identify hypnozoite-specific proteins associated with EVs, we mined the proteome data from MMV048-treated mice and performed an analysis involving intragroup and intergroup comparisons across all experimental conditions followed by a peptide compatibility analysis with predicted spectra to warrant robust identification. Only one protein fulfilled this stringent top-down selection, a putative filamin domain-containing protein. This study sets the stage to unveil biological features of human liver infections and identify biomarkers of hypnozoite infection associated with EVs. Collapse Key Words Plasmodium vivax biomarkers extracellular vesicles humanized mouse model hypnozoites proteomics schizonticidal experimental drug MMV048 Collapse MESH Headings Collapse Grants Collapse
9	The early-life exposome modulates the effect of polymorphic inversions on DNA methylation. Commun Biol 2022;5:455. [PMID: 35550596 PMCID: PMC9098634 DOI: 10.1038/s42003-022-03380-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 04/19/2022] [Indexed: 11/14/2022] Open Abstract Polymorphic genomic inversions are chromosomal variants with intrinsic variability that play important roles in evolution, environmental adaptation, and complex traits. We investigated the DNA methylation patterns of three common human inversions, at 8p23.1, 16p11.2, and 17q21.31 in 1,009 blood samples from children from the Human Early Life Exposome (HELIX) project and in 39 prenatal heart tissue samples. We found inversion-state specific methylation patterns within and nearby flanking each inversion region in both datasets. Additionally, numerous inversion-exposure interactions on methylation levels were identified from early-life exposome data comprising 64 exposures. For instance, children homozygous at inv-8p23.1 and higher meat intake were more susceptible to TDH hypermethylation (P = 3.8 × 10⁻²²); being the inversion, exposure, and gene known risk factors for adult obesity. Inv-8p23.1 associated hypermethylation of GATA4 was also detected across numerous exposures. Our data suggests that the pleiotropic influence of inversions during development and lifetime could be substantially mediated by allele-specific methylation patterns which can be modulated by the exposome. Analysis of the relationship between presence of common DNA sequence inversions and DNA methylation patterns suggests a role for environmental exposures (such as food intake) in mediating inversion state-specific methylation patterns. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
10	Systematic Collaborative Reanalysis of Genomic Data Improves Diagnostic Yield in Neurologic Rare Diseases. J Mol Diagn 2022;24:529-542. [PMID: 35569879 DOI: 10.1016/j.jmoldx.2022.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 12/16/2021] [Accepted: 02/03/2022] [Indexed: 11/26/2022] Open Abstract Many patients experiencing a rare disease remain undiagnosed even after genomic testing. Reanalysis of existing genomic data has shown to increase diagnostic yield, although there are few systematic and comprehensive reanalysis efforts that enable collaborative interpretation and future reinterpretation. The Undiagnosed Rare Disease Program of Catalonia project collated previously inconclusive good quality genomic data (panels, exomes, and genomes) and standardized phenotypic profiles from 323 families (543 individuals) with a neurologic rare disease. The data were reanalyzed systematically to identify relatedness, runs of homozygosity, consanguinity, single-nucleotide variants, insertions and deletions, and copy number variants. Data were shared and collaboratively interpreted within the consortium through a customized Genome-Phenome Analysis Platform, which also enables future data reinterpretation. Reanalysis of existing genomic data provided a diagnosis for 20.7% of the patients, including 1.8% diagnosed after the generation of additional genomic data to identify a second pathogenic heterozygous variant. Diagnostic rate was significantly higher for family-based exome/genome reanalysis compared with singleton panels. Most new diagnoses were attributable to recent gene-disease associations (50.8%), additional or improved bioinformatic analysis (19.7%), and standardized phenotyping data integrated within the Undiagnosed Rare Disease Program of Catalonia Genome-Phenome Analysis Platform functionalities (18%). Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
11	teff: estimation of Treatment EFFects on transcriptomic data using causal random forest. Bioinformatics 2022;38:3124-3125. [DOI: 10.1093/bioinformatics/btac269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 04/07/2022] [Accepted: 04/11/2022] [Indexed: 11/14/2022] Open Abstract Abstract Motivation Causal inference on high dimensional feature data can be used to find a profile of patients who will benefit the most from treatment rather than no treatment. However, there is a need for usable implementations for transcriptomic data. We developed teff that applies random causal forest on gene expression data to target individuals with high expected treatment effects. Results We extracted a profile of high benefit of treating psoriasis with brodalumab and observed that it was associated with higher T cell abundance in non-lesional skin at baseline and a lower response for etanercept in an independent study. Individual patient targeting with causal inference profiling can inform patients on choosing between treatments before the intervention begins. Availability and Implementation teff is an R package available at https://teff-package.github.io Supplementary information Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
12	Identification of autosomal cis expression quantitative trait methylation (cis eQTMs) in children's blood. eLife 2022;11:65310. [PMID: 35302492 PMCID: PMC8933004 DOI: 10.7554/elife.65310] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 02/11/2022] [Indexed: 12/12/2022] Open Abstract Background The identification of expression quantitative trait methylation (eQTMs), defined as associations between DNA methylation levels and gene expression, might help the biological interpretation of epigenome-wide association studies (EWAS). We aimed to identify autosomal cis eQTMs in children's blood, using data from 832 children of the Human Early Life Exposome (HELIX) project. Methods Blood DNA methylation and gene expression were measured with the Illumina 450K and the Affymetrix HTA v2 arrays, respectively. The relationship between methylation levels and expression of nearby genes (1 Mb window centered at the transcription start site, TSS) was assessed by fitting 13.6 M linear regressions adjusting for sex, age, cohort, and blood cell composition. Results We identified 39,749 blood autosomal cis eQTMs, representing 21,966 unique CpGs (eCpGs, 5.7% of total CpGs) and 8,886 unique transcript clusters (eGenes, 15.3% of total transcript clusters, equivalent to genes). In 87.9% of these cis eQTMs, the eCpG was located at <250 kb from eGene's TSS; and 58.8% of all eQTMs showed an inverse relationship between the methylation and expression levels. Only around half of the autosomal cis-eQTMs eGenes could be captured through annotation of the eCpG to the closest gene. eCpGs had less measurement error and were enriched for active blood regulatory regions and for CpGs reported to be associated with environmental exposures or phenotypic traits. In 40.4% of the eQTMs, the CpG and the eGene were both associated with at least one genetic variant. The overlap of autosomal cis eQTMs in children's blood with those described in adults was small (13.8%), and age-shared cis eQTMs tended to be proximal to the TSS and enriched for genetic variants. Conclusions This catalogue of autosomal cis eQTMs in children's blood can help the biological interpretation of EWAS findings and is publicly available at https://helixomics.isglobal.org/ and at Dryad (doi:10.5061/dryad.fxpnvx0t0). Funding The study has received funding from the European Community's Seventh Framework Programme (FP7/2007-206) under grant agreement no 308333 (HELIX project); the H2020-EU.3.1.2. - Preventing Disease Programme under grant agreement no 874583 (ATHLETE project); from the European Union's Horizon 2020 research and innovation programme under grant agreement no 733206 (LIFECYCLE project), and from the European Joint Programming Initiative "A Healthy Diet for a Healthy Life" (JPI HDHL and Instituto de Salud Carlos III) under the grant agreement no AC18/00006 (NutriPROGRAM project). The genotyping was supported by the projects PI17/01225 and PI17/01935, funded by the Instituto de Salud Carlos III and co-funded by European Union (ERDF, "A way to make Europe") and the Centro Nacional de Genotipado-CEGEN (PRB2-ISCIII). BiB received core infrastructure funding from the Wellcome Trust (WT101597MA) and a joint grant from the UK Medical Research Council (MRC) and Economic and Social Science Research Council (ESRC) (MR/N024397/1). INMA data collections were supported by grants from the Instituto de Salud Carlos III, CIBERESP, and the Generalitat de Catalunya-CIRIT. KANC was funded by the grant of the Lithuanian Agency for Science Innovation and Technology (6-04-2014_31V-66). The Norwegian Mother, Father and Child Cohort Study is supported by the Norwegian Ministry of Health and Care Services and the Ministry of Education and Research. The Rhea project was financially supported by European projects (EU FP6-2003-Food-3-NewGeneris, EU FP6. STREP Hiwate, EU FP7 ENV.2007.1.2.2.2. Project No 211250 Escape, EU FP7-2008-ENV-1.2.1.4 Envirogenomarkers, EU FP7-HEALTH-2009- single stage CHICOS, EU FP7 ENV.2008.1.2.1.6. Proposal No 226285 ENRIECO, EU- FP7- HEALTH-2012 Proposal No 308333 HELIX), and the Greek Ministry of Health (Program of Prevention of obesity and neurodevelopmental disorders in preschool children, in Heraklion district, Crete, Greece: 2011-2014; "Rhea Plus": Primary Prevention Program of Environmental Risk Factors for Reproductive Health, and Child Health: 2012-15). We acknowledge support from the Spanish Ministry of Science and Innovation through the "Centro de Excelencia Severo Ochoa 2019-2023" Program (CEX2018-000806-S), and support from the Generalitat de Catalunya through the CERCA Program. MV-U and CR-A were supported by a FI fellowship from the Catalan Government (FI-DGR 2015 and #016FI_B 00272). MC received funding from Instituto Carlos III (Ministry of Economy and Competitiveness) (CD12/00563 and MS16/00128). Collapse Key Words DNA methylation blood children eQTM epidemiology epigenetics genetics genomics global health human transcription Collapse MESH Headings Collapse Grants Collapse
13	Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure. Brief Bioinform 2022;23:6535682. [PMID: 35211719 PMCID: PMC8921734 DOI: 10.1093/bib/bbac043] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 01/25/2022] [Accepted: 01/28/2022] [Indexed: 12/12/2022] Open Abstract Single nucleotide polymorphisms (SNPs) are the most abundant type of genomic variation and the most accessible to genotype in large cohorts. However, they individually explain a small proportion of phenotypic differences between individuals. Ancestry, collective SNP effects, structural variants, somatic mutations or even differences in historic recombination can potentially explain a high percentage of genomic divergence. These genetic differences can be infrequent or laborious to characterize; however, many of them leave distinctive marks on the SNPs across the genome allowing their study in large population samples. Consequently, several methods have been developed over the last decade to detect and analyze different genomic structures using SNP arrays, to complement genome-wide association studies and determine the contribution of these structures to explain the phenotypic differences between individuals. We present an up-to-date collection of available bioinformatics tools that can be used to extract relevant genomic information from SNP array data including population structure and ancestry; polygenic risk scores; identity-by-descent fragments; linkage disequilibrium; heritability and structural variants such as inversions, copy number variants, genetic mosaicisms and recombination histories. From a systematic review of recently published applications of the methods, we describe the main characteristics of R packages, command-line tools and desktop applications, both free and commercial, to help make the most of a large amount of publicly available SNP data. Collapse Key Words GWAS SNP arrays bioinformatic methods genomic structures software structural variants Collapse MESH Headings Collapse Grants Collapse
14	Meta-analysis of epigenome-wide association studies in newborns and children show widespread sex differences in blood DNA methylation. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2022;789:108415. [PMID: 35690418 PMCID: PMC9623595 DOI: 10.1016/j.mrrev.2022.108415] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 02/27/2022] [Accepted: 03/08/2022] [Indexed: 11/26/2022] Abstract BACKGROUND Among children, sex-specific differences in disease prevalence, age of onset, and susceptibility have been observed in health conditions including asthma, immune response, metabolic health, some pediatric and adult cancers, and psychiatric disorders. Epigenetic modifications such as DNA methylation may play a role in the sexual differences observed in diseases and other physiological traits. METHODS We performed a meta-analysis of the association of sex and cord blood DNA methylation at over 450,000 CpG sites in 8438 newborns from 17 cohorts participating in the Pregnancy And Childhood Epigenetics (PACE) Consortium. We also examined associations of child sex with DNA methylation in older children ages 5.5-10 years from 8 cohorts (n = 4268). RESULTS In newborn blood, sex was associated at Bonferroni level significance with differences in DNA methylation at 46,979 autosomal CpG sites (p < 1.3 × 10-7) after adjusting for white blood cell proportions and batch. Most of those sites had lower methylation levels in males than in females. Of the differentially methylated CpG sites identified in newborn blood, 68% (31,727) met look-up level significance (p < 1.1 × 10-6) in older children and had methylation differences in the same direction. CONCLUSIONS This is a large-scale meta-analysis examining sex differences in DNA methylation in newborns and older children. Expanding upon previous studies, we replicated previous findings and identified additional autosomal sites with sex-specific differences in DNA methylation. Differentially methylated sites were enriched in genes involved in cancer, psychiatric disorders, and cardiovascular phenotypes. Collapse Key Words Children Cord blood DNA methylation EWAS Sex Collapse MESH Headings Adolescent Child DNA Methylation/genetics Epigenesis, Genetic Epigenome Epigenomics Female Humans Infant, Newborn Male Pregnancy Sex Characteristics Collapse Grants Z01 ES025045 Intramural NIH HHS P30 ES017885 NIEHS NIH HHS MR/S036520/1 Medical Research Council MR/S009310/1 Medical Research Council 001 World Health Organization Z01 ES049019 Intramural NIH HHS Collapse
15	The early-life exposome and epigenetic age acceleration in children. ENVIRONMENT INTERNATIONAL 2021;155:106683. [PMID: 34144479 DOI: 10.1016/j.envint.2021.106683] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 06/01/2021] [Accepted: 06/01/2021] [Indexed: 06/12/2023] Abstract The early-life exposome influences future health and accelerated biological aging has been proposed as one of the underlying biological mechanisms. We investigated the association between more than 100 exposures assessed during pregnancy and in childhood (including indoor and outdoor air pollutants, built environment, green environments, tobacco smoking, lifestyle exposures, and biomarkers of chemical pollutants), and epigenetic age acceleration in 1,173 children aged 7 years old from the Human Early-Life Exposome project. Age acceleration was calculated based on Horvath's Skin and Blood clock using child blood DNA methylation measured by Infinium HumanMethylation450 BeadChips. We performed an exposure-wide association study between prenatal and childhood exposome and age acceleration. Maternal tobacco smoking during pregnancy was nominally associated with increased age acceleration. For childhood exposures, indoor particulate matter absorbance (PM_abs) and parental smoking were nominally associated with an increase in age acceleration. Exposure to the organic pesticide dimethyl dithiophosphate and the persistent pollutant polychlorinated biphenyl-138 (inversely associated with child body mass index) were protective for age acceleration. None of the associations remained significant after multiple-testing correction. Pregnancy and childhood exposure to tobacco smoke and childhood exposure to indoor PM_abs may accelerate epigenetic aging from an early age. Collapse Key Words Aging Childhood Environmental exposures Epigenetic age acceleration Pregnancy Collapse MESH Headings Acceleration Child DNA Methylation Environmental Exposure Environmental Pollutants/analysis Environmental Pollutants/toxicity Epigenesis, Genetic Exposome Female Humans Pregnancy Collapse Grants MR/S03532X/1 Medical Research Council WT101597MA Wellcome Trust MR/N024397/1 Medical Research Council Collapse
16	methylclock: a Bioconductor package to estimate DNA methylation age. Bioinformatics 2021;37:1759-1760. [PMID: 32960939 DOI: 10.1093/bioinformatics/btaa825] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 07/23/2020] [Accepted: 09/08/2020] [Indexed: 11/12/2022] Open Abstract MOTIVATION Ageing is a biological and psychosocial process related to diseases and mortality. It correlates with changes in DNA methylation (DNAm) in all human tissues. Therefore, epigenetic markers can be used to estimate biological age using DNAm profiling across tissues. RESULTS We developed a Bioconductor package that allows computation of several existing DNAm adult/childhood and gestational age clocks. Functions to visualize the DNAm age prediction versus chronological age and the correlation between DNAm clocks are also available as well as other features, such as missing data imputation of cell types' estimates, that are required for DNAm age clocks. AVAILABILITY AND IMPLEMENTATION https://github.com/isglobal-brge/methylclock. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
17	Variability of multi-omics profiles in a population-based child cohort. BMC Med 2021;19:166. [PMID: 34289836 PMCID: PMC8296694 DOI: 10.1186/s12916-021-02027-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 06/08/2021] [Indexed: 12/17/2022] Open Abstract BACKGROUND Multiple omics technologies are increasingly applied to detect early, subtle molecular responses to environmental stressors for future disease risk prevention. However, there is an urgent need for further evaluation of stability and variability of omics profiles in healthy individuals, especially during childhood. METHODS We aimed to estimate intra-, inter-individual and cohort variability of multi-omics profiles (blood DNA methylation, gene expression, miRNA, proteins and serum and urine metabolites) measured 6 months apart in 156 healthy children from five European countries. We further performed a multi-omics network analysis to establish clusters of co-varying omics features and assessed the contribution of key variables (including biological traits and sample collection parameters) to omics variability. RESULTS All omics displayed a large range of intra- and inter-individual variability depending on each omics feature, although all presented a highest median intra-individual variability. DNA methylation was the most stable profile (median 37.6% inter-individual variability) while gene expression was the least stable (6.6%). Among the least stable features, we identified 1% cross-omics co-variation between CpGs and metabolites (e.g. glucose and CpGs related to obesity and type 2 diabetes). Explanatory variables, including age and body mass index (BMI), explained up to 9% of serum metabolite variability. CONCLUSIONS Methylation and targeted serum metabolomics are the most reliable omics to implement in single time-point measurements in large cross-sectional studies. In the case of metabolomics, sample collection and individual traits (e.g. BMI) are important parameters to control for improved comparability, at the study design or analysis stage. This study will be valuable for the design and interpretation of epidemiological studies that aim to link omics signatures to disease, environmental exposures, or both. Collapse Key Words Children Cross-omics DNA methylation Exposome Metabolomics Multi-omics Population study Variability mRNA miRNA Collapse MESH Headings Child Cohort Studies Cross-Sectional Studies DNA Methylation Diabetes Mellitus, Type 2 Humans MicroRNAs Collapse Grants MR/N024397/1 Medical Research Council H2020 Health Collapse
18	Extreme Downregulation of Chromosome Y and Cancer Risk in Men. J Natl Cancer Inst 2021;112:913-920. [PMID: 31945786 DOI: 10.1093/jnci/djz232] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 10/31/2019] [Accepted: 12/11/2019] [Indexed: 12/14/2022] Open Abstract BACKGROUND Understanding the biological differences between sexes in cancer is essential for personalized treatment and prevention. We hypothesized that the extreme downregulation of chromosome Y gene expression (EDY) is a signature of cancer risk in men and the functional mediator of the reported association between the mosaic loss of chromosome Y (LOY) and cancer. METHODS We advanced a method to measure EDY from transcriptomic data. We studied EDY across 47 nondiseased tissues from the Genotype Tissue-Expression Project (n = 371) and its association with cancer status across 12 cancer studies from The Cancer Genome Atlas (n = 1774) and seven other studies (n = 7562). Associations of EDY with cancer status and presence of loss-off function mutations in chromosome X were tested with logistic regression models, and a Fisher's test was used to assess genome-wide association of EDY with the proportion of copy number gains. All statistical tests were two-sided. RESULTS EDY was likely to occur in multiple nondiseased tissues (P < .001) and was statistically significantly associated with the EGFR tyrosine kinase inhibitor resistance pathway (false discovery rate = 0.028). EDY strongly associated with cancer risk in men (odds ratio [OR] = 3.66, 95% confidence interval [CI] = 1.58 to 8.46, P = .002), adjusted by LOY and age, and its variability was largely explained by several genes of the nonrecombinant region whose chromosome X homologs showed loss-of-function mutations that co-occurred with EDY during cancer (OR = 2.82, 95% CI = 1.32 to 6.01, P = .007). EDY associated with a high proportion of EGFR amplifications (OR = 5.64, 95% CI = 3.70 to 8.59, false discovery rate < 0.001) and EGFR overexpression along with SRY hypomethylation and nonrecombinant region hypermethylation, indicating alternative causes of EDY in cancer other than LOY. EDY associations were independently validated for different cancers and exposure to smoking, and its status was accurately predicted from individual methylation patterns. CONCLUSIONS EDY is a male-specific signature of cancer susceptibility that supports the escape from X-inactivation tumor suppressor hypothesis for genes that protect women compared with men from cancer risk. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
19	Publisher Correction: MLIP genotype as a predictor of pharmacological response in primary open-angle glaucoma and ocular hypertension. Sci Rep 2021;11:8237. [PMID: 33837244 PMCID: PMC8035325 DOI: 10.1038/s41598-021-87653-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
20	Orchestrating privacy-protected big data analyses of data from different resources with R and DataSHIELD. PLoS Comput Biol 2021;17:e1008880. [PMID: 33784300 PMCID: PMC8034722 DOI: 10.1371/journal.pcbi.1008880] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 04/09/2021] [Accepted: 03/17/2021] [Indexed: 01/31/2023] Open Abstract Combined analysis of multiple, large datasets is a common objective in the health- and biosciences. Existing methods tend to require researchers to physically bring data together in one place or follow an analysis plan and share results. Developed over the last 10 years, the DataSHIELD platform is a collection of R packages that reduce the challenges of these methods. These include ethico-legal constraints which limit researchers' ability to physically bring data together and the analytical inflexibility associated with conventional approaches to sharing results. The key feature of DataSHIELD is that data from research studies stay on a server at each of the institutions that are responsible for the data. Each institution has control over who can access their data. The platform allows an analyst to pass commands to each server and the analyst receives results that do not disclose the individual-level data of any study participants. DataSHIELD uses Opal which is a data integration system used by epidemiological studies and developed by the OBiBa open source project in the domain of bioinformatics. However, until now the analysis of big data with DataSHIELD has been limited by the storage formats available in Opal and the analysis capabilities available in the DataSHIELD R packages. We present a new architecture ("resources") for DataSHIELD and Opal to allow large, complex datasets to be used at their original location, in their original format and with external computing facilities. We provide some real big data analysis examples in genomics and geospatial projects. For genomic data analyses, we also illustrate how to extend the resources concept to address specific big data infrastructures such as GA4GH or EGA, and make use of shell commands. Our new infrastructure will help researchers to perform data analyses in a privacy-protected way from existing data sharing initiatives or projects. To help researchers use this framework, we describe selected packages and present an online book (https://isglobal-brge.github.io/resource_bookdown). Collapse Key Words Collapse MESH Headings Big Data Computer Security Databases, Factual Genomics Geographic Information Systems Humans Software Collapse Grants Wellcome Trust 108439/A/15/Z Medical Research Council 108439/A/15/Z Wellcome Trust Collapse
21	MLIP genotype as a predictor of pharmacological response in primary open-angle glaucoma and ocular hypertension. Sci Rep 2021;11:1583. [PMID: 33452295 PMCID: PMC7810753 DOI: 10.1038/s41598-020-80954-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Accepted: 12/24/2020] [Indexed: 11/20/2022] Open Abstract Predicting the therapeutic response to ocular hypotensive drugs is crucial for the clinical treatment and management of glaucoma. Our aim was to identify a possible genetic contribution to the response to current pharmacological treatments of choice in a white Mediterranean population with primary open-angle glaucoma (POAG) or ocular hypertension (OH). We conducted a prospective, controlled, randomized, partial crossover study that included 151 patients of both genders, aged 18 years and older, diagnosed with and requiring pharmacological treatment for POAG or OH in one or both eyes. We sought to identify copy number variants (CNVs) associated with differences in pharmacological response, using a DNA pooling strategy of carefully phenotyped treatment responders and non-responders, treated for a minimum of 6 weeks with a beta-blocker (timolol maleate) and/or prostaglandin analog (latanoprost). Diurnal intraocular pressure reduction and comparative genome wide CNVs were analyzed. Our finding that copy number alleles of an intronic portion of the MLIP gene is a predictor of pharmacological response to beta blockers and prostaglandin analogs could be used as a biomarker to guide first-tier POAG and OH treatment. Our finding improves understanding of the genetic factors modulating pharmacological response in POAG and OH, and represents an important contribution to the establishment of a personalized approach to the treatment of glaucoma. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
22	Urinary metabolite quantitative trait loci in children and their interaction with dietary factors. Hum Mol Genet 2020;29:3830-3844. [PMID: 33283231 DOI: 10.1093/hmg/ddaa257] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 11/26/2020] [Accepted: 11/30/2020] [Indexed: 11/14/2022] Open Abstract Human metabolism is influenced by genetic and environmental factors. Previous studies have identified over 23 loci associated with more than 26 urine metabolites levels in adults, which are known as urinary metabolite quantitative trait loci (metabQTLs). The aim of the present study is the identification for the first time of urinary metabQTLs in children and their interaction with dietary patterns. Association between genome-wide genotyping data and 44 urine metabolite levels measured by proton nuclear magnetic resonance spectroscopy was tested in 996 children from the Human Early Life Exposome project. Twelve statistically significant urine metabQTLs were identified, involving 11 unique loci and 10 different metabolites. Comparison with previous findings in adults revealed that six metabQTLs were already known, and one had been described in serum and three were involved the same locus as other reported metabQTLs but had different urinary metabolites. The remaining two metabQTLs represent novel urine metabolite-locus associations, which are reported for the first time in this study [single nucleotide polymorphism (SNP) rs12575496 for taurine, and the missense SNP rs2274870 for 3-hydroxyisobutyrate]. Moreover, it was found that urinary taurine levels were affected by the combined action of genetic variation and dietary patterns of meat intake as well as by the interaction of this SNP with beverage intake dietary patterns. Overall, we identified 12 urinary metabQTLs in children, including two novel associations. While a substantial part of the identified loci affected urinary metabolite levels both in children and in adults, the metabQTL for taurine seemed to be specific to children and interacted with dietary patterns. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
23	Female-specific risk of Alzheimer's disease is associated with tau phosphorylation processes: A transcriptome-wide interaction analysis. Neurobiol Aging 2020;96:104-108. [PMID: 32977080 DOI: 10.1016/j.neurobiolaging.2020.08.020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Revised: 08/25/2020] [Accepted: 08/25/2020] [Indexed: 01/09/2023] Abstract The levels of tau phosphorylation differ between sexes in Alzheimer's disease (AD). Transcriptome-wide associations of sex by disease interaction could indicate whether specific genes underlie sex differences in tau pathology; however, no such study has been reported yet. We report the first analysis of the effect of the interaction between disease status and sex on differential gene expression, meta-analyzing transcriptomic data from the 3 largest publicly available case-control studies (N = 785) in the brain to date. A total of 128 genes, significantly associated with sex-AD interactions, were enriched in phosphoproteins (false discovery rate (FDR) = 0.001). High and consistent associations were found for the overexpressions of NCL (FDR = 0.002), whose phosphorylated protein generates an epitope against neurofibrillary tangles and KIF2A (FDR = 0.005), a microtubule-associated motor protein gene. Transcriptome-wide interaction analyses suggest sex-modulated tau phosphorylation, at sites like Thr231, Ser199, or Ser202 that could increase the risk of women to AD and indicate sex-specific strategies for intervention and prevention. Collapse Key Words Alzheimer's disease Differential expression Expression array Female risk Gene expression KIF2A NCL Neurofibrillary tangles Phosphorylation Sexual dimorphism Tau Transcriptome Transcriptome-wide interaction analysis Collapse MESH Headings Alzheimer Disease/etiology Alzheimer Disease/genetics Epitopes Female Gene Expression Profiling Genetic Association Studies Humans Kinesins/genetics Male Neurofibrillary Tangles/genetics Phosphoproteins/genetics Phosphoproteins/metabolism Phosphorylation/genetics RNA-Binding Proteins/genetics RNA-Binding Proteins/metabolism Risk Sex Characteristics Transcriptome/genetics tau Proteins/metabolism Nucleolin Collapse Grants Collapse
24	MADloy: robust detection of mosaic loss of chromosome Y from genotype-array-intensity data. BMC Bioinformatics 2020;21:533. [PMID: 33225898 PMCID: PMC7682048 DOI: 10.1186/s12859-020-03768-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Accepted: 09/20/2020] [Indexed: 12/19/2022] Open Abstract BACKGROUND Accurate protocols and methods to robustly detect the mosaic loss of chromosome Y (mLOY) are needed given its reported role in cancer, several age-related disorders and overall male mortality. Intensity SNP-array data have been used to infer mLOY status and to determine its prominent role in male disease. However, discrepancies of reported findings can be due to the uncertainty and variability of the methods used for mLOY detection and to the differences in the tissue-matrix used. RESULTS We created a publicly available software tool called MADloy (Mosaic Alteration Detection for LOY) that incorporates existing methods and includes a new robust approach, allowing efficient calling in large studies and comparisons between methods. MADloy optimizes mLOY calling by correctly modeling the underlying reference population with no-mLOY status and incorporating B-deviation information. We observed improvements in the calling accuracy to previous methods, using experimentally validated samples, and an increment in the statistical power to detect associations with disease and mortality, using simulation studies and real dataset analyses. To understand discrepancies in mLOY detection across different tissues, we applied MADloy to detect the increment of mLOY cellularity in blood on 18 individuals after 3 years and to confirm that its detection in saliva was sub-optimal (41%). We additionally applied MADloy to detect the down-regulation genes in the chromosome Y in kidney and bladder tumors with mLOY, and to perform pathway analyses for the detection of mLOY in blood. CONCLUSIONS MADloy is a new software tool implemented in R for the easy and robust calling of mLOY status across different tissues aimed to facilitate its study in large epidemiological studies. Collapse Key Words Bioconductor Loss of chromosome Y SNP array Collapse MESH Headings Collapse Grants Collapse
25	Identifying chromosomal subpopulations based on their recombination histories advances the study of the genetic basis of phenotypic traits. Genome Res 2020;30:1802-1814. [PMID: 33203765 PMCID: PMC7706724 DOI: 10.1101/gr.258301.119] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 10/22/2020] [Indexed: 02/06/2023] Abstract Recombination is a main source of genetic variability. However, the potential role of the variation generated by recombination in phenotypic traits, including diseases, remains unexplored because there is currently no method to infer chromosomal subpopulations based on recombination pattern differences. We developed recombClust, a method that uses SNP-phased data to detect differences in historic recombination in a chromosome population. We validated our method by performing simulations and by using real data to accurately predict the alleles of well-known recombination modifiers, including common inversions in Drosophila melanogaster and human, and the chromosomes under selective pressure at the lactase locus in humans. We then applied recombClust to the complex human 1q21.1 region, where nonallelic homologous recombination produces deleterious phenotypes. We discovered and validated the presence of two different recombination histories in these regions that significantly associated with the differential expression of ANKRD35 in whole blood and that were in high linkage with variants previously associated with hypertension. By detecting differences in historic recombination, our method opens a way to assess the influence of recombination variation in phenotypic traits. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
26	In utero and childhood exposure to tobacco smoke and multi-layer molecular signatures in children. BMC Med 2020;18:243. [PMID: 32811491 PMCID: PMC7437049 DOI: 10.1186/s12916-020-01686-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 06/29/2020] [Indexed: 02/08/2023] Open Abstract BACKGROUND The adverse health effects of early life exposure to tobacco smoking have been widely reported. In spite of this, the underlying molecular mechanisms of in utero and postnatal exposure to tobacco smoke are only partially understood. Here, we aimed to identify multi-layer molecular signatures associated with exposure to tobacco smoke in these two exposure windows. METHODS We investigated the associations of maternal smoking during pregnancy and childhood secondhand smoke (SHS) exposure with molecular features measured in 1203 European children (mean age 8.1 years) from the Human Early Life Exposome (HELIX) project. Molecular features, covering 4 layers, included blood DNA methylation and gene and miRNA transcription, plasma proteins, and sera and urinary metabolites. RESULTS Maternal smoking during pregnancy was associated with DNA methylation changes at 18 loci in child blood. DNA methylation at 5 of these loci was related to expression of the nearby genes. However, the expression of these genes themselves was only weakly associated with maternal smoking. Conversely, childhood SHS was not associated with blood DNA methylation or transcription patterns, but with reduced levels of several serum metabolites and with increased plasma PAI1 (plasminogen activator inhibitor-1), a protein that inhibits fibrinolysis. Some of the in utero and childhood smoking-related molecular marks showed dose-response trends, with stronger effects with higher dose or longer duration of the exposure. CONCLUSION In this first study covering multi-layer molecular features, pregnancy and childhood exposure to tobacco smoke were associated with distinct molecular phenotypes in children. The persistent and dose-dependent changes in the methylome make CpGs good candidates to develop biomarkers of past exposure. Moreover, compared to methylation, the weak association of maternal smoking in pregnancy with gene expression suggests different reversal rates and a methylation-based memory to past exposures. Finally, certain metabolites and protein markers evidenced potential early biological effects of postnatal SHS, such as fibrinolysis. Collapse Key Words Children DNA methylation Metabolomics Molecular phenotypes Omics Pregnancy Secondhand smoke Tobacco smoking Transcription miRNA Collapse MESH Headings Adolescent Biomarkers/blood Child Child, Preschool DNA Methylation/genetics Female Humans Infant Infant, Newborn Male Pregnancy Prenatal Exposure Delayed Effects/chemically induced Tobacco Smoke Pollution/adverse effects Collapse Grants MR/S03532X/1 Medical Research Council WT101597MA Wellcome Trust MR/N024397/1 Medical Research Council European Community’s Seventh Framework Programme (FP7/2007-206) Collapse
27	Polymorphic Inversions Underlie the Shared Genetic Susceptibility of Obesity-Related Diseases. Am J Hum Genet 2020;106:846-858. [PMID: 32470372 DOI: 10.1016/j.ajhg.2020.04.017] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 04/28/2020] [Indexed: 11/25/2022] Open Abstract The burden of several common diseases including obesity, diabetes, hypertension, asthma, and depression is increasing in most world populations. However, the mechanisms underlying the numerous epidemiological and genetic correlations among these disorders remain largely unknown. We investigated whether common polymorphic inversions underlie the shared genetic influence of these disorders. We performed an inversion association analysis including 21 inversions and 25 obesity-related traits on a total of 408,898 Europeans and validated the results in 67,299 independent individuals. Seven inversions were associated with multiple diseases while inversions at 8p23.1, 16p11.2, and 11q13.2 were strongly associated with the co-occurrence of obesity with other common diseases. Transcriptome analysis across numerous tissues revealed strong candidate genes for obesity-related traits. Analyses in human pancreatic islets indicated the potential mechanism of inversions in the susceptibility of diabetes by disrupting the cis-regulatory effect of SNPs from their target genes. Our data underscore the role of inversions as major genetic contributors to the joint susceptibility to common complex diseases. Collapse Key Words asthma common diseases diabetes disease co-occurrence genetic inversions genomic variation human traits hypertension obesity obesity-related diseases Collapse MESH Headings Collapse Grants Collapse
28	Independent Multiple Factor Association Analysis for Multiblock Data in Imaging Genetics. Neuroinformatics 2020;17:583-592. [PMID: 30903541 DOI: 10.1007/s12021-019-09416-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Abstract Multivariate methods have the potential to better capture complex relationships that may exist between different biological levels. Multiple Factor Analysis (MFA) is one of the most popular methods to obtain factor scores and measures of discrepancy between data sets. However, singular value decomposition in MFA is based on PCA, which is adequate only if the data is normally distributed, linear or stationary. In addition, including strongly correlated variables can overemphasize the contribution of the estimated components. In this work, we introduced a novel method referred as Independent Multifactorial Analysis (ICA-MFA) to derive relevant features from multiscale data. This method is an extended implementation of MFA, where the component value decomposition is based on Independent Component Analysis. In addition, ICA-MFA incorporates a predictive step based on an Independent Component Regression. We evaluated and compared the performance of ICA-MFA with both, the MFA method and traditional univariate analyses, in a simulation study. We showed how ICA-MFA explained up to 10-fold more variance than MFA and univariate methods. We applied the proposed algorithm in a study of 4057 individuals belonging to the population-based Rotterdam Study with available genetic and neuroimaging data, as well as information about executive cognitive functioning. Specifically, we used ICA-MFA to detect relevant genetic features related to structural brain regions, which in turn were involved, in the mechanisms of executive cognitive function. The proposed strategy makes it possible to determine the degree to which the whole set of genetic and/or neuroimaging markers contribute to the variability of the symptomatology jointly, rather than individually. While univariate results and MFA combinations only explained a limited proportion of variance (less than 2%), our method increased the explained variance (10%) and allowed the identification of significant components that maximize the variance explained in the model. The potential application of the ICA-MFA algorithm constitutes an important aspect of integrating multivariate multiscale data, specifically in the field of Neurogenetics. Collapse Key Words Data integration ICA-MFA Imaging genetics Modelling Neurogenetics Collapse MESH Headings Collapse Grants Collapse
29	Dose and time effects of solar-simulated ultraviolet radiation on the in vivo human skin transcriptome. Br J Dermatol 2019;182:1458-1468. [PMID: 31529490 PMCID: PMC7318624 DOI: 10.1111/bjd.18527] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/10/2019] [Indexed: 12/18/2022] Abstract Background Terrestrial ultraviolet (UV) radiation causes erythema, oxidative stress, DNA mutations and skin cancer. Skin can adapt to these adverse effects by DNA repair, apoptosis, keratinization and tanning. Objectives To investigate the transcriptional response to fluorescent solar‐simulated radiation (FSSR) in sun‐sensitive human skin in vivo. Methods Seven healthy male volunteers were exposed to 0, 3 and 6 standard erythemal doses (SED). Skin biopsies were taken at 6 h and 24 h after exposure. Gene and microRNA expression were quantified with next generation sequencing. A set of candidate genes was validated by quantitative polymerase chain reaction (qPCR); and wavelength dependence was examined in other volunteers through microarrays. Results The number of differentially expressed genes increased with FSSR dose and decreased between 6 and 24 h. Six hours after 6 SED, 4071 genes were differentially expressed, but only 16 genes were affected at 24 h after 3 SED. Genes for apoptosis and keratinization were prominent at 6 h, whereas inflammation and immunoregulation genes were predominant at 24 h. Validation by qPCR confirmed the altered expression of nine genes detected under all conditions; genes related to DNA repair and apoptosis; immunity and inflammation; pigmentation; and vitamin D synthesis. In general, candidate genes also responded to UVA1 (340–400 nm) and/or UVB (300 nm), but with variations in wavelength dependence and peak expression time. Only four microRNAs were differentially expressed by FSSR. Conclusions The UV radiation doses of this acute study are readily achieved daily during holidays in the sun, suggesting that the skin transcriptional profile of ‘typical’ holiday makers is markedly deregulated. What's already known about this topic? The skin's transcriptional profile underpins its adverse (i.e. inflammation) and adaptive molecular, cellular and clinical responses (i.e. tanning, hyperkeratosis) to solar ultraviolet radiation. Few studies have assessed microRNA and gene expression in vivo in humans, and there is a lack of information on dose, time and waveband effects. What does this study add? Acute doses of fluorescent solar‐simulated radiation (FSSR), of similar magnitude to those received daily in holiday situations, markedly altered the skin's transcriptional profiles. The number of differentially expressed genes was FSSR‐dose‐dependent, reached a peak at 6 h and returned to baseline at 24 h. The initial transcriptional response involved apoptosis and keratinization, followed by inflammation and immune modulation. In these conditions, microRNA expression was less affected than gene expression. Linked Comment:Hart. Br J Dermatol 2020; 182:1328–1329. Plain language summary available online Respond to this article Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
30	Common polymorphic inversions at 17q21.31 and 8p23.1 associate with cancer prognosis. Hum Genomics 2019;13:57. [PMID: 31753042 PMCID: PMC6873427 DOI: 10.1186/s40246-019-0242-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2019] [Accepted: 10/09/2019] [Indexed: 12/18/2022] Open Abstract BACKGROUND Chromosomal inversions are structural genetic variants where a chromosome segment changes its orientation. While sporadic de novo inversions are known genetic risk factors for cancer susceptibility, it is unknown if common polymorphic inversions are also associated with the prognosis of common tumors, as they have been linked to other complex diseases. We studied the association of two well-characterized human inversions at 17q21.31 and 8p23.1 with the prognosis of lung, liver, breast, colorectal, and stomach cancers. RESULTS Using data from The Cancer Genome Atlas (TCGA), we observed that inv8p23.1 was associated with overall survival in breast cancer and that inv17q21.31 was associated with overall survival in stomach cancer. In the meta-analysis of two independent studies, inv17q21.31 heterozygosity was significantly associated with colorectal disease-free survival. We found that the association was mediated by the de-methylation of cg08283464 and cg03999934, also linked to lower disease-free survival. CONCLUSIONS Our results suggest that chromosomal inversions are important genetic factors of tumor prognosis, likely affecting changes in methylation patterns. Collapse Key Words Cancer prognosis Chromosomal inversions DNA methylation Gene expression Genetic epidemiology Collapse MESH Headings Adolescent Adult Aged Aged, 80 and over Chromosome Inversion/genetics Chromosomes, Human, Pair 17/genetics Chromosomes, Human, Pair 8/genetics CpG Islands/genetics DNA Methylation/genetics Disease-Free Survival Female Gene Expression Regulation, Neoplastic Genetic Association Studies Genetic Predisposition to Disease Humans Male Middle Aged Neoplasms/genetics Polymorphism, Genetic Prognosis Young Adult Collapse Grants MTM2015-68140-R Ministerio de Economía, Industria y Competitividad, Gobierno de España (ES) #016FI_B 00272 Agència de Gestió d'Ajuts Universitaris i de Recerca Agència de Gestió d’Ajuts Universitaris i de Recerca Collapse
31	scoreInvHap: Inversion genotyping for genome-wide association studies. PLoS Genet 2019;15:e1008203. [PMID: 31269027 PMCID: PMC6608898 DOI: 10.1371/journal.pgen.1008203] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 05/17/2019] [Indexed: 02/02/2023] Open Abstract Polymorphic inversions contribute to adaptation and phenotypic variation. However, large multi-centric association studies of inversions remain challenging. We present scoreInvHap, a method to genotype inversions from SNP data for genome-wide association studies (GWASs), overcoming important limitations of current methods and outperforming them in accuracy and applicability. scoreInvHap calls individual inversion-genotypes from a similarity score to the SNPs of experimentally validated references. It can be used on different sources of SNP data, including those with low SNP coverage such as exome sequencing, and is easily adaptable to genotype new inversions, either in humans or in other species. We present 20 human inversions that can be reliably and easily genotyped with scoreInvHap to discover their role in complex human traits, and illustrate a first genome-wide association study of experimentally-validated human inversions. scoreInvHap is implemented in R and it is freely available from Bioconductor. Chromosomal inversions are structural variants consisting on an orientation change of a chromosome segment. Inversions have been linked to some phenotypic differences between individuals and to genetic divergence. However, their overall contribution to complex diseases is largely underdetermined as there are no high-throughput methods to call inversion-genotypes in large cohort studies. Here, we propose a new method, scoreInvHap, to call individual inversion genotypes from their haplotype similarity. We show that scoreInvHap has a high performance when analyzing heterogeneous sources of SNP data. Our current implementation contains 20 human inversions that can be readily genotyped in existing GWAS datasets. We exemplify the utility of scoreInvHap by running the first-genome wide association of experimentally validated inversions and a multi-centric inversion association study. All in all, scoreInvHap can substantially contribute to increase our knowledge of the role of chromosomal inversions in complex diseases by re-analyzing data from existing genetic association studies. Collapse Key Words Collapse MESH Headings Genome-Wide Association Study/methods Genotyping Techniques Humans Polymorphism, Single Nucleotide Sequence Inversion Software Collapse Grants Ministerio de Economía y Competitividad Agència de Gestió d’Ajuts Universitaris i de Recerca Collapse
32	Assessment of Susceptibility Risk Factors for ADHD in Imaging Genetic Studies. J Atten Disord 2019;23:671-681. [PMID: 27535943 DOI: 10.1177/1087054716664408] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Abstract OBJECTIVE ADHD consists of a count of symptoms that often presents heterogeneity due to overdispersion and excess of zeros. Statistical inference is usually based on a dichotomous outcome that is underpowered. The main goal of this study was to determine a suited probability distribution to analyze ADHD symptoms in Imaging Genetic studies. METHOD We used two independent population samples of children to evaluate the consistency of the standard probability distributions based on count data for describing ADHD symptoms. RESULTS We showed that the zero-inflated negative binomial (ZINB) distribution provided the best power for modeling ADHD symptoms. ZINB reveals a genetic variant, rs273342 (Microtubule-Associated Protein [MAPRE2]), associated with ADHD ( p value = 2.73E-05). This variant was also associated with perivascular volumes (Virchow-Robin spaces; p values < 1E-03). No associations were found when using dichotomous definition. CONCLUSION We suggest that an appropriate modeling of ADHD symptoms increases statistical power to establish significant risk factors. Collapse Key Words ADHD symptoms Imaging Genetics MAPRE2 Virchow–Robin space basal ganglia perivascular volumes childhood count data zero-inflated negative binomial Collapse MESH Headings Collapse Grants Collapse
33	When pitch adds to volume: coregulation of transcript diversity predicts gene function. BMC Genomics 2018;19:926. [PMID: 30545302 PMCID: PMC6293560 DOI: 10.1186/s12864-018-5263-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Accepted: 11/19/2018] [Indexed: 11/16/2022] Open Abstract Background Genes corregulate their overall transcript volumes to perform their physiological functions. However, it is unknown if they additionally coregulate their transcript diversities. We studied the reliability, consistency and functional associations of co-splicing correlations of genes of interest, across two independent studies, multiple tissues and two statistical methods. We thoroughly investigated the reproducibility of co-splicing correlations of APP, the candidate gene of Azheimer’s disease (AD). We then studied how co-splicing correlations in different tissues contributed to predict functional interactions of three other genes and finally computed co-splicing frequency for 17 thousand genes across 52 human tissues. Results We replicated co-splicing correlations between APP and 5 AD-related genes and reproduced expected enrichment of APP co-splicing in synaptic vesicle cycle and proteosome pathways. We observed novel associations for tissue vulnerability to disease with enrichment in APP co-splicing, co-expression and epistasis in AD. APP co-splicing was the strongest predictor and replicated between studies. We confirmed known gene interactions of PRPF8 and GRIA1 in testis and brain cortex, and observed a novel interaction of FGFR2, in breast and prostate, modulated by cancer risk-variants. We produced a co-splicing map across 52 human tissues to help predict the function of over 17 thousand genes. Conclusions We show that coregulation of transcript diversities provides novel biological insights in gene physiology and helps to interpret GWAS results. Co-splicing correlations are reliable and frequent and should be further pursued to help predict gene function. Our results additionally support current AD interventions aiming at the ubiquitin proteosome pathway but unveil the need to consider transcript diversity in addition to volume to assess treatment response and susceptibility to the disease. Electronic supplementary material The online version of this article (10.1186/s12864-018-5263-z) contains supplementary material, which is available to authorized users. Collapse Key Words Alternative splicing Alzheimer’s disease Co-splicing Epistasis Gene function prediction RNA-sequencing Transcription diversity Transcriptome Collapse MESH Headings Collapse Grants Collapse
34	A systemic approach to identify signaling pathways activated during short-term exposure to traffic-related urban air pollution from human blood. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2018;25:29572-29583. [PMID: 30141164 DOI: 10.1007/s11356-018-3009-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Accepted: 08/17/2018] [Indexed: 06/08/2023] Abstract The molecular mechanisms that promote pathologic alterations in human physiology mediated by short-term exposure to traffic pollutants remains not well understood. This work was to develop mechanistic networks to determine which specific pathways are activated by real-world exposures of traffic-related air pollution (TRAP) during rest and moderate physical activity (PA). A controlled crossover study to compare whole blood gene expression pre and post short-term exposure to high and low of TRAP was performed together with systems biology analysis. Twenty-eight healthy volunteers aged between 21 and 53 years were recruited. These subjects were exposed during 2 h to different pollution levels (high and low TRAP levels), while either cycling or resting. Global transcriptome profile of each condition was performed from human whole blood samples. Microarrays analysis was performed to obtain differential expressed genes (DEG) to be used as initial input for GeneMANIA software to obtain protein-protein (PPI) networks. Two networks were found reflecting high or low TRAP levels, which shared only 5.6 and 15.5% of its nodes, suggesting specific cell signaling pathways being activated in each environmental condition. However, gene ontology analysis of each PPI network suggests that each level of TRAP regulate common members of NF-κB signaling pathway. Our work provides the first approach describing mechanistic networks to understand TRAP effects on a system level. Collapse Key Words Human interactome Moderate physical activity Systems biology Traffic-related air pollution Collapse MESH Headings Adult Air Pollutants/adverse effects Air Pollutants/analysis Cross-Over Studies Environmental Exposure/adverse effects Environmental Exposure/analysis Exercise Female Gene Expression Profiling Humans Male Middle Aged NF-kappa B/genetics Particulate Matter/adverse effects Particulate Matter/analysis RNA/blood Signal Transduction/drug effects Signal Transduction/genetics Spain Time Factors Traffic-Related Pollution/adverse effects Traffic-Related Pollution/analysis Transcriptome/drug effects Urbanization Young Adult Collapse Grants Collapse
35	Sparse multiple factor analysis to integrate genetic data, neuroimaging features, and attention-deficit/hyperactivity disorder domains. Int J Methods Psychiatr Res 2018;27:e1738. [PMID: 30105890 PMCID: PMC6877273 DOI: 10.1002/mpr.1738] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Revised: 05/17/2018] [Accepted: 06/26/2018] [Indexed: 11/09/2022] Open Abstract OBJECTIVES We proposed the application of a multivariate cross-sectional framework based on a combination of a variable selection method and a multiple factor analysis (MFA) in order to identify complex meaningful biological signals related to attention-deficit/hyperactivity disorder (ADHD) symptoms and hyperactivity/inattention domains. METHODS The study included 135 children from the general population with genomic and neuroimaging data. ADHD symptoms were assessed using a questionnaire based on ADHD-DSM-IV criteria. In all analyses, the raw sum scores of the hyperactivity and inattention domains and total ADHD were used. The analytical framework comprised two steps. First, zero-inflated negative binomial linear model via penalized maximum likelihood (LASSO-ZINB) was performed. Second, the most predictive features obtained with LASSO-ZINB were used as input for the MFA. RESULTS We observed significant relationships between ADHD symptoms and hyperactivity and inattention domains with white matter, gray matter regions, and cerebellum, as well as with loci within chromosome 1. CONCLUSIONS Multivariate methods can be used to advance the neurobiological characterization of complex diseases, improving the statistical power with respect to univariate methods, allowing the identification of meaningful biological signals in Imaging Genetic studies. Collapse Key Words ADHD Imaging Genetics LASSO-ZINB multiple factor analysis neurogenetics Collapse MESH Headings Collapse Grants Collapse
36	Strategies for integrated analysis in imaging genetics studies. Neurosci Biobehav Rev 2018;93:57-70. [PMID: 29944960 DOI: 10.1016/j.neubiorev.2018.06.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 04/30/2018] [Accepted: 06/15/2018] [Indexed: 02/06/2023] Abstract Imaging Genetics (IG) integrates neuroimaging and genomic data from the same individual, deepening our knowledge of the biological mechanisms behind neurodevelopmental domains and neurological disorders. Although the literature on IG has exponentially grown over the past years, the majority of studies have mainly analyzed associations between candidate brain regions and individual genetic variants. However, this strategy is not designed to deal with the complexity of neurobiological mechanisms underlying behavioral and neurodevelopmental domains. Moreover, larger sample sizes and increased multidimensionality of this type of data represents a challenge for standardizing modeling procedures in IG research. This review provides a systematic update of the methods and strategies currently used in IG studies, and serves as an analytical framework for researchers working in this field. To complement the functionalities of the Neuroconductor framework, we also describe existing R packages that implement these methodologies. In addition, we present an overview of how these methodological approaches are applied in integrating neuroimaging and genetic data. Collapse Key Words Analytical strategies Genetics Imaging genetics Neuroconductor Neuroimaging Collapse MESH Headings Collapse Grants Collapse
37	psygenet2r: a R/Bioconductor package for the analysis of psychiatric disease genes. Bioinformatics 2017;33:4004-4006. [PMID: 28961763 PMCID: PMC5860088 DOI: 10.1093/bioinformatics/btx506] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Revised: 08/04/2017] [Accepted: 08/08/2017] [Indexed: 11/12/2022] Open Abstract MOTIVATION Psychiatric disorders have a great impact on morbidity and mortality. Genotype-phenotype resources for psychiatric diseases are key to enable the translation of research findings to a better care of patients. PsyGeNET is a knowledge resource on psychiatric diseases and their genes, developed by text mining and curated by domain experts. RESULTS We present psygenet2r, an R package that contains a variety of functions for leveraging PsyGeNET database and facilitating its analysis and interpretation. The package offers different types of queries to the database along with variety of analysis and visualization tools, including the study of the anatomical structures in which the genes are expressed and gaining insight of gene's molecular function. Psygenet2r is especially suited for network medicine analysis of psychiatric disorders. AVAILABILITY AND IMPLEMENTATION The package is implemented in R and is available under MIT license from Bioconductor (http://bioconductor.org/packages/release/bioc/html/psygenet2r.html). CONTACT juanr.gonzalez@isglobal.org or laura.furlong@upf.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Data Mining Databases, Genetic Genes Humans Mental Disorders/genetics Software Collapse Grants Collapse
38	Redundancy analysis allows improved detection of methylation changes in large genomic regions. BMC Bioinformatics 2017;18:553. [PMID: 29237399 PMCID: PMC5729265 DOI: 10.1186/s12859-017-1986-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Accepted: 12/05/2017] [Indexed: 01/12/2023] Open Abstract BACKGROUND DNA methylation is an epigenetic process that regulates gene expression. Methylation can be modified by environmental exposures and changes in the methylation patterns have been associated with diseases. Methylation microarrays measure methylation levels at more than 450,000 CpGs in a single experiment, and the most common analysis strategy is to perform a single probe analysis to find methylation probes associated with the outcome of interest. However, methylation changes usually occur at the regional level: for example, genomic structural variants can affect methylation patterns in regions up to several megabases in length. Existing DMR methods provide lists of Differentially Methylated Regions (DMRs) of up to only few kilobases in length, and cannot check if a target region is differentially methylated. Therefore, these methods are not suitable to evaluate methylation changes in large regions. To address these limitations, we developed a new DMR approach based on redundancy analysis (RDA) that assesses whether a target region is differentially methylated. RESULTS Using simulated and real datasets, we compared our approach to three common DMR detection methods (Bumphunter, blockFinder, and DMRcate). We found that Bumphunter underestimated methylation changes and blockFinder showed poor performance. DMRcate showed poor power in the simulated datasets and low specificity in the real data analysis. Our method showed very high performance in all simulation settings, even with small sample sizes and subtle methylation changes, while controlling type I error. Other advantages of our method are: 1) it estimates the degree of association between the DMR and the outcome; 2) it can analyze a targeted or region of interest; and 3) it can evaluate the simultaneous effects of different variables. The proposed methodology is implemented in MEAL, a Bioconductor package designed to facilitate the analysis of methylation data. CONCLUSIONS We propose a multivariate approach to decipher whether an outcome of interest alters the methylation pattern of a region of interest. The method is designed to analyze large target genomic regions and outperforms the three most popular methods for detecting DMRs. Our method can evaluate factors with more than two levels or the simultaneous effect of more than one continuous variable, which is not possible with the state-of-the-art methods. Collapse Key Words DNA methylation Epigenomics Gene expression Microarray Region analysis Collapse MESH Headings Breast Neoplasms/genetics DNA Methylation/genetics Databases, Genetic Epigenesis, Genetic Female Genome/genetics Genomics/methods Humans Collapse Grants MTM2015-68140-R Ministerio de Economía y Competitividad #016FI_B 00272 Departament d'Innovació, Universitats i Empresa, Generalitat de Catalunya Collapse
39	The acute effects of ultraviolet radiation on the blood transcriptome are independent of plasma 25OHD₃. ENVIRONMENTAL RESEARCH 2017;159:239-248. [PMID: 28822308 DOI: 10.1016/j.envres.2017.07.045] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Revised: 07/05/2017] [Accepted: 07/25/2017] [Indexed: 06/07/2023] Abstract The molecular basis of many health outcomes attributed to solar ultraviolet radiation (UVR) is unknown. We tested the hypothesis that they may originate from transcriptional changes in blood cells. This was determined by assessing the effect of fluorescent solar simulated radiation (FSSR) on the transcriptional profile of peripheral blood pre- and 6h, 24h and 48h post-exposure in nine healthy volunteers. Expression of 20 genes was down-regulated and one was up-regulated at 6h after FSSR. All recovered to baseline expression at 24h or 48h. These genes have been associated with immune regulation, cancer and blood pressure; health effects attributed to vitamin D via solar UVR exposure. Plasma 25-hydroxyvitamin D₃ [25OHD₃] levels increased over time after FSSR and were maximal at 48h. The increase was more pronounced in participants with low basal 25OHD₃ levels. Mediation analyses suggested that changes in gene expression due to FSSR were independent of 25OHD₃ and blood cell subpopulations. Collapse Key Words Blood Gene expression MiRNA expression Solar ultraviolet radiation Vitamin D Collapse MESH Headings Adult Blood/metabolism Calcifediol/blood Humans Male Transcriptome Ultraviolet Rays/adverse effects United Kingdom Vitamins/blood Young Adult Collapse Grants Collapse
40	A systematic comparison of statistical methods to detect interactions in exposome-health associations. Environ Health 2017;16:74. [PMID: 28709428 PMCID: PMC5513197 DOI: 10.1186/s12940-017-0277-6] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Accepted: 06/11/2017] [Indexed: 05/20/2023] Abstract BACKGROUND There is growing interest in examining the simultaneous effects of multiple exposures and, more generally, the effects of mixtures of exposures, as part of the exposome concept (being defined as the totality of human environmental exposures from conception onwards). Uncovering such combined effects is challenging owing to the large number of exposures, several of them being highly correlated. We performed a simulation study in an exposome context to compare the performance of several statistical methods that have been proposed to detect statistical interactions. METHODS Simulations were based on an exposome including 237 exposures with a realistic correlation structure. We considered several statistical regression-based methods, including two-step Environment-Wide Association Study (EWAS₂), the Deletion/Substitution/Addition (DSA) algorithm, the Least Absolute Shrinkage and Selection Operator (LASSO), Group-Lasso INTERaction-NET (GLINTERNET), a three-step method based on regression trees and finally Boosted Regression Trees (BRT). We assessed the performance of each method in terms of model size, predictive ability, sensitivity and false discovery rate. RESULTS GLINTERNET and DSA had better overall performance than the other methods, with GLINTERNET having better properties in terms of selecting the true predictors (sensitivity) and of predictive ability, while DSA had a lower number of false positives. In terms of ability to capture interaction terms, GLINTERNET and DSA had again the best performances, with the same trade-off between sensitivity and false discovery proportion. When GLINTERNET and DSA failed to select an exposure truly associated with the outcome, they tended to select a highly correlated one. When interactions were not present in the data, using variable selection methods that allowed for interactions had only slight costs in performance compared to methods that only searched for main effects. CONCLUSIONS GLINTERNET and DSA provided better performance in detecting two-way interactions, compared to other existing methods. Collapse Key Words Exposome Interactions Variable selection Collapse MESH Headings Environmental Exposure Environmental Health/methods Environmental Monitoring/methods Environmental Pollutants/toxicity Humans Models, Statistical Collapse Grants Seventh Framework Programme Ministerio de Economía y Competitividad Collapse
41	Novel genes involved in severe early-onset obesity revealed by rare copy number and sequence variants. PLoS Genet 2017;13:e1006657. [PMID: 28489853 PMCID: PMC5443539 DOI: 10.1371/journal.pgen.1006657] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 05/24/2017] [Accepted: 02/26/2017] [Indexed: 12/26/2022] Open Abstract Obesity is a multifactorial disorder with high heritability (50–75%), which is probably higher in early-onset and severe cases. Although rare monogenic forms and several genes and regions of susceptibility, including copy number variants (CNVs), have been described, the genetic causes underlying the disease still remain largely unknown. We searched for rare CNVs (>100kb in size, altering genes and present in <1/2000 population controls) in 157 Spanish children with non-syndromic early-onset obesity (EOO: body mass index >3 standard deviations above the mean at <3 years of age) using SNP array molecular karyotypes. We then performed case control studies (480 EOO cases/480 non-obese controls) with the validated CNVs and rare sequence variants (RSVs) detected by targeted resequencing of selected CNV genes (n = 14), and also studied the inheritance patterns in available first-degree relatives. A higher burden of gain-type CNVs was detected in EOO cases versus controls (OR = 1.71, p-value = 0.0358). In addition to a gain of the NPY gene in a familial case with EOO and attention deficit hyperactivity disorder, likely pathogenic CNVs included gains of glutamate receptors (GRIK1, GRM7) and the X-linked gastrin-peptide receptor (GRPR), all inherited from obese parents. Putatively functional RSVs absent in controls were also identified in EOO cases at NPY, GRIK1 and GRPR. A patient with a heterozygous deletion disrupting two contiguous and related genes, SLCO4C1 and SLCO6A1, also had a missense RSV at SLCO4C1 on the other allele, suggestive of a recessive model. The genes identified showed a clear enrichment of shared co-expression partners with known genes strongly related to obesity, reinforcing their role in the pathophysiology of the disease. Our data reveal a higher burden of rare CNVs and RSVs in several related genes in patients with EOO compared to controls, and implicate NPY, GRPR, two glutamate receptors and SLCO4C1 in highly penetrant forms of familial obesity. Although there is strong evidence for a high genetic component of obesity, the underlying genetic causes are largely unknown, mostly due to the highly heterogeneous nature of the disorder. In this work, we have focused on the most severe end of the spectrum, severe obesity with early-onset in childhood, which is more likely due to genetic alterations. We screened for rare copy number variation (CNV) a sample of 157 Spanish children with early-onset obesity using molecular karyotypes and then studied the genes altered by CNVs in 480 cases and 480 non-obese controls. We identified a higher burden of gain-type CNVs in cases as well as several CNVs and sequence variants that were specific of the obese population. Interestingly, the genes identified shared co-expression partners with known obesity genes. Among those, the genes encoding the neuropeptide Y (NPY), two glutamate receptors (GRIK1, GRM7), the X-linked gastrin-peptide receptor (GRPR), and the organic anion transporter (SLCO4C1) are novel obesity candidate genes that may contribute to highly penetrant forms of familial obesity. Collapse Key Words Collapse MESH Headings Case-Control Studies DNA Copy Number Variations Female Genetic Loci Humans Male Neuropeptide Y/genetics Obesity/diagnosis Obesity/genetics Organic Anion Transporters/genetics Pedigree Polymorphism, Single Nucleotide Receptors, Kainic Acid/genetics Receptors, Metabotropic Glutamate/genetics Collapse Grants Ministerio de Sanidad, Servicios Sociales e Igualdad Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya Secretaría de Estado de Investigación, Desarrollo e Innovación Federación Española de Enfermedades Raras (ES) Ministerio de Sanidad, Servicios Sociales e Igualdad (ES) Federación Española de Enfermedades Raras Fundación de Endocrinología y Nutrición Ministerio de Ciencia e Innovación (ES) Statistical Genetics Network Collapse
42	Polymorphisms in the SNRPN gene are associated with obesity susceptibility in a Spanish population. J Gene Med 2017;19. [PMID: 28387446 DOI: 10.1002/jgm.2956] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Revised: 03/15/2017] [Accepted: 04/04/2017] [Indexed: 12/12/2022] Open Abstract BACKGROUND SNRPN, which codes for the RNA-binding SmN protein, is a candidate gene for Prader-Willi syndrome. One characteristic of this neuroendocrine disorder is hyperphagia resulting in extreme obesity later in life. In the present study, we aimed to assess whether variability within this gene could be implicated in obesity susceptibility. METHODS A case-control study was performed including 265 unrelated patients with nonsyndromic and early-onset severe obesity, belonging to high-risk obesity families from Spanish ancestry; 184 healthy control individuals were included representative of the same genetic background and sex-matched. Forty-nine single nucleotide polymorphisms (SNPs) spanning the entire SNRPN gene were selected and genotyped using the Sequenom MassARRAY platform (Sequenom Inc., San Diego, CA, USA). RESULTS The four SNPs, rs12905653, rs752874, rs1391516 and rs2047433, were found to be nominally associated with obesity (p < 0.03). The diversity haplotype distribution among cases and controls identified the combination rs12905653-T/rs8028366-A/rs4028395-T as being strongly and inversely associated with obesity (odds ratio = 0.49; p = 0.0006). A genetic risk score was built based on rs12905653, rs1391516 and rs2047433 SNPs and each unit increase in genetic risk score increased the obesity risk by 49% (odds ratio = 1.49, 95% confidence interval = 1.24-1.80). CONCLUSIONS To our knowledge, this is the first study reporting an association between variability in the SNRPN gene and the risk of being obese. Interestingly, it was the major allele of each SNP that was found to be associated with the risk of weight gain. Further studies analyzing this locus and the possible additive deleterious capability of SNP combinations could be useful for demonstrating the development of obesity. Collapse Key Words BMI SNRPN gene Spanish population case-control study genetic susceptibility obesity Collapse MESH Headings Collapse Grants Collapse
43	MultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration. BMC Bioinformatics 2017;18:36. [PMID: 28095799 PMCID: PMC5240259 DOI: 10.1186/s12859-016-1455-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2016] [Accepted: 12/24/2016] [Indexed: 12/01/2022] Open Abstract BACKGROUND Reduction in the cost of genomic assays has generated large amounts of biomedical-related data. As a result, current studies perform multiple experiments in the same subjects. While Bioconductor's methods and classes implemented in different packages manage individual experiments, there is not a standard class to properly manage different omic datasets from the same subjects. In addition, most R/Bioconductor packages that have been designed to integrate and visualize biological data often use basic data structures with no clear general methods, such as subsetting or selecting samples. RESULTS To cover this need, we have developed MultiDataSet, a new R class based on Bioconductor standards, designed to encapsulate multiple data sets. MultiDataSet deals with the usual difficulties of managing multiple and non-complete data sets while offering a simple and general way of subsetting features and selecting samples. We illustrate the use of MultiDataSet in three common situations: 1) performing integration analysis with third party packages; 2) creating new methods and functions for omic data integration; 3) encapsulating new unimplemented data from any biological experiment. CONCLUSIONS MultiDataSet is a suitable class for data integration under R and Bioconductor framework. Collapse Key Words Data infrastructure Data integration Data organization Omics data R Collapse MESH Headings DNA Methylation Gene Expression Genomics/methods Humans Multivariate Analysis Software Collapse Grants Ministerio de Economía y Competitividad (ES) Agència de Gestió d’Ajuts Universitaris i de Recerca Seventh Framework Programme Collapse
44	Imaging genetics in attention-deficit/hyperactivity disorder and related neurodevelopmental domains: state of the art. Brain Imaging Behav 2016;11:1922-1931. [DOI: 10.1007/s11682-016-9663-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
45	A Systematic Comparison of Linear Regression-Based Statistical Methods to Assess Exposome-Health Associations. ENVIRONMENTAL HEALTH PERSPECTIVES 2016;124:1848-1856. [PMID: 27219331 PMCID: PMC5132632 DOI: 10.1289/ehp172] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Revised: 01/12/2016] [Accepted: 04/28/2016] [Indexed: 05/17/2023] Abstract BACKGROUND The exposome constitutes a promising framework to improve understanding of the effects of environmental exposures on health by explicitly considering multiple testing and avoiding selective reporting. However, exposome studies are challenged by the simultaneous consideration of many correlated exposures. OBJECTIVES We compared the performances of linear regression-based statistical methods in assessing exposome-health associations. METHODS In a simulation study, we generated 237 exposure covariates with a realistic correlation structure and with a health outcome linearly related to 0 to 25 of these covariates. Statistical methods were compared primarily in terms of false discovery proportion (FDP) and sensitivity. RESULTS On average over all simulation settings, the elastic net and sparse partial least-squares regression showed a sensitivity of 76% and an FDP of 44%; Graphical Unit Evolutionary Stochastic Search (GUESS) and the deletion/substitution/addition (DSA) algorithm revealed a sensitivity of 81% and an FDP of 34%. The environment-wide association study (EWAS) underperformed these methods in terms of FDP (average FDP, 86%) despite a higher sensitivity. Performances decreased considerably when assuming an exposome exposure matrix with high levels of correlation between covariates. CONCLUSIONS Correlation between exposures is a challenge for exposome research, and the statistical methods investigated in this study were limited in their ability to efficiently differentiate true predictors from correlated covariates in a realistic exposome context. Although GUESS and DSA provided a marginally better balance between sensitivity and FDP, they did not outperform the other multivariate methods across all scenarios and properties examined, and computational complexity and flexibility should also be considered when choosing between these methods. Citation: Agier L, Portengen L, Chadeau-Hyam M, Basagaña X, Giorgis-Allemand L, Siroux V, Robinson O, Vlaanderen J, González JR, Nieuwenhuijsen MJ, Vineis P, Vrijheid M, Slama R, Vermeulen R. 2016. A systematic comparison of linear regression-based statistical methods to assess exposome-health associations. Environ Health Perspect 124:1848-1856; http://dx.doi.org/10.1289/EHP172. Collapse Key Words Collapse MESH Headings Environmental Exposure Environmental Monitoring/methods Environmental Pollutants/toxicity Humans Linear Models Collapse Grants Collapse
46	Genetic polymorphisms associated with increased risk of developing chronic myelogenous leukemia. Oncotarget 2016;6:36269-77. [PMID: 26474455 PMCID: PMC4742176 DOI: 10.18632/oncotarget.5915] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 09/14/2015] [Indexed: 12/22/2022] Open Abstract Little is known about inherited factors associated with the risk of developing chronic myelogenous leukemia (CML). We used a dedicated DNA chip containing 16 561 single nucleotide polymorphisms (SNPs) covering 1 916 candidate genes to analyze 437 CML patients and 1 144 healthy control individuals. Single SNP association analysis identified 139 SNPs that passed multiple comparisons (1% false discovery rate). The HDAC9, AVEN, SEMA3C, IKBKB, GSTA3, RIPK1 and FGF2 genes were each represented by three SNPs, the PSM family by four SNPs and the SLC15A1 gene by six. Haplotype analysis showed that certain combinations of rare alleles of these genes increased the risk of developing CML by more than two or three-fold. A classification tree model identified five SNPs belonging to the genes PSMB10, TNFRSF10D, PSMB2, PPARD and CYP26B1, which were associated with CML predisposition. A CML-risk-allele score was created using these five SNPs. This score was accurate for discriminating CML status (AUC: 0.61, 95%CI: 0.58-0.64). Interestingly, the score was associated with age at diagnosis and the average number of risk alleles was significantly higher in younger patients. The risk-allele score showed the same distribution in the general population (HapMap CEU samples) as in our control individuals and was associated with differential gene expression patterns of two genes (VAPA and TDRKH). In conclusion, we describe haplotypes and a genetic score that are significantly associated with a predisposition to develop CML. The SNPs identified will also serve to drive fundamental research on the putative role of these genes in CML development. Collapse Key Words CML SNPs genetic predisposition myeloid leukemia Collapse MESH Headings Collapse Grants Collapse
47	Ancient Haplotypes at the 15q24.2 Microdeletion Region Are Linked to Brain Expression of MAN2C1 and Children's Intelligence. PLoS One 2016;11:e0157739. [PMID: 27355585 PMCID: PMC4927142 DOI: 10.1371/journal.pone.0157739] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Accepted: 06/05/2016] [Indexed: 11/26/2022] Open Abstract The chromosome bands 15q24.1-15q24.3 contain a complex region with numerous segmental duplications that predispose to regional microduplications and microdeletions, both of which have been linked to intellectual disability, speech delay and autistic features. The region may also harbour common inversion polymorphisms whose functional and phenotypic manifestations are unknown. Using single nucleotide polymorphism (SNP) data, we detected four large contiguous haplotype-genotypes at 15q24 with Mendelian inheritance in 2,562 trios, African origin, high population stratification and reduced recombination rates. Although the haplotype-genotypes have been most likely generated by decreased or absent recombination among them, we could not confirm that they were the product of inversion polymorphisms in the region. One of the blocks was composed of three haplotype-genotypes (N1a, N1b and N2), which significantly correlated with intelligence quotient (IQ) in 2,735 children of European ancestry from three independent population cohorts. Homozygosity for N2 was associated with lower verbal IQ (2.4-point loss, p-value = 0.01), while homozygosity for N1b was associated with 3.2-point loss in non-verbal IQ (p-value = 0.0006). The three alleles strongly correlated with expression levels of MAN2C1 and SNUPN in blood and brain. Homozygosity for N2 correlated with over-expression of MAN2C1 over many brain areas but the occipital cortex where N1b homozygous highly under-expressed. Our population-based analyses suggest that MAN2C1 may contribute to the verbal difficulties observed in microduplications and to the intellectual disability of microdeletion syndromes, whose characteristic dosage increment and removal may affect different brain areas. Collapse Key Words Collapse MESH Headings Animals Brain/metabolism Child Chromosome Aberrations Chromosome Deletion Chromosome Disorders/genetics Chromosomes, Human, Pair 15 Cohort Studies Ethiopia Evolution, Molecular Genome, Human Genotype Haplotypes Homozygote Humans In Situ Hybridization, Fluorescence Intellectual Disability/genetics Intelligence/genetics Intelligence Tests Macaca mulatta Mannosidases/genetics Mice Phenotype Polymorphism, Single Nucleotide Pongo Rats alpha-Mannosidase Collapse Grants Spanish Ministry of Science and Innovation Instituto de Salud Carlos III Generalitat de Catalunya European Commission Fundacio La Marato de TV3, Generalitat de Catalunya Conselleria de Sanitat Generalitat Valenciana Fundacion Roger Torne Canadian Institutes of Health Research Heart and Stroke Foundation of Quebec Canadian Foundation for Innovation Erasmus Medical Centre, Rotterdam, Erasmus University Rotterdam and the Netherlands Organization for Health Research and Development Netherlands Organization for Health Research and Development Estonian Government Estonian Research Roadmap through Estonian Ministry of Education and Research, Center of Excellence in Genomics Center of Translational Genomics, University of Tartu Estonian Research Council IUT2-2 grant and European Regional Development Fund Chief Scientist Office of the Scottish Government, the Royal Society, the MRC Human Genetics Unit, Arthritis Research UK European Union framework program 6 EUROSPAN project Spanish Ministry of Health and FEDER Spanish Ministry of Economy and Competiveness ICREA-Academia program Collapse
48	APOE and MS4A6A interact with GnRH signaling in Alzheimer's disease: Enrichment of epistatic effects. Alzheimers Dement 2016;13:493-497. [DOI: 10.1016/j.jalz.2016.05.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Revised: 05/04/2016] [Accepted: 05/22/2016] [Indexed: 10/21/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
49	Sixteen new lung function signals identified through 1000 Genomes Project reference panel imputation. Nat Commun 2015;6:8658. [PMID: 26635082 PMCID: PMC4686825 DOI: 10.1038/ncomms9658] [Citation(s) in RCA: 92] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2015] [Accepted: 09/17/2015] [Indexed: 01/11/2023] Open Abstract Lung function measures are used in the diagnosis of chronic obstructive pulmonary disease. In 38,199 European ancestry individuals, we studied genome-wide association of forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC) and FEV1/FVC with 1000 Genomes Project (phase 1)-imputed genotypes and followed up top associations in 54,550 Europeans. We identify 14 novel loci (P<5 × 10(-8)) in or near ENSA, RNU5F-1, KCNS3, AK097794, ASTN2, LHX3, CCDC91, TBX3, TRIP11, RIN3, TEKT5, LTBP4, MN1 and AP1S2, and two novel signals at known loci NPNT and GPR126, providing a basis for new understanding of the genetic determinants of these traits and pulmonary diseases in which they are altered. Collapse Key Words Collapse MESH Headings Adult Aged Aged, 80 and over Female Forced Expiratory Volume Genome-Wide Association Study Humans Lung/physiopathology Lung Diseases/genetics Lung Diseases/physiopathology Male Middle Aged Polymorphism, Single Nucleotide White People/genetics Young Adult Collapse Grants 092731 Wellcome Trust R01 EY018246 NEI NIH HHS R01 HL087679 NHLBI NIH HHS 1R01EY018246 NEI NIH HHS 5R01MH63706:02 NIMH NIH HHS MR/L01341X/1 Medical Research Council G0500539 Medical Research Council U01 DK062418 NIDDK NIH HHS WT 084703MA Wellcome Trust G0501942 Medical Research Council MC_PC_U127561128 Medical Research Council WT090532 Wellcome Trust BB/F019394/1 Biotechnology and Biological Sciences Research Council G1002319 Medical Research Council CZB/4/505 Chief Scientist Office 098017 Wellcome Trust R01 MH063706 NIMH NIH HHS G1000861 Medical Research Council MC_UU_12015/1 Medical Research Council 079895 Wellcome Trust G1001799 Medical Research Council MC_PC_12010 Medical Research Council ETM/55 Chief Scientist Office 1RL1MH083268-01 NIMH NIH HHS MR/N01104X/1 Medical Research Council CZD/16/6/4 Chief Scientist Office MC_PC_U127592696 Medical Research Council G0902313 Medical Research Council SRF/01/010 Department of Health Wellcome Trust 068545/Z/02 Wellcome Trust 5R01HL087679-02 NHLBI NIH HHS MC_UU_12013/4 Medical Research Council RL1 MH083268 NIMH NIH HHS WT064890 Wellcome Trust 076113/B/04/Z Wellcome Trust MC_U106179471 Medical Research Council G0000934 Medical Research Council G0600705 Medical Research Council MR/K026992/1 Medical Research Council Collapse
50	affy2sv: an R package to pre-process Affymetrix CytoScan HD and 750K arrays for SNP, CNV, inversion and mosaicism calling. BMC Bioinformatics 2015;16:167. [PMID: 25991004 PMCID: PMC4438530 DOI: 10.1186/s12859-015-0608-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 04/30/2015] [Indexed: 12/02/2022] Open Abstract Background The well-known Genome-Wide Association Studies (GWAS) had led to many scientific discoveries using SNP data. Even so, they were not able to explain the full heritability of complex diseases. Now, other structural variants like copy number variants or DNA inversions, either germ-line or in mosaicism events, are being studies. We present the R package affy2sv to pre-process Affymetrix CytoScan HD/750k array (also for Genome-Wide SNP 5.0/6.0 and Axiom) in structural variant studies. Results We illustrate the capabilities of affy2sv using two different complete pipelines on real data. The first one performing a GWAS and a mosaic alterations detection study, and the other detecting CNVs and performing an inversion calling. Conclusion Both examples presented in the article show up how affy2sv can be used as part of more complex pipelines aimed to analyze Affymetrix SNP arrays data in genetic association studies, where different types of structural variants are considered. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0608-y) contains supplementary material, which is available to authorized users. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse