1
|
SUMMIT-FA: a new resource for improved transcriptome imputation using functional annotations. Hum Mol Genet 2024; 33:624-635. [PMID: 38129112 PMCID: PMC10954367 DOI: 10.1093/hmg/ddad205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 10/24/2023] [Accepted: 11/30/2023] [Indexed: 12/23/2023] Open
Abstract
Transcriptome-wide association studies (TWAS) integrate gene expression prediction models and genome-wide association studies (GWAS) to identify gene-trait associations. The power of TWAS is determined by the sample size of GWAS and the accuracy of the expression prediction model. Here, we present a new method, the Summary-level Unified Method for Modeling Integrated Transcriptome using Functional Annotations (SUMMIT-FA), which improves gene expression prediction accuracy by leveraging functional annotation resources and a large expression quantitative trait loci (eQTL) summary-level dataset. We build gene expression prediction models in whole blood using SUMMIT-FA with the comprehensive functional database MACIE and eQTL summary-level data from the eQTLGen consortium. We apply these models to GWAS for 24 complex traits and show that SUMMIT-FA identifies significantly more gene-trait associations and improves predictive power for identifying "silver standard" genes compared to several benchmark methods. We further conduct a simulation study to demonstrate the effectiveness of SUMMIT-FA.
Collapse
|
2
|
NCAD v1.0: a database for non-coding variant annotation and interpretation. J Genet Genomics 2024; 51:230-242. [PMID: 38142743 DOI: 10.1016/j.jgg.2023.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/15/2023] [Accepted: 12/18/2023] [Indexed: 12/26/2023]
Abstract
The application of whole genome sequencing is expanding in clinical diagnostics across various genetic disorders, and the significance of non-coding variants in penetrant diseases is increasingly being demonstrated. Therefore, it is urgent to improve the diagnostic yield by exploring the pathogenic mechanisms of variants in non-coding regions. However, the interpretation of non-coding variants remains a significant challenge, due to the complex functional regulatory mechanisms of non-coding regions and the current limitations of available databases and tools. Hence, we develop the non-coding variant annotation database (NCAD, http://www.ncawdb.net/), encompassing comprehensive insights into 665,679,194 variants, regulatory elements, and element interaction details. Integrating data from 96 sources, spanning both GRCh37 and GRCh38 versions, NCAD v1.0 provides vital information to support the genetic diagnosis of non-coding variants, including allele frequencies of 12 diverse populations, with a particular focus on the population frequency information for 230,235,698 variants in 20,964 Chinese individuals. Moreover, it offers prediction scores for variant functionality, five categories of regulatory elements, and four types of non-coding RNAs. With its rich data and comprehensive coverage, NCAD serves as a valuable platform, empowering researchers and clinicians with profound insights into non-coding regulatory mechanisms while facilitating the interpretation of non-coding variants.
Collapse
|
3
|
A statistical framework for powerful multi-trait rare variant analysis in large-scale whole-genome sequencing studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.30.564764. [PMID: 37961350 PMCID: PMC10634938 DOI: 10.1101/2023.10.30.564764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally-scalable analytical pipeline for functionally-informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits (low-density lipoprotein cholesterol, high-density lipoprotein cholesterol and triglycerides) in 61,861 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered new associations with lipid traits missed by single-trait analysis, including rare variants within an enhancer of NIPSNAP3A and an intergenic region on chromosome 1.
Collapse
|
4
|
Gene-based burden scores identify rare variant associations for 28 blood biomarkers. BMC Genom Data 2023; 24:50. [PMID: 37667186 PMCID: PMC10476296 DOI: 10.1186/s12863-023-01155-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 08/28/2023] [Indexed: 09/06/2023] Open
Abstract
BACKGROUND A relevant part of the genetic architecture of complex traits is still unknown; despite the discovery of many disease-associated common variants. Polygenic risk score (PRS) models are based on the evaluation of the additive effects attributable to common variants and have been successfully implemented to assess the genetic susceptibility for many phenotypes. In contrast, burden tests are often used to identify an enrichment of rare deleterious variants in specific genes. Both kinds of genetic contributions are typically analyzed independently. Many studies suggest that complex phenotypes are influenced by both low effect common variants and high effect rare deleterious variants. The aim of this paper is to integrate the effect of both common and rare functional variants for a more comprehensive genetic risk modeling. METHODS We developed a framework combining gene-based scores based on the enrichment of rare functionally relevant variants with genome-wide PRS based on common variants for association analysis and prediction models. We applied our framework on UK Biobank dataset with genotyping and exome data and considered 28 blood biomarkers levels as target phenotypes. For each biomarker, an association analysis was performed on full cohort using gene-based scores (GBS). The cohort was then split into 3 subsets for PRS construction and feature selection, predictive model training, and independent evaluation, respectively. Prediction models were generated including either PRS, GBS or both (combined). RESULTS Association analyses of the cohort were able to detect significant genes that were previously known to be associated with different biomarkers. Interestingly, the analyses also revealed heterogeneous effect sizes and directionality highlighting the complexity of the blood biomarkers regulation. However, the combined models for many biomarkers show little or no improvement in prediction accuracy compared to the PRS models. CONCLUSION This study shows that rare variants play an important role in the genetic architecture of complex multifactorial traits such as blood biomarkers. However, while rare deleterious variants play a strong role at an individual level, our results indicate that classical common variant based PRS might be more informative to predict the genetic susceptibility at the population level.
Collapse
|
5
|
The sequence kernel association test for multicategorical outcomes. Genet Epidemiol 2023; 47:432-449. [PMID: 37078108 DOI: 10.1002/gepi.22527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 03/29/2023] [Accepted: 03/30/2023] [Indexed: 04/21/2023]
Abstract
Disease heterogeneity is ubiquitous in biomedical and clinical studies. In genetic studies, researchers are increasingly interested in understanding the distinct genetic underpinning of subtypes of diseases. However, existing set-based analysis methods for genome-wide association studies are either inadequate or inefficient to handle such multicategorical outcomes. In this paper, we proposed a novel set-based association analysis method, sequence kernel association test (SKAT)-MC, the sequence kernel association test for multicategorical outcomes (nominal or ordinal), which jointly evaluates the relationship between a set of variants (common and rare) and disease subtypes. Through comprehensive simulation studies, we showed that SKAT-MC effectively preserves the nominal type I error rate while substantially increases the statistical power compared to existing methods under various scenarios. We applied SKAT-MC to the Polish breast cancer study (PBCS), and identified gene FGFR2 was significantly associated with estrogen receptor (ER)+ and ER- breast cancer subtypes. We also investigated educational attainment using UK Biobank data (N = 127 , 127 $N=127,127$ ) with SKAT-MC, and identified 21 significant genes in the genome. Consequently, SKAT-MC is a powerful and efficient analysis tool for genetic association studies with multicategorical outcomes. A freely distributed R package SKAT-MC can be accessed at https://github.com/Zhiwen-Owen-Jiang/SKATMC.
Collapse
|
6
|
RegVar: Tissue-specific Prioritization of Non-coding Regulatory Variants. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:385-395. [PMID: 34973416 PMCID: PMC10626172 DOI: 10.1016/j.gpb.2021.08.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 06/11/2021] [Accepted: 09/27/2021] [Indexed: 06/14/2023]
Abstract
Non-coding genomic variants constitute the majority of trait-associated genome variations; however, the identification of functional non-coding variants is still a challenge in human genetics, and a method for systematically assessing the impact of regulatory variants on gene expression and linking these regulatory variants to potential target genes is still lacking. Here, we introduce a deep neural network (DNN)-based computational framework, RegVar, which can accurately predict the tissue-specific impact of non-coding regulatory variants on target genes. We show that by robustly learning the genomic characteristics of massive variant-gene expression associations in a variety of human tissues, RegVar vastly surpasses all current non-coding variant prioritization methods in predicting regulatory variants under different circumstances. The unique features of RegVar make it an excellent framework for assessing the regulatory impact of any variant on its putative target genes in a variety of tissues. RegVar is available as a web server at https://regvar.omic.tech/.
Collapse
|
7
|
Disease-associated non-coding variants alter NKX2-5 DNA-binding affinity. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2023; 1866:194906. [PMID: 36690178 PMCID: PMC10013089 DOI: 10.1016/j.bbagrm.2023.194906] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 12/30/2022] [Accepted: 01/02/2023] [Indexed: 01/22/2023]
Abstract
Genome-wide association studies (GWAS) have mapped over 90 % of disease- or trait-associated variants within the non-coding genome, like cis-regulatory elements (CREs). Non-coding single nucleotide polymorphisms (SNPs) are genomic variants that can change how DNA-binding regulatory proteins, like transcription factors (TFs), interact with the genome and regulate gene expression. NKX2-5 is a TF essential for proper heart development, and mutations affecting its function have been associated with congenital heart diseases (CHDs). However, establishing a causal mechanism between non-coding genomic variants and human disease remains challenging. To address this challenge, we identified 8475 SNPs predicted to alter NKX2-5 DNA-binding using a position weight matrix (PWM)-based predictive model. Five variants were prioritized for in vitro validation; four of them are associated with traits and diseases that impact cardiovascular health. The impact of these variants on NKX2-5 binding was evaluated with electrophoretic mobility shift assay (EMSA) using purified recombinant NKX2-5 homeodomain. Binding curves were constructed to determine changes in binding between variant and reference alleles. Variants rs7350789, rs7719885, rs747334, and rs3892630 increased binding affinity, whereas rs61216514 decreased binding by NKX2-5 when compared to the reference genome. Our findings suggest that differential TF-DNA binding affinity can be key in establishing a causal mechanism of pathogenic variants.
Collapse
|
8
|
TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions. Bioinformatics 2023; 39:btad060. [PMID: 36707993 PMCID: PMC9900211 DOI: 10.1093/bioinformatics/btad060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 01/20/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
MOTIVATION Small insertion and deletion (sindel) of human genome has an important implication for human disease. One important mechanism for non-coding sindel (nc-sindel) to have an impact on human diseases and phenotypes is through the regulation of gene expression. Nevertheless, current sequencing experiments may lack statistical power and resolution to pinpoint the functional sindel due to lower minor allele frequency or small effect size. As an alternative strategy, a supervised machine learning method can identify the otherwise masked functional sindels by predicting their regulatory potential directly. However, computational methods for annotating and predicting the regulatory sindels, especially in the non-coding regions, are underdeveloped. RESULTS By leveraging labeled nc-sindels identified by cis-expression quantitative trait loci analyses across 44 tissues in Genotype-Tissue Expression (GTEx), and a compilation of both generic functional annotations and large-scale epigenomic profiles, we develop TIssue-specific Variant Annotation for Non-coding indel (TIVAN-indel), which is a supervised computational framework for predicting non-coding regulatory sindels. As a result, we demonstrate that TIVAN-indel achieves the best prediction performance in both with-tissue prediction and cross-tissue prediction. As an independent evaluation, we train TIVAN-indel from the 'Whole Blood' tissue in GTEx and test the model using 15 immune cell types from an independent study named Database of Immune Cell Expression. Lastly, we perform an enrichment analysis for both true and predicted sindels in key regulatory regions such as chromatin interactions, open chromatin regions and histone modification sites, and find biologically meaningful enrichment patterns. AVAILABILITY AND IMPLEMENTATION https://github.com/lichen-lab/TIVAN-indel. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
9
|
FAVOR: functional annotation of variants online resource and annotator for variation across the human genome. Nucleic Acids Res 2023; 51:D1300-D1311. [PMID: 36350676 PMCID: PMC9825437 DOI: 10.1093/nar/gkac966] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/25/2022] [Accepted: 10/14/2022] [Indexed: 11/11/2022] Open
Abstract
Large biobank-scale whole genome sequencing (WGS) studies are rapidly identifying a multitude of coding and non-coding variants. They provide an unprecedented resource for illuminating the genetic basis of human diseases. Variant functional annotations play a critical role in WGS analysis, result interpretation, and prioritization of disease- or trait-associated causal variants. Existing functional annotation databases have limited scope to perform online queries and functionally annotate the genotype data of large biobank-scale WGS studies. We develop the Functional Annotation of Variants Online Resources (FAVOR) to meet these pressing needs. FAVOR provides a comprehensive multi-faceted variant functional annotation online portal that summarizes and visualizes findings of all possible nine billion single nucleotide variants (SNVs) across the genome. It allows for rapid variant-, gene- and region-level queries of variant functional annotations. FAVOR integrates variant functional information from multiple sources to describe the functional characteristics of variants and facilitates prioritizing plausible causal variants influencing human phenotypes. Furthermore, we provide a scalable annotation tool, FAVORannotator, to functionally annotate large-scale WGS studies and efficiently store the genotype and their variant functional annotation data in a single file using the annotated Genomic Data Structure (aGDS) format, making downstream analysis more convenient. FAVOR and FAVORannotator are available at https://favor.genohub.org.
Collapse
|
10
|
Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies. Nat Genet 2023; 55:154-164. [PMID: 36564505 PMCID: PMC10084891 DOI: 10.1038/s41588-022-01225-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 10/13/2022] [Indexed: 12/24/2022]
Abstract
Meta-analysis of whole genome sequencing/whole exome sequencing (WGS/WES) studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Existing rare variant meta-analysis approaches are not scalable to biobank-scale WGS data. Here we present MetaSTAAR, a powerful and resource-efficient rare variant meta-analysis framework for large-scale WGS/WES studies. MetaSTAAR accounts for relatedness and population structure, can analyze both quantitative and dichotomous traits and boosts the power of rare variant tests by incorporating multiple variant functional annotations. Through meta-analysis of four lipid traits in 30,138 ancestrally diverse samples from 14 studies of the Trans Omics for Precision Medicine (TOPMed) Program, we show that MetaSTAAR performs rare variant meta-analysis at scale and produces results comparable to using pooled data. Additionally, we identified several conditionally significant rare variant associations with lipid traits. We further demonstrate that MetaSTAAR is scalable to biobank-scale cohorts through meta-analysis of TOPMed WGS data and UK Biobank WES data of ~200,000 samples.
Collapse
|
11
|
A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nat Methods 2022; 19:1599-1611. [PMID: 36303018 PMCID: PMC10008172 DOI: 10.1038/s41592-022-01640-x] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 09/06/2022] [Indexed: 02/07/2023]
Abstract
Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.
Collapse
Grants
- R01 DK078616 NIDDK NIH HHS
- U01 HG007417 NHGRI NIH HHS
- KL2 TR001100 NCATS NIH HHS
- R01 HL112064 NHLBI NIH HHS
- N01-HC-95160 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R35 HG010692 NHGRI NIH HHS
- U01-HL054472 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-HL142711 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-DK071891 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- F30 HL149180 NHLBI NIH HHS
- R01 NR019628 NINR NIH HHS
- R01 HL113323 NHLBI NIH HHS
- N01-HC-95166 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UL1RR033176 U.S. Department of Health & Human Services | NIH | National Center for Research Resources (NCRR)
- R01 HL132947 NHLBI NIH HHS
- P30 DK040561 NIDDK NIH HHS
- U01 HL137183 NHLBI NIH HHS
- R01-HL127564 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P30 CA016672 NCI NIH HHS
- R01-HL071051 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL104135 NHLBI NIH HHS
- T32 HL144442 NHLBI NIH HHS
- R35 CA197449 NCI NIH HHS
- P30 ES010126 NIEHS NIH HHS
- DP5 OD029586 NIH HHS
- R01-NS058700 U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
- R01 HL123915 NHLBI NIH HHS
- R01 HL120393 NHLBI NIH HHS
- R01HL071259 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL046380 NHLBI NIH HHS
- R01HL071251, R01HL071258, R01HL071259 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U54 HG003067 NHGRI NIH HHS
- 75N92020D00003 NHLBI NIH HHS
- K01 AG059898 NIA NIH HHS
- U01 DK085524 NIDDK NIH HHS
- KL2 TR002542 NCATS NIH HHS
- R01-HL055673-18S1 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R03 HL141439 NHLBI NIH HHS
- HHSN268201500001I NHLBI NIH HHS
- R01-MH078143, R01-MH078111, R01-MH083824 U.S. Department of Health & Human Services | NIH | National Institute of Mental Health (NIMH)
- U01 DK062413 NIDDK NIH HHS
- R01 HL109946 NHLBI NIH HHS
- U01-HL054495 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- K01 HL136700 NHLBI NIH HHS
- U19 CA203654 NCI NIH HHS
- R01-DK078616 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- U01 HL080295 NHLBI NIH HHS
- NO1-HC-25195 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HG006703 NHGRI NIH HHS
- UL1-TR-001420 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- U01 HG012064 NHGRI NIH HHS
- R35-CA197449 U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
- P30 ES005605 NIEHS NIH HHS
- R01 AR042742 NIAMS NIH HHS
- R21 HL140385 NHLBI NIH HHS
- HHSN268201800015I NHLBI NIH HHS
- U01 HL130114 NHLBI NIH HHS
- R01 HL117191 NHLBI NIH HHS
- R01 HG009974 NHGRI NIH HHS
- U01-HL054473 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 DK113003 NIDDK NIH HHS
- UL1RR033176 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL059367 NHLBI NIH HHS
- R24 AG047115 NIA NIH HHS
- U01-HL137181 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P01 HL107202 NHLBI NIH HHS
- NR0224103 U.S. Department of Health & Human Services | NIH | National Institute of Nursing Research (NINR)
- P50 HL118006 NHLBI NIH HHS
- U01-HL72518, HL087698, HL49762, HL59684, HL58625, HL071025, HL112064 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01 HL120393 NHLBI NIH HHS
- R01 DK117445 NIDDK NIH HHS
- R01-AG058921 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- R03-HL154284 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, UL1-TR-001881 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- R01 AG058921 NIA NIH HHS
- R01 HL129132 NHLBI NIH HHS
- R01 HL113338 NHLBI NIH HHS
- HHSN268201800012I NHLBI NIH HHS
- R01 HL153805 NHLBI NIH HHS
- R01 DK072193 NIDDK NIH HHS
- R01 HL137922 NHLBI NIH HHS
- R01 AI079139 NIAID NIH HHS
- N01-HC-95164 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-DK085524 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- U19 AI111224 NIAID NIH HHS
- R35 HL135824 NHLBI NIH HHS
- 75N92019D00031 NHLBI NIH HHS
- R01 DK110113 NIDDK NIH HHS
- N01-HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- N01-HC-95165 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL138737 NHLBI NIH HHS
- P30 DK079626 NIDDK NIH HHS
- R01 NS058700 NINDS NIH HHS
- R01 HL127564 NHLBI NIH HHS
- T32 HG000040 NHGRI NIH HHS
- DK063491 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- R01 HL141845 NHLBI NIH HHS
- R01 DK075787 NIDDK NIH HHS
- R01 AR072199 NIAMS NIH HHS
- R01 HL120854 NHLBI NIH HHS
- R01 HL163560 NHLBI NIH HHS
- R01HL071258 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-HG009088 U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
- R01 HL163972 NHLBI NIH HHS
- K23 HL123778 NHLBI NIH HHS
- U01 HL137181 NHLBI NIH HHS
- R01 MH078111 NIMH NIH HHS
- HHSN268201700005I NHLBI NIH HHS
- N01-HC-95159 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01-HL113323 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL141944 NHLBI NIH HHS
- R01 HL119443 NHLBI NIH HHS
- R01-HL071051, R01-HL071205, R01HL071250 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P60-AG10484 U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- 75N92020D00007 NHLBI NIH HHS
- UM1 AI068634 NIAID NIH HHS
- HHSN268201500003I NHLBI NIH HHS
- HHSN268201700004I NHLBI NIH HHS
- N01-HC-95163 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01-HL071205 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- F30 HL107066 NHLBI NIH HHS
- R01-HL153805 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL105756 NHLBI NIH HHS
- K01 HL125751 NHLBI NIH HHS
- R01 HL067348 NHLBI NIH HHS
- T32 HL007208 NHLBI NIH HHS
- R01 HL142711 NHLBI NIH HHS
- R35 HL135818 NHLBI NIH HHS
- R01-HL92301 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- T32 GM074897 NIGMS NIH HHS
- I01 BX005295 BLRD VA
- 75N92020D00001 NHLBI NIH HHS
- R01 HL113326 NHLBI NIH HHS
- R00 HL129045 NHLBI NIH HHS
- UL1-TR-000040 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- UL1-TR-001079 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- U01 HL072524 NHLBI NIH HHS
- R35-HL135818 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- K08 HL140203 NHLBI NIH HHS
- N01-HC-95162 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- K08 HL141601 NHLBI NIH HHS
- 75N92020D00005 NHLBI NIH HHS
- R01-DK117445 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- R01-AR48797 U.S. Department of Health & Human Services | NIH | National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS)
- R56 AG058543 NIA NIH HHS
- U19 AI077439 NIAID NIH HHS
- R01 HL142028 NHLBI NIH HHS
- 75N92020D00004 NHLBI NIH HHS
- HHSN268201800011I NHLBI NIH HHS
- R35 GM127131 NIGMS NIH HHS
- U01 HL137880 NHLBI NIH HHS
- R01 HG010869 NHGRI NIH HHS
- R01-HL133040 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- HHSN268201700003I NHLBI NIH HHS
- R01HL071250 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- N01-HC-95168 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL148239 NHLBI NIH HHS
- U01-HL137162 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 AI132476 NIAID NIH HHS
- T32 GM007205 NIGMS NIH HHS
- HHSN268201800010I NHLBI NIH HHS
- R01-HL092577-06S1 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UL1-TR-001881 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- R01-HL104135-04S1 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL132320 NHLBI NIH HHS
- U01 DK078616 NIDDK NIH HHS
- HHSN268201700001I NHLBI NIH HHS
- R01-HL141944 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01 HL137162 NHLBI NIH HHS
- R01 HG005701 NHGRI NIH HHS
- 75N92020D00001, 75N92020D00002, 75N92020D00003, 75N92020D00004 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 HL143221 NHLBI NIH HHS
- R01 HL142992 NHLBI NIH HHS
- K01 HL129039 NHLBI NIH HHS
- R01 HL133870 NHLBI NIH HHS
- R01 DA037904 NIDA NIH HHS
- R21 HL123677 NHLBI NIH HHS
- R01 DK071891 NIDDK NIH HHS
- HHSN268201800001I U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- 75N92020D00002 NHLBI NIH HHS
- K01 HL130609 NHLBI NIH HHS
- N01-HC-95167 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- T32 HL007374 NHLBI NIH HHS
- N01-HC-95169 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U01-DK078616 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- R01 AR063611 NIAMS NIH HHS
- KL2TR002490 U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- R03 HL154284 NHLBI NIH HHS
- M01-RR000052 U.S. Department of Health & Human Services | NIH | National Center for Research Resources (NCRR)
- 75N92020D00006 NHLBI NIH HHS
- S10 OD020069 NIH HHS
- R01 MD012765 NIMHD NIH HHS
- N01-HC-95161 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- HHSN268201700002I NHLBI NIH HHS
- R01 HL151855 NHLBI NIH HHS
- K23 HL138461 NHLBI NIH HHS
- U01 CA182913 NCI NIH HHS
- UG3 HL151865 NHLBI NIH HHS
- F32 HL150992 NHLBI NIH HHS
- R01-MD012765 U.S. Department of Health & Human Services | NIH | National Institute on Minority Health and Health Disparities (NIMHD)
- 75N92020D00005, 75N92020D00006, 75N92020D00007 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01 MH101244 NIMH NIH HHS
- U01 HG009088 NHGRI NIH HHS
- N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P42 ES016454 NIEHS NIH HHS
- UM1 DK078616 NIDDK NIH HHS
- U01-HL054509 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R35-HL135824 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- M01-RR07122 U.S. Department of Health & Human Services | NIH | National Center for Research Resources (NCRR)
- U01 DK105561 NIDDK NIH HHS
- U01-HL072524 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P20 GM121334 NIGMS NIH HHS
- N01-HC-95167, N01-HC-95168, N01-HC-95169 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R01 HL131565 NHLBI NIH HHS
- R01HL071251 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R13 CA124365 NCI NIH HHS
- R01-HL045522 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- P01 HL132825 NHLBI NIH HHS
- R01 HL118267 NHLBI NIH HHS
- HHSN268201800013I NIMHD NIH HHS
- R01-HL67348 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- U54 GM115428 NIGMS NIH HHS
- R01 HL055673 NHLBI NIH HHS
- HHSN268201600018C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C, and HHSN268201600004C U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UM1-DK078616 U.S. Department of Health & Human Services | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (National Institute of Diabetes & Digestive & Kidney Diseases)
- R01 HL149683 NHLBI NIH HHS
- R01 HL092301 NHLBI NIH HHS
- P30 DK020595 NIDDK NIH HHS
- R01 HL149836 NHLBI NIH HHS
- K08 HL145095 NHLBI NIH HHS
- K01 HL135405 NHLBI NIH HHS
- R03 OD030608 NIH HHS
- HHSN268201800014I NHLBI NIH HHS
- R01-HL113338 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- F32-HL085989 U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- UM1 AI068636 NIAID NIH HHS
- R01 AG057381 NIA NIH HHS
- U19-CA203654 U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)
Collapse
|
12
|
The Role of Long Noncoding RNAs on Male Infertility: A Systematic Review and In Silico Analysis. BIOLOGY 2022; 11:biology11101510. [PMID: 36290414 PMCID: PMC9598197 DOI: 10.3390/biology11101510] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 10/08/2022] [Accepted: 10/13/2022] [Indexed: 11/16/2022]
Abstract
Male infertility is a complex disorder affecting many couples worldwide. Long noncoding RNAs (lncRNAs) regulate important cellular processes; however, a comprehensive understanding of their role in male infertility is limited. This systematic review investigates the differential expressions of lncRNAs in male infertility or variations in lncRNA regions associated with it. The PRISMA guidelines were used to search Pubmed and Web of Science (1 June 2022). Inclusion criteria were human participants, patients diagnosed with male infertility, and English language speakers. We also performed an in silico analysis investigating lncRNAs that are reported in many subtypes of male infertility. A total of 625 articles were found, and after the screening and eligibility stages, 20 studies were included in the final sample. Many lncRNAs are deregulated in male infertility, and interactions between lncRNAs and miRNAs play an important role. However, there is a knowledge gap regarding the impact of variants found in lncRNA regions. Furthermore, eight lncRNAs were identified as differentially expressed in many subtypes of male infertility. After in silico analysis, gene ontology (GO) and KEGG enrichment analysis of the genes targeted by them revealed their association with bladder and prostate cancer. However, pathways involved in general in tumorigenesis and cancer development of all types, such as p53 pathways, apoptosis, and cell death, were also enriched, indicating a link between cancer and male infertility. This evidence, however, is preliminary. Future research is needed to explore the exact mechanism of action of the identified lncRNAs and investigate the association between male infertility and cancer.
Collapse
|
13
|
Natural and Experimental Rewiring of Gene Regulatory Regions. Annu Rev Genomics Hum Genet 2022; 23:73-97. [PMID: 35472292 DOI: 10.1146/annurev-genom-112921-010715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The successful development and ongoing functioning of complex organisms depend on the faithful execution of the genetic code. A critical step in this process is the correct spatial and temporal expression of genes. The highly orchestrated transcription of genes is controlled primarily by cis-regulatory elements: promoters, enhancers, and insulators. The medical importance of this key biological process can be seen by the frequency with which mutations and inherited variants that alter cis-regulatory elements lead to monogenic and complex diseases and cancer. Here, we provide an overview of the methods available to characterize and perturb gene regulatory circuits. We then highlight mechanisms through which regulatory rewiring contributes to disease, and conclude with a perspective on how our understanding of gene regulation can be used to improve human health.
Collapse
|
14
|
3DFAACTS-SNP: using regulatory T cell-specific epigenomics data to uncover candidate mechanisms of type 1 diabetes (T1D) risk. Epigenetics Chromatin 2022; 15:24. [PMID: 35773720 PMCID: PMC9244893 DOI: 10.1186/s13072-022-00456-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 06/06/2022] [Indexed: 11/26/2022] Open
Abstract
Background Genome-wide association studies (GWAS) have enabled the discovery of single nucleotide polymorphisms (SNPs) that are significantly associated with many autoimmune diseases including type 1 diabetes (T1D). However, many of the identified variants lie in non-coding regions, limiting the identification of mechanisms that contribute to autoimmune disease progression. To address this problem, we developed a variant filtering workflow called 3DFAACTS-SNP to link genetic variants to target genes in a cell-specific manner. Here, we use 3DFAACTS-SNP to identify candidate SNPs and target genes associated with the loss of immune tolerance in regulatory T cells (Treg) in T1D. Results Using 3DFAACTS-SNP, we identified from a list of 1228 previously fine-mapped variants, 36 SNPs with plausible Treg-specific mechanisms of action. The integration of cell type-specific chromosome conformation capture data in 3DFAACTS-SNP identified 266 regulatory regions and 47 candidate target genes that interact with these variant-containing regions in Treg cells. We further demonstrated the utility of the workflow by applying it to three other SNP autoimmune datasets, identifying 16 Treg-centric candidate variants and 60 interacting genes. Finally, we demonstrate the broad utility of 3DFAACTS-SNP for functional annotation of all known common (> 10% allele frequency) variants from the Genome Aggregation Database (gnomAD). We identified 9376 candidate variants and 4968 candidate target genes, generating a list of potential sites for future T1D or other autoimmune disease research. Conclusions We demonstrate that it is possible to further prioritise variants that contribute to T1D based on regulatory function, and illustrate the power of using cell type-specific multi-omics datasets to determine disease mechanisms. Our workflow can be customised to any cell type for which the individual datasets for functional annotation have been generated, giving broad applicability and utility. Supplementary Information The online version contains supplementary material available at 10.1186/s13072-022-00456-5.
Collapse
|
15
|
Whole exome sequencing identifies novel germline variants of SLC15A4 gene as potentially cancer predisposing in familial colorectal cancer. Mol Genet Genomics 2022; 297:965-979. [PMID: 35562597 PMCID: PMC9250485 DOI: 10.1007/s00438-022-01896-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 04/02/2022] [Indexed: 11/25/2022]
Abstract
About 15% of colorectal cancer (CRC) patients have first-degree relatives affected by the same malignancy. However, for most families the cause of familial aggregation of CRC is unknown. To identify novel high-to-moderate-penetrance germline variants underlying CRC susceptibility, we performed whole exome sequencing (WES) on four CRC cases and two unaffected members of a Polish family without any mutation in known CRC predisposition genes. After WES, we used our in-house developed Familial Cancer Variant Prioritization Pipeline and identified two novel variants in the solute carrier family 15 member 4 (SLC15A4) gene. The heterozygous missense variant, p. Y444C, was predicted to affect the phylogenetically conserved PTR2/POT domain and to have a deleterious effect on the function of the encoded peptide/histidine transporter. The other variant was located in the upstream region of the same gene (GRCh37.p13, 12_129308531_C_T; 43 bp upstream of transcription start site, ENST00000266771.5) and it was annotated to affect the promoter region of SLC15A4 as well as binding sites of 17 different transcription factors. Our findings of two distinct variants in the same gene may indicate a synergistic up-regulation of SLC15A4 as the underlying genetic cause and implicate this gene for the first time in genetic inheritance of familial CRC.
Collapse
|
16
|
A multi-dimensional integrative scoring framework for predicting functional variants in the human genome. Am J Hum Genet 2022; 109:446-456. [PMID: 35216679 PMCID: PMC8948160 DOI: 10.1016/j.ajhg.2022.01.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 01/26/2022] [Indexed: 12/26/2022] Open
Abstract
Attempts to identify and prioritize functional DNA elements in coding and non-coding regions, particularly through use of in silico functional annotation data, continue to increase in popularity. However, specific functional roles can vary widely from one variant to another, making it challenging to summarize different aspects of variant function with a one-dimensional rating. Here we propose multi-dimensional annotation-class integrative estimation (MACIE), an unsupervised multivariate mixed-model framework capable of integrating annotations of diverse origin to assess multi-dimensional functional roles for both coding and non-coding variants. Unlike existing one-dimensional scoring methods, MACIE views variant functionality as a composite attribute encompassing multiple characteristics and estimates the joint posterior functional probabilities of each genomic position. This estimate offers more comprehensive and interpretable information in the presence of multiple aspects of functionality. Applied to a variety of independent coding and non-coding datasets, MACIE demonstrates powerful and robust performance in discriminating between functional and non-functional variants. We also show an application of MACIE to fine-mapping and heritability enrichment analysis by using the lipids GWAS summary statistics data from the European Network for Genetic and Genomic Epidemiology Consortium.
Collapse
|
17
|
Abstract
Non-coding variants have long been recognized as important contributors to common disease risks, but with the expansion of clinical whole genome sequencing, examples of rare, high-impact non-coding variants are also accumulating. Despite recent advances in the study of regulatory elements and the availability of specialized data collections, the systematic annotation of non-coding variants from genome sequencing remains challenging. Here, we propose a new framework for the prioritization of non-coding regulatory variants that integrates information about regulatory regions with prediction scores and HPO-based prioritization. Firstly, we created a comprehensive collection of annotations for regulatory regions including a database of 2.4 million regulatory elements (GREEN-DB) annotated with controlled gene(s), tissue(s) and associated phenotype(s) where available. Secondly, we calculated a variation constraint metric and showed that constrained regulatory regions associate with disease-associated genes and essential genes from mouse knock-outs. Thirdly, we compared 19 non-coding impact prediction scores providing suggestions for variant prioritization. Finally, we developed a VCF annotation tool (GREEN-VARAN) that can integrate all these elements to annotate variants for their potential regulatory impact. In our evaluation, we show that GREEN-DB can capture previously published disease-associated non-coding variants as well as identify additional candidate disease genes in trio analyses.
Collapse
|
18
|
Advancing drug discovery using the power of the human genome. J Pathol 2021; 254:418-429. [PMID: 33748968 PMCID: PMC8251523 DOI: 10.1002/path.5664] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 03/11/2021] [Accepted: 03/16/2021] [Indexed: 12/31/2022]
Abstract
Human genetics plays an increasingly important role in drug development and population health. Here we review the history of human genetics in the context of accelerating the discovery of therapies, present examples of how human genetics evidence supports successful drug targets, and discuss how polygenic risk scores could be beneficial in various clinical settings. We highlight the value of direct-to-consumer platforms in the era of fast-paced big data biotechnology, and how diverse genetic and health data can benefit society. © 2021 23andMe, Inc. The Journal of Pathology published by John Wiley & Sons, Ltd. on behalf of The Pathological Society of Great Britain and Ireland.
Collapse
|
19
|
Cell-type-specific effects of genetic variation on chromatin accessibility during human neuronal differentiation. Nat Neurosci 2021; 24:941-953. [PMID: 34017130 PMCID: PMC8254789 DOI: 10.1038/s41593-021-00858-w] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 04/15/2021] [Indexed: 02/03/2023]
Abstract
Common genetic risk for neuropsychiatric disorders is enriched in regulatory elements active during cortical neurogenesis. However, it remains poorly understood as to how these variants influence gene regulation. To model the functional impact of common genetic variation on the noncoding genome during human cortical development, we performed the assay for transposase accessible chromatin using sequencing (ATAC-seq) and analyzed chromatin accessibility quantitative trait loci (QTL) in cultured human neural progenitor cells and their differentiated neuronal progeny from 87 donors. We identified significant genetic effects on 988/1,839 neuron/progenitor regulatory elements, with highly cell-type and temporally specific effects. A subset (roughly 30%) of chromatin accessibility-QTL were also associated with changes in gene expression. Motif-disrupting alleles of transcriptional activators generally led to decreases in chromatin accessibility, whereas motif-disrupting alleles of repressors led to increases in chromatin accessibility. By integrating cell-type-specific chromatin accessibility-QTL and brain-relevant genome-wide association data, we were able to fine-map and identify regulatory mechanisms underlying noncoding neuropsychiatric disorder risk loci.
Collapse
|
20
|
Abstract
Myxomatous mitral valve disease (MMVD) is the most common heart disease and cause of cardiac death in domestic dogs. MMVD is characterised by slow progressive myxomatous degeneration from the tips of the mitral valves onwards with subsequent mitral valve regurgitation, and left atrial and ventricular dilatation. Although the disease usually has a long asymptomatic period, in dogs with severe disease, mortality is typically secondary to left-sided congestive heart failure. Although it is not uncommon for dogs to survive long enough in the asymptomatic period to die from unrelated causes; a proportion of dogs rapidly advance into congestive heart failure. Heightened prevalence in certain breeds, such as the Cavalier King Charles Spaniel, has indicated that MMVD is under a genetic influence. The genetic characterisation of the factors that underlie the difference in progression of disease is of strong interest to those concerned with dog longevity and welfare. Advanced genomic technologies have the potential to provide information that may impact treatment, prevalence, or severity of MMVD through the elucidation of pathogenic mechanisms and the detection of predisposing genetic loci of major effect. Here we describe briefly the clinical nature of the disorder and consider the physiological mechanisms that might impact its occurrence in the domestic dog. Using results from comparative genomics we suggest possible genetic approaches for identifying genetic risk factors within breeds. The Cavalier King Charles Spaniel breed represents a robust resource for uncovering the genetic basis of MMVD.
Collapse
|
21
|
Pleiotropy and Cross-Disorder Genetics Among Psychiatric Disorders. Biol Psychiatry 2021; 89:20-31. [PMID: 33131714 PMCID: PMC7898275 DOI: 10.1016/j.biopsych.2020.09.026] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Revised: 08/28/2020] [Accepted: 09/30/2020] [Indexed: 12/20/2022]
Abstract
Genome-wide analyses of common and rare genetic variations have documented the heritability of major psychiatric disorders, established their highly polygenic genetic architecture, and identified hundreds of contributing variants. In recent years, these studies have illuminated another key feature of the genetic basis of psychiatric disorders: the important role and pervasive nature of pleiotropy. It is now clear that a substantial fraction of genetic influences on psychopathology transcend clinical diagnostic boundaries. In this review, we summarize evidence in psychiatry for pleiotropy at multiple levels of analysis: from overall genome-wide correlation to biological pathways and down to the level of individual loci. We examine underlying mechanisms of observed pleiotropy, including genetic effects on neurodevelopment, diverse actions of regulatory elements, mediated effects, and spurious associations of genomic variation with multiple phenotypes. We conclude with an exploration of the implications of pleiotropy for understanding the genetic basis of psychiatric disorders, informing nosology, and advancing the aims of precision psychiatry and genomic medicine.
Collapse
|
22
|
Involvement of lncRNAs in celiac disease pathogenesis. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2020. [PMID: 33707056 DOI: 10.1016/bs.ircmb.2020.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/15/2023]
Abstract
Celiac disease (CD) is an immune-mediated disease that develops in genetically susceptible individuals upon gluten exposure. Human Leukocyte Antigen (HLA) genes in the Major Histocompatibility Complex (MHC) have been described to represent the 40% of the genetic risk to develop CD. Aiming to gain understanding of the genetic involvement in CD, high throughput studies have been performed, revealing that many CD-associated variants are located in non-coding regions, hindering the study of the functional implications of these single nucleotide polymorphisms (SNPs). In the last decade, long non-coding RNAs (lncRNAs) have been described to be influenced by disease-associated SNPs and to drive many important mechanisms involved in the development of inflammatory diseases. Here we describe the lncRNAs identified and characterized in the context of celiac disease and highlight the importance of the study of these molecules in inflammatory and autoimmune disorders.
Collapse
|
23
|
Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat Genet 2020; 52:969-983. [PMID: 32839606 PMCID: PMC7483769 DOI: 10.1038/s41588-020-0676-4] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 07/02/2020] [Indexed: 12/13/2022]
Abstract
Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.
Collapse
|
24
|
Accuracy of a machine learning muscle MRI-based tool for the diagnosis of muscular dystrophies. Neurology 2020; 94:e1094-e1102. [PMID: 32029545 DOI: 10.1212/wnl.0000000000009068] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 10/03/2019] [Indexed: 12/11/2022] Open
Abstract
OBJECTIVE Genetic diagnosis of muscular dystrophies (MDs) has classically been guided by clinical presentation, muscle biopsy, and muscle MRI data. Muscle MRI suggests diagnosis based on the pattern of muscle fatty replacement. However, patterns overlap between different disorders and knowledge about disease-specific patterns is limited. Our aim was to develop a software-based tool that can recognize muscle MRI patterns and thus aid diagnosis of MDs. METHODS We collected 976 pelvic and lower limbs T1-weighted muscle MRIs from 10 different MDs. Fatty replacement was quantified using Mercuri score and files containing the numeric data were generated. Random forest supervised machine learning was applied to develop a model useful to identify the correct diagnosis. Two thousand different models were generated and the one with highest accuracy was selected. A new set of 20 MRIs was used to test the accuracy of the model, and the results were compared with diagnoses proposed by 4 specialists in the field. RESULTS A total of 976 lower limbs MRIs from 10 different MDs were used. The best model obtained had 95.7% accuracy, with 92.1% sensitivity and 99.4% specificity. When compared with experts on the field, the diagnostic accuracy of the model generated was significantly higher in a new set of 20 MRIs. CONCLUSION Machine learning can help doctors in the diagnosis of muscle dystrophies by analyzing patterns of muscle fatty replacement in muscle MRI. This tool can be helpful in daily clinics and in the interpretation of the results of next-generation sequencing tests. CLASSIFICATION OF EVIDENCE This study provides Class II evidence that a muscle MRI-based artificial intelligence tool accurately diagnoses muscular dystrophies.
Collapse
|
25
|
GWAS4D: multidimensional analysis of context-specific regulatory variant for human complex diseases and traits. Nucleic Acids Res 2019; 46:W114-W120. [PMID: 29771388 PMCID: PMC6030885 DOI: 10.1093/nar/gky407] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Accepted: 05/03/2018] [Indexed: 01/04/2023] Open
Abstract
Genome-wide association studies have generated over thousands of susceptibility loci for many human complex traits, and yet for most of these associations the true causal variants remain unknown. Tissue/cell type-specific prediction and prioritization of non-coding regulatory variants will facilitate the identification of causal variants and underlying pathogenic mechanisms for particular complex diseases and traits. By leveraging recent large-scale functional genomics/epigenomics data, we develop an intuitive web server, GWAS4D (http://mulinlab.tmu.edu.cn/gwas4d or http://mulinlab.org/gwas4d), that systematically evaluates GWAS signals and identifies context-specific regulatory variants. The updated web server includes six major features: (i) updates the regulatory variant prioritization method with our new algorithm; (ii) incorporates 127 tissue/cell type-specific epigenomes data; (iii) integrates motifs of 1480 transcriptional regulators from 13 public resources; (iv) uniformly processes Hi-C data and generates significant interactions at 5 kb resolution across 60 tissues/cell types; (v) adds comprehensive non-coding variant functional annotations; (vi) equips a highly interactive visualization function for SNP-target interaction. Using a GWAS fine-mapped set for 161 coronary artery disease risk loci, we demonstrate that GWAS4D is able to efficiently prioritize disease-causal regulatory variants.
Collapse
|
26
|
Association between MBL2 haplotypes and dengue severity in children from Rio de Janeiro, Brazil. Mem Inst Oswaldo Cruz 2019; 114:e190004. [PMID: 31141020 PMCID: PMC6534340 DOI: 10.1590/0074-02760190004] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Accepted: 04/11/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Dengue is an arthropod-borne viral disease with a majority of asymptomatic
individuals and clinical manifestations varying from mild fever to severe
and potentially lethal forms. An increasing number of genetic studies have
outlined the association between host genetic variations and dengue
severity. Genes associated to viral recognition and entry, as well as those
encoding mediators of the immune response against infection are strong
candidates for association studies. OBJECTIVES The aim of this study was to investigate the association between
MBL2, CLEC5A, ITGB3
and CCR5 genes and dengue severity in children. METHODS A matched case-control study was conducted and 19 single nucleotide
polymorphisms (SNPs) were investigated. FINDINGS No associations were observed in single SNP analysis. However, when
MBL2 SNPs were combined in haplotypes, the allele
rs7095891G/rs1800450C/ rs1800451C/rs4935047A/rs930509G/rs2120131G/rs2099902C
was significantly associated to risk of severe dengue under α = 0.05 (aOR =
4.02; p = 0.02). A second haplotype carrying rs4935047G and rs7095891G
alleles was also associated to risk (aOR = 1.91; p = 0.04). MAIN CONCLUSIONS This is the first study to demonstrate the association between
MBL2 haplotypes and dengue severity in Brazilians
including adjustment for genetic ancestry. These results reinforce the role
of mannose binding lectin in immune response to DENV.
Collapse
|
27
|
SSS-test: a novel test for detecting positive selection on RNA secondary structure. BMC Bioinformatics 2019; 20:151. [PMID: 30898084 PMCID: PMC6429701 DOI: 10.1186/s12859-019-2711-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 03/03/2019] [Indexed: 12/23/2022] Open
Abstract
Background Long non-coding RNAs (lncRNAs) play an important role in regulating gene expression and are thus important for determining phenotypes. Most attempts to measure selection in lncRNAs have focused on the primary sequence. The majority of small RNAs and at least some parts of lncRNAs must fold into specific structures to perform their biological function. Comprehensive assessments of selection acting on RNAs therefore must also encompass structure. Selection pressures acting on the structure of non-coding genes can be detected within multiple sequence alignments. Approaches of this type, however, have so far focused on negative selection. Thus, a computational method for identifying ncRNAs under positive selection is needed. Results We introduce the SSS-test (test for Selection on Secondary Structure) to identify positive selection and thus adaptive evolution. Benchmarks with biological as well as synthetic controls yield coherent signals for both negative and positive selection, demonstrating the functionality of the test. A survey of a lncRNA collection comprising 15,443 families resulted in 110 candidates that appear to be under positive selection in human. In 26 lncRNAs that have been associated with psychiatric disorders we identified local structures that have signs of positive selection in the human lineage. Conclusions It is feasible to assay positive selection acting on RNA secondary structures on a genome-wide scale. The detection of human-specific positive selection in lncRNAs associated with cognitive disorder provides a set of candidate genes for further experimental testing and may provide insights into the evolution of cognitive abilities in humans. Availability The SSS-test and related software is available at: https://github.com/waltercostamb/SSS-test. The databases used in this work are available at: http://www.bioinf.uni-leipzig.de/Software/SSS-test/. Electronic supplementary material The online version of this article (10.1186/s12859-019-2711-y) contains supplementary material, which is available to authorized users.
Collapse
|
28
|
Estimating contribution of rare non-coding variants to neuropsychiatric disorders. Psychiatry Clin Neurosci 2019; 73:2-10. [PMID: 30293238 DOI: 10.1111/pcn.12774] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/24/2018] [Indexed: 12/21/2022]
Abstract
Owing to recent advances in DNA sequencing technology, a number of large-scale comprehensive analyses of genetic variations in protein-coding regions (i.e., whole-exome sequencing studies), have been conducted for neuropsychiatric and neurodevelopmental disorders, such as autism spectrum disorders, intellectual disability, and schizophrenia. These studies, especially those focusing on de novo (newly arising) mutations and extremely rare variants, have successfully identified previously unrecognized disease genes/mutations with a large effect size and deepen our understanding of the biology of neuropsychiatric diseases. Along with the continuously dropping sequencing cost, now the target of sequencing studies is expanding from the exome to the whole human genome. Several pioneering works have provided important insights into the contribution of rare non-coding variants to neuropsychiatric diseases. At the same time, these studies highlight need for further larger sample sizes and improvement in annotation of non-coding regulatory variants. In this review, key findings from recent studies as well as likely future directions are overviewed.
Collapse
|
29
|
Functional implication of celiac disease associated lncRNAs in disease pathogenesis. Comput Biol Med 2018; 102:369-375. [DOI: 10.1016/j.compbiomed.2018.08.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Revised: 08/09/2018] [Accepted: 08/09/2018] [Indexed: 12/11/2022]
|