1
|
Prokop JW, Jdanov V, Savage L, Morris M, Lamb N, VanSickle E, Stenger CL, Rajasekaran S, Bupp CP. Computational and Experimental Analysis of Genetic Variants. Compr Physiol 2022; 12:3303-3336. [PMID: 35578967 DOI: 10.1002/cphy.c210012] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Genomics has grown exponentially over the last decade. Common variants are associated with physiological changes through statistical strategies such as Genome-Wide Association Studies (GWAS) and quantitative trail loci (QTL). Rare variants are associated with diseases through extensive filtering tools, including population genomics and trio-based sequencing (parents and probands). However, the genomic associations require follow-up analyses to narrow causal variants, identify genes that are influenced, and to determine the physiological changes. Large quantities of data exist that can be used to connect variants to gene changes, cell types, protein pathways, clinical phenotypes, and animal models that establish physiological genomics. This data combined with bioinformatics including evolutionary analysis, structural insights, and gene regulation can yield testable hypotheses for mechanisms of genomic variants. Molecular biology, biochemistry, cell culture, CRISPR editing, and animal models can test the hypotheses to give molecular variant mechanisms. Variant characterizations can be a significant component of educating future professionals at the undergraduate, graduate, or medical training programs through teaching the basic concepts and terminology of genetics while learning independent research hypothesis design. This article goes through the computational and experimental analysis strategies of variant characterization and provides examples of these tools applied in publications. © 2022 American Physiological Society. Compr Physiol 12:3303-3336, 2022.
Collapse
Affiliation(s)
- Jeremy W Prokop
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, Michigan, USA.,Department of Pharmacology and Toxicology, Michigan State University, East Lansing, Michigan, USA
| | - Vladislav Jdanov
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, Michigan, USA
| | - Lane Savage
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, Michigan, USA
| | - Michele Morris
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, USA
| | - Neil Lamb
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, USA
| | | | - Cynthia L Stenger
- Department of Mathematics, University of North Alabama, Florence, Alabama, USA
| | - Surender Rajasekaran
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, Michigan, USA.,Pediatric Intensive Care Unit, Helen DeVos Children's Hospital, Grand Rapids, Michigan, USA.,Office of Research, Spectrum Health, Grand Rapids, Michigan, USA
| | - Caleb P Bupp
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, Michigan, USA.,Medical Genetics, Spectrum Health, Grand Rapids, Michigan, USA
| |
Collapse
|
2
|
Nonsense-mediated mRNA decay immunity can help identify human polycistronic transcripts. PLoS One 2014; 9:e91535. [PMID: 24621851 PMCID: PMC3951408 DOI: 10.1371/journal.pone.0091535] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Accepted: 02/13/2014] [Indexed: 11/19/2022] Open
Abstract
Eukaryotic polycistronic transcription units are rare and only a few examples are known, mostly being the outcome of serendipitous discovery. We claim that nonsense-mediated mRNA decay (NMD) immune structure is a common characteristic of polycistronic transcripts, and that this immunity is an emergent property derived from all functional CDSs. The human RefSeq transcriptome was computationally screened for transcripts capable of eliciting NMD, and which contain an additional ORF(s) potentially capable of rescuing the transcript from NMD. Transcripts were further analyzed implementing domain-based strategies in order to estimate the potential of the candidate ORF to encode a functional protein. Consequently, we predict the existence of forty nine novel polycistronic transcripts. Experimental verification was carried out utilizing two different types of analyses. First, five Gene Expression Omnibus (GEO) datasets from published NMD-inhibition studies were used, aiming to explore whether a given mRNA is indeed insensitive to NMD. All known bicistronic transcripts and eleven out of the twelve predicted genes that were analyzed, displayed NMD insensitivity using various NMD inhibitors. For three genes, a mixed expression pattern was observed presenting both NMD sensitivity and insensitivity in different cell types. Second, we used published global translation initiation sequencing data from HEK293 cells to verify the existence of translation initiation sites in our predicted polycistronic genes. In five of our genes, the predicted rescuing uORFs are indeed identified as translation initiation sites, and in two additional genes, one of two predicted rescuing uORF is verified. These results validate our computational analysis and reinforce the possibility that NMD-immune architecture is a parameter by which polycistronic genes can be identified. Moreover, we present evidence for NMD-mediated regulation controlling the production of one or more proteins encoded in the polycistronic transcript.
Collapse
|
3
|
Gardner LB. Nonsense-mediated RNA decay regulation by cellular stress: implications for tumorigenesis. Mol Cancer Res 2010; 8:295-308. [PMID: 20179151 DOI: 10.1158/1541-7786.mcr-09-0502] [Citation(s) in RCA: 113] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Nonsense-mediated RNA decay (NMD) has long been viewed as an important constitutive mechanism to rapidly eliminate mutated mRNAs. More recently, it has been appreciated that NMD also degrades multiple nonmutated transcripts and that NMD can be regulated by wide variety of cellular stresses. Many of the stresses that inhibit NMD, including cellular hypoxia and amino acid deprivation, are experienced in cells exposed to hostile microenvironments, and several NMD-targeted transcripts promote cellular adaptation in response to these environmental stresses. Because adaptation to the microenvironment is crucial in tumorigenesis, and because NMD targets many mutated tumor suppressor gene transcripts, the regulation of NMD may have particularly important implications in cancer. This review briefly outlines the mechanisms by which transcripts are identified and targeted by NMD and reviews the evidence showing that NMD is a regulated process that can dynamically alter gene expression. Although much of the focus in NMD research has been in identifying the proteins that play a role in NMD and identifying NMD-targeted transcripts, recent data about the potential functional significance of NMD regulation, including the stabilization of alternatively spliced mRNA isoforms, the validation of mRNAs as bona fide NMD targets, and the role of NMD in tumorigenesis, are explored.
Collapse
Affiliation(s)
- Lawrence B Gardner
- Division of Hematology, Department of Medicine, New York University School of Medicine, New York, NY 10016, USA.
| |
Collapse
|
4
|
Ren J, Jiang C, Gao X, Liu Z, Yuan Z, Jin C, Wen L, Zhang Z, Xue Y, Yao X. PhosSNP for systematic analysis of genetic polymorphisms that influence protein phosphorylation. Mol Cell Proteomics 2009; 9:623-34. [PMID: 19995808 DOI: 10.1074/mcp.m900273-mcp200] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
We are entering the era of personalized genomics as breakthroughs in sequencing technology have made it possible to sequence or genotype an individual person in an efficient and accurate manner. Preliminary results from HapMap and other similar projects have revealed the existence of tremendous genetic variations among world populations and among individuals. It is important to delineate the functional implication of such variations, i.e. whether they affect the stability and biochemical properties of proteins. It is also generally believed that the genetic variation is the main cause for different susceptibility to certain diseases or different response to therapeutic treatments. Understanding genetic variation in the context of human diseases thus holds the promise for "personalized medicine." In this work, we carried out a genome-wide analysis of single nucleotide polymorphisms (SNPs) that could potentially influence protein phosphorylation characteristics in human. Here, we defined a phosphorylation-related SNP (phosSNP) as a non-synonymous SNP (nsSNP) that affects the protein phosphorylation status. Using an in-house developed kinase-specific phosphorylation site predictor (GPS 2.0), we computationally detected that approximately 70% of the reported nsSNPs are potential phosSNPs. More interestingly, approximately 74.6% of these potential phosSNPs might also induce changes in protein kinase types in adjacent phosphorylation sites rather than creating or removing phosphorylation sites directly. Taken together, we proposed that a large proportion of the nsSNPs might affect protein phosphorylation characteristics and play important roles in rewiring biological pathways. Finally, all phosSNPs were integrated into the PhosSNP 1.0 database, which was implemented in JAVA 1.5 (J2SE 5.0). The PhosSNP 1.0 database is freely available for academic researchers.
Collapse
Affiliation(s)
- Jian Ren
- Department of Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Abstract
Rare, high-penetrance genetic variations account for a small portion of genetic cancer syndromes. In contrast, most cancers develop from a combination of minor genetic influences and environmental factors. There are numerous publications on cancer susceptibility. In contrast, genetic studies in treatment response and outcome analyses are a rapidly emerging field. Approaches used in disease susceptibility can be adapted for genetic outcome studies. In this review, we summarize the current knowledge on how candidate genes and genetic variations are selected to evaluate gene-outcome, gene-prognosis, and gene-treatment response relationships as applicable to the practicing oncologist.
Collapse
Affiliation(s)
- Sevtap Savas
- Division of Applied Molecular Oncology, Department of Medical Biophysics, Ontario Cancer Institute, Toronto, Canada.
| | | |
Collapse
|
6
|
Yngvadottir B, Xue Y, Searle S, Hunt S, Delgado M, Morrison J, Whittaker P, Deloukas P, Tyler-Smith C. A genome-wide survey of the prevalence and evolutionary forces acting on human nonsense SNPs. Am J Hum Genet 2009; 84:224-34. [PMID: 19200524 DOI: 10.1016/j.ajhg.2009.01.008] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2008] [Revised: 01/10/2009] [Accepted: 01/14/2009] [Indexed: 12/17/2022] Open
Abstract
Nonsense SNPs introduce premature termination codons into genes and can result in the absence of a gene product or in a truncated and potentially harmful protein, so they are often considered disadvantageous and are associated with disease susceptibility. As such, we might expect the disrupted allele to be rare and, in healthy people, observed only in a heterozygous state. However, some, like those in the CASP12 and ACTN3 genes, are known to be present at high frequencies and to occur often in a homozygous state and seem to have been advantageous in recent human evolution. To evaluate the selective forces acting on nonsense SNPs as a class, we have carried out a large-scale experimental survey of nonsense SNPs in the human genome by genotyping 805 of them (plus control synonymous SNPs) in 1,151 individuals from 56 worldwide populations. We identified 169 genes containing nonsense SNPs that were variable in our samples, of which 99 were found with both copies inactivated in at least one individual. We found that the sampled humans differ on average by 24 genes (out of about 20,000) because of these nonsense SNPs alone. As might be expected, nonsense SNPs as a class were found to be slightly disadvantageous over evolutionary timescales, but a few nevertheless showed signs of being possibly advantageous, as indicated by unusually high levels of population differentiation, long haplotypes, and/or high frequencies of derived alleles. This study underlines the extent of variation in gene content within humans and emphasizes the importance of understanding this type of variation.
Collapse
|
7
|
Yamaguchi-Kabata Y, Shimada MK, Hayakawa Y, Minoshima S, Chakraborty R, Gojobori T, Imanishi T. Distribution and effects of nonsense polymorphisms in human genes. PLoS One 2008; 3:e3393. [PMID: 18852891 PMCID: PMC2561068 DOI: 10.1371/journal.pone.0003393] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2008] [Accepted: 09/03/2008] [Indexed: 11/20/2022] Open
Abstract
Background A great amount of data has been accumulated on genetic variations in the human genome, but we still do not know much about how the genetic variations affect gene function. In particular, little is known about the distribution of nonsense polymorphisms in human genes despite their drastic effects on gene products. Methodology/Principal Findings To detect polymorphisms affecting gene function, we analyzed all publicly available polymorphisms in a database for single nucleotide polymorphisms (dbSNP build 125) located in the exons of 36,712 known and predicted protein-coding genes that were defined in an annotation project of all human genes and transcripts (H-InvDB ver3.8). We found a total of 252,555 single nucleotide polymorphisms (SNPs) and 8,479 insertion and deletions in the representative transcripts in these genes. The SNPs located in ORFs include 40,484 synonymous and 53,754 nonsynonymous SNPs, and 1,258 SNPs that were predicted to be nonsense SNPs or read-through SNPs. We estimated the density of nonsense SNPs to be 0.85×10−3 per site, which is lower than that of nonsynonymous SNPs (2.1×10−3 per site). On average, nonsense SNPs were located 250 codons upstream of the original termination codon, with the substitution occurring most frequently at the first codon position. Of the nonsense SNPs, 581 were predicted to cause nonsense-mediated decay (NMD) of transcripts that would prevent translation. We found that nonsense SNPs causing NMD were more common in genes involving kinase activity and transport. The remaining 602 nonsense SNPs are predicted to produce truncated polypeptides, with an average truncation of 75 amino acids. In addition, 110 read-through SNPs at termination codons were detected. Conclusion/Significance Our comprehensive exploration of nonsense polymorphisms showed that nonsense SNPs exist at a lower density than nonsynonymous SNPs, suggesting that nonsense mutations have more severe effects than amino acid changes. The correspondence of nonsense SNPs to known pathological variants suggests that phenotypic effects of nonsense SNPs have been reported for only a small fraction of nonsense SNPs, and that nonsense SNPs causing NMD are more likely to be involved in phenotypic variations. These nonsense SNPs may include pathological variants that have not yet been reported. These data are available from Transcript View of H-InvDB and VarySysDB (http://h-invitational.jp/varygene/).
Collapse
Affiliation(s)
- Yumi Yamaguchi-Kabata
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | - Makoto K. Shimada
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- Japan Biological Information Research Center, Japan Biological Informatics Consortium, Tokyo, Japan
| | - Yosuke Hayakawa
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- Japan Biological Information Research Center, Japan Biological Informatics Consortium, Tokyo, Japan
| | | | - Ranajit Chakraborty
- Center for Genome Information, University of Cincinnati, Cincinnati, Ohio, United States of America
| | - Takashi Gojobori
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Tadashi Imanishi
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- * E-mail:
| |
Collapse
|
8
|
Kim BC, Kim WY, Park D, Chung WH, Shin KS, Bhak J. SNP@Promoter: a database of human SNPs (single nucleotide polymorphisms) within the putative promoter regions. BMC Bioinformatics 2008; 9 Suppl 1:S2. [PMID: 18315851 PMCID: PMC2259403 DOI: 10.1186/1471-2105-9-s1-s2] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Analysis of single nucleotide polymorphism (SNP) is becoming a key research in genomics fields. Many functional analyses of SNPs have been carried out for coding regions and splicing sites that can alter proteins and mRNA splicing. However, SNPs in non-coding regulatory regions can also influence important biological regulation. Presently, there are few databases for SNPs in non-coding regulatory regions. DESCRIPTION We identified 488,452 human SNPs in the putative promoter regions that extended from the +5000 bp to -500 bp region of the transcription start sites. Some SNPs occurring in transcription factor (TF) binding sites were also predicted (47,832 SNP; 9.8%). The result is stored in a database: SNP@promoter. Users can search the SNP@Promoter database using three entries: 1) by SNP identifier (rs number from dbSNP), 2) by gene (gene name, gene symbol, refSeq ID), and 3) by disease term. The SNP@Promoter database provides extensive genetic information and graphical views of queried terms. CONCLUSION We present the SNP@Promoter database. It was created in order to predict functional SNPs in putative promoter regions and predicted transcription factor binding sites. SNP@Promoter will help researchers to identify functional SNPs in non-coding regions.
Collapse
Affiliation(s)
- Byoung-Chul Kim
- Korean BioInformation Center (KOBIC), KRIBB, Daejeon 305-806, Korea.
| | | | | | | | | | | |
Collapse
|
9
|
Wilming LG, Gilbert JGR, Howe K, Trevanion S, Hubbard T, Harrow JL. The vertebrate genome annotation (Vega) database. Nucleic Acids Res 2007; 36:D753-60. [PMID: 18003653 PMCID: PMC2238886 DOI: 10.1093/nar/gkm987] [Citation(s) in RCA: 183] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Vertebrate Genome Annotation (Vega) database (http://vega.sanger.ac.uk) was first made public in 2004 and has been designed to view manual annotation of human, mouse and zebrafish genomic sequences produced at the Wellcome Trust Sanger Institute. Since its initial release, the number of human annotated loci has more than doubled to close to 33 000 and now contains comprehensive annotation on 20 of the 24 human chromosomes, four whole mouse chromosomes and around 40% of the zebrafish Danio rerio genome. In addition, we offer manual annotation of a number of haplotype regions in mouse and human and regions of comparative interest in pig and dog that are unique to Vega.
Collapse
Affiliation(s)
- L G Wilming
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.
| | | | | | | | | | | |
Collapse
|