401
|
Bayesian systems-based genetic association analysis with effect strength estimation and omic wide interpretation: a case study in rheumatoid arthritis. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2014; 1142:143-76. [PMID: 24706282 DOI: 10.1007/978-1-4939-0404-4_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Rich dependency structures are often formed in genetic association studies between the phenotypic, clinical, and environmental descriptors. These descriptors may not be standardized, and may encompass various disease definitions and clinical endpoints which are only weakly influenced by various (e.g., genetic) factors. Such loosely defined complex intermediate clinical phenotypes are typically used in follow-up candidate gene association studies, e.g., after genome-wide analysis, to deepen the understanding of the associations and to estimate effect strength. This chapter discusses a solid methodology, which is useful in such a scenario, by using probabilistic graphical models, namely, Bayesian networks in the Bayesian statistical framework. This method offers systematically scalable, comprehensive hierarchical hypotheses about multivariate relevance. We discuss its workflow: from data engineering to semantic publication of the results. We overview the construction, visualization, and interpretation of complex hypotheses related to the structural analysis of relevance. Furthermore, we illustrate the use of a dependency model-based relevance measure, which takes into account the structural properties of the model, for quantifying the effect strength. Finally, we discuss the "interpretational" or translational challenge of a genetic association study, with a focus on the fusion of heterogeneous omic knowledge to reintegrate the results into a genome-wide context.
Collapse
|
402
|
Ioannidis JPA. To replicate or not to replicate: the case of pharmacogenetic studies: Have pharmacogenomics failed, or do they just need larger-scale evidence and more replication? ACTA ACUST UNITED AC 2014; 6:413-8; discussion 418. [PMID: 23963161 DOI: 10.1161/circgenetics.113.000106] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- John P A Ioannidis
- Stanford Prevention Research Center, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA.
| |
Collapse
|
403
|
Gomez-Cabrero D, Abugessaisa I, Maier D, Teschendorff A, Merkenschlager M, Gisel A, Ballestar E, Bongcam-Rudloff E, Conesa A, Tegnér J. Data integration in the era of omics: current and future challenges. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 2:I1. [PMID: 25032990 PMCID: PMC4101704 DOI: 10.1186/1752-0509-8-s2-i1] [Citation(s) in RCA: 209] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
To integrate heterogeneous and large omics data constitutes not only a conceptual challenge but a practical hurdle in the daily analysis of omics data. With the rise of novel omics technologies and through large-scale consortia projects, biological systems are being further investigated at an unprecedented scale generating heterogeneous and often large data sets. These data-sets encourage researchers to develop novel data integration methodologies. In this introduction we review the definition and characterize current efforts on data integration in the life sciences. We have used a web-survey to assess current research projects on data-integration to tap into the views, needs and challenges as currently perceived by parts of the research community.
Collapse
|
404
|
Pei YF, Zhang L, Papasian CJ, Wang YP, Deng HW. On individual genome-wide association studies and their meta-analysis. Hum Genet 2014; 133:265-279. [PMID: 24114349 PMCID: PMC4127980 DOI: 10.1007/s00439-013-1366-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Accepted: 09/22/2013] [Indexed: 01/07/2023]
Abstract
Individual genome-wide association (GWA) studies and their meta-analyses represent two approaches for identifying genetic loci associated with complex diseases/traits. Inconsistent findings and non-replicability between individual GWA studies and meta-analyses are commonly observed, hence posing the critical question as to how to interpret their respective results properly. In this study, we performed a series of simulation studies to investigate and compare the statistical properties of the two approaches. Our results show that (1) as expected, meta-analysis of larger sample size is more powerful than individual GWA studies under the ideal setting of population homogeneity among individual studies; (2) under the realistic setting of heterogeneity among individual studies, detection of heterogeneity is usually difficult and meta-analysis (even with the random-effects model) may introduce elevated false positive and/or negative rates; (3) despite relatively small sample size, well-designed individual GWA study has the capacity to identify novel loci for complex traits; (4) replicability between meta-analysis and independent individual studies or between independent meta-analyses is limited, and thus inconsistent findings are not unexpected.
Collapse
Affiliation(s)
- Yu-Fang Pei
- Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, 200093, People's Republic of China,
| | | | | | | | | |
Collapse
|
405
|
Abstract
Genome-wide association studies (GWAS) are designed to identify the portion of single-nucleotide polymorphisms (SNPs) in genome sequences associated with a complex trait. Strategies based on the gene list enrichment concept are currently applied for the functional analysis of GWAS, according to which a significant overrepresentation of candidate genes associated with a biological pathway is used as a proxy to infer overrepresentation of candidate SNPs in the pathway. Here we show that such inference is not always valid and introduce the program SNP2GO, which implements a new method to properly test for the overrepresentation of candidate SNPs in biological pathways.
Collapse
|
406
|
Sayols-Baixeras S, Lluís-Ganella C, Lucas G, Elosua R. Pathogenesis of coronary artery disease: focus on genetic risk factors and identification of genetic variants. Appl Clin Genet 2014; 7:15-32. [PMID: 24520200 PMCID: PMC3920464 DOI: 10.2147/tacg.s35301] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Coronary artery disease (CAD) is the leading cause of death and disability worldwide, and its prevalence is expected to increase in the coming years. CAD events are caused by the interplay of genetic and environmental factors, the effects of which are mainly mediated through cardiovascular risk factors. The techniques used to study the genetic basis of these diseases have evolved from linkage studies to candidate gene studies and genome-wide association studies. Linkage studies have been able to identify genetic variants associated with monogenic diseases, whereas genome-wide association studies have been more successful in determining genetic variants associated with complex diseases. Currently, genome-wide association studies have identified approximately 40 loci that explain 6% of the heritability of CAD. The application of this knowledge to clinical practice is challenging, but can be achieved using various strategies, such as genetic variants to identify new therapeutic targets, personal genetic information to improve disease risk prediction, and pharmacogenomics. The main aim of this narrative review is to provide a general overview of our current understanding of the genetics of coronary artery disease and its potential clinical utility.
Collapse
Affiliation(s)
- Sergi Sayols-Baixeras
- Cardiovascular epidemiology and Genetics Research Group, Institut Hospital del Mar d’Investigacions Mèdiques, Barcelona, Spain
| | - Carla Lluís-Ganella
- Cardiovascular epidemiology and Genetics Research Group, Institut Hospital del Mar d’Investigacions Mèdiques, Barcelona, Spain
| | - Gavin Lucas
- Cardiovascular epidemiology and Genetics Research Group, Institut Hospital del Mar d’Investigacions Mèdiques, Barcelona, Spain
| | - Roberto Elosua
- Cardiovascular epidemiology and Genetics Research Group, Institut Hospital del Mar d’Investigacions Mèdiques, Barcelona, Spain
| |
Collapse
|
407
|
Hu Y, Yu CY, Wang JL, Guan J, Chen HY, Fang JY. MicroRNA sequence polymorphisms and the risk of different types of cancer. Sci Rep 2014; 4:3648. [PMID: 24413317 PMCID: PMC5379157 DOI: 10.1038/srep03648] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Accepted: 12/10/2013] [Indexed: 01/05/2023] Open
Abstract
MicroRNAs (miRNAs) participate in diverse biological pathways and may act as oncogenes or tumor suppressors. Single nucleotide polymorphisms (SNPs) in miRNAs (MirSNPs) might promote carcinogenesis by affecting miRNA function and/or maturation; however, the association between MirSNPs reported and cancer risk remain inconsistent. Here, we investigated the association between nine common MirSNPs and cancer risk using data from large scale case-control studies. Eight precursor-miRNA (pre-miRNA) SNPs (rs2043556/miR-605, rs3746444/miR-499a/b, rs4919510/miR-608, rs2910164/miR-146a, rs11614913/miR-196a2, rs895819/miR-27a, rs2292832/miR-149, rs6505162/miR-423) and one primary-miRNA (pri-miRNA) SNP (rs1834306/miR-100) were analyzed in 16399 cases and 21779 controls from seven published studies in eight common cancers. With a novel statistic, Cross phenotype meta-analysis (CPMA) of the association of MirSNPs with multiple phenotypes indicated rs2910164 C (P = 1.11E-03), rs2043556 C (P = 0.0165), rs6505162 C (P = 2.05E-03) and rs895819 (P = 0.0284) were associated with a significant overall risk of cancer. In conclusion, MirSNPs might affect an individual's susceptibility to various types of cancer.
Collapse
Affiliation(s)
- Ye Hu
- 1] Division of Gastroenterology and Hepatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai Institution of Digestive Disease; Key Laboratory of Gastroenterology & Hepatology, Ministry of Health; State Key Laboratory of Oncogene and Related Genes. 145 Middle Shandong Rd, Shanghai 200001, China [2]
| | - Chen-Yang Yu
- 1] Division of Gastroenterology and Hepatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai Institution of Digestive Disease; Key Laboratory of Gastroenterology & Hepatology, Ministry of Health; State Key Laboratory of Oncogene and Related Genes. 145 Middle Shandong Rd, Shanghai 200001, China [2]
| | - Ji-Lin Wang
- Division of Gastroenterology and Hepatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai Institution of Digestive Disease; Key Laboratory of Gastroenterology & Hepatology, Ministry of Health; State Key Laboratory of Oncogene and Related Genes. 145 Middle Shandong Rd, Shanghai 200001, China
| | - Jian Guan
- Department of Otolaryngology, The Affiliated Sixth People's Hospital,Otolaryngology Institute of Shanghai Jiao Tong University, Shanghai 200233, China
| | - Hao-Yan Chen
- Division of Gastroenterology and Hepatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai Institution of Digestive Disease; Key Laboratory of Gastroenterology & Hepatology, Ministry of Health; State Key Laboratory of Oncogene and Related Genes. 145 Middle Shandong Rd, Shanghai 200001, China
| | - Jing-Yuan Fang
- Division of Gastroenterology and Hepatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai Institution of Digestive Disease; Key Laboratory of Gastroenterology & Hepatology, Ministry of Health; State Key Laboratory of Oncogene and Related Genes. 145 Middle Shandong Rd, Shanghai 200001, China
| |
Collapse
|
408
|
Abstract
Genome-wide association studies (GWAS) are a powerful tool for investigators to examine the human genome to detect genetic risk factors, reveal the genetic architecture of diseases and open up new opportunities for treatment and prevention. However, despite its successes, GWAS have not been able to identify genetic loci that are effective classifiers of disease, limiting their value for genetic testing. This chapter highlights the challenges that lie ahead for GWAS in better identifying disease risk predictors, and how we may address them. In this regard, we review basic concepts regarding GWAS, the technologies used for capturing genetic variation, the missing heritability problem, the need for efficient study design especially for replication efforts, reducing the bias introduced into a dataset, and how to utilize new resources available, such as electronic medical records. We also look to what lies ahead for the field, and the approaches that can be taken to realize the full potential of GWAS.
Collapse
Affiliation(s)
- Rishika De
- Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH, USA
| | | | | |
Collapse
|
409
|
Gupta PK, Kulwal PL, Jaiswal V. Association mapping in crop plants: opportunities and challenges. ADVANCES IN GENETICS 2014; 85:109-47. [PMID: 24880734 DOI: 10.1016/b978-0-12-800271-1.00002-0] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
The research area of association mapping (AM) is currently receiving major attention for genetic studies of quantitative traits in all major crops. However, the level of success and utility of AM achieved for crop improvement is not comparable to that in the area of human health care for diagnosis of complex human diseases. These AM studies in plants, as in humans, became possible due to the availability of DNA-based molecular markers and a variety of sophisticated statistical tools that are evolving on a regular basis. In this chapter, we first briefly review the significance of a variety of populations that are used in AM studies, then briefly describe the molecular markers and high-throughput genotyping strategies, and finally describe the approaches used for AM studies. The major part of the chapter is, however, devoted to analysis of reasons why the results of AM have been underutilized in plant breeding. We also examine the opportunities available and challenges faced while using AM for crop improvement programs. This includes a detailed discussion of the issues that have plagued AM studies, and the solutions that have become available to deal with these issues, so that in future, the results of AM studies may prove increasingly fruitful for crop improvement programs.
Collapse
Affiliation(s)
- Pushpendra K Gupta
- Department of Genetics and Plant Breeding, Ch. Charan Singh University, Meerut, UP, India
| | - Pawan L Kulwal
- State Level Biotechnology Centre, Mahatma Phule Agricultural University, Rahuri, MS, India
| | - Vandana Jaiswal
- Department of Genetics and Plant Breeding, Ch. Charan Singh University, Meerut, UP, India
| |
Collapse
|
410
|
Liang H, Deng X, Deng H. Response to “A closer look at FBXO41 as a Parkinson's disease risk factor”. Parkinsonism Relat Disord 2013; 19:1177-8. [DOI: 10.1016/j.parkreldis.2013.08.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/18/2013] [Accepted: 08/23/2013] [Indexed: 11/26/2022]
|
411
|
Lin YC, Yu CS, Lin YJ. Enabling large-scale biomedical analysis in the cloud. BIOMED RESEARCH INTERNATIONAL 2013; 2013:185679. [PMID: 24288665 PMCID: PMC3832998 DOI: 10.1155/2013/185679] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/06/2013] [Accepted: 09/22/2013] [Indexed: 01/02/2023]
Abstract
Recent progress in high-throughput instrumentations has led to an astonishing growth in both volume and complexity of biomedical data collected from various sources. The planet-size data brings serious challenges to the storage and computing technologies. Cloud computing is an alternative to crack the nut because it gives concurrent consideration to enable storage and high-performance computing on large-scale data. This work briefly introduces the data intensive computing system and summarizes existing cloud-based resources in bioinformatics. These developments and applications would facilitate biomedical research to make the vast amount of diversification data meaningful and usable.
Collapse
Affiliation(s)
- Ying-Chih Lin
- Master's Program in Biomedical Informatics and Biomedical Engineering, Feng Chia University, No. 100 Wenhwa Road, Seatwen, Taichung 40724, Taiwan
- Department of Applied Mathematics, Feng Chia University, No. 100 Wenhwa Road, Seatwen, Taichung 40724, Taiwan
| | - Chin-Sheng Yu
- Master's Program in Biomedical Informatics and Biomedical Engineering, Feng Chia University, No. 100 Wenhwa Road, Seatwen, Taichung 40724, Taiwan
- Department of Information Engineering and Computer Science, Feng Chia University, No. 100 Wenhwa Road, Seatwen, Taichung 40724, Taiwan
| | - Yen-Jen Lin
- Department of Computer Science, National Tsing Hua University, No. 101, Section 2, Kuang-Fu Road, Hsinchu 30013, Taiwan
| |
Collapse
|
412
|
Yu X, Liu J, Zhu H, Xia Y, Gao L, Li Z, Jia N, Shen W, Yang Y, Niu W. An interactive association of advanced glycation end-product receptor gene four common polymorphisms with coronary artery disease in northeastern Han Chinese. PLoS One 2013; 8:e76966. [PMID: 24155913 PMCID: PMC3796558 DOI: 10.1371/journal.pone.0076966] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Accepted: 08/26/2013] [Indexed: 01/23/2023] Open
Abstract
Background Growing evidence indicates that advanced glycation end-product receptor (RAGE) might play a contributory role in the pathogenesis of coronary artery disease (CAD). To shed some light from a genetic perspective, we sought to investigate the interactive association of RAGE gene four common polymorphisms (rs1800625 or T-429C, rs1800624 or T-374A, rs2070600 or Gly82Ser, and rs184003 or G1704A) with the risk of developing CAD in a large northeastern Han Chinese population. Methodology/Principal Findings This was a hospital-based case-control study incorporating 1142 patients diagnosed with CAD and 1106 age- and gender-matched controls. All individuals were angiographically confirmed. Risk estimates were expressed as odds ratio (OR) and 95% confidence interval (CI). Overall there were significant differences in the genotype and allele distributions of rs1800625 and rs184003, even after the Bonferroni correction. Logistic regression analyses indicated that rs1800625 and rs184003 were associated with significant risk of CAD under both additive (OR = 1.20 and 1.23; 95% CI: 1.06–1.37 and 1.06–1.42; P = 0.006 and 0.008) and recessive (OR = 1.75 and 2.39; 95% CI: 1.28–2.40 and 1.47–3.87; P<0.001 and <0.001) models after adjusting for confounders. In haplotype analyses, haplotypes C-T-G-G and T-A-G-T (alleles in order of rs1800625, rs1800624, rs2070600 and rs184003), overrepresented in patients, were associated with 52% (95% CI: 1.19–1.87; P = 0.0052) and 63% (95% CI: 1.14–2.34; P = 0.0075) significant increases in adjusted risk for CAD. Further interactive analyses identified an overall best multifactor dimensionality reduction (MDR) model including rs1800625 and rs184003. This model had a maximal testing accuracy of 0.6856 and a cross-validation consistency of 10 out of 10 (P = 0.0016). The validity of this model was substantiated by classical Logistic regression analysis. Conclusions Our findings provided strong evidence for the potentially contributory roles of RAGE multiple genetic polymorphisms, especially in the context of locus-to-locus interaction, in the pathogenesis of CAD among northeastern Han Chinese.
Collapse
Affiliation(s)
- Xiaohong Yu
- Department of Cardiology, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Jun Liu
- Department of Cardiology, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Hao Zhu
- Department of Cardiology, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Yunlong Xia
- Department of Cardiology, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Lianjun Gao
- Department of Cardiology, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Zhen Li
- Department of Cardiology, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Nan Jia
- Department of Hypertension, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Weifeng Shen
- Department of Cardiology, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Yanzong Yang
- Department of Cardiology, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
- * E-mail: (YY); (WN)
| | - Wenquan Niu
- State Key Laboratory of Medical Genomics, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Institute of Hypertension, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- * E-mail: (YY); (WN)
| |
Collapse
|
413
|
Imputation-based genomic coverage assessments of current human genotyping arrays. G3-GENES GENOMES GENETICS 2013; 3:1795-807. [PMID: 23979933 PMCID: PMC3789804 DOI: 10.1534/g3.113.007161] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Microarray single-nucleotide polymorphism genotyping, combined with imputation of untyped variants, has been widely adopted as an efficient means to interrogate variation across the human genome. “Genomic coverage” is the total proportion of genomic variation captured by an array, either by direct observation or through an indirect means such as linkage disequilibrium or imputation. We have performed imputation-based genomic coverage assessments of eight current genotyping arrays that assay from ~0.3 to ~5 million variants. Coverage was determined separately in each of the four continental ancestry groups in the 1000 Genomes Project phase 1 release. We used the subset of 1000 Genomes variants present on each array to impute the remaining variants and assessed coverage based on correlation between imputed and observed allelic dosages. More than 75% of common variants (minor allele frequency > 0.05) are covered by all arrays in all groups except for African ancestry, and up to ~90% in all ancestries for the highest density arrays. In contrast, less than 40% of less common variants (0.01 < minor allele frequency < 0.05) are covered by low density arrays in all ancestries and 50–80% in high density arrays, depending on ancestry. We also calculated genome-wide power to detect variant-trait association in a case-control design, across varying sample sizes, effect sizes, and minor allele frequency ranges, and compare these array-based power estimates with a hypothetical array that would type all variants in 1000 Genomes. These imputation-based genomic coverage and power analyses are intended as a practical guide to researchers planning genetic studies.
Collapse
|