1
|
Fishman CE, Mohebnasab M, van Setten J, Zanoni F, Wang C, Deaglio S, Amoroso A, Callans L, van Gelder T, Lee S, Kiryluk K, Lanktree MB, Keating BJ. Genome-Wide Study Updates in the International Genetics and Translational Research in Transplantation Network (iGeneTRAiN). Front Genet 2019; 10:1084. [PMID: 31803228 PMCID: PMC6873800 DOI: 10.3389/fgene.2019.01084] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 10/09/2019] [Indexed: 12/14/2022] Open
Abstract
The prevalence of end-stage renal disease (ESRD) and the number of kidney transplants performed continues to rise every year, straining the procurement of deceased and living kidney allografts and health systems. Genome-wide genotyping and sequencing of diseased populations have uncovered genetic contributors in substantial proportions of ESRD patients. A number of these discoveries are beginning to be utilized in risk stratification and clinical management of patients. Specifically, genetics can provide insight into the primary cause of chronic kidney disease (CKD), the risk of progression to ESRD, and post-transplant outcomes, including various forms of allograft rejection. The International Genetics & Translational Research in Transplantation Network (iGeneTRAiN), is a multi-site consortium that encompasses >45 genetic studies with genome-wide genotyping from over 51,000 transplant samples, including genome-wide data from >30 kidney transplant cohorts (n = 28,015). iGeneTRAiN is statistically powered to capture both rare and common genetic contributions to ESRD and post-transplant outcomes. The primary cause of ESRD is often difficult to ascertain, especially where formal biopsy diagnosis is not performed, and is unavailable in ∼2% to >20% of kidney transplant recipients in iGeneTRAiN studies. We overview our current copy number variant (CNV) screening approaches from genome-wide genotyping datasets in iGeneTRAiN, in attempts to discover and validate genetic contributors to CKD and ESRD. Greater aggregation and analyses of well phenotyped patients with genome-wide datasets will undoubtedly yield insights into the underlying pathophysiological mechanisms of CKD, leading the way to improved diagnostic precision in nephrology.
Collapse
Affiliation(s)
- Claire E Fishman
- Division of Transplantation Department of Surgery, University of Pennsylvania, Philadelphia, PA, United States
| | - Maede Mohebnasab
- Division of Transplantation Department of Surgery, University of Pennsylvania, Philadelphia, PA, United States
| | - Jessica van Setten
- Department of Cardiology, University Medical Center Utrecht, University of Utrecht, Utrecht, Netherlands
| | - Francesca Zanoni
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, United States
| | - Chen Wang
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, United States
| | - Silvia Deaglio
- Immunogenetics and Biology of Transplantation, Città della Salute e della Scienza, University Hospital of Turin, Turin, Italy.,Medical Genetics, Department of Medical Sciences, University Turin, Turin, Italy
| | - Antonio Amoroso
- Immunogenetics and Biology of Transplantation, Città della Salute e della Scienza, University Hospital of Turin, Turin, Italy.,Medical Genetics, Department of Medical Sciences, University Turin, Turin, Italy
| | - Lauren Callans
- Division of Transplantation Department of Surgery, University of Pennsylvania, Philadelphia, PA, United States
| | - Teun van Gelder
- Department of Hospital Pharmacy, University Medical Center Rotterdam, Rotterdam, Netherlands
| | - Sangho Lee
- Department of Nephrology, Khung Hee University, Seoul, South Korea
| | - Krzysztof Kiryluk
- Department of Medicine, Division of Nephrology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, United States
| | - Matthew B Lanktree
- Division of Nephrology, St. Joseph's Healthcare Hamilton, McMaster University, Hamilton, ON, Canada
| | - Brendan J Keating
- Division of Transplantation Department of Surgery, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
2
|
Chakraborty C, George Priya Doss C, Zhu H, Agoramoorthy G. Rising Strengths Hong Kong SAR in Bioinformatics. Interdiscip Sci 2016; 9:224-236. [PMID: 26961385 PMCID: PMC7091071 DOI: 10.1007/s12539-016-0147-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2015] [Revised: 12/07/2015] [Accepted: 01/08/2016] [Indexed: 12/18/2022]
Abstract
Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Bio-informatics, School of Computer and Information Sciences, Galgotias University, Greater Noida, UP, 201306, India
- Department of Computer Sciences, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - C George Priya Doss
- Medical Biotechnology Division, School of BioSciences and Technology, VIT University, Vellore, TN, 632014, India
| | - Hailong Zhu
- Department of Computer Sciences, Hong Kong Baptist University, Kowloon Tong, Hong Kong.
| | | |
Collapse
|
3
|
Baron RV, Conley YP, Gorin MB, Weeks DE. dbVOR: a database system for importing pedigree, phenotype and genotype data and exporting selected subsets. BMC Bioinformatics 2015; 16:91. [PMID: 25887129 PMCID: PMC4407391 DOI: 10.1186/s12859-015-0505-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2014] [Accepted: 02/20/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND When studying the genetics of a human trait, we typically have to manage both genome-wide and targeted genotype data. There can be overlap of both people and markers from different genotyping experiments; the overlap can introduce several kinds of problems. Most times the overlapping genotypes are the same, but sometimes they are different. Occasionally, the lab will return genotypes using a different allele labeling scheme (for example 1/2 vs A/C). Sometimes, the genotype for a person/marker index is unreliable or missing. Further, over time some markers are merged and bad samples are re-run under a different sample name. We need a consistent picture of the subset of data we have chosen to work with even though there might possibly be conflicting measurements from multiple data sources. RESULTS We have developed the dbVOR database, which is designed to hold data efficiently for both genome-wide and targeted experiments. The data are indexed for fast retrieval by person and marker. In addition, we store pedigree and phenotype data for our subjects. The dbVOR database allows us to select subsets of the data by several different criteria and to merge their results into a coherent and consistent whole. Data may be filtered by: family, person, trait value, markers, chromosomes, and chromosome ranges. The results can be presented in columnar, Mega2, or PLINK format. CONCLUSIONS dbVOR serves our needs well. It is freely available from https://watson.hgen.pitt.edu/register . Documentation for dbVOR can be found at https://watson.hgen.pitt.edu/register/docs/dbvor.html .
Collapse
Affiliation(s)
- Robert V Baron
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, PittsburghPennsylvania, 15261, USA.
| | - Yvette P Conley
- Department of Health Promotion and Development, School of Nursing, University of Pittsburgh, Pittsburgh, Pennsylvania, 15261, USA.
| | - Michael B Gorin
- Department of Ophthalmology, David Geffen School of Medicine, Stein Eye Institute, University of California Los Angeles, Los Angeles, California, 90095, USA.
| | - Daniel E Weeks
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, PittsburghPennsylvania, 15261, USA. .,Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, 15261, USA.
| |
Collapse
|
4
|
Guzzi PH, Cannataro M. Micro-Analyzer: automatic preprocessing of Affymetrix microarray data. Comput Methods Programs Biomed 2013; 111:402-409. [PMID: 23731720 DOI: 10.1016/j.cmpb.2013.04.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2013] [Revised: 03/14/2013] [Accepted: 04/11/2013] [Indexed: 06/02/2023]
Abstract
A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power Tools), (ii) the manual loading of preprocessing libraries, and (iii) the management of intermediate files, such as results and metadata. Micro-Analyzer users can directly manage Affymetrix binary data without worrying about locating and invoking the proper preprocessing tools and chip-specific libraries. Moreover, users of the Micro-Analyzer tool can load the preprocessed data directly into the well-known TM4 platform, extending in such a way also the TM4 capabilities. Consequently, Micro Analyzer offers the following advantages: (i) it reduces possible errors in the preprocessing and further analysis phases, e.g. due to the incorrect choice of parameters or due to the use of old libraries, (ii) it enables the combined and centralized pre-processing of different arrays, (iii) it may enhance the quality of further analysis by storing the workflow, i.e. information about the preprocessing steps, and (iv) finally Micro-Analzyer is freely available as a standalone application at the project web site http://sourceforge.net/projects/microanalyzer/.
Collapse
Affiliation(s)
- Pietro Hiram Guzzi
- Bioinformatics Laboratory, Department of Surgical and Medical Sciences, Magna Graecia University, Catanzaro, Italy.
| | | |
Collapse
|
5
|
Go MJ, Hwang JY, Kim DJ, Lee HJ, Jang HB, Park KH, Song J, Lee JY. Effect of genetic predisposition on blood lipid traits using cumulative risk assessment in the korean population. Genomics Inform 2012; 10:99-105. [PMID: 23105936 PMCID: PMC3480684 DOI: 10.5808/gi.2012.10.2.99] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2012] [Revised: 05/18/2012] [Accepted: 05/22/2012] [Indexed: 12/27/2022] Open
Abstract
Dyslipidemia, mainly characterized by high triglyceride (TG) and low high-density lipoprotein cholesterol (HDL-C) levels, is an important etiological factor in the development of cardiovascular disease (CVD). Considering the relationship between childhood obesity and CVD risk, it would be worthwhile to evaluate whether previously identified lipid-related variants in adult subjects are associated with lipid variations in a childhood obesity study (n = 482). In an association analysis for 16 genome-wide association study (GWAS)-based candidate loci, we confirmed significant associations of a genetic predisposition to lipoprotein concentrations in a childhood obesity study. Having two loci (rs10503669 at LPL and rs16940212 at LIPC) that showed the strongest association with blood levels of TG and HDL-C, we calculated a genetic risk score (GRS), representing the sum of the risk alleles. It has been observed that increasing GRS is significantly associated with decreased HDL-C (effect size, -1.13 ± 0.07) compared to single nucleotide polymorphism combinations without two risk variants. In addition, a positive correlation was observed between allelic dosage score and risk allele (rs10503669 at LPL) on high TG levels (effect size, 10.89 ± 0.84). These two loci yielded consistent associations in our previous meta-analysis. Taken together, our findings demonstrate that the genetic architecture of circulating lipid levels (TG and HDL-C) overlap to a large extent in childhood as well as in adulthood. Post-GWAS functional characterization of these variants is further required to elucidate their pathophysiological roles and biological mechanisms.
Collapse
Affiliation(s)
- Min Jin Go
- Center for Genome Science, National Institute of Health, Osong Health Technology Administration Complex, Cheongwon 363-951, Korea
| | | | | | | | | | | | | | | |
Collapse
|
6
|
Abstract
MOTIVATION High-throughput single nucleotide polymorphism (SNP) arrays have become the standard platform for linkage and association analyses. The high SNP density of these platforms allows high-resolution identification of ancestral recombination events even for distant relatives many generations apart. However, such inference is sensitive to marker mistyping and current error detection methods rely on the genotyping of additional close relatives. Genotyping algorithms provide a confidence score for each marker call that is currently not integrated in existing methods. There is a need for a model that incorporates this prior information within the standard identical by descent (IBD) and association analyses. RESULTS We propose a novel model that incorporates marker confidence scores within IBD methods based on the Lander-Green Hidden Markov Model. The novel parameter of this model is the joint distribution of confidence scores and error status per array. We estimate this probability distribution by applying a modified expectation-maximization (EM) procedure on data from nuclear families genotyped with Affymetrix 250K SNP arrays. The converged tables from two different genotyping algorithms are shown for a wide range of error rates. We demonstrate the efficacy of our method in refining the detection of IBD signals using nuclear pedigrees and distant relatives. AVAILABILITY Plinke, a new version of Plink with an extended pairwise IBD inference model allowing per marker error probabilities is freely available at: http://bioinfo.bgu.ac.il/bsu/software/plinke. CONTACT obirk@bgu.ac.il; markusb@bgu.ac.il SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Barak Markus
- The Morris Kahn Laboratory of Human Genetics, Department of Virology and Developmental Genetics, NIBN, Ben Gurion University, Israel.
| | | | | |
Collapse
|
7
|
Garcia-Barceló MM, Yeung MY, Miao XP, Tang CSM, Cheng G, So MT, Ngan ESW, Lui VCH, Chen Y, Liu XL, Hui KJWS, Li L, Guo WH, Sun XB, Tou JF, Chan KW, Wu XZ, Song YQ, Chan D, Cheung K, Chung PHY, Wong KKY, Sham PC, Cherny SS, Tam PKH. Genome-wide association study identifies a susceptibility locus for biliary atresia on 10q24.2. Hum Mol Genet 2010; 19:2917-25. [PMID: 20460270 DOI: 10.1093/hmg/ddq196] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Biliary atresia (BA) is characterized by the progressive fibrosclerosing obliteration of the extrahepatic biliary system during the first few weeks of life. Despite early diagnosis and prompt surgical intervention, the disease progresses to cirrhosis in many patients. The current theory for the pathogenesis of BA proposes that during the perinatal period, a still unknown exogenous factor meets the innate immune system of a genetically predisposed individual and induces an uncontrollable and potentially self-limiting immune response, which becomes manifest in liver fibrosis and atresia of the extrahepatic bile ducts. Genetic factors that could account for the disease, let alone for its high incidence in Chinese, are to be investigated. To identify BA susceptibility loci, we carried out a genome-wide association study (GWAS) using the Affymetrix 5.0 and 500 K marker sets. We genotyped nearly 500 000 single-nucleotide polymorphisms (SNPs) in 200 Chinese BA patients and 481 ethnically matched control subjects. The 10 most BA-associated SNPs from the GWAS were genotyped in an independent set of 124 BA and 90 control subjects. The strongest overall association was found for rs17095355 on 10q24, downstream XPNPEP1, a gene involved in the metabolism of inflammatory mediators. Allelic chi-square test P-value for the meta-analysis of the GWAS and replication results was 6.94 x 10(-9). The identification of putative BA susceptibility loci not only opens new fields of investigation into the mechanisms underlying BA but may also provide new clues for the development of preventive and curative strategies.
Collapse
|
8
|
Abstract
Genome-wide association studies, using hundreds of thousands of single-nucleotide polymorphism (SNP) markers, have become a standard approach for identifying disease susceptibility genes. The change in the technology poses substantial computational and statistical challenges that have been addressed in the quality control, imputation, and population-based measure groups of the Genetic Analysis Workshop 16. The computational challenges pertain to efficient memory management and computational speed of the statistical procedures, and we discuss an approach for efficient SNP storage. Accuracy and computational speed is relevant for genotype calling, and the results from a comparison of three calling algorithms are discussed. The first statistical challenge is related to statistical quality control, and we discuss two novel quality control procedures. These low-level analyses have an effect on subsequent preparatory steps for high-level analyses, e.g., the quality of genotype imputation approaches. After the conduct of a genome-wide association study with successful replication and/or validation, measures of diagnostic accuracy, including the area under the curve, are investigated. The area under the curve can be constructed from summary data in some situations. Finally, we discuss how the population-attributable risk of a genetic variant that is only measured in a reference data set can be determined.
Collapse
Affiliation(s)
- Andreas Ziegler
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Germany.
| |
Collapse
|
9
|
Fong C, Ko DC, Wasnick M, Radey M, Miller SI, Brittnacher M. GWAS analyzer: integrating genotype, phenotype and public annotation data for genome-wide association study analysis. ACTA ACUST UNITED AC 2010; 26:560-4. [PMID: 20053839 PMCID: PMC2820681 DOI: 10.1093/bioinformatics/btp714] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Motivation: Genome-wide association studies are beginning to elucidate how our genetic differences contribute to susceptibility and severity of disease. While computational tools have previously been developed to support various aspects of genome-wide association studies, there is currently a need for informatics solutions that facilitate the integration of data from multiple sources. Results: Here we present GWAS Analyzer, a database driven web-based tool that integrates genotype and phenotype data, association analysis results and genomic annotations from multiple public resources. GWAS Analyzer contains features for browsing these interrelated data, exploring phenotypic values by family or genotype, and filtering association results based on multiple criteria. The utility of the tool has been demonstrated by a genome-wide association study of human in vitro susceptibility to bacterial infection. GWAS Analyzer facilitated management of large sets of phenotype and genotype data, analysis of phenotypic variation and heritability, and most importantly, generation of a refined set of candidate single nucleotide polymorphisms (SNPs). The tool revealed a SNP that was experimentally validated to be associated with increased cell death among Salmonella infected HapMap cell lines. Availability:http://www.nwrce.org/gwas-analyzer Contact:mbrittna@u.washington.edu Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christine Fong
- Department of Immunology, University of Washington, Seattle, Washington 98195, USA
| | | | | | | | | | | |
Collapse
|