1
|
Thanh Nguyen D, Hoang Nguyen Q, Thuy Duong N, Vo NS. LmTag: functional-enrichment and imputation-aware tag SNP selection for population-specific genotyping arrays. Brief Bioinform 2022; 23:6627269. [PMID: 35780383 DOI: 10.1093/bib/bbac252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Revised: 05/02/2022] [Accepted: 05/31/2022] [Indexed: 12/16/2022] Open
Abstract
Despite the rapid development of sequencing technology, single-nucleotide polymorphism (SNP) arrays are still the most cost-effective genotyping solutions for large-scale genomic research and applications. Recent years have witnessed the rapid development of numerous genotyping platforms of different sizes and designs, but population-specific platforms are still lacking, especially for those in developing countries. SNP arrays designed for these countries should be cost-effective (small size), yet incorporate key information needed to associate genotypes with traits. A key design principle for most current platforms is to improve genome-wide imputation so that more SNPs not included in the array (imputed SNPs) can be predicted. However, current tag SNP selection methods mostly focus on imputation accuracy and coverage, but not the functional content of the array. It is those functional SNPs that are most likely associated with traits. Here, we propose LmTag, a novel method for tag SNP selection that not only improves imputation performance but also prioritizes highly functional SNP markers. We apply LmTag on a wide range of populations using both public and in-house whole-genome sequencing databases. Our results show that LmTag improved both functional marker prioritization and genome-wide imputation accuracy compared to existing methods. This novel approach could contribute to the next generation genotyping arrays that provide excellent imputation capability as well as facilitate array-based functional genetic studies. Such arrays are particularly suitable for under-represented populations in developing countries or non-model species, where little genomics data are available while investment in genome sequencing or high-density SNP arrays is limited. $\textrm{LmTag}$ is available at: https://github.com/datngu/LmTag.
Collapse
Affiliation(s)
- Dat Thanh Nguyen
- Center for Biomedical Informatics, Vingroup Big Data Institute, 458 Minh Khai, 10000, Hanoi, Vietnam
| | - Quan Hoang Nguyen
- Institute for Molecular Bioscience, University of Queensland, st Lucia, QLD 4067, Brisbane, Australia
| | - Nguyen Thuy Duong
- Center for Biomedical Informatics, Vingroup Big Data Institute, 458 Minh Khai, 10000, Hanoi, Vietnam.,Institute of Genome Research, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, 10000, Hanoi, Vietnam
| | - Nam S Vo
- Center for Biomedical Informatics, Vingroup Big Data Institute, 458 Minh Khai, 10000, Hanoi, Vietnam.,College of Engineering and Computer Science, VinUniversity, Vinhomes Ocean Park, 10000, Hanoi, Vietnam
| |
Collapse
|
2
|
Furstenau TN, Cocking JH, Sahl JW, Fofanov VY. Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites. BMC Bioinformatics 2018; 19:222. [PMID: 29890941 PMCID: PMC5996513 DOI: 10.1186/s12859-018-2225-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 05/30/2018] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Targeted PCR amplicon sequencing (TAS) techniques provide a sensitive, scalable, and cost-effective way to query and identify closely related bacterial species and strains. Typically, this is accomplished by targeting housekeeping genes that provide resolution down to the family, genera, and sometimes species level. Unfortunately, this level of resolution is not sufficient in many applications where strain-level identification of bacteria is required (biodefense, forensics, clinical diagnostics, and outbreak investigations). Adding more genomic targets will increase the resolution, but the challenge is identifying the appropriate targets. VaST was developed to address this challenge by finding the minimum number of targets that, in combination, achieve maximum strain-level resolution for any strain complex. The final combination of target regions identified by the algorithm produce a unique haplotype for each strain which can be used as a fingerprint for identifying unknown samples in a TAS assay. VaST ensures that the targets have conserved primer regions so that the targets can be amplified in all of the known strains and it also favors the inclusion of targets with basal variants which makes the set more robust when identifying previously unseen strains. RESULTS We analyzed VaST's performance using a number of different pathogenic species that are relevant to human disease outbreaks and biodefense. The number of targets required to achieve full resolution ranged from 20 to 88% fewer sites than what would be required in the worst case and most of the resolution is achieved within the first 20 targets. We computationally and experimentally validated one of the VaST panels and found that the targets led to accurate phylogenetic placement of strains, even when the strains were not a part of the original panel design. CONCLUSIONS VaST is an open source software that, when provided a set of variant sites, can find the minimum number of sites that will provide maximum resolution of a strain complex, and it has many different run-time options that can accommodate a wide range of applications. VaST can be an effective tool in the design of strain identification panels that, when combined with TAS technologies, offer an efficient and inexpensive strain typing protocol.
Collapse
Affiliation(s)
- Tara N Furstenau
- The School of Informatics, Computing, and Cyber Systems, Northern Arizona University, 1295 S Knoles Dr., Flagstaff, Arizona, 86001, USA
| | - Jill H Cocking
- The School of Informatics, Computing, and Cyber Systems, Northern Arizona University, 1295 S Knoles Dr., Flagstaff, Arizona, 86001, USA
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S Knoles Dr., Flagstaff, Arizona, 86001, USA
| | - Jason W Sahl
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S Knoles Dr., Flagstaff, Arizona, 86001, USA
| | - Viacheslav Y Fofanov
- The School of Informatics, Computing, and Cyber Systems, Northern Arizona University, 1295 S Knoles Dr., Flagstaff, Arizona, 86001, USA.
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S Knoles Dr., Flagstaff, Arizona, 86001, USA.
| |
Collapse
|
3
|
Mahid S, Minor K, Brangers B, Cobbs G, Galandiuk S. SMAD2 and the Relationship of Colorectal Cancer to Inflammatory Bowel Disease. Int J Biol Markers 2018. [DOI: 10.1177/172460080802300306] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Inflammatory bowel diseases (IBDs) affecting the colon [Crohn's disease (CD) and ulcerative colitis (UC)] are associated with an increased risk of colorectal cancer (CRC). Our previous work using oligonucleotide array data indicated that SMAD2 was significantly underexpressed in UC dysplastic tissue compared to benign UC. The aim of this current study was to determine whether single nucleotide polymorphisms (SNPs) within the SMAD2 gene are associated with IBD dysplasia/cancer. We performed an SNP haplotype-based case-control association study. Leukocyte DNA was obtained from 489 unrelated Caucasians (158 UC, 175 CD, 71 CRC, 85 controls). Eleven SNPs were genotyped. All 11 SNPs were in Hardy-Weinberg equilibrium in the control population. Strong linkage disequilibrium was observed among nearly all SMAD2 SNPs. There were no significant associations between SMAD2 allele or haplotype frequencies. Power calculations indicated good power for single-marker analysis (>0.8) and reasonably good power against effects of 0.1–0.15 for haplotype analysis. SMAD2 SNPs were not associated with the development of IBD dysplasia/cancer. This incongruity between our previous microarray data and the findings from this genotype study may be attributed to mechanisms such as alternative splicing of pre-mRNA SMAD2 and/or cross talk with other cellular pathways.
Collapse
Affiliation(s)
- S.S. Mahid
- Price Institute of Surgical Research and the Section of Colorectal Surgery, Department of Surgery, University of Louisville School of Medicine, Louisville, Kentucky
| | - K.S. Minor
- Price Institute of Surgical Research and the Section of Colorectal Surgery, Department of Surgery, University of Louisville School of Medicine, Louisville, Kentucky
| | - B.C. Brangers
- Price Institute of Surgical Research and the Section of Colorectal Surgery, Department of Surgery, University of Louisville School of Medicine, Louisville, Kentucky
| | - G.A. Cobbs
- Department of Biology, University of Louisville, Kentucky - USA
| | - S. Galandiuk
- Price Institute of Surgical Research and the Section of Colorectal Surgery, Department of Surgery, University of Louisville School of Medicine, Louisville, Kentucky
| |
Collapse
|
4
|
Athari SS, Athari SM, Beyzay F, Movassaghi M, Mortaz E, Taghavi M. Critical role of Toll-like receptors in pathophysiology of allergic asthma. Eur J Pharmacol 2016; 808:21-27. [PMID: 27894811 DOI: 10.1016/j.ejphar.2016.11.047] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Revised: 11/21/2016] [Accepted: 11/25/2016] [Indexed: 12/11/2022]
Abstract
Allergic asthma is an airway disease, characterized by reversible bronchoconstriction, chronic inflammation of the airway, and thickness of smooth muscle in the respiratory tract. Asthma is orchestrated by an excessive Th2-adaptive immune response, in which innate immunity plays a key role. Recently TLRs have received more and more attention as they are central to orchestrate the innate immune responses. TLRs are localized as integral membrane or intracellular glycoproteins with those on the cell surface sensing microbial antigens and the ones, localized in intracellular vesicles, sensing microbial nucleic acid species. Having recognized microbial antigens, TLRs conduct the immune response towards a pro- or anti-allergy response. As a double-edged sword, they could initiate either harmful or helpful responses by the immune system in case of allergic asthma. In the current review, we will describe the role of TLRs and their signaling pathways in allergic asthma.
Collapse
Affiliation(s)
- Seyyed Shamsadin Athari
- Research Center for Food Hygiene and Safety, Shahid Sadoughi University of Medical Sciences, Yazd, Iran; Health policy Research Center, Shiraz University of Medical Sciences, Shiraz, Iran; Department of Immunology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | | | - Fateme Beyzay
- Department of Immunology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Masoud Movassaghi
- Department of Pathology and Laboratory Medicine, University of California, Los Angeles (UCLA), Los Angeles, CA, USA
| | - Esmaeil Mortaz
- Clinical Tuberculosis and Epidemiology Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran; Department of Immunology, Faculty of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Mehdi Taghavi
- Mycology Research Center, Faculty of Veterinary Medicine, University of Tehran, Tehran, Iran
| |
Collapse
|
5
|
Abstract
In this review, we address how to best use data from the Human Genome Project to discover new drug targets for common disease. We focus on population genetic approaches to identify variants associated with disease and how these can illuminate new targets and pathways for intervention. We discuss new insights into patterns of human genetic variation, evolving strategies for genome-wide case-control design, and developments in bioinformatic technologies. Hypothesis versus non-hypothesis-driven approaches to target identification are considered.:
Collapse
|
6
|
Woo KT, Lau YK, Choong HL, Tan HK, Foo MWY, Lee EJC, Anantharaman V, Lee GSL, Yap HK, Yi Z, Fook-Chong S, Wong KS, Chan CM. Genomics and Disease Progression in IgA Nephritis. ANNALS OF THE ACADEMY OF MEDICINE, SINGAPORE 2013. [DOI: 10.47102/annals-acadmedsg.v42n12p674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Apart from clinical, histological and biochemical indices, genomics are now being employed to unravel the pathogenetic mechanisms in the disease progression of IgA nephritis (IgAN). The results of angiotensin converting enzyme (ACE) gene polymorphism have been controversial. Those patients with the DD genotype seem to have a poorer prognosis. However, with high dose angiotensin receptor blocker (ARB) therapy, the ACE gene polymorphism status of a patient may no longer be a matter for concern as those with the DD genotype would also respond favourably to high dose ARB therapy. Association studies with gene sequencing and haplotypes have suggested that multiple genes are involved in the pathogenesis of IgAN. Some workers have reported a synergistic effect in the combined analysis of AGT-M235T and ACE I/D polymorphism. With the use of deoxyribo nucleic acid (DNA) microarray, tens of thousands of gene expressions genome-wide can be examined together simultaneously. A locus of familial IgAN has been described with strong evidence of linkage to IgAN1 on chromosome 6q22-23. Two other loci were reported at 4q26-31 and 17q12-22. DNA microarray techniques could also help in the identification of specific pathogenic genes that are up- or down-regulated and this may allow genome wide analyses of these genes and their role in the pathogenesis and progression of IgAN. Recently, using genome-wide association studies (GWAS) more loci for disease susceptibility for IgAN have been identified at 17p13, 8p23, 22q12, 1q32 and 6p21.
Key words: Gene sequencing, Haplotypes, Microarray, Single nucleotide polymorphism
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Zhao Yi
- Singapore General Hospital, Singapore
| | | | | | | |
Collapse
|
7
|
Manku H, Langefeld CD, Guerra SG, Malik TH, Alarcon-Riquelme M, Anaya JM, Bae SC, Boackle SA, Brown EE, Criswell LA, Freedman BI, Gaffney PM, Gregersen PA, Guthridge JM, Han SH, Harley JB, Jacob CO, James JA, Kamen DL, Kaufman KM, Kelly JA, Martin J, Merrill JT, Moser KL, Niewold TB, Park SY, Pons-Estel BA, Sawalha AH, Scofield RH, Shen N, Stevens AM, Sun C, Gilkeson GS, Edberg JC, Kimberly RP, Nath SK, Tsao BP, Vyse TJ. Trans-ancestral studies fine map the SLE-susceptibility locus TNFSF4. PLoS Genet 2013; 9:e1003554. [PMID: 23874208 PMCID: PMC3715547 DOI: 10.1371/journal.pgen.1003554] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 04/23/2013] [Indexed: 12/01/2022] Open
Abstract
We previously established an 80 kb haplotype upstream of TNFSF4 as a susceptibility locus in the autoimmune disease SLE. SLE-associated alleles at this locus are associated with inflammatory disorders, including atherosclerosis and ischaemic stroke. In Europeans, the TNFSF4 causal variants have remained elusive due to strong linkage disequilibrium exhibited by alleles spanning the region. Using a trans-ancestral approach to fine-map the locus, utilising 17,900 SLE and control subjects including Amerindian/Hispanics (1348 cases, 717 controls), African-Americans (AA) (1529, 2048) and better powered cohorts of Europeans and East Asians, we find strong association of risk alleles in all ethnicities; the AA association replicates in African-American Gullah (152,122). The best evidence of association comes from two adjacent markers: rs2205960-T (P=1.71 × 10(-34) , OR=1.43[1.26-1.60]) and rs1234317-T (P=1.16 × 10(-28) , OR=1.38[1.24-1.54]). Inference of fine-scale recombination rates for all populations tested finds the 80 kb risk and non-risk haplotypes in all except African-Americans. In this population the decay of recombination equates to an 11 kb risk haplotype, anchored in the 5' region proximal to TNFSF4 and tagged by rs2205960-T after 1000 Genomes phase 1 (v3) imputation. Conditional regression analyses delineate the 5' risk signal to rs2205960-T and the independent non-risk signal to rs1234314-C. Our case-only and SLE-control cohorts demonstrate robust association of rs2205960-T with autoantibody production. The rs2205960-T is predicted to form part of a decameric motif which binds NF-κBp65 with increased affinity compared to rs2205960-G. ChIP-seq data also indicate NF-κB interaction with the DNA sequence at this position in LCL cells. Our research suggests association of rs2205960-T with SLE across multiple groups and an independent non-risk signal at rs1234314-C. rs2205960-T is associated with autoantibody production and lymphopenia. Our data confirm a global signal at TNFSF4 and a role for the expressed product at multiple stages of lymphocyte dysregulation during SLE pathogenesis. We confirm the validity of trans-ancestral mapping in a complex trait.
Collapse
Affiliation(s)
- Harinder Manku
- Department of Medical & Molecular Genetics, King's College London School of Medicine, Guy's Hospital, London, United Kingdom
| | - Carl D. Langefeld
- Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Sandra G. Guerra
- Centre for Rheumatology & Connective Tissue Diseases, Royal Free & University College Medical School, London, United Kingdom
| | - Talat H. Malik
- Division of Immunology and Inflammation, Imperial College, London, United Kingdom
| | - Marta Alarcon-Riquelme
- Centro Pfizer-Universidad de Granada-Junta de Andalucía de Genómica e Investigaciones Oncológicas, Granada, Spain
| | - Juan-Manuel Anaya
- Center for Autoimmune Diseases Research, Universidad del Rosario, Bogota, Colombia
| | - Sang-Cheol Bae
- Hospital for Rheumatic Diseases, Hanyang University, Seoul, South Korea
| | - Susan A. Boackle
- Division of Rheumatology, University of Colorado Denver, Aurora, Colorado, United States of America
| | - Elizabeth E. Brown
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Lindsey A. Criswell
- Rosalind Russell Medical Research Center for Arthritis, University of California San Francisco, San Francisco, California, United States of America
| | - Barry I. Freedman
- Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Patrick M. Gaffney
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
| | - Peter A. Gregersen
- The Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, New York, United States of America
| | - Joel M. Guthridge
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
| | - Sang-Hoon Han
- Hospital for Rheumatic Diseases, Hanyang University, Seoul, South Korea
| | - John B. Harley
- Division of Rheumatology, Cincinnati Children's Hospital Medical Centre, Cincinnati, Ohio, United States of America
| | - Chaim O. Jacob
- The Lupus Genetics Group, Department of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Judith A. James
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
- Department of Medicine, University of Oklahoma Healthy Sciences Center, Oklahoma City, Oklahoma, United States of America
| | - Diane L. Kamen
- Division of Rheumatology, Medical University of South Carolina, Charleston, South Carolina, United States of America
| | - Kenneth M. Kaufman
- Division of Rheumatology, Cincinnati Children's Hospital Medical Centre, Cincinnati, Ohio, United States of America
| | - Jennifer A. Kelly
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
| | - Javier Martin
- Instituto de Parasitologia y Biomedicina Lopez-Neyra, Consejo Superior de Investigaciones Cientificas, Granada, Spain
| | - Joan T. Merrill
- Clinical Pharmacology, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
| | - Kathy L. Moser
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
| | - Timothy B. Niewold
- Divisions of Rheumatology and Immunology, Mayo Clinic, Rochester, Minnesota, United States of America
| | - So-Yeon Park
- Hospital for Rheumatic Diseases, Hanyang University, Seoul, South Korea
| | | | - Amr H. Sawalha
- Division of Rheumatology, Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America
| | - R. Hal Scofield
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
- Department of Medicine, University of Oklahoma Healthy Sciences Center, Oklahoma City, Oklahoma, United States of America
| | - Nan Shen
- Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Anne M. Stevens
- Center for Immunity and Immunotherapies, Seattle Children's Research Institute, Seattle, Washington, United States of America
| | - Celi Sun
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
| | - Gary S. Gilkeson
- Division of Rheumatology and Immunology, Medical University of South Carolina, Charleston, South Carolina, United States of America
| | - Jeff C. Edberg
- Division of Rheumatology and Immunology, Medical University of South Carolina, Charleston, South Carolina, United States of America
| | - Robert P. Kimberly
- Division of Clinical Immunology and Rheumatology, Department of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Swapan K. Nath
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, United States of America
| | - Betty P. Tsao
- Division of Rheumatology, Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, California, United States of America
| | - Tim J. Vyse
- Department of Medical & Molecular Genetics, King's College London School of Medicine, Guy's Hospital, London, United Kingdom
| |
Collapse
|
8
|
Folci M, Meda F, Gershwin ME, Selmi C. Cutting-edge issues in primary biliary cirrhosis. Clin Rev Allergy Immunol 2012; 42:342-54. [PMID: 21243445 DOI: 10.1007/s12016-011-8253-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Several crucial issues remain open in our understanding of primary biliary cirrhosis (PBC), an autoimmune liver disease targeting the small- and medium-sized intrahepatic bile ducts. These issues include the high tissue specificity of the autoimmune injury despite the nontraditional autoantigens found in all mitochondria recognized by PBC-associated autoantibodies, the causes of the commonly observed pruritus, and the disease etiology per se. In all these fields, there has been recent interest secondary to the use of large-scale efforts (such as genome-wide association studies) that were previously considered poorly feasible in a rare disease such as PBC as well as other intuitions. Accordingly, there are now fascinating theories to explain the onset and severity of pruritus due to elevated autotaxin levels, the peculiar apoptotic features of bile duct cells to explain the tissue specificity, and genomic and epigenetic associations contributing to disease susceptibility. We have arbitrarily chosen these four aspects as the most promising in the PBC recent literature and will provide herein a discussion of the recent data and their potential implications.
Collapse
Affiliation(s)
- Marco Folci
- Division of Internal Medicine, IRCCS Istituto Clinico Humanitas, via A. Manzoni 56, Rozzano, 20089, Milan, Italy
| | | | | | | |
Collapse
|
9
|
Kovacic MB, Myers JMB, Wang N, Martin LJ, Lindsey M, Ericksen MB, He H, Patterson TL, Baye TM, Torgerson D, Roth LA, Gupta J, Sivaprasad U, Gibson AM, Tsoras AM, Hu D, Eng C, Chapela R, Rodríguez-Santana JR, Rodríguez-Cintrón W, Avila PC, Beckman K, Seibold MA, Gignoux C, Musaad SM, Chen W, Burchard EG, Hershey GKK. Identification of KIF3A as a novel candidate gene for childhood asthma using RNA expression and population allelic frequencies differences. PLoS One 2011; 6:e23714. [PMID: 21912604 PMCID: PMC3166061 DOI: 10.1371/journal.pone.0023714] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Accepted: 07/23/2011] [Indexed: 11/18/2022] Open
Abstract
Background Asthma is a chronic inflammatory disease with a strong genetic predisposition. A major challenge for candidate gene association studies in asthma is the selection of biologically relevant genes. Methodology/Principal Findings Using epithelial RNA expression arrays, HapMap allele frequency variation, and the literature, we identified six possible candidate susceptibility genes for childhood asthma including ADCY2, DNAH5, KIF3A, PDE4B, PLAU, SPRR2B. To evaluate these genes, we compared the genotypes of 194 predominantly tagging SNPs in 790 asthmatic, allergic and non-allergic children. We found that SNPs in all six genes were nominally associated with asthma (p<0.05) in our discovery cohort and in three independent cohorts at either the SNP or gene level (p<0.05). Further, we determined that our selection approach was superior to random selection of genes either differentially expressed in asthmatics compared to controls (p = 0.0049) or selected based on the literature alone (p = 0.0049), substantiating the validity of our gene selection approach. Importantly, we observed that 7 of 9 SNPs in the KIF3A gene more than doubled the odds of asthma (OR = 2.3, p<0.0001) and increased the odds of allergic disease (OR = 1.8, p<0.008). Our data indicate that KIF3A rs7737031 (T-allele) has an asthma population attributable risk of 18.5%. The association between KIF3A rs7737031 and asthma was validated in 3 independent populations, further substantiating the validity of our gene selection approach. Conclusions/Significance Our study demonstrates that KIF3A, a member of the kinesin superfamily of microtubule associated motors that are important in the transport of protein complexes within cilia, is a novel candidate gene for childhood asthma. Polymorphisms in KIF3A may in part be responsible for poor mucus and/or allergen clearance from the airways. Furthermore, our study provides a promising framework for the identification and evaluation of novel candidate susceptibility genes.
Collapse
Affiliation(s)
- Melinda Butsch Kovacic
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Wu S, Yang J, Wang C, Wu R. A general quantitative genetic model for haplotyping a complex trait in humans. Curr Genomics 2011; 8:343-50. [PMID: 19384430 PMCID: PMC2652406 DOI: 10.2174/138920207782446179] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2006] [Revised: 08/17/2007] [Accepted: 09/02/2007] [Indexed: 11/22/2022] Open
Abstract
Uncertainty about linkage phases of multiple single nucleotide polymorphisms (SNPs) in heterozygous diploids challenges the identification of specific DNA sequence variants that encode a complex trait. A statistical technique implemented with the EM algorithm has been developed to infer the effects of SNP haplotypes from genotypic data by assuming that one haplotype (called the risk haplotype) performs differently from the rest (called the non-risk haplotype). This assumption simplifies the definition and estimation of genotypic values of diplotypes for a complex trait, but will reduce the power to detect the risk haplotype when non-risk haplotypes contain substantial diversity. In this article, we incorporate general quantitative genetic theory to specify the differentiation of different haplotypes in terms of their genetic control of a complex trait. A model selection procedure is deployed to test the best number and combination of risk haplotypes, thus providing a precise and powerful test of genetic determination in association studies. Our method is derived on the maximum likelihood theory and has been shown through simulation studies to be powerful for the characterization of the genetic architecture of complex quantitative traits.
Collapse
Affiliation(s)
- Song Wu
- Department of Statistics, University of Florida, Gainesville, FL 32611, USA
| | | | | | | |
Collapse
|
11
|
ABCB1 (MDR1) polymorphisms and antidepressant response in geriatric depression. Pharmacogenet Genomics 2011; 20:467-75. [PMID: 20555295 DOI: 10.1097/fpc.0b013e32833b593a] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
OBJECTIVE Variation in the ATP-binding cassette, subfamily B, member 1 transporter (ABCB1) (multidrug-resistance gene 1) gene has been investigated as a predictor of response to treatment with a variety of medications such as antiarrhythmics, chemotherapeutic agents, anti-HIV medications, and some psychotropics. The ABCB1 gene product, P-glycoprotein, affects the transport of drugs out of many cell types, including endothelial cells at the blood-brain barrier. We sought to determine if ABCB1 polymorphisms predict response to antidepressant treatment in geriatric patients. METHODS We compared the effects of ABCB1 genetic variation on the therapeutic response to paroxetine, a P-glycoprotein substrate, and to mirtazapine, which is not thought to be transported by ABCB1, in a sample of 246 elderly patients with major depression treated in a clinical trial setting. A total of 15 single nucleotide polymorphisms in the ABCB1 gene were assessed in each patient. Two of these ABCB1 single nucleotide polymorphisms were earlier reported to predict treatment response in patients prescribed with P-glycoprotein substrate antidepressants. RESULTS The two earlier identified ABCB1 markers for antidepressant response predicted time to remission in our paroxetine-treated patients, but not in the mirtazapine-treated patients. These results replicate the published findings of others. If a Bonferroni correction for type I error is made, our results do not reach the criteria for statistical significance. However, the Bonferroni correction may be too conservative given the strong linkage disequilibrium among some of the markers and our aim to replicate the earlier published findings. CONCLUSION Our study provides confirmation that certain ABCB1 polymorphisms predict response to substrate medications in geriatric patients.
Collapse
|
12
|
Zhang X, Pan F, Xie Y, Zou F, Wang W. COE: a general approach for efficient genome-wide two-locus epistasis test in disease association study. J Comput Biol 2010; 17:401-15. [PMID: 20377453 DOI: 10.1089/cmb.2009.0155] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The availability of high-density single nucleotide polymorphisms (SNPs) data has made genome-wide association study computationally challenging. Two-locus epistasis (gene-gene interaction) detection has attracted great research interest as a promising method for genetic analysis of complex diseases. In this article, we propose a general approach, COE, for efficient large scale gene-gene interaction analysis, which supports a wide range of tests. In particular, we show that many commonly used statistics are convex functions. From the observed values of the events in two-locus association test, we can develop an upper bound of the test value. Such an upper bound only depends on single-locus test and the genotype of the SNP-pair. We thus group and index SNP-pairs by their genotypes. This indexing structure can benefit the computation of all convex statistics. Utilizing the upper bound and the indexing structure, we can prune most of the SNP-pairs without compromising the optimality of the result. Our approach is especially efficient for large permutation test. Extensive experiments demonstrate that our approach provides orders of magnitude performance improvement over the brute force approach.
Collapse
Affiliation(s)
- Xiang Zhang
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
| | | | | | | | | |
Collapse
|
13
|
Mahdevar G, Zahiri J, Sadeghi M, Nowzari-Dalini A, Ahrabian H. Tag SNP selection via a genetic algorithm. J Biomed Inform 2010; 43:800-4. [PMID: 20546935 DOI: 10.1016/j.jbi.2010.05.011] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2009] [Revised: 03/17/2010] [Accepted: 05/16/2010] [Indexed: 01/02/2023]
Abstract
Single Nucleotide Polymorphisms (SNPs) provide valuable information on human evolutionary history and may lead us to identify genetic variants responsible for human complex diseases. Unfortunately, molecular haplotyping methods are costly, laborious, and time consuming; therefore, algorithms for constructing full haplotype patterns from small available data through computational methods, Tag SNP selection problem, are convenient and attractive. This problem is proved to be an NP-hard problem, so heuristic methods may be useful. In this paper we present a heuristic method based on genetic algorithm to find reasonable solution within acceptable time. The algorithm was tested on a variety of simulated and experimental data. In comparison with the exact algorithm, based on brute force approach, results show that our method can obtain optimal solutions in almost all cases and runs much faster than exact algorithm when the number of SNP sites is large. Our software is available upon request to the corresponding author.
Collapse
Affiliation(s)
- Ghasem Mahdevar
- Department of Bioinformatics, University of Tehran, Tehran, Iran.
| | | | | | | | | |
Collapse
|
14
|
Liu L, Wu Y, Lonardi S, Jiang T. Efficient genome-wide TagSNP selection across populations via the linkage disequilibrium criterion. J Comput Biol 2010; 17:21-37. [PMID: 20078395 DOI: 10.1089/cmb.2007.0228] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In this article, we studied the tag single-nucleotide polymorphism (tagSNP) selection problem on multiple populations using the pairwise r(2) linkage disequilibrium criterion. We proposed a novel combinatorial optimization model for the tagSNP selection problem, called the minimum common tagSNP selection (MCTS) problem, and presented efficient solutions for MCTS. Our approach consists of the following three main steps: (i) partitioning the SNP markers into small disjoint components, (ii) applying some data reduction rules to simplify the problem, and (iii) applying either a fast greedy algorithm or a Lagrangian relaxation algorithm to solve the remaining (general) MCTS. These algorithms also provide lower bounds on tagging (i.e., the minimum number of tagSNPs needed). The lower bounds allow us to evaluate how far our solution is from the optimum. To the best of our knowledge, it is the first time the tagging lower bounds are discussed in the literature. We assessed the performance of our algorithms on real HapMap data for genome-wide tagging. The experiments demonstrated that our algorithms run 3-4 orders of magnitude faster than the existing single-population tagging programs such as FESTA, LD-Select, and the multiple-population tagging method MultiPop-TagSelect. Our method also greatly reduced the required tagSNPs compared with LD-Select on a single population and MultiPop-TagSelect on multiple populations. Moreover, the numbers of tagSNPs selected by our algorithms are almost optimal because they are very close to the corresponding lower bounds obtained by our method.
Collapse
Affiliation(s)
- Lan Liu
- Department of Computer Science and Engineering, University of California, Riverside, California, USA.
| | | | | | | |
Collapse
|
15
|
Liu G, Wang Y, Wong L. FastTagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium. BMC Bioinformatics 2010; 11:66. [PMID: 20113476 PMCID: PMC3098109 DOI: 10.1186/1471-2105-11-66] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Accepted: 01/29/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Human genome contains millions of common single nucleotide polymorphisms (SNPs) and these SNPs play an important role in understanding the association between genetic variations and human diseases. Many SNPs show correlated genotypes, or linkage disequilibrium (LD), thus it is not necessary to genotype all SNPs for association study. Many algorithms have been developed to find a small subset of SNPs called tag SNPs that are sufficient to infer all the other SNPs. Algorithms based on the r2 LD statistic have gained popularity because r2 is directly related to statistical power to detect disease associations. Most of existing r2 based algorithms use pairwise LD. Recent studies show that multi-marker LD can help further reduce the number of tag SNPs. However, existing tag SNP selection algorithms based on multi-marker LD are both time-consuming and memory-consuming. They cannot work on chromosomes containing more than 100 k SNPs using length-3 tagging rules. RESULTS We propose an efficient algorithm called FastTagger to calculate multi-marker tagging rules and select tag SNPs based on multi-marker LD. FastTagger uses several techniques to reduce running time and memory consumption. Our experiment results show that FastTagger is several times faster than existing multi-marker based tag SNP selection algorithms, and it consumes much less memory at the same time. As a result, FastTagger can work on chromosomes containing more than 100 k SNPs using length-3 tagging rules.FastTagger also produces smaller sets of tag SNPs than existing multi-marker based algorithms, and the reduction ratio ranges from 3%-9% when length-3 tagging rules are used. The generated tagging rules can also be used for genotype imputation. We studied the prediction accuracy of individual rules, and the average accuracy is above 96% when r2 >/= 0.9. CONCLUSIONS Generating multi-marker tagging rules is a computation intensive task, and it is the bottleneck of existing multi-marker based tag SNP selection methods. FastTagger is a practical and scalable algorithm to solve this problem.
Collapse
Affiliation(s)
- Guimei Liu
- Department of Computer Science, National University of Singapore, Singapore.
| | | | | |
Collapse
|
16
|
Himes BE, Wu AC, Duan QL, Klanderman B, Litonjua AA, Tantisira K, Ramoni MF, Weiss ST. Predicting response to short-acting bronchodilator medication using Bayesian networks. Pharmacogenomics 2009; 10:1393-412. [PMID: 19761364 DOI: 10.2217/pgs.09.93] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
AIMS Bronchodilator response tests measure the effect of beta(2)-agonists, the most commonly used short-acting reliever drugs for asthma. We sought to relate candidate gene SNP data with bronchodilator response and measure the predictive accuracy of a model constructed with genetic variants. MATERIALS & METHODS Bayesian networks, multivariate models that are able to account for simultaneous associations and interactions among variables, were used to create a predictive model of bronchodilator response using candidate gene SNP data from 308 Childhood Asthma Management Program Caucasian subjects. RESULTS The model found that 15 SNPs in 15 genes predict bronchodilator response with fair accuracy, as established by a fivefold cross-validation area under the receiver-operating characteristic curve of 0.75 (standard error: 0.03). CONCLUSION Bayesian networks are an attractive approach to analyze large-scale pharmacogenetic SNP data because of their ability to automatically learn complex models that can be used for the prediction and discovery of novel biological hypotheses.
Collapse
Affiliation(s)
- Blanca E Himes
- Harvard-MIT Division of Health Sciences and Technology, MA, USA.
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Polymorphisms in toll-like receptor 4 and toll-like receptor 9 influence viral load in a seroincident cohort of HIV-1-infected individuals. AIDS 2009; 23:2387-95. [PMID: 19855253 DOI: 10.1097/qad.0b013e328330b489] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
OBJECTIVES Toll-like receptors (TLRs) are innate immune sensors that are integral to resisting chronic and opportunistic infections. Mounting evidence implicates TLR polymorphisms in susceptibilities to various infectious diseases, including HIV-1. We investigated the impact of TLR single nucleotide polymorphisms (SNPs) on clinical outcome in a seroincident cohort of HIV-1-infected volunteers. DESIGN We analyzed TLR SNPs in 201 antiretroviral treatment-naive HIV-1-infected volunteers from a longitudinal seroincident cohort with regular follow-up intervals (median follow-up 4.2 years, interquartile range 4.4). Participants were stratified into two groups according to either disease progression, defined as peripheral blood CD4(+) T-cell decline over time, or peak and setpoint viral load. METHODS Haplotype tagging SNPs from TLR2, TLR3, TLR4, and TLR9 were detected by mass array genotyping, and CD4(+) T-cell counts and viral load measurements were determined prior to antiretroviral therapy initiation. The association of TLR haplotypes with viral load and rapid progression was assessed by multivariate regression models using age and sex as covariates. RESULTS Two TLR4 SNPs in strong linkage disequilibrium [1063 A/G (D299G) and 1363 C/T (T399I)] were more frequent among individuals with high peak viral load compared with low/moderate peak viral load (odds ratio 6.65, 95% confidence interval 2.19-20.46, P < 0.001; adjusted P = 0.002 for 1063 A/G). In addition, a TLR9 SNP previously associated with slow progression was found less frequently among individuals with high viral setpoint compared with low/moderate setpoint (odds ratio 0.29, 95% confidence interval 0.13-0.65, P = 0.003, adjusted P = 0.04). CONCLUSION This study suggests a potentially new role for TLR4 polymorphisms in HIV-1 peak viral load and confirms a role for TLR9 polymorphisms in disease progression.
Collapse
|
18
|
Levesque MC, Hobbs MR, O'Loughlin CW, Chancellor JA, Chen Y, Tkachuk AN, Booth J, Patch KB, Allgood S, Pole AR, Fernandez CA, Mwaikambo ED, Mutabingwa TK, Fried M, Sorensen B, Duffy PE, Granger DL, Anstey NM, Weinberg JB. Malaria severity and human nitric oxide synthase type 2 (NOS2) promoter haplotypes. Hum Genet 2009; 127:163-82. [PMID: 19859740 DOI: 10.1007/s00439-009-0753-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2009] [Accepted: 10/05/2009] [Indexed: 10/20/2022]
Abstract
Nitric oxide (NO) mediates host resistance to severe malaria and other infectious diseases. NO production and mononuclear cell expression of the NO producing enzyme-inducible nitric oxide synthase (NOS2) have been associated with protection from severe falciparum malaria. The purpose of this study was to identify single nucleotide polymorphisms (SNPs) and haplotypes in the NOS2 promoter, to identify associations of these haplotypes with malaria severity and to test the effects of these polymorphisms on promoter activity. We identified 34 SNPs in the proximal 7.3 kb region of the NOS2 promoter and inferred NOS2 promoter haplotypes based on genotyping 24 of these SNPs in a population of Tanzanian children with and without cerebral malaria. We identified 71 haplotypes; 24 of these haplotypes comprised 82% of the alleles. We determined whether NOS2 promoter haplotypes were associated with malaria severity in two groups of subjects from Dar es Salaam (N = 185 and N = 250) and in an inception cohort of children from Muheza-Tanga, Tanzania (N = 883). We did not find consistent associations of NOS2 promoter haplotypes with malaria severity or malarial anemia, although interpretation of these results was potentially limited by the sample size of each group. Furthermore, cytokine-induced NOS2 promoter activity determined using luciferase reporter constructs containing the proximal 7.3 kb region of the NOS2 promoter and the G-954C or C-1173T SNPs did not differ from NOS2 promoter constructs that lacked these polymorphisms. Taken together, these studies suggest that the relationship between NOS2 promoter polymorphisms and malaria severity is more complex than previously described.
Collapse
|
19
|
Sebastiani P, Timofeev N, Dworkis DA, Perls TT, Steinberg MH. Genome-wide association studies and the genetic dissection of complex traits. Am J Hematol 2009; 84:504-15. [PMID: 19569043 PMCID: PMC2895326 DOI: 10.1002/ajh.21440] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The availability of affordable high throughput technology for parallel genotyping has opened the field of genetics to genome-wide association studies (GWAS), and in the last few years hundreds of articles reporting results of GWAS for a variety of heritable traits have been published. What do these results tell us? Although GWAS have discovered a few hundred reproducible associations, this number is underwhelming in relation to the huge amount of data produced, and challenges the conjecture that common variants may be the genetic causes of common diseases. We argue that the massive amount of genetic data that result from these studies remains largely unexplored and unexploited because of the challenge of mining and modeling enormous data sets, the difficulty of using nontraditional computational techniques and the focus of accepted statistical analyses on controlling the false positive rate rather than limiting the false negative rate. In this article, we will review the common approach to analysis of GWAS data and then discuss options to learn more from these data. We will use examples from our ongoing studies of sickle cell anemia and also GWAS in multigenic traits.
Collapse
Affiliation(s)
- Paola Sebastiani
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts 02118, USA.
| | | | | | | | | |
Collapse
|
20
|
Boyles AL, Wilcox AJ, Taylor JA, Shi M, Weinberg CR, Meyer K, Fredriksen A, Ueland PM, Johansen AMW, Drevon CA, Jugessur A, Trung TN, Gjessing HK, Vollset SE, Murray JC, Christensen K, Lie RT. Oral facial clefts and gene polymorphisms in metabolism of folate/one-carbon and vitamin A: a pathway-wide association study. Genet Epidemiol 2009; 33:247-55. [PMID: 19048631 DOI: 10.1002/gepi.20376] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
An increased risk of facial clefts has been observed among mothers with lower intake of folic acid or vitamin A around conception. We hypothesized that the risk of clefts may be further moderated by genes involved in metabolizing folate or vitamin A. We included 425 case-parent triads in which the child had either cleft lip with or without cleft palate (CL/P) or cleft palate only (CPO), and no other major defects. We analyzed 108 SNPs and one insertion in 29 genes involved in folate/one-carbon metabolism and 68 SNPs from 16 genes involved in vitamin A metabolism. Using the Triad Multi-Marker (TRIMM) approach we performed SNP, gene, chromosomal region, and pathway-wide association tests of child or maternal genetic effects for both CL/P and CPO. We stratified these analyses on maternal intake of folic acid or vitamin A during the periconceptional period. As expected with this high number of statistical tests, there were many associations with P-values<0.05; although there were fewer than predicted by chance alone. The strongest association in our data (between fetal FOLH1 and CPO, P=0.0008) is not in agreement with epidemiologic evidence that folic acid reduces the risk of CL/P in these data, not CPO. Despite strong evidence for genetic causes of oral facial clefts and the protective effects of maternal vitamins, we found no convincing indication that polymorphisms in these vitamin metabolism genes play an etiologic role.
Collapse
Affiliation(s)
- Abee L Boyles
- Epidemiology Branch, NIEHS/NIH, Durham, North Carolina 27709, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Marazita ML, Lidral AC, Murray JC, Field LL, Maher BS, Goldstein McHenry T, Cooper ME, Govil M, Daack-Hirsch S, Riley B, Jugessur A, Felix T, Morene L, Mansilla MA, Vieira AR, Doheny K, Pugh E, Valencia-Ramirez C, Arcos-Burgos M. Genome scan, fine-mapping, and candidate gene analysis of non-syndromic cleft lip with or without cleft palate reveals phenotype-specific differences in linkage and association results. Hum Hered 2009; 68:151-70. [PMID: 19521098 DOI: 10.1159/000224636] [Citation(s) in RCA: 101] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2008] [Accepted: 02/12/2009] [Indexed: 01/11/2023] Open
Abstract
OBJECTIVES Non-syndromic orofacial clefts, i.e. cleft lip (CL) and cleft palate (CP), are among the most common birth defects. The goal of this study was to identify genomic regions and genes for CL with or without CP (CL/P). METHODS We performed linkage analyses of a 10 cM genome scan in 820 multiplex CL/P families (6,565 individuals). Significant linkage results were followed by association analyses of 1,476 SNPs in candidate genes and regions, utilizing a weighted false discovery rate (wFDR) approach to control for multiple testing and incorporate the genome scan results. RESULTS Significant (multipoint HLOD >or=3.2) or genome-wide-significant (HLOD >or=4.02) linkage results were found for regions 1q32, 2p13, 3q27-28, 9q21, 12p11, 14q21-24 and 16q24. SNPs in IRF6 (1q32) and in or near FOXE1 (9q21) reached formal genome-wide wFDR-adjusted significance. Further, results were phenotype dependent in that the IRF6 region results were most significant for families in which affected individuals have CL alone, and the FOXE1 region results were most significant in families in which some or all of the affected individuals have CL with CP. CONCLUSIONS These results highlight the importance of careful phenotypic delineation in large samples of families for genetic analyses of complex, heterogeneous traits such as CL/P.
Collapse
Affiliation(s)
- Mary L Marazita
- Department of Oral Biology, Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA 15219, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
"PolyMin": software for identification of the minimum number of polymorphisms required for haplotype and genotype differentiation. BMC Bioinformatics 2009; 10:176. [PMID: 19515225 PMCID: PMC2707369 DOI: 10.1186/1471-2105-10-176] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2009] [Accepted: 06/10/2009] [Indexed: 11/10/2022] Open
Abstract
Background Analysis of allelic variation for relevant genes and monitoring chromosome segment transmission during selection are important approaches in plant breeding and ecology. To minimize the number of required molecular markers for this purpose is crucial due to cost and time constraints. To date, software for identification of the minimum number of required markers has been optimized for human genetics and is only partly matching the needs of plant scientists and breeders. In addition, different software packages with insufficient interoperability need to be combined to extract this information from available allele sequence data, resulting in an error-prone multi-step process of data handling. Results PolyMin, a computer program combining the detection of a minimum set of single nucleotide polymorphisms (SNPs) and/or insertions/deletions (INDELs) necessary for allele differentiation with the subsequent genotype differentiation in plant populations has been developed. Its efficiency in finding minimum sets of polymorphisms is comparable to other available program packages. Conclusion A computer program detecting the minimum number of SNPs for haplotype discrimination and subsequent genotype differentiation has been developed, and its performance compared to other relevant software. The main advantages of PolyMin, especially for plant scientists, is the integration of procedures from sequence analysis to polymorphism selection within a single program, including both haplotype and genotype differentiation.
Collapse
|
23
|
Wada M, Marusawa H, Yamada R, Nasu A, Osaki Y, Kudo M, Nabeshima M, Fukuda Y, Chiba T, Matsuda F. Association of genetic polymorphisms with interferon-induced haematologic adverse effects in chronic hepatitis C patients. J Viral Hepat 2009; 16:388-96. [PMID: 19200137 DOI: 10.1111/j.1365-2893.2009.01095.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
Abstract
Interferon (IFN)-based combination therapy with ribavirin has become the gold standard for the treatment of chronic hepatitis C virus infection. Haematologic toxicities, such as neutropenia, thrombocytopenia, and anaemia, however, frequently cause poor treatment tolerance, resulting in poor therapeutic efficacy. The aim of this study was to identify host genetic polymorphisms associated with the efficacy or haematologic toxicity of IFN-based combination therapy in chronic hepatitis C patients. We performed comprehensive single nucleotide polymorphism detection in all exonic regions of the 12 genes involved in the IFN signalling pathway in 32 healthy Japanese volunteers. Of 167 identified polymorphisms, 35 were genotyped and tested for an association with the efficacy or toxicity of IFN plus ribavirin therapy in 240 chronic hepatitis C patients. Multiple logistic regression analysis revealed that low viral load, viral genotypes 2 and 3, and a lower degree of liver fibrosis, but none of the genetic polymorphisms, were significantly associated with a sustained virologic response. In contrast to efficacy, multiple linear regression analyses demonstrated that two polymorphisms (IFNAR1 10848-A/G and STAT2 4757-G/T) were significantly associated with IFN-induced neutropenia (P = 0.013 and P = 0.011, respectively). Thrombocytopenia was associated with the IRF7 789-G/A (P = 0.031). In conclusion, genetic polymorphisms in IFN signalling pathway-related genes were associated with IFN-induced neutropenia and thrombocytopenia in chronic hepatitis C patients. In contrast to toxicity, the efficacy of IFN-based therapy was largely dependent on viral factors and degree of liver fibrosis.
Collapse
Affiliation(s)
- M Wada
- Department of Gastroenterology and Hepatology, Graduate School of Medicine, Kyoto University, Sakyo-ku, Kyoto, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Abstract
Although previous studies have revealed a great deal about the genetic basis of susceptibility and resistance to parasite infection, there is now an opportunity to considerably enhance understanding through genome-wide association mapping. The application of association mapping to complex inheritance has recently become achievable given reduced costs, sophisticated genotyping platforms and powerful statistical methods which build upon increased knowledge of the linkage disequilibrium structure of the human genome. Linkage mapping and related approaches remain useful for the localization of the rarer genetic variants and candidate region association studies can be a very cost-effective route to progress. However, genome-wide association offers the greatest promise, despite the challenges posed by phenotype complexity, ensuring genotype coverage/quality and robust statistical analysis. The available approaches for mapping genes underlying susceptibility are reviewed here, emphasizing their relative merits and drawbacks and highlighting specific software tools and resources that enable successful mapping.
Collapse
Affiliation(s)
- A Collins
- Human Genetics Division, School of Medicine, Southampton General Hospital, University of Southampton, Southampton, UK.
| |
Collapse
|
25
|
Jugessur A, Shi M, Gjessing HK, Lie RT, Wilcox AJ, Weinberg CR, Christensen K, Boyles AL, Daack-Hirsch S, Trung TN, Bille C, Lidral AC, Murray JC. Genetic determinants of facial clefting: analysis of 357 candidate genes using two national cleft studies from Scandinavia. PLoS One 2009; 4:e5385. [PMID: 19401770 PMCID: PMC2671138 DOI: 10.1371/journal.pone.0005385] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2009] [Accepted: 03/20/2009] [Indexed: 11/28/2022] Open
Abstract
Background Facial clefts are common birth defects with a strong genetic component. To identify fetal genetic risk factors for clefting, 1536 SNPs in 357 candidate genes were genotyped in two population-based samples from Scandinavia (Norway: 562 case-parent and 592 control-parent triads; Denmark: 235 case-parent triads). Methodology/Principal Findings We used two complementary statistical methods, TRIMM and HAPLIN, to look for associations across these two national samples. TRIMM tests for association in each gene by using multi-SNP genotypes from case-parent triads directly without the need to infer haplotypes. HAPLIN on the other hand estimates the full haplotype distribution over a set of SNPs and estimates relative risks associated with each haplotype. For isolated cleft lip with or without cleft palate (I-CL/P), TRIMM and HAPLIN both identified significant associations with IRF6 and ADH1C in both populations, but only HAPLIN found an association with FGF12. For isolated cleft palate (I-CP), TRIMM found associations with ALX3, MKX, and PDGFC in both populations, but only the association with PDGFC was identified by HAPLIN. In addition, HAPLIN identified an association with ETV5 that was not detected by TRIMM. Conclusion/Significance Strong associations with seven genes were replicated in the Scandinavian samples and our approach effectively replicated the strongest previously known association in clefting—with IRF6. Based on two national cleft cohorts of similar ancestry, two robust statistical methods and a large panel of SNPs in the most promising cleft candidate genes to date, this study identified a previously unknown association with clefting for ADH1C and provides additional candidates and analytic approaches to advance the field.
Collapse
Affiliation(s)
- Astanand Jugessur
- Craniofacial Development, Musculoskeletal Disorders, Murdoch Childrens Research Institute, Royal Children's Hospital, Parkville, Australia
| | - Min Shi
- Biostatistics Branch, National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, Durham, North Carolina, United States of America
| | - Håkon Kristian Gjessing
- Department of Epidemiology (EPAM), Norwegian Institute of Public Health, Oslo, Norway
- Section for Epidemiology and Medical Statistics, Department of Public Health and Primary Health Care, University of Bergen, Bergen, Norway
| | - Rolv Terje Lie
- Section for Epidemiology and Medical Statistics, Department of Public Health and Primary Health Care, University of Bergen, Bergen, Norway
- Medical Birth Registry of Norway, Norwegian Institute of Public Health, Bergen, Norway
| | - Allen James Wilcox
- Epidemiology Branch, National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, Durham, North Carolina, United States of America
| | - Clarice Ring Weinberg
- Biostatistics Branch, National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, Durham, North Carolina, United States of America
| | - Kaare Christensen
- Department of Epidemiology, University of Southern Denmark, Odense, Denmark
| | - Abee Lowman Boyles
- Epidemiology Branch, National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, Durham, North Carolina, United States of America
| | - Sandra Daack-Hirsch
- College of Nursing, University of Iowa, Iowa City, Iowa, United States of America
| | - Truc Nguyen Trung
- Medical Birth Registry of Norway, Norwegian Institute of Public Health, Bergen, Norway
| | - Camilla Bille
- Department of Epidemiology, University of Southern Denmark, Odense, Denmark
| | - Andrew Carl Lidral
- Departments of Pediatrics, Epidemiology and Biological Sciences, University of Iowa, Iowa City, Iowa, United States of America
| | - Jeffrey Clark Murray
- Department of Epidemiology, University of Southern Denmark, Odense, Denmark
- Departments of Pediatrics, Epidemiology and Biological Sciences, University of Iowa, Iowa City, Iowa, United States of America
- * E-mail:
| |
Collapse
|
26
|
Zhang X, Zou F, Wang W. FastChi: an efficient algorithm for analyzing gene-gene interactions. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2009:528-539. [PMID: 19209728 PMCID: PMC2728448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Recent advances in high-throughput genotyping have inspired increasing research interests in genome-wide association study for diseases. To understand underlying biological mechanisms of many diseases, we need to consider simultaneously the genetic effects across multiple loci. The large number of SNPs often makes multilocus association study very computationally challenging because it needs to explicitly enumerate all possible SNP combinations at the genome-wide scale. Moreover, with the large number of SNPs correlated, permutation procedure is often needed for properly controlling family-wise error rates. This makes the problem even more computationally demanding, since the test procedure needs to be repeated for each permuted data. In this paper, we present FastChi, an exhaustive yet efficient algorithm for genome-wide two-locus chi-square test. FastChi utilizes an upper bound of the two-locus chi-square test, which can be expressed as the sum of two terms--both are efficient to compute: the first term is based on the single-locus chi-square test for the given phenotype; and the second term only depends on the genotypes and is independent of the phenotype. This upper bound enables the algorithm to only perform the two-locus chi-square test on a small number of candidate SNP pairs without the risk of missing any significant ones. Since the second part of the upper bound only needs to be precomputed once and stored for subsequence uses, the advantage is more prominent in large permutation tests. Extensive experimental results demonstrate that our method is an order of magnitude faster than the brute force alternative.
Collapse
Affiliation(s)
- Xiang Zhang
- Department of Computer Science, University of North Carolina at Chapel Hill, USA
| | | | | |
Collapse
|
27
|
Abstract
Different expression of maternally and paternally inherited alleles at certain genes is called genetic imprinting. Despite its great importance in trait formation, development, and evolution, it remains unclear how genetic imprinting operates in a complex network of interactive genes located throughout the genome. Genetic mapping has proven to be a powerful tool that can estimate the distribution and effects of imprinted genes. While traditional mapping models attempt to detect imprinted quantitative trait loci based on a linkage map constructed from molecular markers, we have developed a statistical model for estimating the imprinting effects of haplotypes composed of multiple sequenced single-nucleotide polymorphisms. The new model provides a characterization of the difference in the effect of maternally and paternally derived haplotypes, which can be used as a tool for genetic association studies at the candidate gene or genome-wide level. The model was used to map imprinted haplotype effects on body mass index in a random sample from a natural human population, leading to the detection of significant imprinted effects at the haplotype level. The new model will be useful for characterizing the genetic architecture of complex quantitative traits at the nucleotide level.
Collapse
Affiliation(s)
- Yun Cheng
- Department of Statistics, University of Florida, Gainesville, FL, USA
| | | | | | | | | |
Collapse
|
28
|
COE: A General Approach for Efficient Genome-Wide Two-Locus Epistasis Test in Disease Association Study. LECTURE NOTES IN COMPUTER SCIENCE 2009. [DOI: 10.1007/978-3-642-02008-7_19] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
29
|
Applications of Linkage Disequilibrium and Association Mapping in Maize. MOLECULAR GENETIC APPROACHES TO MAIZE IMPROVEMENT 2008. [DOI: 10.1007/978-3-540-68922-5_13] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
30
|
Knoll B, Goldammer M, Wojewoda A, Flügge J, Johne A, Mrozikiewicz PM, Roots I, Köpke K. An anomalous haplotype distribution of the arrestin domain-containing 4 gene (ARRDC4) haplotypes in Caucasians. ACTA ACUST UNITED AC 2008; 12:147-52. [PMID: 18307387 DOI: 10.1089/gte.2007.0049] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Little was known about the sequence variability of the human Arrestin domain-containing 4 gene (ARRDC4). We sequenced its DNA from exon 2 to exon 8 in a sample of 92 Russians. Seven variants were identified; one of them has not been described yet. It causes an amino acid change from Thr to Met. Identified variants were genotyped in the complete sample of 253 unrelated men and women to analyze haplotype distribution. Fifteen haplotypes were inferred. Nine haplotypes had estimated frequencies > 1%. Ninety-five percent of all haplotypes were determined by five haplotype-tagging single nucleotide polymorphisms. Haplotypes form two clades. The two most common haplotypes cover 76% of all haplotypes. The certainty of the haplotype reconstruction does not depend on the haplotype-inferring algorithms, but is a result of the anomalous haplotype distribution of ARRDC4, which makes this gene a suitable candidate gene for haplotype association studies. Interestingly, there is a great evolutionary distance between the two most common haplotypes, which could suggest a more complicated coalescent process with either past gene flow, selections, or bottlenecks.
Collapse
Affiliation(s)
- Bettina Knoll
- Institute of Clinical Pharmacology and Toxicology, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Liu Q, Yang J, Chen Z, Yang MQ, Sung AH, Huang X. Supervised learning-based tagSNP selection for genome-wide disease classifications. BMC Genomics 2008; 9 Suppl 1:S6. [PMID: 18366619 PMCID: PMC2386071 DOI: 10.1186/1471-2164-9-s1-s6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Background Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information redundancy from associations between SNP markers. Results We have developed a feature selection method named Supervised Recursive Feature Addition (SRFA). This method combines supervised learning and statistical measures for the chosen candidate features/SNPs to reconcile the redundancy information and, in doing so, improve the classification performance in association studies. Additionally, we have proposed a Support Vector based Recursive Feature Addition (SVRFA) scheme in SNP-disease association analysis. Conclusions We have proposed using SRFA with different statistical learning classifiers and SVRFA for both SNP selection and disease classification and then applying them to two complex disease data sets. In general, our approaches outperform the well-known feature selection method of Support Vector Machine Recursive Feature Elimination and logic regression-based SNP selection for disease classification in genetic association studies. Our study further indicates that both genetic and environmental variables should be taken into account when doing disease predictions and classifications for the most complex human diseases that have gene-environment interactions.
Collapse
Affiliation(s)
- Qingzhong Liu
- Department of Computer Science, New Mexico Institute of Mining and Technology, Socorro, NM 87801, USA.
| | | | | | | | | | | |
Collapse
|
32
|
Abstract
The question of tagging single nucleotide polymorphism (tagSNP) transferability is an important one because many ongoing and upcoming Genome-Wide Association studies rely critically upon the validity, and practical feasibility of using a universal core set of tagSNPs. A series of recent studies analyzed performance of tagSNPs selected based on the HapMap. While these studies showed largely satisfactory transferability of the tagSNPs, they also reported that the level of transferability varies, substantively sometimes, especially when tagSNPs selected in one population were used in another distant population. We present a review of the literature about where and why tagSNP transferability may become a problem and suggest research directions that may help the resolution.
Collapse
Affiliation(s)
- C Charles Gu
- Division of Biostatistics, Washington University School of Medicine, St Louis, MO 63110, USA.
| | | | | | | | | |
Collapse
|
33
|
Kim S, Moon SM, Kim YS, Kim JJ, Ryu HJ, Kim YJ, Choi JW, Park HS, Kim DG, Shin HD, Rutherford MS, Oh B, Lee JK. TNFR1 promoter −329G/T polymorphism results in allele-specific repression of TNFR1 expression. Biochem Biophys Res Commun 2008; 368:395-401. [DOI: 10.1016/j.bbrc.2008.01.098] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2008] [Accepted: 01/18/2008] [Indexed: 12/12/2022]
|
34
|
Liang Y, Kelemen A. Statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic study for complex diseases. STATISTICS SURVEYS 2008. [DOI: 10.1214/07-ss026] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
35
|
Zhang X, Zou F, Wang W. FastANOVA: an Efficient Algorithm for Genome-Wide Association Study. KDD : PROCEEDINGS. INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING 2008:821-829. [PMID: 20945829 PMCID: PMC2951741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Studying the association between quantitative phenotype (such as height or weight) and single nucleotide polymorphisms (SNPs) is an important problem in biology. To understand underlying mechanisms of complex phenotypes, it is often necessary to consider joint genetic effects across multiple SNPs. ANOVA (analysis of variance) test is routinely used in association study. Important findings from studying gene-gene (SNP-pair) interactions are appearing in the literature. However, the number of SNPs can be up to millions. Evaluating joint effects of SNPs is a challenging task even for SNP-pairs. Moreover, with large number of SNPs correlated, permutation procedure is preferred over simple Bonferroni correction for properly controlling family-wise error rate and retaining mapping power, which dramatically increases the computational cost of association study.In this paper, we study the problem of finding SNP-pairs that have significant associations with a given quantitative phenotype. We propose an efficient algorithm, FastANOVA, for performing ANOVA tests on SNP-pairs in a batch mode, which also supports large permutation test. We derive an upper bound of SNP-pair ANOVA test, which can be expressed as the sum of two terms. The first term is based on single-SNP ANOVA test. The second term is based on the SNPs and independent of any phenotype permutation. Furthermore, SNP-pairs can be organized into groups, each of which shares a common upper bound. This allows for maximum reuse of intermediate computation, efficient upper bound estimation, and effective SNP-pair pruning. Consequently, FastANOVA only needs to perform the ANOVA test on a small number of candidate SNP-pairs without the risk of missing any significant ones. Extensive experiments demonstrate that FastANOVA is orders of magnitude faster than the brute-force implementation of ANOVA tests on all SNP pairs.
Collapse
Affiliation(s)
- Xiang Zhang
- Department of Computer Science, University of North Carolina at Chapel Hill
| | | | | |
Collapse
|
36
|
Gu CC, Yu K, Rao DC. Characterization of LD structures and the utility of HapMap in genetic association studies. ADVANCES IN GENETICS 2008; 60:407-35. [PMID: 18358328 DOI: 10.1016/s0065-2660(07)00415-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Observed distribution of and variation in linkage disequilibrium (LD) with respect to the evolution history and disease transmission in a population is the driving force behind the current wave of genome-wide association (GWA) studies of complex human diseases. An extensive literature covers topics from haplotype analysis that utilizes local LD structures in candidate genes and regions to genome-wide organization of LD blocks (neighborhood) that led to the development of International HapMap Project and panels of "tagSNPs" used by current GWA studies. In this chapter, we examine the scenarios where each of the major types of analysis methods may be applicable and where the current popular genotyping platforms for GWA might come short. We discuss current association analysis methods by emphasizing their reliance on the local LD structures or the global organization of the LD structures, and highlight the need to consider individual marker information content in large-scale association mapping.
Collapse
Affiliation(s)
- C Charles Gu
- Division of Biostatistics and Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | | | | |
Collapse
|
37
|
|
38
|
Igo RP, Londono D, Miller K, Parrado AR, Quade SRE, Sinha M, Kim S, Won S, Li J, Goddard KAB. Density-based clustering in haplotype analysis for association mapping. BMC Proc 2007; 1 Suppl 1:S27. [PMID: 18466524 PMCID: PMC2367537 DOI: 10.1186/1753-6561-1-s1-s27] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Clustering of related haplotypes in haplotype-based association mapping has the potential to improve power by reducing the degrees of freedom without sacrificing important information about the underlying genetic structure. We have modified a generalized linear model approach for association analysis by incorporating a density-based clustering algorithm to reduce the number of coefficients in the model. Using the GAW 15 Problem 3 simulated data, we show that our novel method can substantially enhance power to detect association with the binary rheumatoid arthritis (RA) phenotype at the HLA-DRB1 locus on chromosome 6. In contrast, clustering did not appreciably improve performance at locus D, perhaps a consequence of a rare susceptibility allele and of the overwhelming effect of HLA-DRB1/locus C, 5 cM distal. Optimization of parameters governing the clustering algorithm identified a set of parameters that delivered nearly ideal performance in a variety of situations. The cluster-based score test was valid over a wide range of haplotype diversity, and was robust to severe departures from Hardy-Weinberg equilibrium encountered near HLA-DRB1 in RA case-control samples.
Collapse
Affiliation(s)
- Robert P Igo
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Room 1300-C, Cleveland, Ohio 44106, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Sipahimalani P, Spinelli JJ, MacArthur AC, Lai A, Leach SR, Janoo-Gilani RT, Palmquist DL, Connors JM, Gascoyne RD, Gallagher RP, Brooks-Wilson AR. A systematic evaluation of the ataxia telangiectasia mutated gene does not show an association with non-Hodgkin lymphoma. Int J Cancer 2007; 121:1967-1975. [PMID: 17640065 DOI: 10.1002/ijc.22888] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The ataxia telangiectasia mutated (ATM) gene is critical for the detection and repair of DNA double-stranded breaks. Mutations in this gene cause the autosomal recessive syndrome ataxia telangiectasia (AT), an attribute of which is an increased risk of cancer, particularly lymphoma. We have undertaken a population-based case/control study to assess the influence of genetic variation in ATM on the risk of non-Hodgkin lymphoma (NHL). A number of the subtypes that constitute NHL have in common the occurrence of specific somatic translocations that contribute to lymphomagenesis. We hypothesize that ATM function is slightly attenuated by some variants, which could reduce double-stranded break repair capacity, contributing to the occurrence of translocations and subsequent lymphomas. We sequenced the promoter and all exons of ATM in the germline DNA of 86 NHL patients and identified 79 variants. Eighteen of these variants correspond to nonsynonymous amino acid differences, 6 of which were predicted to be deleterious to protein function; these variants were all rare. Eleven common variants make up 10 haplotypes that are specified by 7 tagSNPs. Linkage disequilibrium across the ATM gene is high but incomplete. TagSNPs and the 6 putatively deleterious variants were genotyped in 798 NHL cases and 793 controls. Our results indicate that common variants of ATM do not significantly contribute to the risk of NHL in the general population. However, some rare, functionally deleterious variants may contribute to an increased risk of development of rare subtypes of the disease.
Collapse
Affiliation(s)
- Payal Sipahimalani
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - John J Spinelli
- Cancer Control Research Department, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Amy C MacArthur
- Cancer Control Research Department, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Agnes Lai
- Cancer Control Research Department, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Stephen R Leach
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Rozmin T Janoo-Gilani
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Diana L Palmquist
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Joseph M Connors
- Division of Medical Oncology, British Columbia Cancer Agency and the University of British Columbia, Vancouver, BC, Canada
| | - Randy D Gascoyne
- Department of Pathology, British Columbia Cancer Agency and the University of British Columbia, Vancouver, BC, Canada
| | - Richard P Gallagher
- Cancer Control Research Department, British Columbia Cancer Agency, Vancouver, BC, Canada
| | - Angela R Brooks-Wilson
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
| |
Collapse
|
40
|
|
41
|
Nam MH, Won HH, Lee KA, Kim JW. Effectiveness of in silico tagSNP selection methods: virtual analysis of the genotypes of pharmacogenetic genes. Pharmacogenomics 2007; 8:1347-57. [PMID: 17979509 DOI: 10.2217/14622416.8.10.1347] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
INTRODUCTION SNP tagging has been recently introduced, and the use of this strategy reduces the dimension of disease association studies and eventually saves on genotyping costs. There is no single set of tagging SNPs (tagSNPs) that will satisfy every association study design; thus, many different methods have been introduced. We evaluated various tagSNP selection methods using known haplotype data of pharmacogenetic genes. We also compared the selected tagSNPs among different ethnic groups. METHODS We collected genotype data for the NAT2 and CYP2D6 genes from the previously published literature where the linkage phase was resolved directly through molecular haplotyping. Three computational tagSNP selection methods (ldSelect, Tagger and TagIT software) were evaluated with these data sets. RESULTS Tagging effectiveness and efficiency were variable in all three tagSNP selection methods. No tagSNP sets were identical among the different ethnic groups. The haplotype r(2)-based method was more effective in determining genotype-phenotype correlation than the other methods employed. CONCLUSION All of the three computational tagSNP selection methods showed acceptable efficiency and effectiveness. The selected tagSNPs were different from each other among the different ethnic groups.
Collapse
Affiliation(s)
- Myung-Hyun Nam
- College of Medicine, Korea University, Department of Laboratory Medicine, Seoul 136-705, South Korea.
| | | | | | | |
Collapse
|
42
|
Chen YC, Giovannucci E, Kraft P, Lazarus R, Hunter DJ. Association between Toll-Like Receptor Gene Cluster (TLR6, TLR1, and TLR10) and Prostate Cancer. Cancer Epidemiol Biomarkers Prev 2007; 16:1982-9. [DOI: 10.1158/1055-9965.epi-07-0325] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
43
|
Yende S, Angus DC, Ding J, Newman AB, Kellum JA, Li R, Ferrell RE, Zmuda J, Kritchevsky SB, Harris TB, Garcia M, Yaffe K, Wunderink RG. 4G/5G plasminogen activator inhibitor-1 polymorphisms and haplotypes are associated with pneumonia. Am J Respir Crit Care Med 2007; 176:1129-37. [PMID: 17761618 PMCID: PMC2176102 DOI: 10.1164/rccm.200605-644oc] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
RATIONALE Plasminogen activator inhibitor (PAI)-1 inhibits urokinase and tissue plasminogen activator, required for host response to infection. Whether variation within the PAI-1 gene is associated with increased susceptibility to infection is unknown. OBJECTIVES To ascertain the role of the 4G/5G polymorphism and other genetic variants within the PAI-1 gene. We hypothesized that variants associated with increased PAI-1 expression would be associated with an increased occurrence of community-acquired pneumonia (CAP). METHODS Longitudinal analysis (>12 yr) of the Health, Aging, and Body Composition cohort, aged 65-74 years at start of analysis. MEASUREMENTS AND MAIN RESULTS We genotyped the 4G/5G PAI-1 polymorphism and six additional single nucleotide polymorphisms. Of the 3,075 subjects, 272 (8.8%) had at least one hospitalization for CAP. Among whites, variants at the PAI4G,5G, PAI2846, and PAI7343 sites had higher risk of CAP (P = 0.018, 0.021, and 0.021, respectively). At these sites, variants associated with higher PAI-1 expression were associated with increased CAP susceptibility. Compared with the 5G/5G genotypes at PAI4G,5G site, the 4G/4G and 4G/5G genotypes were associated with a 1.98-fold increased risk of CAP (95% confidence interval, 1.2-3.2; P = 0.006). In whole blood stimulation assay, subjects with a 4G allele had 3.3- and 1.9-fold increased PAI-1 expression (P = 0.043 and 0.034, respectively). In haplotype analysis, the 4G/G/C/A haplotype at the PAI4G,5G, PAI2846, PAI4588, and PAI7343 single nucleotide polymorphisms was associated with higher CAP susceptibility, whereas the 5G/G/C/A haplotype was associated with lower CAP susceptibility. No associations were seen among blacks. CONCLUSIONS Genotypes associated with increased expression of PAI-1 were associated with increased susceptibility to CAP in elderly whites.
Collapse
Affiliation(s)
- Sachin Yende
- CRISMA Laboratory (Clinical Research, Investigation, and Systems Modeling of Acute Illness), Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Hou W, Yap JSF, Wu S, Liu T, Cheverud JM, Wu R. Haplotyping a quantitative trait with a high-density map in experimental crosses. PLoS One 2007; 2:e732. [PMID: 17710132 PMCID: PMC1940312 DOI: 10.1371/journal.pone.0000732] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2007] [Accepted: 07/13/2007] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The ultimate goal of genetic mapping of quantitative trait loci (QTL) is the positional cloning of genes involved in any agriculturally or medically important phenotype. However, only a small portion (< or = 1%) of the QTL detected have been characterized at the molecular level, despite the report of hundreds of thousands of QTL for different traits and populations. METHODS/RESULTS We develop a statistical model for detecting and characterizing the nucleotide structure and organization of haplotypes that underlie QTL responsible for a quantitative trait in an F2 pedigree. The discovery of such haplotypes by the new model will facilitate the molecular cloning of a QTL. Our model is founded on population genetic properties of genes that are segregating in a pedigree, constructed with the mixture-based maximum likelihood context and implemented with the EM algorithm. The closed forms have been derived to estimate the linkage and linkage disequilibria among different molecular markers, such as single nucleotide polymorphisms, and quantitative genetic effects of haplotypes constructed by non-alleles of these markers. Results from the analysis of a real example in mouse have validated the usefulness and utilization of the model proposed. CONCLUSION The model is flexible to be extended to model a complex network of genetic regulation that includes the interactions between different haplotypes and between haplotypes and environments.
Collapse
Affiliation(s)
- Wei Hou
- Department of Epidemiology and Health Policy Research, University of Florida, Gainesville, Florida, United States of America
| | - John Stephen F. Yap
- Department of Statistics, University of Florida, Gainesville, Florida, United States of America
| | - Song Wu
- Department of Statistics, University of Florida, Gainesville, Florida, United States of America
| | - Tian Liu
- Department of Statistics, University of Florida, Gainesville, Florida, United States of America
| | - James M. Cheverud
- Department of Anatomy and Neurobiology, Washington University Medical School, St. Louis, Missouri, United States of America
| | - Rongling Wu
- Department of Statistics, University of Florida, Gainesville, Florida, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
45
|
|
46
|
Phuong TM, Lin Z, Altman RB. Choosing SNPs using feature selection. PROCEEDINGS. IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE 2007:301-9. [PMID: 16447987 DOI: 10.1109/csb.2005.22] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
A major challenge for genomewide disease association studies is the high cost of genotyping large number of single nucleotide polymorphisms (SNP). The correlations between SNPs, however, make it possible to select a parsimonious set of informative SNPs, known as "tagging" SNPs, able to capture most variation in a population. Considerable research interest has recently focused on the development of methods for finding such SNPs. In this paper, we present an efficient method for finding tagging SNPs. The method does not involve computation-intensive search for SNP subsets but discards redundant SNPs using a feature selection algorithm. In contrast to most existing methods, the method presented here does not limit itself to using only correlations between SNPs in local groups. By using correlations that occur across different chromosomal regions, the method can reduce the number of globally redundant SNPs. Experimental results show that the number of tagging SNPs selected by our method is smaller than by using block-based methods.
Collapse
Affiliation(s)
- Tu Minh Phuong
- Department of Information Technology, Post & Telecom. Institute of Technology, Hanoi, Vietnam.
| | | | | |
Collapse
|
47
|
Abstract
The search for the association between complex diseases and single nucleotide polymorphisms (SNPs) or haplotypes has recently received great attention. For these studies, it is essential to use a small subset of informative SNPs, i.e., tag SNPs, accurately representing the rest of the SNPs. Tag SNP selection can achieve: 1) considerable budget savings by genotyping only a limited number of SNPs and computationally inferring all other SNPs or 2) necessary reduction of the huge SNP sets (obtained, e.g., from Affymetrix) for further fine haplotype analysis. In this paper, we show that the tag SNP selection strongly depends on how the chosen tags will be used-advantage of one tag set over another can only be considered with respect to a certain prediction method. We show how to separate tag selection from SNP prediction and propose greedy and local-minimization algorithms for tag SNP selection. We give two novel approaches to SNP prediction based on multiple linear regression (MLR) and support vector machines (SVMs). An extensive experimental study on various datasets including ten regions from hapMap project shows that the MLR prediction combined with stepwise tag selection uses fewer tags than the state-of-the-art method of Halperin et al. The MLR-based method also uses on average 30% fewer tags than IdSelect for statistical covering all SNPs. The tag selection based on SVM SNP prediction uses fewer tags to achieve the same prediction accuracy as the methods of Halldorsson et al.
Collapse
Affiliation(s)
- Jingwu He
- Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA.
| | | |
Collapse
|
48
|
Abstract
To identify the genetic etiology of a disease of interest, disease-related characteristics (phenotypes) are often tested for association with genetic variants (genotypes). Although genetic association studies of single genetic variants have been widely performed, there has been increasing interest in studies of multiple adjacent genetic variants on one chromosome, known as a haplotype. In this review, we will provide background about the origin of haplotypes and why they can be useful in genetic studies; we will discuss approaches to determining haplotypes and performing haplotype-based genetic association studies; and we will compare single variant and haplotype-based approaches.
Collapse
Affiliation(s)
- Edwin K Silverman
- Channing Laboratory and Pulmonary and Critical Care Division, Brigham and Women's Hospital, Boston, Massachusetts, USA.
| |
Collapse
|
49
|
Wasmuth HE, Glantz A, Keppeler H, Simon E, Bartz C, Rath W, Mattsson LA, Marschall HU, Lammert F. Intrahepatic cholestasis of pregnancy: the severe form is associated with common variants of the hepatobiliary phospholipid transporter ABCB4 gene. Gut 2007; 56:265-70. [PMID: 16891356 PMCID: PMC1856745 DOI: 10.1136/gut.2006.092742] [Citation(s) in RCA: 110] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
BACKGROUND Intrahepatic cholestasis of pregnancy (ICP) is characterised by troublesome maternal pruritus, raised serum bile acid levels and increased fetal risk. Mutations of the ABCB4 gene encoding the hepatobiliary phospholipid transporter have been identified in a small proportion of patients with cholestasis of pregnancy. In a recent prospective study on 693 patients with cholestasis of pregnancy, a cut-off level for serum bile acid (> or =40 micromol/l) was determined for increased risk of fetal complications. OBJECTIVES To investigate whether common combinations of polymorphic alleles (haplotypes) of the genes encoding the hepatobiliary ATP-binding cassette (ABC) transporters for phospholipids (ABCB4) and bile acids (ABCB11) were associated with this severe form of cholestasis of pregnancy. METHODS For genetic analysis, 52 women with bile acid levels > or =40 micromol/l (called cases) and 52 unaffected women (called controls) matched for age, parity and geographical residence were studied. Gene variants tagging common ABCB4 and ABCB11 haplotypes were genotyped and haplotype distributions were compared between cases and controls by permutation testing. RESULTS In contrast with ABCB11 haplotypes, ABCB4 haplotypes differed between the two groups (p = 0.019), showing that the severe form of cholestasis of pregnancy is associated with the ABCB4 gene variants. Specifically, haplotype ABCB4_5 occurred more often in cases, whereas haplotypes ABCB4_3 and ABCB4_7 were more common in controls. These associations were reflected by different frequencies of at-risk alleles of the two tagging polymorphisms (c.711A: odds ratio (OR) 2.27, p = 0.04; deletion intron 5: OR 14.68, p = 0.012). CONCLUSION Variants of ABCB4 represent genetic risk factors for the severe form of ICP in Sweden.
Collapse
Affiliation(s)
- H E Wasmuth
- Third Department of Medicine, University Hospital Aachen, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Chi PB, Duggal P, Kao WHL, Mathias RA, Grant AV, Stockton ML, Garcia JGN, Ingersoll RG, Scott AF, Beaty TH, Barnes KC, Fallin MD. Comparison of SNP tagging methods using empirical data: association study of 713 SNPs on chromosome 12q14.3-12q24.21 for asthma and total serum IgE in an African Caribbean population. Genet Epidemiol 2007; 30:609-19. [PMID: 16830339 DOI: 10.1002/gepi.20172] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Few comparison studies have been performed on single nucleotide polymorphism (SNP) tagging methods to examine their consistency and effectiveness in terms of inferences about association with disease. We applied several SNP tagging methods to SNPs on chromosome 12q (n=713) and compared the utility of these methods to detect association for asthma and serum IgE levels among a sample of African Caribbean families from Barbados selected through asthmatic probands. We found that a high level of information regarding association is retained in Clayton's htSNP, Stram's TagSNP, and de Bakker's Tagger. We also found a high degree of consistency between TagSNP and Tagger. Using this set of 713 SNPs on chromosome 12q, our study provides insight towards analytic strategies for future studies of complex traits.
Collapse
Affiliation(s)
- Peter B Chi
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|