Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Thanh Nguyen D, Hoang Nguyen Q, Thuy Duong N, Vo NS. LmTag: functional-enrichment and imputation-aware tag SNP selection for population-specific genotyping arrays. Brief Bioinform 2022;23:6627269. [PMID: 35780383 DOI: 10.1093/bib/bbac252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Revised: 05/02/2022] [Accepted: 05/31/2022] [Indexed: 12/16/2022] Open

Abstract

Despite the rapid development of sequencing technology, single-nucleotide polymorphism (SNP) arrays are still the most cost-effective genotyping solutions for large-scale genomic research and applications. Recent years have witnessed the rapid development of numerous genotyping platforms of different sizes and designs, but population-specific platforms are still lacking, especially for those in developing countries. SNP arrays designed for these countries should be cost-effective (small size), yet incorporate key information needed to associate genotypes with traits. A key design principle for most current platforms is to improve genome-wide imputation so that more SNPs not included in the array (imputed SNPs) can be predicted. However, current tag SNP selection methods mostly focus on imputation accuracy and coverage, but not the functional content of the array. It is those functional SNPs that are most likely associated with traits. Here, we propose LmTag, a novel method for tag SNP selection that not only improves imputation performance but also prioritizes highly functional SNP markers. We apply LmTag on a wide range of populations using both public and in-house whole-genome sequencing databases. Our results show that LmTag improved both functional marker prioritization and genome-wide imputation accuracy compared to existing methods. This novel approach could contribute to the next generation genotyping arrays that provide excellent imputation capability as well as facilitate array-based functional genetic studies. Such arrays are particularly suitable for under-represented populations in developing countries or non-model species, where little genomics data are available while investment in genome sequencing or high-density SNP arrays is limited. $\textrm{LmTag}$ is available at: https://github.com/datngu/LmTag.

Collapse

Discovering Genome-Wide Tag SNPs Based on the Mutual Information of the Variants. PLoS One 2016;11:e0167994. [PMID: 27992465 PMCID: PMC5161470 DOI: 10.1371/journal.pone.0167994] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 11/23/2016] [Indexed: 01/01/2023] Open

Budhathoki S, Yamaji T, Iwasaki M, Sawada N, Shimazu T, Sasazuki S, Yoshida T, Tsugane S. Vitamin D Receptor Gene Polymorphism and the Risk of Colorectal Cancer: A Nested Case-Control Study. PLoS One 2016;11:e0164648. [PMID: 27736940 PMCID: PMC5063384 DOI: 10.1371/journal.pone.0164648] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Accepted: 09/28/2016] [Indexed: 12/31/2022] Open

Liao B, Li X, Cai L, Cao Z, Chen H. A Hierarchical Clustering Method of Selecting Kernel SNP to Unify Informative SNP and Tag SNP. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015;12:113-122. [PMID: 26357082 DOI: 10.1109/tcbb.2014.2351797] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Srivastava AK, Chopra R, Ali S, Aggarwal S, Vig L, Bamezai RNK. Inferring population structure and relationship using minimal independent evolutionary markers in Y-chromosome: a hybrid approach of recursive feature selection for hierarchical clustering. Nucleic Acids Res 2014;42:e122. [PMID: 25030906 PMCID: PMC4150763 DOI: 10.1093/nar/gku585] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

Demir HD, Ortak H, Şahin Ş, Ateş Ö, Benli İ, İnanır A. VKORC1 C1173TandVKORC1 G-1639AGene Polymorphisms in Turkish Behçet’s Patients with Ocular and Non-ocular Involvement. Ophthalmic Genet 2014;35:7-11. [DOI: 10.3109/13816810.2013.763994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

İlhan İ, Tezel G. How to Select Tag SNPs in Genetic Association Studies? The CLONTagger Method with Parameter Optimization. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2013;17:368-83. [DOI: 10.1089/omi.2012.0100] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

İlhan İ, Tezel G. A genetic algorithm–support vector machine method with parameter optimization for selecting the tag SNPs. J Biomed Inform 2013;46:328-40. [DOI: 10.1016/j.jbi.2012.12.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2012] [Revised: 10/13/2012] [Accepted: 12/11/2012] [Indexed: 01/06/2023]

Liao B, Li X, Zhu W, Cao Z. A novel method to select informative SNPs and their application in genetic association studies. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012;9:1529-1534. [PMID: 22585142 DOI: 10.1109/tcbb.2012.70] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Weersma RK, Crusius JBA, Roberts RL, Koeleman BPC, Palomino-Morales R, Wolfkamp S, Hollis-Moffatt JE, Festen EAM, Meisneris S, Heijmans R, Noble CL, Gearry RB, Barclay ML, Gómez-Garcia M, Lopez-Nevot MA, Nieto A, Rodrigo L, Radstake TRDJ, van Bodegraven AA, Wijmenga C, Merriman TR, Stokkers PCF, Peña AS, Martín J, Alizadeh BZ. Association of FcgR2a, but not FcgR3a, with inflammatory bowel diseases across three Caucasian populations. Inflamm Bowel Dis 2010;16:2080-9. [PMID: 20848524 DOI: 10.1002/ibd.21342] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]

Liu G, Wang Y, Wong L. FastTagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium. BMC Bioinformatics 2010;11:66. [PMID: 20113476 PMCID: PMC3098109 DOI: 10.1186/1471-2105-11-66] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Accepted: 01/29/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Human genome contains millions of common single nucleotide polymorphisms (SNPs) and these SNPs play an important role in understanding the association between genetic variations and human diseases. Many SNPs show correlated genotypes, or linkage disequilibrium (LD), thus it is not necessary to genotype all SNPs for association study. Many algorithms have been developed to find a small subset of SNPs called tag SNPs that are sufficient to infer all the other SNPs. Algorithms based on the r2 LD statistic have gained popularity because r2 is directly related to statistical power to detect disease associations. Most of existing r2 based algorithms use pairwise LD. Recent studies show that multi-marker LD can help further reduce the number of tag SNPs. However, existing tag SNP selection algorithms based on multi-marker LD are both time-consuming and memory-consuming. They cannot work on chromosomes containing more than 100 k SNPs using length-3 tagging rules.

RESULTS

We propose an efficient algorithm called FastTagger to calculate multi-marker tagging rules and select tag SNPs based on multi-marker LD. FastTagger uses several techniques to reduce running time and memory consumption. Our experiment results show that FastTagger is several times faster than existing multi-marker based tag SNP selection algorithms, and it consumes much less memory at the same time. As a result, FastTagger can work on chromosomes containing more than 100 k SNPs using length-3 tagging rules.FastTagger also produces smaller sets of tag SNPs than existing multi-marker based algorithms, and the reduction ratio ranges from 3%-9% when length-3 tagging rules are used. The generated tagging rules can also be used for genotype imputation. We studied the prediction accuracy of individual rules, and the average accuracy is above 96% when r2 >/= 0.9.

CONCLUSIONS

Generating multi-marker tagging rules is a computation intensive task, and it is the bottleneck of existing multi-marker based tag SNP selection methods. FastTagger is a practical and scalable algorithm to solve this problem.

Collapse