1
|
Bollas AE, Rajkovic A, Ceyhan D, Gaither JB, Mardis ER, White P. SNVstory: inferring genetic ancestry from genome sequencing data. BMC Bioinformatics 2024; 25:76. [PMID: 38378494 PMCID: PMC10877842 DOI: 10.1186/s12859-024-05703-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Accepted: 02/13/2024] [Indexed: 02/22/2024] Open
Abstract
BACKGROUND Genetic ancestry, inferred from genomic data, is a quantifiable biological parameter. While much of the human genome is identical across populations, it is estimated that as much as 0.4% of the genome can differ due to ancestry. This variation is primarily characterized by single nucleotide variants (SNVs), which are often unique to specific genetic populations. Knowledge of a patient's genetic ancestry can inform clinical decisions, from genetic testing and health screenings to medication dosages, based on ancestral disease predispositions. Nevertheless, the current reliance on self-reported ancestry can introduce subjectivity and exacerbate health disparities. While genomic sequencing data enables objective determination of a patient's genetic ancestry, existing approaches are limited to ancestry inference at the continental level. RESULTS To address this challenge, and create an objective, measurable metric of genetic ancestry we present SNVstory, a method built upon three independent machine learning models for accurately inferring the sub-continental ancestry of individuals. We also introduce a novel method for simulating individual samples from aggregate allele frequencies from known populations. SNVstory includes a feature-importance scheme, unique among open-source ancestral tools, which allows the user to track the ancestral signal broadcast by a given gene or locus. We successfully evaluated SNVstory using a clinical exome sequencing dataset, comparing self-reported ethnicity and race to our inferred genetic ancestry, and demonstrate the capability of the algorithm to estimate ancestry from 36 different populations with high accuracy. CONCLUSIONS SNVstory represents a significant advance in methods to assign genetic ancestry, opening the door to ancestry-informed care. SNVstory, an open-source model, is packaged as a Docker container for enhanced reliability and interoperability. It can be accessed from https://github.com/nch-igm/snvstory .
Collapse
Affiliation(s)
- Audrey E Bollas
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Andrei Rajkovic
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| | - Defne Ceyhan
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| | - Jeffrey B Gaither
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
| | - Elaine R Mardis
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Peter White
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, USA.
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA.
| |
Collapse
|
2
|
Hateley S, Lopez-Izquierdo A, Jou CJ, Cho S, Schraiber JG, Song S, Maguire CT, Torres N, Riedel M, Bowles NE, Arrington CB, Kennedy BJ, Etheridge SP, Lai S, Pribble C, Meyers L, Lundahl D, Byrnes J, Granka JM, Kauffman CA, Lemmon G, Boyden S, Scott Watkins W, Karren MA, Knight S, Brent Muhlestein J, Carlquist JF, Anderson JL, Chahine KG, Shah KU, Ball CA, Benjamin IJ, Yandell M, Tristani-Firouzi M. The history and geographic distribution of a KCNQ1 atrial fibrillation risk allele. Nat Commun 2021; 12:6442. [PMID: 34750360 PMCID: PMC8575962 DOI: 10.1038/s41467-021-26741-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 10/20/2021] [Indexed: 11/08/2022] Open
Abstract
The genetic architecture of atrial fibrillation (AF) encompasses low impact, common genetic variants and high impact, rare variants. Here, we characterize a high impact AF-susceptibility allele, KCNQ1 R231H, and describe its transcontinental geographic distribution and history. Induced pluripotent stem cell-derived cardiomyocytes procured from risk allele carriers exhibit abbreviated action potential duration, consistent with a gain-of-function effect. Using identity-by-descent (IBD) networks, we estimate the broad- and fine-scale population ancestry of risk allele carriers and their relatives. Analysis of ancestral migration routes reveals ancestors who inhabited Denmark in the 1700s, migrated to the Northeastern United States in the early 1800s, and traveled across the Midwest to arrive in Utah in the late 1800s. IBD/coalescent-based allele dating analysis reveals a relatively recent origin of the AF risk allele (~5000 years). Thus, our approach broadens the scope of study for disease susceptibility alleles to the context of human migration and ancestral origins.
Collapse
Affiliation(s)
| | | | - Chuanchau J Jou
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Scott Cho
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | | | | | - Colin T Maguire
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Natalia Torres
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Michael Riedel
- Cardiovascular Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Neil E Bowles
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Cammon B Arrington
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Brett J Kennedy
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Susan P Etheridge
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Shuping Lai
- Cardiovascular Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Chase Pribble
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Lindsay Meyers
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Derek Lundahl
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | | | | | - Christopher A Kauffman
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Gordon Lemmon
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Steven Boyden
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - W Scott Watkins
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Mary Anne Karren
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | | | | | | | | | - Khushi U Shah
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | | | - Ivor J Benjamin
- Cardiovascular Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Mark Yandell
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Martin Tristani-Firouzi
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA.
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA.
| |
Collapse
|