1
|
Pfennig A, Lachance J. Challenges of accurately estimating sex-biased admixture from X chromosomal and autosomal ancestry proportions. Am J Hum Genet 2023; 110:359-367. [PMID: 36736293 PMCID: PMC9943719 DOI: 10.1016/j.ajhg.2022.12.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 12/20/2022] [Indexed: 02/05/2023] Open
Abstract
Sex-biased admixture can be inferred from ancestry-specific proportions of X chromosome and autosomes. In a paper published in the American Journal of Human Genetics, Micheletti et al.1 used this approach to quantify male and female contributions following the transatlantic slave trade. Using a large dataset from 23andMe, they concluded that African and European contributions to gene pools in the Americas were much more sex biased than previously thought. We show that the reported extreme sex-specific contributions can be attributed to unassigned genetic ancestry as well as the limitations of simple models of sex-biased admixture. Unassigned ancestry proportions in the study by Micheletti et al. ranged from ∼1% to 21%, depending on the type of chromosome and geographic region. A sensitivity analysis illustrates how this unassigned ancestry can create false patterns of sex bias and that mathematical models are highly sensitive to slight sampling errors when inferring mean ancestry proportions, making confidence intervals necessary. Thus, unassigned ancestry and the sensitivity of the models effectively prohibit the interpretation of estimated sex biases for many geographic regions in Micheletti et al. Furthermore, Micheletti et al. assumed models of a single admixture event. Using simulations, we find that violations of demographic assumptions, such as subsequent gene flow and/or sex-specific assortative mating, may have confounded the analyses of Micheletti et al., but unassigned ancestry was likely the more important confounding factor. Our findings underscore the importance of using complete ancestry information, sufficiently large sample sizes, and appropriate models when inferring sex-biased patterns of demography. This Matters Arising paper is in response to Micheletti et al.,1 published in American Journal of Human Genetics. See also the response by Micheletti et al.,2 published in this issue.
Collapse
Affiliation(s)
- Aaron Pfennig
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Joseph Lachance
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA.
| |
Collapse
|
2
|
Lee KK, Norris ET, Rishishwar L, Conley AB, Mariño-Ramírez L, McDonald JF, Jordan IK. Ethnic disparities in mortality and group-specific risk factors in the UK Biobank. PLOS GLOBAL PUBLIC HEALTH 2023; 3:e0001560. [PMID: 36963080 PMCID: PMC10021328 DOI: 10.1371/journal.pgph.0001560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 01/09/2023] [Indexed: 02/25/2023]
Abstract
Despite a substantial overall decrease in mortality, disparities among ethnic minorities in developed countries persist. This study investigated mortality disparities and their associated risk factors for the three largest ethnic groups in the United Kingdom: Asian, Black, and White. Study participants were sampled from the UK Biobank (UKB), a prospective cohort enrolled between 2006 and 2010. Genetics, biological samples, and health information and outcomes data of UKB participants were downloaded and data-fields were prioritized based on participants with death registry records. Kaplan-Meier method was used to evaluate survival differences among ethnic groups; survival random forest feature selection followed by Cox proportional-hazard modeling was used to identify and estimate the effects of shared and ethnic group-specific mortality risk factors. The White ethnic group showed significantly worse survival probability than the Asian and Black groups. In all three ethnic groups, endoscopy and colonoscopy procedures showed significant protective effects on overall mortality. Asian and Black women show lower relative risk of mortality than men, whereas no significant effect of sex was seen for the White group. The strongest ethnic group-specific mortality associations were ischemic heart disease for Asians, COVID-19 for Blacks, and cancers of respiratory/intrathoracic organs for Whites. Mental health-related diagnoses, including substance abuse, anxiety, and depression, were a major risk factor for overall mortality in the Asian group. The effect of mental health on Asian mortality, particularly for digestive cancers, was exacerbated by an observed hesitance to answer mental health questions, possibly related to cultural stigma. C-reactive protein (CRP) serum levels were associated with both overall and cause-specific mortality due to COVID-19 and digestive cancers in the Black group, where elevated CRP has previously been linked to psychosocial stress due to discrimination. Our results point to mortality risk factors that are group-specific and modifiable, supporting targeted interventions towards greater health equity.
Collapse
Affiliation(s)
- Kara Keun Lee
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States of America
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States of America
| | - Emily T Norris
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, United States of America
| | - Lavanya Rishishwar
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, United States of America
| | - Andrew B Conley
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, United States of America
| | - Leonardo Mariño-Ramírez
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, United States of America
| | - John F McDonald
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States of America
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States of America
| | - I King Jordan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States of America
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States of America
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, United States of America
| |
Collapse
|
3
|
Mariño-Ramírez L, Sharma S, Rishishwar L, Conley AB, Nagar SD, Jordan IK. Effects of genetic ancestry and socioeconomic deprivation on ethnic differences in serum creatinine. Gene 2022; 837:146709. [PMID: 35772650 PMCID: PMC9288982 DOI: 10.1016/j.gene.2022.146709] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 06/24/2022] [Indexed: 11/18/2022]
Abstract
The inclusion of ethnicity in equations for estimating the glomerular filtration rate (eGFR) from serum creatinine levels has been challenged since ethnicity is socially defined and therefore a poor proxy for biological differences. We hypothesized that genetic ancestry (GA) would be more strongly associated with creatinine levels among healthy individuals than self-identified ethnicity. We studied a diverse cohort of 35,590 participants characterized as part of the UK Biobank, grouped by self-reported ethnicity: Black, East Asian, Mixed, Other, South Asian, and White. We used multivariable modeling to test for associations between ethnicity, GA, socioeconomic deprivation, and serum creatinine levels, including covariates for age, sex, height, and body mass index. Model fit comparisons and relative importance analysis were used to compare the effects of ethnicity and GA on creatinine levels. Black ethnicity shows a positive effect on participant serum creatinine levels (β = 9.36 ± 0.38), whereas East Asian (β = -1.80 ± 0.66) and South Asian (β = -0.28 ± 0.36) ethnicity show negative effects on creatinine. Male sex (β = 17.69 ± 0.34) and height (β = 0.13 ± 0.02) also show high positive associations with creatinine levels, while socioeconomic deprivation (β = -0.04 ± 0.04) shows no significant association. African ancestry has the highest association (β = 13.81 ± 0.52) with creatinine levels. Overall, GA (9.06%) explains significantly more of the variation in creatinine levels than ethnicity (4.96%), with African ancestry (6.36%) alone explaining more of the variation than ethnicity. We found that GA explains more of the variation in serum creatinine levels than socioeconomic deprivation, suggesting the possibility that ethnic differences in creatinine are shaped by genetic rather than social factors.
Collapse
Affiliation(s)
- Leonardo Mariño-Ramírez
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, USA; PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia.
| | - Shivam Sharma
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Lavanya Rishishwar
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, USA; PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia; IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA
| | - Andrew B Conley
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, USA; PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia; IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA
| | - Shashwat Deepali Nagar
- PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - I King Jordan
- PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA; IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA.
| |
Collapse
|
4
|
Hatoum AS, Wendt FR, Galimberti M, Polimanti R, Neale B, Kranzler HR, Gelernter J, Edenberg HJ, Agrawal A. Ancestry may confound genetic machine learning: Candidate-gene prediction of opioid use disorder as an example. Drug Alcohol Depend 2021; 229:109115. [PMID: 34710714 PMCID: PMC9358969 DOI: 10.1016/j.drugalcdep.2021.109115] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 07/29/2021] [Accepted: 10/04/2021] [Indexed: 11/18/2022]
Abstract
BACKGROUND Machine learning (ML) models are beginning to proliferate in psychiatry, however machine learning models in psychiatric genetics have not always accounted for ancestry. Using an empirical example of a proposed genetic test for OUD, and exploring a similar test for tobacco dependence and a simulated binary phenotype, we show that genetic prediction using ML is vulnerable to ancestral confounding. METHODS We utilize five ML algorithms trained with 16 brain reward-derived "candidate" SNPs proposed for commercial use and examine their ability to predict OUD vs. ancestry in an out-of-sample test set (N = 1000, stratified into equal groups of n = 250 cases and controls each of European and African ancestry). We rerun analyses with 8 random sets of allele-frequency matched SNPs. We contrast findings with 11 genome-wide significant variants for tobacco smoking. To document generalizability, we generate and test a random phenotype. RESULTS None of the 5 ML algorithms predict OUD better than chance when ancestry was balanced but were confounded with ancestry in an out-of-sample test. In addition, the algorithms preferentially predicted admixed subpopulations. Random sets of variants matched to the candidate SNPs by allele frequency produced similar bias. Genome-wide significant tobacco smoking variants were also confounded by ancestry. Finally, random SNPs predicting a random simulated phenotype show that the bias attributable to ancestral confounding could impact any ML-based genetic prediction. CONCLUSIONS Researchers and clinicians are encouraged to be skeptical of claims of high prediction accuracy from ML-derived genetic algorithms for polygenic traits like addiction, particularly when using candidate variants.
Collapse
Affiliation(s)
- Alexander S Hatoum
- Washington University in St. Louis, School of Medicine, Department of Psychiatry, USA.
| | - Frank R Wendt
- Department of Psychiatry, Division of Human Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Marco Galimberti
- Department of Psychiatry, Division of Human Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Renato Polimanti
- Department of Psychiatry, Division of Human Genetics, Yale School of Medicine, New Haven, CT, USA; Veterans Affairs Connecticut Healthcare System, West Haven, CT, USA
| | - Benjamin Neale
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Henry R Kranzler
- Center for Studies of Addiction, Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; VISN 4 MIRECC, Crescenz VAMC, Philadelphia, PA, USA
| | - Joel Gelernter
- Department of Psychiatry, Division of Human Genetics, Yale School of Medicine, New Haven, CT, USA; Veterans Affairs Connecticut Healthcare System, West Haven, CT, USA; Department of Genetics, Yale School of Medicine, New Haven, CT, USA; Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
| | - Howard J Edenberg
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA; Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Arpana Agrawal
- Washington University in St. Louis, School of Medicine, Department of Psychiatry, USA
| |
Collapse
|
5
|
Gusev A, Groha S, Taraszka K, Semenov YR, Zaitlen N. Constructing germline research cohorts from the discarded reads of clinical tumor sequences. Genome Med 2021; 13:179. [PMID: 34749793 PMCID: PMC8576948 DOI: 10.1186/s13073-021-00999-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 10/28/2021] [Indexed: 12/02/2022] Open
Abstract
Background Hundreds of thousands of cancer patients have had targeted (panel) tumor sequencing to identify clinically meaningful mutations. In addition to improving patient outcomes, this activity has led to significant discoveries in basic and translational domains. However, the targeted nature of clinical tumor sequencing has a limited scope, especially for germline genetics. In this work, we assess the utility of discarded, off-target reads from tumor-only panel sequencing for the recovery of genome-wide germline genotypes through imputation. Methods We developed a framework for inference of germline variants from tumor panel sequencing, including imputation, quality control, inference of genetic ancestry, germline polygenic risk scores, and HLA alleles. We benchmarked our framework on 833 individuals with tumor sequencing and matched germline SNP array data. We then applied our approach to a prospectively collected panel sequencing cohort of 25,889 tumors. Results We demonstrate high to moderate accuracy of each inferred feature relative to direct germline SNP array genotyping: individual common variants were imputed with a mean accuracy (correlation) of 0.86, genetic ancestry was inferred with a correlation of > 0.98, polygenic risk scores were inferred with a correlation of > 0.90, and individual HLA alleles were inferred with a correlation of > 0.80. We demonstrate a minimal influence on the accuracy of somatic copy number alterations and other tumor features. We showcase the feasibility and utility of our framework by analyzing 25,889 tumors and identifying the relationships between genetic ancestry, polygenic risk, and tumor characteristics that could not be studied with conventional on-target tumor data. Conclusions We conclude that targeted tumor sequencing can be leveraged to build rich germline research cohorts from existing data and make our analysis pipeline publicly available to facilitate this effort. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-021-00999-4.
Collapse
Affiliation(s)
- Alexander Gusev
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA. .,Division of Genetics, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA. .,The Broad Institute of MIT & Harvard, Cambridge, MA, USA.
| | - Stefan Groha
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA.,The Broad Institute of MIT & Harvard, Cambridge, MA, USA
| | - Kodi Taraszka
- Departments of Neurology and Computational Medicine, UCLA, Los Angeles, CA, USA
| | - Yevgeniy R Semenov
- Department of Dermatology, Massachusetts General Hospital, Boston, MA, USA
| | - Noah Zaitlen
- Departments of Neurology and Computational Medicine, UCLA, Los Angeles, CA, USA.
| |
Collapse
|
6
|
Nagar SD, Conley AB, Sharma S, Rishishwar L, Jordan IK, Mariño-Ramírez L. Comparing Genetic and Socioenvironmental Contributions to Ethnic Differences in C-Reactive Protein. Front Genet 2021; 12:738485. [PMID: 34733313 PMCID: PMC8558394 DOI: 10.3389/fgene.2021.738485] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 10/05/2021] [Indexed: 02/03/2023] Open
Abstract
C-reactive protein (CRP) is a routinely measured blood biomarker for inflammation. Elevated levels of circulating CRP are associated with response to infection, risk for a number of complex common diseases, and psychosocial stress. The objective of this study was to compare the contributions of genetic ancestry, socioenvironmental factors, and inflammation-related health conditions to ethnic differences in C-reactive protein levels. We used multivariable regression to compare CRP blood serum levels between Black and White ethnic groups from the United Kingdom Biobank (UKBB) prospective cohort study. CRP serum levels are significantly associated with ethnicity in an age and sex adjusted model. Study participants who identify as Black have higher average CRP than those who identify as White, CRP increases with age, and females have higher average CRP than males. Ethnicity and sex show a significant interaction effect on CRP. Black females have higher average CRP levels than White females, whereas White males have higher average CRP than Black males. Significant associations between CRP, ethnicity, and genetic ancestry are almost completely attenuated in a fully adjusted model that includes socioenvironmental factors and inflammation-related health conditions. BMI, smoking, and socioeconomic deprivation all have high relative effects on CRP. These results indicate that socioenvironmental factors contribute more to CRP ethnic differences than genetics. Differences in CRP are associated with ethnic disparities for a number of chronic diseases, including type 2 diabetes, essential hypertension, sarcoidosis, and lupus erythematosus. Our results indicate that ethnic differences in CRP are linked to both socioenvironmental factors and numerous ethnic health disparities.
Collapse
Affiliation(s)
- Shashwat Deepali Nagar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States
| | - Andrew B Conley
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, United States.,National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, United States
| | - Shivam Sharma
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, United States
| | - Lavanya Rishishwar
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, United States.,National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, United States
| | - I King Jordan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, United States.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, United States.,PanAmerican Bioinformatics Institute, Cali, Colombia
| | - Leonardo Mariño-Ramírez
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, United States.,PanAmerican Bioinformatics Institute, Cali, Colombia
| |
Collapse
|
7
|
Chen C, Jin X, Zhang X, Zhang W, Guo Y, Tao R, Chen A, Xu Q, Li M, Yang Y, Zhu B. Comprehensive Insights Into Forensic Features and Genetic Background of Chinese Northwest Hui Group Using Six Distinct Categories of 231 Molecular Markers. Front Genet 2021; 12:705753. [PMID: 34721519 PMCID: PMC8555763 DOI: 10.3389/fgene.2021.705753] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 09/07/2021] [Indexed: 11/13/2022] Open
Abstract
The Hui minority is predominantly composed of Chinese-speaking Islamic adherents distributed throughout China, of which the individuals are mainly concentrated in Northwest China. In the present study, we employed the length and sequence polymorphisms-based typing system of 231 molecular markers, i.e., amelogenin, 22 phenotypic-informative single nucleotide polymorphisms (PISNPs), 94 identity-informative single nucleotide polymorphisms (IISNPs), 24 Y-chromosomal short tandem repeats (Y-STRs), 56 ancestry-informative single nucleotide polymorphisms (AISNPs), 7 X-chromosomal short tandem repeats (X-STRs), and 27 autosomal short tandem repeats (A-STRs), into 90 unrelated male individuals from the Chinese Northwest Hui group to comprehensively explore its forensic characteristics and genetic background. Total of 451 length-based and 652 sequence-based distinct alleles were identified from 58 short tandem repeats (STRs) in 90 unrelated Northwest Hui individuals, denoting that the sequence-based genetic markers could pronouncedly provide more genetic information than length-based markers. The forensic characteristics and efficiencies of STRs and IISNPs were estimated, both of which externalized high polymorphisms in the Northwest Hui group and could be further utilized in forensic investigations. No significant departure from the Hardy-Weinberg equilibrium (HWE) expectation was observed after the Bonferroni correction. Additionally, four group sets of reference population data were exploited to dissect the genetic background of the Northwest Hui group separately from different perspectives, which contained 26 populations for 93 IISNPs, 58 populations for 17 Y-STRs, 26 populations for 55 AISNPs (raw data), and 109 populations for 55 AISNPs (allele frequencies). As a result, the analyses based on the Y-STRs indicated that the Northwest Hui group primarily exhibited intimate genetic relationships with reference Hui groups from Chinese different regions except for the Sichuan Hui group and secondarily displayed close genetic relationships with populations from Central and West Asia, as well as several Chinese groups. However, the AISNP analyses demonstrated that the Northwest Hui group shared more intimate relationships with current East Asian populations apart from reference Hui group, harboring the large proportion of ancestral component contributed by East Asia.
Collapse
Affiliation(s)
- Chong Chen
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an Jiaotong University, Xi'an, China.,Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Xiaoye Jin
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an Jiaotong University, Xi'an, China
| | - Xingru Zhang
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an Jiaotong University, Xi'an, China.,Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Wenqing Zhang
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an Jiaotong University, Xi'an, China
| | - Yuxin Guo
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an Jiaotong University, Xi'an, China
| | - Ruiyang Tao
- Shanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Sciences, Ministry of Justice, Shanghai, China
| | - Anqi Chen
- Shanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Sciences, Ministry of Justice, Shanghai, China.,Department of Forensic Medicine, Shanghai Medical College of Fudan University, Shanghai, China
| | - Qiannan Xu
- Shanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Sciences, Ministry of Justice, Shanghai, China.,Institute of Forensic Medicine, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu, China
| | - Min Li
- Shanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Sciences, Ministry of Justice, Shanghai, China.,Institute of Forensic Medicine, West China School of Basic Medical Sciences and Forensic Medicine, Sichuan University, Chengdu, China
| | - Yue Yang
- Shanghai Key Laboratory of Forensic Medicine, Shanghai Forensic Service Platform, Academy of Forensic Sciences, Ministry of Justice, Shanghai, China.,School of Basic Medicine, Inner Mongolia Medical University, Hohhot, China
| | - Bofeng Zhu
- Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi'an Jiaotong University, Xi'an, China.,Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China.,Department of Forensic Genetics, Multi-Omics Innovative Research Center of Forensic Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| |
Collapse
|
8
|
Nagar SD, Conley AB, Chande AT, Rishishwar L, Sharma S, Mariño-Ramírez L, Aguinaga-Romero G, González-Andrade F, Jordan IK. Genetic ancestry and ethnic identity in Ecuador. HGG ADVANCES 2021; 2:100050. [PMID: 35047841 PMCID: PMC8756502 DOI: 10.1016/j.xhgg.2021.100050] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 08/09/2021] [Indexed: 02/05/2023] Open
Abstract
We investigated the ancestral origins of four Ecuadorian ethnic groups-Afro-Ecuadorian, Mestizo, Montubio, and the Indigenous Tsáchila-in an effort to gain insight on the relationship between ancestry, culture, and the formation of ethnic identities in Latin America. The observed patterns of genetic ancestry are largely concordant with ethnic identities and historical records of conquest and colonization in Ecuador. Nevertheless, a number of exceptional findings highlight the complex relationship between genetic ancestry and ethnicity in Ecuador. Afro-Ecuadorians show far less African ancestry, and the highest levels of Native American ancestry, seen for any Afro-descendant population in the Americas. Mestizos in Ecuador show high levels of Native American ancestry, with substantially less European ancestry, despite the relatively low Indigenous population in the country. The recently recognized Montubio ethnic group is highly admixed, with substantial contributions from all three continental ancestries. The Tsáchila show two distinct ancestry subgroups, with most individuals showing almost exclusively Native American ancestry and a smaller group showing a Mestizo characteristic pattern. Considered together with historical data and sociological studies, our results indicate the extent to which ancestry and culture interact, often in unexpected ways, to shape ethnic identity in Ecuador.
Collapse
Affiliation(s)
- Shashwat Deepali Nagar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.,PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia
| | - Andrew B Conley
- PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA.,National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, USA
| | - Aroon T Chande
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.,PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA
| | - Lavanya Rishishwar
- PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA.,National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, USA
| | - Shivam Sharma
- National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, USA
| | - Leonardo Mariño-Ramírez
- PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia.,National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD, USA
| | | | | | - I King Jordan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.,PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA
| |
Collapse
|
9
|
Chande AT, Rishishwar L, Ban D, Nagar SD, Conley AB, Rowell J, Valderrama-Aguirre AE, Medina-Rivas MA, Jordan IK. The Phenotypic Consequences of Genetic Divergence between Admixed Latin American Populations: Antioquia and Chocó, Colombia. Genome Biol Evol 2021; 12:1516-1527. [PMID: 32681795 PMCID: PMC7513793 DOI: 10.1093/gbe/evaa154] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/12/2020] [Indexed: 12/11/2022] Open
Abstract
Genome-wide association studies have uncovered thousands of genetic variants that are associated with a wide variety of human traits. Knowledge of how trait-associated variants are distributed within and between populations can provide insight into the genetic basis of group-specific phenotypic differences, particularly for health-related traits. We analyzed the genetic divergence levels for 1) individual trait-associated variants and 2) collections of variants that function together to encode polygenic traits, between two neighboring populations in Colombia that have distinct demographic profiles: Antioquia (Mestizo) and Chocó (Afro-Colombian). Genetic ancestry analysis showed 62% European, 32% Native American, and 6% African ancestry for Antioquia compared with 76% African, 10% European, and 14% Native American ancestry for Chocó, consistent with demography and previous results. Ancestry differences can confound cross-population comparison of polygenic risk scores (PRS); however, we did not find any systematic bias in PRS distributions for the two populations studied here, and population-specific differences in PRS were, for the most part, small and symmetrically distributed around zero. Both genetic differentiation at individual trait-associated single nucleotide polymorphisms and population-specific PRS differences between Antioquia and Chocó largely reflected anthropometric phenotypic differences that can be readily observed between the populations along with reported disease prevalence differences. Cases where population-specific differences in genetic risk did not align with observed trait (disease) prevalence point to the importance of environmental contributions to phenotypic variance, for both infectious and complex, common disease. The results reported here are distributed via a web-based platform for searching trait-associated variants and PRS divergence levels at http://map.chocogen.com (last accessed August 12, 2020).
Collapse
Affiliation(s)
- Aroon T Chande
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, Georgia.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - Lavanya Rishishwar
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, Georgia.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - Dongjo Ban
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - Shashwat D Nagar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - Andrew B Conley
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, Georgia.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| | - Jessica Rowell
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Augusto E Valderrama-Aguirre
- PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia.,Biomedical Research Institute (COL0082529), Cali, Colombia.,Universidad Santiago de Cali, Colombia
| | - Miguel A Medina-Rivas
- PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia.,Centro de Investigación en Biodiversidad y Hábitat, Universidad Tecnológica del Chocó, Quibdó, Colombia
| | - I King Jordan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, Georgia.,PanAmerican Bioinformatics Institute, Valle del Cauca, Cali, Colombia
| |
Collapse
|
10
|
Monteiro B, Arenas M, Prata MJ, Amorim A. Evolutionary dynamics of the human pseudoautosomal regions. PLoS Genet 2021; 17:e1009532. [PMID: 33872316 PMCID: PMC8084340 DOI: 10.1371/journal.pgen.1009532] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 04/29/2021] [Accepted: 04/06/2021] [Indexed: 01/19/2023] Open
Abstract
Recombination between the X and Y human sex chromosomes is limited to the two pseudoautosomal regions (PARs) that present quite distinct evolutionary origins. Despite the crucial importance for male meiosis, genetic diversity patterns and evolutionary dynamics of these regions are poorly understood. In the present study, we analyzed and compared the genetic diversity of the PAR regions using publicly available genomic sequences encompassing both PAR1 and PAR2. Comparisons were performed through allele diversities, linkage disequilibrium status and recombination frequencies within and between X and Y chromosomes. In agreement with previous studies, we confirmed the role of PAR1 as a male-specific recombination hotspot, but also observed similar characteristic patterns of diversity in both regions although male recombination occurs at PAR2 to a much lower extent (at least one recombination event at PAR1 and in ≈1% in normal male meioses at PAR2). Furthermore, we demonstrate that both PARs harbor significantly different allele frequencies between X and Y chromosomes, which could support that recombination is not sufficient to homogenize the pseudoautosomal gene pool or is counterbalanced by other evolutionary forces. Nevertheless, the observed patterns of diversity are not entirely explainable by sexually antagonistic selection. A better understanding of such processes requires new data from intergenerational transmission studies of PARs, which would be decisive on the elucidation of PARs evolution and their role in male-driven heterosomal aneuploidies.
Collapse
Affiliation(s)
- Bruno Monteiro
- Institute of Investigation and Innovation in Health (i3S). University of Porto, Porto, Portugal
- Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, Porto, Portugal
| | - Miguel Arenas
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain
- CINBIO (Biomedical Research Centre), University of Vigo, Vigo, Spain
| | - Maria João Prata
- Institute of Investigation and Innovation in Health (i3S). University of Porto, Porto, Portugal
- Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, Porto, Portugal
- Faculty of Sciences, University of Porto, Porto, Portugal
- * E-mail:
| | - António Amorim
- Institute of Investigation and Innovation in Health (i3S). University of Porto, Porto, Portugal
- Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, Porto, Portugal
- Faculty of Sciences, University of Porto, Porto, Portugal
| |
Collapse
|
11
|
Gross JM, Edgar HJH. Geographic and temporal diversity in dental morphology reflects a history of admixture, isolation, and drift in African Americans. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2021; 175:497-505. [PMID: 33704773 DOI: 10.1002/ajpa.24258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 01/28/2021] [Accepted: 02/05/2021] [Indexed: 11/10/2022]
Abstract
OBJECTIVES While genetic studies have documented variation in admixture proportions in contemporary African Americans across the US, relatively little is known about the socio-historical roots of this variation. Our goal in this study is to use dental morphology to explore the socio-historical correlates of admixture, localized gene flow, and drift in African Americans. METHODS Our data are ordinally-graded dental morphological traits scored in 196 Africans, 335 Europeans and European Americans, 291 pre-Spanish-contact Native Americans, and 722 African Americans. The African American data derived from contemporary and historic samples. We eliminated from analysis individuals and traits with greater than 20% missing data. We summarized the major axes of trait variation using principal component analysis (PCA), estimated biological distance, constructed multidimensional scaling (MDS) plots of the distances, and measured the correlation between geographic and biological distance. RESULTS In the PCA, African American groups clustered between Africans and Europeans on PC 1, reflecting admixture between the groups. PC 2 separated African American samples, possibly reflecting movement, isolation, and drift. MDS analyses confirmed the existence of sizable biological distances between African American samples, especially between contemporary and past African American samples. We found no relationship between biological and geographic distances. DISCUSSION We demonstrate that admixture and drift can be inferred from multi-variable analyses of patterns of dental morphology in admixed populations. Localized gene flow has not affected patterns of trait variation in African Americans, but long-range movement, isolation, and drift have. We connect patterns of dental trait variation to efforts to flee oppression during the Great Migration, and the repeal of anti-miscegenation laws.
Collapse
Affiliation(s)
- Jessica M Gross
- Department of Anthropology MSC01-1040, Anthropology 1, University of New Mexico, Albuquerque, New Mexico, USA
| | - Heather J H Edgar
- Department of Anthropology MSC01-1040, Anthropology 1, University of New Mexico, Albuquerque, New Mexico, USA
| |
Collapse
|
12
|
Abstract
Throughout human history, large-scale migrations have facilitated the formation of populations with ancestry from multiple previously separated populations. This process leads to subsequent shuffling of genetic ancestry through recombination, producing variation in ancestry between populations, among individuals in a population, and along the genome within an individual. Recent methodological and empirical developments have elucidated the genomic signatures of this admixture process, bringing previously understudied admixed populations to the forefront of population and medical genetics. Under this theme, we present a collection of recent PLOS Genetics publications that exemplify recent progress in human genetic admixture studies, and we discuss potential areas for future work.
Collapse
Affiliation(s)
- Katharine L. Korunes
- Department of Evolutionary Anthropology, Duke University, Durham, North Carolina, United States of America
| | - Amy Goldberg
- Department of Evolutionary Anthropology, Duke University, Durham, North Carolina, United States of America
| |
Collapse
|
13
|
Spear ML, Diaz-Papkovich A, Ziv E, Yracheta JM, Gravel S, Torgerson DG, Hernandez RD. Recent shifts in the genomic ancestry of Mexican Americans may alter the genetic architecture of biomedical traits. eLife 2020; 9:e56029. [PMID: 33372659 PMCID: PMC7771964 DOI: 10.7554/elife.56029] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 12/13/2020] [Indexed: 11/13/2022] Open
Abstract
People in the Americas represent a diverse continuum of populations with varying degrees of admixture among African, European, and Amerindigenous ancestries. In the United States, populations with non-European ancestry remain understudied, and thus little is known about the genetic architecture of phenotypic variation in these populations. Using genotype data from the Hispanic Community Health Study/Study of Latinos, we find that Amerindigenous ancestry increased by an average of ~20% spanning 1940s-1990s in Mexican Americans. These patterns result from complex interactions between several population and cultural factors which shaped patterns of genetic variation and influenced the genetic architecture of complex traits in Mexican Americans. We show for height how polygenic risk scores based on summary statistics from a European-based genome-wide association study perform poorly in Mexican Americans. Our findings reveal temporal changes in population structure within Hispanics/Latinos that may influence biomedical traits, demonstrating a need to improve our understanding of admixed populations.
Collapse
Affiliation(s)
- Melissa L Spear
- Biomedical Sciences Graduate Program, University of California, San FranciscoSan FranciscoUnited States
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
| | - Alex Diaz-Papkovich
- McGill Genome Centre, McGill UniversityMontrealCanada
- Quantitative Life Sciences Program, McGill UniversityMontrealCanada
| | - Elad Ziv
- Division of General Internal Medicine, University of California, San FranciscoSan FranciscoUnited States
- Department of Medicine, University of California, San FranciscoSan FranciscoUnited States
- Institute of Human Genetics, University of California, San FranciscoSan FranciscoUnited States
- Helen Diller Family Comprehensive Cancer Center, University of California, San FranciscoSan FranciscoUnited States
| | - Joseph M Yracheta
- Native BioData ConsortiumEagle ButteUnited States
- Bloomberg School of Public Health, Johns Hopkins UniversityBaltimoreUnited States
| | - Simon Gravel
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
| | - Dara G Torgerson
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
- Department of Epidemiology and Biostatistics University of California, San FranciscoSan FranciscoUnited States
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
- Institute of Human Genetics, University of California, San FranciscoSan FranciscoUnited States
- Bakar Computational Health Sciences Institute, University of California, San FranciscoSan FranciscoUnited States
- Quantitative Biosciences Institute, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
14
|
Mario-Vásquez JE, Naranjo-González CA, Montiel J, Zuluaga LM, Vásquez AM, Tobón-Castaño A, Bedoya G, Segura C. Association of variants in IL1B, TLR9, TREM1, IL10RA, and CD3G and Native American ancestry on malaria susceptibility in Colombian populations. INFECTION GENETICS AND EVOLUTION 2020; 87:104675. [PMID: 33316430 DOI: 10.1016/j.meegid.2020.104675] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 11/19/2020] [Accepted: 12/09/2020] [Indexed: 12/24/2022]
Abstract
Host genetics is an influencing factor in the manifestation of infectious diseases. In this study, the association of mild malaria with 28 variants in 16 genes previously reported in other populations and/or close to ancestry-informative markers (AIMs) selected was evaluated in an admixed 736 Colombian population sample. Additionally, the effect of genetic ancestry on phenotype expression was explored. For this purpose, the ancestral genetic composition of Turbo and El Bagre was determined. A higher Native American ancestry trend was found in the population with lower malaria susceptibility [odds ratio (OR) = 0.416, 95% confidence interval (95% CI) = 0.234-0.740, P = 0.003]. Three AIMs presented significant associations with the disease phenotype (MID1752, MID921, and MID1586). The first two were associated with greater malaria susceptibility (D/D, OR = 2.23, 95% CI = 1.06-4.69, P = 0.032 and I/D-I/I, OR = 2.14, 95% CI = 1.18-3.87, P = 0.011, respectively), and the latter has a protective effect on the appearance of malaria (I/I, OR = 0.18, 95% CI = 0.08-0.40, P < 0.0001). After adjustment by age, sex, municipality, and genetic ancestry, genotype association analysis showed evidence of association with malaria susceptibility for variants in or near IL1B, TLR9, TREM1, IL10RA, and CD3G genes: rs1143629-IL1B (G/A-A/A, OR = 0.41, 95% CI = 0.21-0.78, P = 0.0051), rs352139-TLR9 (T/T, OR = 0.28, 95% CI = 0.11-0.72, P = 0.0053), rs352140-TLR9 (C/C, OR = 0.41, 95% CI = 0.20-0.87, P = 0.019), rs2234237-TREM1 (T/A-A/A, OR = 0.43, 95% CI = 0.23-0.79, P = 0.0056), rs4252246-IL10RA (C/A-A/A, OR = 2.11, 95% CI = 1.18-3.75, P = 0.01), and rs1561966-CD3G (A/A, OR = 0.20, 95% CI = 0.06-0.69, P = 0.0058). The results showed the participation of genes involved in immunological processes and suggested an effect of ancestral genetic composition over the traits analyzed. Compared to the paisa population (Antioquia), Turbo and El Bagre showed a strong decrease in European ancestry and an increase in African and Native American ancestries. Also, a novel association of two single nucleotide polymorphisms with malaria susceptibility was identified in this study.
Collapse
Affiliation(s)
- Jorge Eliécer Mario-Vásquez
- Grupo Genética Molecular (GENMOL), Universidad de Antioquia, Carrera 53 No. 61-30, Lab 430. Medellín, Colombia
| | | | - Jehidys Montiel
- Grupo Malaria-Facultad de Medicina, Universidad de Antioquia, Carrera 53 No. 61-30, Lab 610, Medellín, Colombia
| | - Lina M Zuluaga
- Grupo Malaria-Facultad de Medicina, Universidad de Antioquia, Carrera 53 No. 61-30, Lab 610, Medellín, Colombia
| | - Ana M Vásquez
- Grupo Malaria-Facultad de Medicina, Universidad de Antioquia, Carrera 53 No. 61-30, Lab 610, Medellín, Colombia
| | - Alberto Tobón-Castaño
- Grupo Malaria-Facultad de Medicina, Universidad de Antioquia, Carrera 53 No. 61-30, Lab 610, Medellín, Colombia
| | - Gabriel Bedoya
- Grupo Genética Molecular (GENMOL), Universidad de Antioquia, Carrera 53 No. 61-30, Lab 430. Medellín, Colombia
| | - Cesar Segura
- Grupo Malaria-Facultad de Medicina, Universidad de Antioquia, Carrera 53 No. 61-30, Lab 610, Medellín, Colombia.
| |
Collapse
|
15
|
Nagar SD, Conley AB, Jordan IK. Population structure and pharmacogenomic risk stratification in the United States. BMC Biol 2020; 18:140. [PMID: 33050895 PMCID: PMC7557099 DOI: 10.1186/s12915-020-00875-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 09/22/2020] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Pharmacogenomic (PGx) variants mediate how individuals respond to medication, and response differences among racial/ethnic groups have been attributed to patterns of PGx diversity. We hypothesized that genetic ancestry (GA) would provide higher resolution for stratifying PGx risk, since it serves as a more reliable surrogate for genetic diversity than self-identified race/ethnicity (SIRE), which includes a substantial social component. We analyzed a cohort of 8628 individuals from the United States (US), for whom we had both SIRE information and whole genome genotypes, with a focus on the three largest SIRE groups in the US: White, Black (African-American), and Hispanic (Latino). Our approach to the question of PGx risk stratification entailed the integration of two distinct methodologies: population genetics and evidence-based medicine. This integrated approach allowed us to consider the clinical implications for the observed patterns of PGx variation found within and between population groups. RESULTS Whole genome genotypes were used to characterize individuals' continental ancestry fractions-European, African, and Native American-and individuals were grouped according to their GA profiles. SIRE and GA groups were found to be highly concordant. Continental ancestry predicts individuals' SIRE with > 96% accuracy, and accordingly, GA provides only a marginal increase in resolution for PGx risk stratification. In light of the concordance between SIRE and GA, taken together with the fact that information on SIRE is readily available to clinicians, we evaluated PGx variation between SIRE groups to explore the potential clinical utility of race and ethnicity. PGx variants are highly diverged compared to the genomic background; 82 variants show significant frequency differences among SIRE groups, and genome-wide patterns of PGx variation are almost entirely concordant with SIRE. The vast majority of PGx variation is found within rather than between groups, a well-established fact for almost all genetic variants, which is often taken to argue against the clinical utility of population stratification. Nevertheless, analysis of highly differentiated PGx variants illustrates how SIRE partitions PGx variation based on groups' characteristic ancestry patterns. These cases underscore the extent to which SIRE carries clinically valuable information for stratifying PGx risk among populations, albeit with less utility for predicting individual-level PGx alleles (genotypes), supporting the concept of population pharmacogenomics. CONCLUSIONS Perhaps most interestingly, we show that individuals who identify as Black or Hispanic stand to gain far more from the consideration of race/ethnicity in treatment decisions than individuals from the majority White population.
Collapse
Affiliation(s)
- Shashwat Deepali Nagar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
| | - Andrew B. Conley
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, 950 Atlantic Drive, Atlanta, GA 30332 USA
| | - I. King Jordan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, 950 Atlantic Drive, Atlanta, GA 30332 USA
| |
Collapse
|
16
|
Norris ET, Rishishwar L, Chande AT, Conley AB, Ye K, Valderrama-Aguirre A, Jordan IK. Admixture-enabled selection for rapid adaptive evolution in the Americas. Genome Biol 2020; 21:29. [PMID: 32028992 PMCID: PMC7006128 DOI: 10.1186/s13059-020-1946-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 01/24/2020] [Indexed: 02/08/2023] Open
Abstract
Background Admixture occurs when previously isolated populations come together and exchange genetic material. We hypothesize that admixture can enable rapid adaptive evolution in human populations by introducing novel genetic variants (haplotypes) at intermediate frequencies, and we test this hypothesis through the analysis of whole genome sequences sampled from admixed Latin American populations in Colombia, Mexico, Peru, and Puerto Rico. Results Our screen for admixture-enabled selection relies on the identification of loci that contain more or less ancestry from a given source population than would be expected given the genome-wide ancestry frequencies. We employ a combined evidence approach to evaluate levels of ancestry enrichment at single loci across multiple populations and multiple loci that function together to encode polygenic traits. We find cross-population signals of African ancestry enrichment at the major histocompatibility locus on chromosome 6, consistent with admixture-enabled selection for enhanced adaptive immune response. Several of the human leukocyte antigen genes at this locus, such as HLA-A, HLA-DRB51, and HLA-DRB5, show independent evidence of positive selection prior to admixture, based on extended haplotype homozygosity in African populations. A number of traits related to inflammation, blood metabolites, and both the innate and adaptive immune system show evidence of admixture-enabled polygenic selection in Latin American populations. Conclusions The results reported here, considered together with the ubiquity of admixture in human evolution, suggest that admixture serves as a fundamental mechanism that drives rapid adaptive evolution in human populations.
Collapse
Affiliation(s)
- Emily T Norris
- School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, Atlanta, GA, 30332, USA.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA.,PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia
| | - Lavanya Rishishwar
- School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, Atlanta, GA, 30332, USA.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA.,PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia
| | - Aroon T Chande
- School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, Atlanta, GA, 30332, USA.,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA.,PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia
| | - Andrew B Conley
- IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA.,PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia
| | - Kaixiong Ye
- Department of Genetics, University of Georgia, Athens, GA, USA.,Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Augusto Valderrama-Aguirre
- PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia.,Biomedical Research Institute (COL0082529), Cali, Colombia.,Universidad Santiago de Cali, Cali, Colombia
| | - I King Jordan
- School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, Atlanta, GA, 30332, USA. .,IHRC-Georgia Tech Applied Bioinformatics Laboratory, Atlanta, GA, USA. .,PanAmerican Bioinformatics Institute, Cali, Valle del Cauca, Colombia.
| |
Collapse
|
17
|
UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLoS Genet 2019; 15:e1008432. [PMID: 31675358 PMCID: PMC6853336 DOI: 10.1371/journal.pgen.1008432] [Citation(s) in RCA: 106] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 11/13/2019] [Accepted: 09/17/2019] [Indexed: 11/19/2022] Open
Abstract
Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets. The demographic history of human populations features varying geographic and social barriers to mating. Over time, these barriers have led to varying levels of genetic relatedness among individuals. This population structure is informative about human history, and can have a significant impact on studies of medical genetics. Because population structure depends on myriad demographic, ecological, and social forces, a priori visualization is useful to identify subtle patterns of population structure. We use a dimension reduction method—UMAP—to visualize population structure in three genomic datasets and find previously unobserved patterns, revealing fine-scale population structure and illustrating differences between groups in traits such as white blood cell count, height, and FEV1, a measure of lung function. Using UMAP is computationally efficient and can identify fine-scale population structure in large population datasets. We find it particularly useful to reveal phenotypic variation among genetically related populations, and recommend it is a complement to principal component analysis in primary data visualization.
Collapse
|