1
|
Haddad Y. The value-ladenness of ancestry. STUDIES IN HISTORY AND PHILOSOPHY OF SCIENCE 2025; 112:23-32. [PMID: 40516384 DOI: 10.1016/j.shpsa.2025.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 10/15/2024] [Accepted: 06/04/2025] [Indexed: 06/16/2025]
Abstract
Clustering humans based on their genetic ancestry is a common practice in human genomics. Genetically similar populations can be seen as statistical constructs that are labeled by population descriptors such as "race," "ethnicity," and "genetic ancestry." Recently, there has been a shift towards replacing the descriptor "race" with "genetic ancestry" because the latter is considered more objective. A descriptor is deemed objective if it adequately captures an underlying feature of the biological world, such as genetic similarities or differences between human sub-populations. However, claims of objectivity do not sufficiently explain the rationale for the choice and use of population descriptors such as "ancestry." This paper proposes an axiological approach to capture the choice and use of population descriptors in human genomics, by showing that the population descriptor "ancestry" is value-laden and that there is a legitimate role for values in the choice and use of population descriptors in genomics.
Collapse
Affiliation(s)
- Yasmin Haddad
- Department of Philosophy, Université Du Québec à Montréal, Pavillon Thérèse-Casgrain, 5è étage, 455Boulevard René-Lévesque Est, Local W-5350, H2L 4Y2, Canada.
| |
Collapse
|
2
|
Patel RA, Weiß CL, Zhu H, Mostafavi H, Simons YB, Spence JP, Pritchard JK. Characterizing selection on complex traits through conditional frequency spectra. Genetics 2025; 229:iyae210. [PMID: 39691067 PMCID: PMC12005249 DOI: 10.1093/genetics/iyae210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 11/18/2024] [Accepted: 12/03/2024] [Indexed: 12/19/2024] Open
Abstract
Natural selection on complex traits is difficult to study in part due to the ascertainment inherent to genome-wide association studies (GWAS). The power to detect a trait-associated variant in GWAS is a function of its frequency and effect size - but for traits under selection, the effect size of a variant determines the strength of selection against it, constraining its frequency. Recognizing the biases inherent to GWAS ascertainment, we propose studying the joint distribution of allele frequencies across populations, conditional on the frequencies in the GWAS cohort. Before considering these conditional frequency spectra, we first characterized the impact of selection and non-equilibrium demography on allele frequency dynamics forwards and backwards in time. We then used these results to understand conditional frequency spectra under realistic human demography. Finally, we investigated empirical conditional frequency spectra for GWAS variants associated with 106 complex traits, finding compelling evidence for either stabilizing or purifying selection. Our results provide insights into polygenic score portability and other properties of variants ascertained with GWAS, highlighting the utility of conditional frequency spectra.
Collapse
Affiliation(s)
- Roshni A Patel
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Clemens L Weiß
- Stanford Cancer Institute Core, Stanford University, Stanford, CA 94305, USA
| | - Huisheng Zhu
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Hakhamanesh Mostafavi
- Center for Human Genetics and Genomics, New York University School of Medicine, New York, NY 10016, USA
- Division of Biostatistics, Department of Population Health, New York University School of Medicine, New York, NY 10016, USA
| | - Yuval B Simons
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Jeffrey P Spence
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
3
|
Nguyen AK, Schall PZ, Kidd JM. A map of canine sequence variation relative to a Greenland wolf outgroup. Mamm Genome 2024; 35:565-576. [PMID: 39088040 DOI: 10.1007/s00335-024-10056-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 07/25/2024] [Indexed: 08/02/2024]
Abstract
For over 15 years, canine genetics research relied on a reference assembly from a Boxer breed dog named Tasha (i.e., canFam3.1). Recent advances in long-read sequencing and genome assembly have led to the development of numerous high-quality assemblies from diverse canines. These assemblies represent notable improvements in completeness, contiguity, and the representation of gene promoters and gene models. Although genome graph and pan-genome approaches have promise, most genetic analyses in canines rely upon the mapping of Illumina sequencing reads to a single reference. The Dog10K consortium, and others, have generated deep catalogs of genetic variation through an alignment of Illumina sequencing reads to a reference genome obtained from a German Shepherd Dog named Mischka (i.e., canFam4, UU_Cfam_GSD_1.0). However, alignment to a breed-derived genome may introduce bias in genotype calling across samples. Since the use of an outgroup reference genome may remove this effect, we have reprocessed 1929 samples analyzed by the Dog10K consortium using a Greenland wolf (mCanLor1.2) as the reference. We efficiently performed remapping and variant calling using a GPU-implementation of common analysis tools. The resulting call set removes the variability in genetic differences seen across samples and breed relationships revealed by principal component analysis are not affected by the choice of reference genome. Using this sequence data, we inferred the history of population sizes and found that village dog populations experienced a 9-13 fold reduction in historic effective population size relative to wolves.
Collapse
Affiliation(s)
- Anthony K Nguyen
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Peter Z Schall
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
4
|
Funk MW, Kidd JM. A Variant-Centric Analysis of Allele Sharing in Dogs and Wolves. Genes (Basel) 2024; 15:1168. [PMID: 39336759 PMCID: PMC11431226 DOI: 10.3390/genes15091168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 08/28/2024] [Accepted: 08/30/2024] [Indexed: 09/30/2024] Open
Abstract
Canines are an important model system for genetics and evolution. Recent advances in sequencing technologies have enabled the creation of large databases of genetic variation in canines, but analyses of allele sharing among canine groups have been limited. We applied GeoVar, an approach originally developed to study the sharing of single nucleotide polymorphisms across human populations, to assess the sharing of genetic variation among groups of wolves, village dogs, and breed dogs. Our analysis shows that wolves differ from each other at an average of approximately 2.3 million sites while dogs from the same breed differ at nearly 1 million sites. We found that 22% of the variants are common across wolves, village dogs, and breed dogs, that ~16% of variable sites are common across breed dogs, and that nearly half of the differences between two dogs of different breeds are due to sites that are common in all clades. These analyses represent a succinct summary of allele sharing across canines and illustrate the effects of canine history on the apportionment of genetic variation.
Collapse
Affiliation(s)
- Matthew W. Funk
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA;
| | - Jeffrey M. Kidd
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA;
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
5
|
Patel RA, Weiß CL, Zhu H, Mostafavi H, Simons YB, Spence JP, Pritchard JK. Conditional frequency spectra as a tool for studying selection on complex traits in biobanks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.15.599126. [PMID: 38948697 PMCID: PMC11212903 DOI: 10.1101/2024.06.15.599126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Natural selection on complex traits is difficult to study in part due to the ascertainment inherent to genome-wide association studies (GWAS). The power to detect a trait-associated variant in GWAS is a function of frequency and effect size - but for traits under selection, the effect size of a variant determines the strength of selection against it, constraining its frequency. To account for GWAS ascertainment, we propose studying the joint distribution of allele frequencies across populations, conditional on the frequencies in the GWAS cohort. Before considering these conditional frequency spectra, we first characterized the impact of selection and non-equilibrium demography on allele frequency dynamics forwards and backwards in time. We then used these results to understand conditional frequency spectra under realistic human demography. Finally, we investigated empirical conditional frequency spectra for GWAS variants associated with 106 complex traits, finding compelling evidence for either stabilizing or purifying selection. Our results provide insight into polygenic score portability and other properties of variants ascertained with GWAS, highlighting the utility of conditional frequency spectra.
Collapse
Affiliation(s)
- Roshni A. Patel
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | - Clemens L. Weiß
- Stanford Cancer Institute Core, Stanford University School of Medicine, Stanford, CA
| | - Huisheng Zhu
- Department of Biology, Stanford University, Stanford, CA
| | - Hakhamanesh Mostafavi
- Center for Human Genetics and Genomics, New York University School of Medicine, New York, NY
- Division of Biostatistics, Department of Population Health, New York University School of Medicine, New York, NY
| | | | - Jeffrey P. Spence
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | - Jonathan K. Pritchard
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
- Department of Biology, Stanford University, Stanford, CA
| |
Collapse
|
6
|
Du N, Wang X, Wang Z, Liu H, Liu H, Duan H, Zhao S, Banerjee S, Zhang X. Identification of a Novel Homozygous Mutation in MTMR2 Gene Causes Very Rare Charcot-Marie-Tooth Disease Type 4B1. Appl Clin Genet 2024; 17:71-84. [PMID: 38835974 PMCID: PMC11149649 DOI: 10.2147/tacg.s448084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Accepted: 05/01/2024] [Indexed: 06/06/2024] Open
Abstract
Background Charcot-Marie-Tooth disease (CMT) is a heterogeneous group of disorders involving peripheral nervous system. Charcot-Marie-Tooth disease 4B1 (CMT4B1) is a rare subtype of CMT. CMT4B1 is an axonal demyelinating polyneuropathy with an autosomal recessive mode of inheritance. Patients with CMT4B1 usually manifested with dysfunction of the motor and sensory systems which leads to gradual and progressive muscular weakness and atrophy, starting from the peroneal muscles and finally affecting the distal muscles. Germline mutations in MTMR2 gene causes CMT4B1. Material and Methods In this study, we investigated a 4-year-old Chinese boy with gradual and progressive weakness and atrophy of both proximal and distal muscles. The proband's parents did not show any abnormalities. Whole-exome sequencing and Sanger sequencing were performed. Results Whole-exome sequencing identified a novel homozygous nonsense mutation (c.118A>T; p.Lys40*) in exon 2 of MTMR2 gene in the proband. This novel mutation leads to the formation of a truncated MTMR2 protein of 39 amino acids instead of the wild- type MTMR2 protein of 643 amino acids. This mutation is predicted to cause the complete loss of the PH-GRAM domain, phosphatase domain, coiled-coil domain, and PDZ-binding motif of the MTMR2 protein. Sanger sequencing revealed that the proband's parents carried the mutation in a heterozygous state. This mutation was absent in 100 healthy control individuals. Conclusion This study reports the first mutation in MTMR2 associated with CMT4B1 in a Chinese population. Our study also showed the importance of whole-exome sequencing in identifying candidate genes and disease-causing variants in patients with CMT4B1.
Collapse
Affiliation(s)
- Nan Du
- Department of Medical Genetics, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, Shaanxi, 710004, People's Republic of China
| | - Xiaolei Wang
- Department of Medical Genetics, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, Shaanxi, 710004, People's Republic of China
| | - Zhaohui Wang
- Center for Children Health Care, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, Shaanxi, 710004, People's Republic of China
| | - Hongwei Liu
- Department of Medical Genetics, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, Shaanxi, 710004, People's Republic of China
| | - Hui Liu
- Department of Medical Genetics, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, Shaanxi, 710004, People's Republic of China
| | - Hongfang Duan
- Department of Medical Genetics, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, Shaanxi, 710004, People's Republic of China
| | - Shaozhi Zhao
- Department of Medical Genetics, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, Shaanxi, 710004, People's Republic of China
| | - Santasree Banerjee
- Department of Genetics, College of Basic Medical Sciences, Jilin University, Changchun, Jilin, 130021, People's Republic of China
| | - Xinwen Zhang
- Department of Medical Genetics, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, Shaanxi, 710004, People's Republic of China
| |
Collapse
|
7
|
Passmore S, Wood ALC, Barbieri C, Shilton D, Daikoku H, Atkinson QD, Savage PE. Global musical diversity is largely independent of linguistic and genetic histories. Nat Commun 2024; 15:3964. [PMID: 38729968 PMCID: PMC11087526 DOI: 10.1038/s41467-024-48113-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
Music is a universal yet diverse cultural trait transmitted between generations. The extent to which global musical diversity traces cultural and demographic history, however, is unresolved. Using a global musical dataset of 5242 songs from 719 societies, we identify five axes of musical diversity and show that music contains geographical and historical structures analogous to linguistic and genetic diversity. After creating a matched dataset of musical, genetic, and linguistic data spanning 121 societies containing 981 songs, 1296 individual genetic profiles, and 121 languages, we show that global musical similarities are only weakly and inconsistently related to linguistic or genetic histories, with some regional exceptions such as within Southeast Asia and sub-Saharan Africa. Our results suggest that global musical traditions are largely distinct from some non-musical aspects of human history.
Collapse
Affiliation(s)
- Sam Passmore
- Graduate School of Media and Governance, Keio University, Fujisawa, Japan.
- Evolution of Cultural Diversity Initiative (ECDI), Australian National University, Canberra, Australia.
| | | | - Chiara Barbieri
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, 8057, Switzerland
- Centre for the Interdisciplinary Study of Language Evolution (ISLE), University of Zurich, Zurich, 8050, Switzerland
- Department of Life and Environmental Sciences, University of Cagliari, 09126, Cagliari, Italy
| | - Dor Shilton
- Cohn Institute for the History and Philosophy of Science and Ideas, Tel Aviv University, Tel Aviv, Israel
- Edelstein Centre for the History and Philosophy of Science, Technology, and Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Hideo Daikoku
- Graduate School of Media and Governance, Keio University, Fujisawa, Japan
| | | | - Patrick E Savage
- School of Psychology, University of Auckland, Auckland, New Zealand.
- Faculty of Environment and Information Studies, Keio University, Fujisawa, Japan.
| |
Collapse
|
8
|
Marks-Anglin AK, Barg FK, Ross M, Wiebe DJ, Hwang WT. Survival analysis under imperfect record linkage using historic census data. BMC Med Res Methodol 2024; 24:67. [PMID: 38481152 PMCID: PMC10935812 DOI: 10.1186/s12874-024-02194-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 03/01/2024] [Indexed: 03/17/2024] Open
Abstract
BACKGROUND Advancements in linking publicly available census records with vital and administrative records have enabled novel investigations in epidemiology and social history. However, in the absence of unique identifiers, the linkage of the records may be uncertain or only be successful for a subset of the census cohort, resulting in missing data. For survival analysis, differential ascertainment of event times can impact inference on risk associations and median survival. METHODS We modify some existing approaches that are commonly used to handle missing survival times to accommodate this imperfect linkage situation including complete case analysis, censoring, weighting, and several multiple imputation methods. We then conduct simulation studies to compare the performance of the proposed approaches in estimating the associations of a risk factor or exposure in terms of hazard ratio (HR) and median survival times in the presence of missing survival times. The effects of different missing data mechanisms and exposure-survival associations on their performance are also explored. The approaches are applied to a historic cohort of residents in Ambler, PA, established using the 1930 US census, from which only 2,440 out of 4,514 individuals (54%) had death records retrievable from publicly available data sources and death certificates. Using this cohort, we examine the effects of occupational and paraoccupational asbestos exposure on survival and disparities in mortality by race and gender. RESULTS We show that imputation based on conditional survival results in less bias and greater efficiency relative to a complete case analysis when estimating log-hazard ratios and median survival times. When the approaches are applied to the Ambler cohort, we find a significant association between occupational exposure and mortality, particularly among black individuals and males, but not between paraoccupational exposure and mortality. DISCUSSION This investigation illustrates the strengths and weaknesses of different imputation methods for missing survival times due to imperfect linkage of the administrative or registry data. The performance of the methods may depend on the missingness process as well as the parameter being estimated and models of interest, and such factors should be considered when choosing the methods to address the missing event times.
Collapse
Affiliation(s)
- Arielle K Marks-Anglin
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Frances K Barg
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Family Medicine and Community Health, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Michelle Ross
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Douglas J Wiebe
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Wei-Ting Hwang
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- , 423 Guardian Drive, Blockley Hall Room 610, Philadelphia, PA, 19064, USA.
| |
Collapse
|
9
|
Mao B, Yang J, Zhao X, Jia X, Shi X, Zhao L, Banerjee S, Zhang L, Ma X. Identification and functional characterization of a novel heterozygous splice‑site mutation in the calpain 3 gene causes rare autosomal dominant limb‑girdle muscular dystrophy. Exp Ther Med 2024; 27:97. [PMID: 38356676 PMCID: PMC10865457 DOI: 10.3892/etm.2024.12385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 11/03/2023] [Indexed: 02/16/2024] Open
Abstract
Limb-girdle muscular dystrophies are a group of extremely heterogenous neuromuscular disorders that manifest with gradual and progressive weakness of both proximal and distal muscles. Autosomal dominant limb-girdle muscular dystrophy (LGMDD4) or calpainopathy is a very rare form of myopathy characterized by weakness and atrophy of both proximal and distal muscles with a variable age of onset. LGMDD4 is caused by germline heterozygous mutations of the calpain 3 (CAPN3) gene. Patients with LGMDD4 often show extreme phenotypic heterogeneity; however, most patients present with gait difficulties, increased levels of serum creatine kinase, myalgia and back pain. In the present study, a 16-year-old male patient, clinically diagnosed with LGMDD4, was investigated. The proband had been suffering from weakness and atrophy of both of their proximal and distal muscles, and had difficulty walking and standing independently. The serum creatine kinase levels (4,754 IU/l; normal, 35-232 IU/l) of the patient were markedly elevated. The younger sister and mother of the proband were also clinically diagnosed with LGMDD4, while the father was phenotypically normal. Whole exome sequencing identified a heterozygous novel splice-site (c.2440-1G>A) mutation in intron 23 of the CAPN3 gene in the proband. Sanger sequencing confirmed that this mutation was also present in both the younger sister and mother of the proband, but the father was not a carrier of this mutation. This splice-site (c.2440-1G>A) mutation causes aberrant splicing of CAPN3 mRNA, leading to the skipping of the last exon (exon 24) of CAPN3 mRNA and resulting in the removal of eight amino acids from the C-terminal of domain IV of the CAPN3 protein. Hence, this splice site mutation causes the formation of a truncated CAPN3 protein (p.Trp814*) of 813 amino acids instead of the wild-type CAPN3 protein that consists of 821 amino acids. This mutation causes partial loss of domain IV (PEF domain) in the CAPN3 protein, which is involved in calcium binding and homodimerization; therefore, this is a loss-of-function mutation. Relative expression of the mutated CAPN3 mRNA was reduced in comparison with the wild-type CAPN3 mRNA in the proband, and their younger sister and mother. This mutation was also not present in 100 normal healthy control individuals of the same ethnicity. The present study reported the first case of CAPN3 gene-associated LGMDD4 in the Chinese population.
Collapse
Affiliation(s)
- Bin Mao
- The Reproductive Medicine Centre, The First Hospital of Lanzhou University, Lanzhou, Gansu 730000, P.R. China
| | - Jie Yang
- The Reproductive Medicine Centre, The First Hospital of Lanzhou University, Lanzhou, Gansu 730000, P.R. China
| | - Xiaodong Zhao
- The Reproductive Medicine Centre, The First Hospital of Lanzhou University, Lanzhou, Gansu 730000, P.R. China
| | - Xueling Jia
- The Reproductive Medicine Centre, The First Hospital of Lanzhou University, Lanzhou, Gansu 730000, P.R. China
| | - Xin Shi
- The Reproductive Medicine Centre, The First Hospital of Lanzhou University, Lanzhou, Gansu 730000, P.R. China
| | - Lihui Zhao
- The Reproductive Medicine Centre, The First Hospital of Lanzhou University, Lanzhou, Gansu 730000, P.R. China
| | - Santasree Banerjee
- Department of Genetics, College of Basic Medical Sciences, Jilin University, Changchun, Jilin 130021, P.R. China
| | - Lili Zhang
- The Reproductive Medicine Centre, The First Hospital of Lanzhou University, Lanzhou, Gansu 730000, P.R. China
| | - Xiaoling Ma
- The Reproductive Medicine Centre, The First Hospital of Lanzhou University, Lanzhou, Gansu 730000, P.R. China
| |
Collapse
|
10
|
Koganebuchi K, Matsunami M, Imamura M, Kawai Y, Hitomi Y, Tokunaga K, Maeda S, Ishida H, Kimura R. Demographic history of Ryukyu islanders at the southern part of the Japanese Archipelago inferred from whole-genome resequencing data. J Hum Genet 2023; 68:759-767. [PMID: 37468573 PMCID: PMC10597838 DOI: 10.1038/s10038-023-01180-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 05/29/2023] [Accepted: 06/17/2023] [Indexed: 07/21/2023]
Abstract
The Ryukyu Islands are located in the southernmost part of the Japanese Archipelago and consist of several island groups. Each island group has its own history and culture, which differ from those of mainland Japan. People of the Ryukyu Islands are genetically subdivided; however, their detailed demographic history remains unclear. We report the results of a whole-genome sequencing analysis of a total of 50 Ryukyu islanders, focusing on genetic differentiation between Miyako and Okinawa islanders. We confirmed that Miyako and Okinawa islanders cluster differently in principal component analysis and ADMIXTURE analysis and that there is a population structure among Miyako islanders. The present study supports the hypothesis that population differentiation is primarily caused by genetic drift rather than by differences in the rate of migration from surrounding regions, such as the Japanese main islands or Taiwan. In addition, the genetic cline observed among Miyako and Okinawa islanders can be explained by recurrent migration beyond the bounds of these islands. Our analysis also suggested that the presence of multiple subpopulations during the Neolithic Ryukyu Jomon period is not crucial to explain the modern Ryukyu populations. However, the assumption of multiple subpopulations during the time of admixture with mainland Japanese is necessary to explain the modern Ryukyu populations. Our findings add insights that could help clarify the complex history of populations in the Ryukyu Islands.
Collapse
Affiliation(s)
- Kae Koganebuchi
- Advanced Medical Research Center, Faculty of Medicine, University of the Ryukyus, Nishihara, 903-0215, Japan.
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, 113-0033, Japan.
| | - Masatoshi Matsunami
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara, 903-0215, Japan
| | - Minako Imamura
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara, 903-0215, Japan
- Division of Clinical Laboratory and Blood Transfusion, University of the Ryukyus Hospital, Nishihara, 903-0215, Japan
| | - Yosuke Kawai
- Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo, 162-8655, Japan
| | - Yuki Hitomi
- Department of Microbiology, Hoshi University School of Pharmacy and Pharmaceutical Sciences, Tokyo, 142-8501, Japan
| | - Katsushi Tokunaga
- Genome Medical Science Project, National Center for Global Health and Medicine, Tokyo, 162-8655, Japan
| | - Shiro Maeda
- Department of Advanced Genomic and Laboratory Medicine, Graduate School of Medicine, University of the Ryukyus, Nishihara, 903-0215, Japan
- Division of Clinical Laboratory and Blood Transfusion, University of the Ryukyus Hospital, Nishihara, 903-0215, Japan
| | - Hajime Ishida
- Department of Human Biology and Anatomy, Graduate School of Medicine, University of the Ryukyus, Nishihara, 903-0215, Japan
- Mt. Olive Hospital, Naha, 903-0804, Japan
| | - Ryosuke Kimura
- Department of Human Biology and Anatomy, Graduate School of Medicine, University of the Ryukyus, Nishihara, 903-0215, Japan.
| |
Collapse
|
11
|
Flegontov P, Işıldak U, Maier R, Yüncü E, Changmai P, Reich D. Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes. PLoS Genet 2023; 19:e1010931. [PMID: 37676865 PMCID: PMC10508636 DOI: 10.1371/journal.pgen.1010931] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 09/19/2023] [Accepted: 08/21/2023] [Indexed: 09/09/2023] Open
Abstract
f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data-that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed-but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True "outgroup ascertainment" is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the "Affymetrix Human Origins array" which has been genotyped on thousands of modern individuals from hundreds of populations, or the "1240k" in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.
Collapse
Affiliation(s)
- Pavel Flegontov
- Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
- Kalmyk Research Center of the Russian Academy of Sciences, Elista, Russia
| | - Ulaş Işıldak
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
| | - Robert Maier
- Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Eren Yüncü
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
| | - Piya Changmai
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
| | - David Reich
- Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| |
Collapse
|
12
|
Arca M, Gouesnard B, Mary-Huard T, Le Paslier MC, Bauland C, Combes V, Madur D, Charcosset A, Nicolas SD. Genotyping of DNA pools identifies untapped landraces and genomic regions to develop next-generation varieties. PLANT BIOTECHNOLOGY JOURNAL 2023; 21:1123-1139. [PMID: 36740649 DOI: 10.1111/pbi.14022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 01/18/2023] [Indexed: 05/27/2023]
Abstract
Landraces, that is, traditional varieties, have a large diversity that is underexploited in modern breeding. A novel DNA pooling strategy was implemented to identify promising landraces and genomic regions to enlarge the genetic diversity of modern varieties. As proof of concept, DNA pools from 156 American and European maize landraces representing 2340 individuals were genotyped with an SNP array to assess their genome-wide diversity. They were compared to elite cultivars produced across the 20th century, represented by 327 inbred lines. Detection of selective footprints between landraces of different geographic origin identified genes involved in environmental adaptation (flowering times, growth) and tolerance to abiotic and biotic stress (drought, cold, salinity). Promising landraces were identified by developing two novel indicators that estimate their contribution to the genome of inbred lines: (i) a modified Roger's distance standardized by gene diversity and (ii) the assignation of lines to landraces using supervised analysis. It showed that most landraces do not have closely related lines and that only 10 landraces, including famous landraces as Reid's Yellow Dent, Lancaster Surecrop and Lacaune, cumulated half of the total contribution to inbred lines. Comparison of ancestral lines directly derived from landraces with lines from more advanced breeding cycles showed a decrease in the number of landraces with a large contribution. New inbred lines derived from landraces with limited contributions enriched more the haplotype diversity of reference inbred lines than those with a high contribution. Our approach opens an avenue for the identification of promising landraces for pre-breeding.
Collapse
Affiliation(s)
- Mariangela Arca
- INRAE, CNRS, AgroParisTech, GQE - Le Moulon, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Brigitte Gouesnard
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Tristan Mary-Huard
- INRAE, CNRS, AgroParisTech, GQE - Le Moulon, Université Paris-Saclay, Gif-sur-Yvette, France
| | | | - Cyril Bauland
- INRAE, CNRS, AgroParisTech, GQE - Le Moulon, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Valérie Combes
- INRAE, CNRS, AgroParisTech, GQE - Le Moulon, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Delphine Madur
- INRAE, CNRS, AgroParisTech, GQE - Le Moulon, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Alain Charcosset
- INRAE, CNRS, AgroParisTech, GQE - Le Moulon, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Stéphane D Nicolas
- INRAE, CNRS, AgroParisTech, GQE - Le Moulon, Université Paris-Saclay, Gif-sur-Yvette, France
| |
Collapse
|
13
|
Flegontov P, Işıldak U, Maier R, Yüncü E, Changmai P, Reich D. Modeling of African population history using f -statistics can be highly biased and is not addressed by previously suggested SNP ascertainment schemes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.22.525077. [PMID: 36711923 PMCID: PMC9882349 DOI: 10.1101/2023.01.22.525077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
f -statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. These statistics can provide strong evidence for either admixture or cladality, which can be robust to substantial rates of errors or missing data. f -statistics are guaranteed to be unbiased under "SNP ascertainment" (analyzing non-randomly chosen subsets of single nucleotide polymorphisms) only if it relies on a population that is an outgroup for all groups analyzed. However, ascertainment on a true outgroup that is not co-analyzed with other populations is often impractical and uncommon in the literature. In this study focused on practical rather than theoretical aspects of SNP ascertainment, we show that many non-outgroup ascertainment schemes lead to false rejection of true demographic histories, as well as to failure to reject incorrect models. But the bias introduced by common ascertainments such as the 1240K panel is mostly limited to situations when more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans) or non-human outgroups are co-modelled, for example, f 4 -statistics involving one non-African group, two African groups, and one archaic group. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, cannot fix all these problems since for some classes of f -statistics it is not a clean outgroup ascertainment, and in other cases it demonstrates relatively low power to reject incorrect demographic models since it provides a relatively small number of variants common in anatomically modern humans. And due to the paucity of high-coverage archaic genomes, archaic individuals used for ascertainment often act as sole representatives of the respective groups in an analysis, and we show that this approach is highly problematic. By carrying out large numbers of simulations of diverse demographic histories, we find that bias in inferences based on f -statistics introduced by non-outgroup ascertainment can be minimized if the derived allele frequency spectrum in the population used for ascertainment approaches the spectrum that existed at the root of all groups being co-analyzed. Ascertaining on sites with variants common in a diverse group of African individuals provides a good approximation to such a set of SNPs, addressing the great majority of biases and also retaining high statistical power for studying population history. Such a "pan-African" ascertainment, although not completely problem-free, allows unbiased exploration of demographic models for the widest set of archaic and modern human populations, as compared to the other ascertainment schemes we explored.
Collapse
Affiliation(s)
- Pavel Flegontov
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
- Kalmyk Research Center of the Russian Academy of Sciences, Elista, Russia
| | - Ulaş Işıldak
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
| | - Robert Maier
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Eren Yüncü
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
| | - Piya Changmai
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
| | - David Reich
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| |
Collapse
|
14
|
Koptekin D, Yüncü E, Rodríguez-Varela R, Altınışık NE, Psonis N, Kashuba N, Yorulmaz S, George R, Kazancı DD, Kaptan D, Gürün K, Vural KB, Gemici HC, Vassou D, Daskalaki E, Karamurat C, Lagerholm VK, Erdal ÖD, Kırdök E, Marangoni A, Schachner A, Üstündağ H, Shengelia R, Bitadze L, Elashvili M, Stravopodi E, Özbaşaran M, Duru G, Nafplioti A, Rose CB, Gencer T, Darbyshire G, Gavashelishvili A, Pitskhelauri K, Çevik Ö, Vuruşkan O, Kyparissi-Apostolika N, Büyükkarakaya AM, Oğuzhanoğlu U, Günel S, Tabakaki E, Aliev A, Ibrahimov A, Shadlinski V, Sampson A, Kılınç GM, Atakuman Ç, Stamatakis A, Poulakakis N, Erdal YS, Pavlidis P, Storå J, Özer F, Götherström A, Somel M. Spatial and temporal heterogeneity in human mobility patterns in Holocene Southwest Asia and the East Mediterranean. Curr Biol 2023; 33:41-57.e15. [PMID: 36493775 PMCID: PMC9839366 DOI: 10.1016/j.cub.2022.11.034] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 08/13/2022] [Accepted: 11/15/2022] [Indexed: 12/13/2022]
Abstract
We present a spatiotemporal picture of human genetic diversity in Anatolia, Iran, Levant, South Caucasus, and the Aegean, a broad region that experienced the earliest Neolithic transition and the emergence of complex hierarchical societies. Combining 35 new ancient shotgun genomes with 382 ancient and 23 present-day published genomes, we found that genetic diversity within each region steadily increased through the Holocene. We further observed that the inferred sources of gene flow shifted in time. In the first half of the Holocene, Southwest Asian and the East Mediterranean populations homogenized among themselves. Starting with the Bronze Age, however, regional populations diverged from each other, most likely driven by gene flow from external sources, which we term "the expanding mobility model." Interestingly, this increase in inter-regional divergence can be captured by outgroup-f3-based genetic distances, but not by the commonly used FST statistic, due to the sensitivity of FST, but not outgroup-f3, to within-population diversity. Finally, we report a temporal trend of increasing male bias in admixture events through the Holocene.
Collapse
Affiliation(s)
- Dilek Koptekin
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, 06800 Ankara, Turkey,Department of Biological Sciences, Middle East Technical University, 06800 Ankara, Turkey,Corresponding author
| | - Eren Yüncü
- Department of Biological Sciences, Middle East Technical University, 06800 Ankara, Turkey
| | - Ricardo Rodríguez-Varela
- Centre for Palaeogenetics, Stockholm, Sweden,Department of Archaeology and Classical Studies, Stockholm University, 10691 Stockholm, Sweden
| | - N. Ezgi Altınışık
- Human-G Laboratory, Department of Anthropology, Hacettepe University, Beytepe 06800, Ankara, Turkey
| | - Nikolaos Psonis
- Ancient DNA Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology – Hellas (FORTH), N. Plastira 100, Vassilika Vouton, GR-70013 Irakleio, Greece
| | - Natalia Kashuba
- Department of Archaeology and Ancient History, Archaeology, Uppsala University, Uppsala, Sweden
| | - Sevgi Yorulmaz
- Department of Biological Sciences, Middle East Technical University, 06800 Ankara, Turkey
| | - Robert George
- Centre for Palaeogenetics, Stockholm, Sweden,School of Medicine, University of Notre Dame, Sydney, Australia
| | - Duygu Deniz Kazancı
- Department of Biological Sciences, Middle East Technical University, 06800 Ankara, Turkey,Human-G Laboratory, Department of Anthropology, Hacettepe University, Beytepe 06800, Ankara, Turkey
| | - Damla Kaptan
- Department of Biological Sciences, Middle East Technical University, 06800 Ankara, Turkey
| | - Kanat Gürün
- Department of Biological Sciences, Middle East Technical University, 06800 Ankara, Turkey
| | - Kıvılcım Başak Vural
- Department of Biological Sciences, Middle East Technical University, 06800 Ankara, Turkey
| | - Hasan Can Gemici
- Department of Settlement Archaeology, Middle East Technical University, 06800 Ankara, Turkey
| | - Despoina Vassou
- Ancient DNA Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology – Hellas (FORTH), N. Plastira 100, Vassilika Vouton, GR-70013 Irakleio, Greece
| | - Evangelia Daskalaki
- Department of Archaeology and Classical Studies, Stockholm University, 10691 Stockholm, Sweden
| | - Cansu Karamurat
- Department of Settlement Archaeology, Middle East Technical University, 06800 Ankara, Turkey
| | - Vendela K. Lagerholm
- Centre for Palaeogenetics, Stockholm, Sweden,Department of Archaeology and Classical Studies, Stockholm University, 10691 Stockholm, Sweden
| | - Ömür Dilek Erdal
- Husbio-L Laboratory, Department of Anthropology, Hacettepe University, 06800 Beytepe, Ankara, Turkey
| | - Emrah Kırdök
- Department of Biotechnology, Mersin University, 33343 Yenişehir, Mersin, Turkey
| | | | - Andreas Schachner
- Deutsches Archäologisches Institut, Inönü Cad. 10, Gümüşsuyu, 34437 İstanbul, Turkey
| | - Handan Üstündağ
- Department of Archaeology, Anadolu University, 26470 Eskişehir, Turkey
| | - Ramaz Shengelia
- Department of the History of Medicine and Bioethics, Tbilisi State Medical University, Tbilisi 0162, Georgia
| | - Liana Bitadze
- Institute of History and Ethnology, Tbilisi State University, Tbilisi, Georgia
| | - Mikheil Elashvili
- Cultural Heritage and Environment Research Center, School of Natural Sciences and Medicine, Ilia State University, Tbilisi, Georgia
| | - Eleni Stravopodi
- Ephorate of Palaeoanthropology and Speleology, Ministry of Culture and Sports, 11636 Athens, Greece
| | | | - Güneş Duru
- Mimar Sinan Fine Arts University, 34134 Istanbul, Turkey
| | - Argyro Nafplioti
- Ancient DNA Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology – Hellas (FORTH), N. Plastira 100, Vassilika Vouton, GR-70013 Irakleio, Greece
| | - C. Brian Rose
- Department of Classical Studies, University of Pennsylvania, Philadelphia, PA, USA
| | - Tuğba Gencer
- Department of History of Medicine and Ethics, Cerrahpasa Faculty of Medicine, Istanbul University, Istanbul, Turkey
| | | | - Alexander Gavashelishvili
- Center of Biodiversity Studies, Institute of Ecology, Ilia State University, Cholokashvili Str. 5, Tbilisi 0162, Georgia
| | | | - Özlem Çevik
- Department of Archaeology, Trakya University, Edirne, Turkey
| | - Osman Vuruşkan
- Department of Archaeology, Trakya University, Edirne, Turkey
| | | | - Ali Metin Büyükkarakaya
- Department of Anthropology, Hacettepe University, 06800 Beytepe, Ankara, Turkey,Human Behavioral Ecology and Archaeometry Laboratory (IDEA Lab), Hacettepe University, Ankara, Turkey
| | - Umay Oğuzhanoğlu
- Department of Archaeology, Pamukkale University, Denizli, Turkey
| | - Sevinç Günel
- Department of Archaeology, Hacettepe University, 06800 Beytepe, Ankara, Turkey
| | - Eugenia Tabakaki
- Ancient DNA Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology – Hellas (FORTH), N. Plastira 100, Vassilika Vouton, GR-70013 Irakleio, Greece
| | - Akper Aliev
- Azerbaijan DNA Project, Family Tree DNA, Houston, TX, USA
| | | | | | - Adamantios Sampson
- Department of Mediterranean Studies, University of Aegean, Dimokratias st., 85100 Rhodes, Greece
| | - Gülşah Merve Kılınç
- Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, 06100 Ankara, Turkey
| | - Çiğdem Atakuman
- Institute of Social Sciences, Middle East Technical University, 06800 Ankara, Turkey
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, 69118 Heidelberg, Germany,Institute for Theoretical Informatics, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany
| | - Nikos Poulakakis
- Ancient DNA Lab, Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology – Hellas (FORTH), N. Plastira 100, Vassilika Vouton, GR-70013 Irakleio, Greece,Natural History Museum of Crete, School of Sciences and Engineering, University of Crete, Knossos Avenue, 71409 Irakleio, Greece,Department of Biology, School of Sciences and Engineering, University of Crete, Vassilika Vouton, 70013 Irakleio, Greece
| | - Yılmaz Selim Erdal
- Human-G Laboratory, Department of Anthropology, Hacettepe University, Beytepe 06800, Ankara, Turkey,Husbio-L Laboratory, Department of Anthropology, Hacettepe University, 06800 Beytepe, Ankara, Turkey
| | - Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), 70013 Heraklion, Greece
| | - Jan Storå
- Osteoarchaeological Research Laboratory, Department of Archaeology and Classical Studies, Stockholm University, 10691 Stockholm, Sweden
| | - Füsun Özer
- Human-G Laboratory, Department of Anthropology, Hacettepe University, Beytepe 06800, Ankara, Turkey
| | - Anders Götherström
- Centre for Palaeogenetics, Stockholm, Sweden,Department of Archaeology and Classical Studies, Stockholm University, 10691 Stockholm, Sweden,Corresponding author
| | - Mehmet Somel
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, 06800 Ankara, Turkey,Department of Biological Sciences, Middle East Technical University, 06800 Ankara, Turkey,Corresponding author
| |
Collapse
|
15
|
Söylev A, Çokoglu SS, Koptekin D, Alkan C, Somel M. CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data. PLoS Comput Biol 2022; 18:e1010788. [PMID: 36516232 PMCID: PMC9873172 DOI: 10.1371/journal.pcbi.1010788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 01/24/2023] [Accepted: 12/03/2022] [Indexed: 12/15/2022] Open
Abstract
To date, ancient genome analyses have been largely confined to the study of single nucleotide polymorphisms (SNPs). Copy number variants (CNVs) are a major contributor of disease and of evolutionary adaptation, but identifying CNVs in ancient shotgun-sequenced genomes is hampered by typical low genome coverage (<1×) and short fragments (<80 bps), precluding standard CNV detection software to be effectively applied to ancient genomes. Here we present CONGA, tailored for genotyping CNVs at low coverage. Simulations and down-sampling experiments suggest that CONGA can genotype deletions >1 kbps with F-scores >0.75 at ≥1×, and distinguish between heterozygous and homozygous states. We used CONGA to genotype 10,002 outgroup-ascertained deletions across a heterogenous set of 71 ancient human genomes spanning the last 50,000 years, produced using variable experimental protocols. A fraction of these (21/71) display divergent deletion profiles unrelated to their population origin, but attributable to technical factors such as coverage and read length. The majority of the sample (50/71), despite originating from nine different laboratories and having coverages ranging from 0.44×-26× (median 4×) and average read lengths 52-121 bps (median 69), exhibit coherent deletion frequencies. Across these 50 genomes, inter-individual genetic diversity measured using SNPs and CONGA-genotyped deletions are highly correlated. CONGA-genotyped deletions also display purifying selection signatures, as expected. CONGA thus paves the way for systematic CNV analyses in ancient genomes, despite the technical challenges posed by low and variable genome coverage.
Collapse
Affiliation(s)
- Arda Söylev
- Department of Computer Engineering, Konya Food and Agriculture University, Konya, Turkey
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- * E-mail: (AS); (MS)
| | | | - Dilek Koptekin
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara, Turkey
| | - Mehmet Somel
- Department of Biology, Middle East Technical University, Ankara, Turkey
- * E-mail: (AS); (MS)
| |
Collapse
|
16
|
Baccichet I, Chiozzotto R, Scaglione D, Bassi D, Rossini L, Cirilli M. Genetic dissection of fruit maturity date in apricot (P. armeniaca L.) through a Single Primer Enrichment Technology (SPET) approach. BMC Genomics 2022; 23:712. [PMID: 36258163 PMCID: PMC9580121 DOI: 10.1186/s12864-022-08901-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 09/08/2022] [Indexed: 11/10/2022] Open
Abstract
Background Single primer enrichment technology (SPET) is an emerging and increasingly popular solution for high-throughput targeted genotyping in plants. Although SPET requires a priori identification of polymorphisms for probe design, this technology has potentially higher reproducibility and transferability compared to other reduced representation sequencing (RRS) approaches, also enabling the discovery of closely linked polymorphisms surrounding the target one. Results The potential for SPET application in fruit trees was evaluated by developing a 25K target SNPs assay to genotype a panel of apricot accessions and progenies. A total of 32,492 polymorphic sites were genotyped in 128 accessions (including 8,188 accessory non-target SNPs) with extremely low levels of missing data and a significant correlation of allelic frequencies compared to whole-genome sequencing data used for array design. Assay performance was further validated by estimating genotyping errors in two biparental progenies, resulting in an overall 1.8% rate. SPET genotyping data were used to infer population structure and to dissect the architecture of fruit maturity date (MD), a quantitative reproductive phenological trait of great agronomical interest in apricot species. Depending on the year, GWAS revealed loci associated to MD on several chromosomes. The QTLs on chromosomes 1 and 4 (the latter explaining most of the phenotypic variability in the panel) were the most consistent over years and were further confirmed by linkage mapping in two segregating progenies. Conclusions Besides the utility for marker assisted selection and for paving the way to in-depth studies to clarify the molecular bases of MD trait variation in apricot, the results provide an overview of the performance and reliability of SPET for fruit tree genetics. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08901-1.
Collapse
Affiliation(s)
| | | | | | - Daniele Bassi
- Università degli Studi di Milan - DiSAA, Milano, Italy
| | - Laura Rossini
- Università degli Studi di Milan - DiSAA, Milano, Italy.
| | - Marco Cirilli
- Università degli Studi di Milan - DiSAA, Milano, Italy.
| |
Collapse
|
17
|
Yang Z, Chen H, Lu Y, Gao Y, Sun H, Wang J, Jin L, Chu J, Xu S. Genetic evidence of tri-genealogy hypothesis on the origin of ethnic minorities in Yunnan. BMC Biol 2022; 20:166. [PMID: 35864541 PMCID: PMC9306206 DOI: 10.1186/s12915-022-01367-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 07/05/2022] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Yunnan is located in Southwest China and consists of great cultural, linguistic, and genetic diversity. However, the genomic diversity of ethnic minorities in Yunnan is largely under-investigated. To gain insights into population history and local adaptation of Yunnan minorities, we analyzed 242 whole-exome sequencing data with high coverage (~ 100-150 ×) of Yunnan minorities representing Achang, Jingpo, Dai, and Deang, who were linguistically assumed to be derived from three ancient lineages (the tri-genealogy hypothesis), i.e., Di-Qiang, Bai-Yue, and Bai-Pu. RESULTS Yunnan minorities show considerable genetic differences. Di-Qiang populations likely migrated from the Tibetan area about 6700 years ago. Genetic divergence between Bai-Yue and Di-Qiang was estimated to be 7000 years, and that between Bai-Yue and Bai-Pu was estimated to be 5500 years. Bai-Pu is relatively isolated, but gene flow from surrounding Di-Qiang and Bai-Yue populations was also found. Furthermore, we identified genetic variants that are differentiated within Yunnan minorities possibly due to the living circumstances and habits. Notably, we found that adaptive variants related to malaria and glucose metabolism suggest the adaptation to thalassemia and G6PD deficiency resulting from malaria resistance in the Dai population. CONCLUSIONS We provided genetic evidence of the tri-genealogy hypothesis as well as new insights into the genetic history and local adaptation of the Yunnan minorities.
Collapse
Affiliation(s)
- Zhaoqing Yang
- Department of Medical Genetics, Institute of Medical Biology, Chinese Academy of Medical Sciences, Kunming, 650118, China
| | - Hao Chen
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Yan Lu
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Center for Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai, 200438, China
| | - Yang Gao
- Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, 201203, China
| | - Hao Sun
- Department of Medical Genetics, Institute of Medical Biology, Chinese Academy of Medical Sciences, Kunming, 650118, China
| | - Jiucun Wang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Center for Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, 201203, China
| | - Li Jin
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Center for Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, 201203, China
| | - Jiayou Chu
- Department of Medical Genetics, Institute of Medical Biology, Chinese Academy of Medical Sciences, Kunming, 650118, China.
| | - Shuhua Xu
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, Center for Evolutionary Biology, School of Life Sciences, Fudan University, Shanghai, 200438, China.
- Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, 201203, China.
- Department of Liver Surgery and Transplantation Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai, 200032, China.
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.
| |
Collapse
|
18
|
Li Y, Ruperao P, Batley J, Edwards D, Martin W, Hobson K, Sutton T. Genomic prediction of preliminary yield trials in chickpea: Effect of functional annotation of SNPs and environment. THE PLANT GENOME 2022; 15:e20166. [PMID: 34786880 DOI: 10.1002/tpg2.20166] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 09/14/2021] [Indexed: 06/13/2023]
Abstract
Achieving yield potential in chickpea (Cicer arietinum L.) is limited by many constraints that include biotic and abiotic stresses. Combining next-generation sequencing technology with advanced statistical modeling has the potential to increase genetic gain efficiently. Whole genome resequencing data was obtained from 315 advanced chickpea breeding lines from the Australian chickpea breeding program resulting in more than 298,000 single nucleotide polymorphisms (SNPs) discovered. Analysis of population structure revealed a distinct group of breeding lines with many alleles that are absent from recently released Australian cultivars. Genome-wide association studies (GWAS) using these Australian breeding lines identified 20 SNPs significantly associated with grain yield in multiple field environments. A reduced level of nucleotide diversity and extended linkage disequilibrium suggested that some regions in these chickpea genomes may have been through selective breeding for yield or other traits. A large introgression segment that introduced from C. echinospermum for phytophthora root rot resistance was identified on chromosome 6, yet it also has unintended consequences of reducing yield due to linkage drag. We further investigated the effect of genotype by environment interaction on genomic prediction of yield. We found that the training set had better prediction accuracy when phenotyped under conditions relevant to the targeted environments. We also investigated the effect of SNP functional annotation on prediction accuracy using different subsets of SNPs based on their genomic locations: regulatory regions, exome, and alternative splice sites. Compared with the whole SNP dataset, a subset of SNPs did not significantly decrease prediction accuracy for grain yield despite consisting of a smaller number of SNPs.
Collapse
Affiliation(s)
- Yongle Li
- School of Agriculture, Food and Wine, The Univ. of Adelaide, Adelaide, SA, 5064, Australia
| | - Pradeep Ruperao
- Statistics, Bioinformatics and Data Management, ICRISAT, Hyderabad, 502324, India
| | - Jacqueline Batley
- School of Biological Sciences, The Univ. of Western Australia, Perth, WA, 6001, Australia
| | - David Edwards
- School of Biological Sciences, The Univ. of Western Australia, Perth, WA, 6001, Australia
| | - William Martin
- Dep. of Agriculture and Fisheries, Warwick, Qld, 4370, Australia
| | - Kristy Hobson
- NSW Dep. of Primary Industries, Tamworth, NSW, 2340, Australia
| | - Tim Sutton
- School of Agriculture, Food and Wine, The Univ. of Adelaide, Adelaide, SA, 5064, Australia
- South Australian Research and Development Institute, Adelaide, SA, 5064, Australia
| |
Collapse
|
19
|
Kratochwil CF, Kautt AF, Rometsch SJ, Meyer A. Benefits and limitations of a new genome-based PCR-RFLP genotyping assay (GB-RFLP): A SNP-based detection method for identification of species in extremely young adaptive radiations. Ecol Evol 2022; 12:e8751. [PMID: 35356554 PMCID: PMC8941502 DOI: 10.1002/ece3.8751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 03/02/2022] [Indexed: 11/18/2022] Open
Abstract
High-throughput DNA sequencing technologies make it possible now to sequence entire genomes relatively easily. Complete genomic information obtained by whole-genome resequencing (WGS) can aid in identifying and delineating species even if they are extremely young, cryptic, or morphologically difficult to discern and closely related. Yet, for taxonomic or conservation biology purposes, WGS can remain cost-prohibitive, too time-consuming, and often constitute a "data overkill." Rapid and reliable identification of species (and populations) that is also cost-effective is made possible by species-specific markers that can be discovered by WGS. Based on WGS data, we designed a PCR restriction fragment length polymorphism (PCR-RFLP) assay for 19 Neotropical Midas cichlid populations (Amphilophus cf. citrinellus), that includes all 13 described species of this species complex. Our work illustrates that identification of species and populations (i.e., fish from different lakes) can be greatly improved by designing genetic markers using available "high resolution" genomic information. Yet, our work also shows that even in the best-case scenario, when whole-genome resequencing information is available, unequivocal assignments remain challenging when species or populations diverged very recently, or gene flow persists. In summary, we provide a comprehensive workflow on how to design RFPL markers based on genome resequencing data, how to test and evaluate their reliability, and discuss the benefits and pitfalls of our approach.
Collapse
Affiliation(s)
- Claudius F. Kratochwil
- Zoology and Evolutionary BiologyDepartment of BiologyUniversity of KonstanzKonstanzGermany
- Present address:
Institute of BiotechnologyHiLIFEUniversity of HelsinkiHelsinkiFinland
| | - Andreas F. Kautt
- Zoology and Evolutionary BiologyDepartment of BiologyUniversity of KonstanzKonstanzGermany
- Present address:
Department of Organismic and Evolutionary BiologyHarvard UniversityCambridgeMassachusettsUSA
| | - Sina J. Rometsch
- Zoology and Evolutionary BiologyDepartment of BiologyUniversity of KonstanzKonstanzGermany
| | - Axel Meyer
- Zoology and Evolutionary BiologyDepartment of BiologyUniversity of KonstanzKonstanzGermany
| |
Collapse
|
20
|
Kastally C, Niskanen AK, Perry A, Kujala ST, Avia K, Cervantes S, Haapanen M, Kesälahti R, Kumpula TA, Mattila TM, Ojeda DI, Tyrmi JS, Wachowiak W, Cavers S, Kärkkäinen K, Savolainen O, Pyhäjärvi T. Taming the massive genome of Scots pine with PiSy50k, a new genotyping array for conifer research. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 109:1337-1350. [PMID: 34897859 PMCID: PMC9303803 DOI: 10.1111/tpj.15628] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 11/05/2021] [Accepted: 12/02/2021] [Indexed: 06/14/2023]
Abstract
Pinus sylvestris (Scots pine) is the most widespread coniferous tree in the boreal forests of Eurasia, with major economic and ecological importance. However, its large and repetitive genome presents a challenge for conducting genome-wide analyses such as association studies, genetic mapping and genomic selection. We present a new 50K single-nucleotide polymorphism (SNP) genotyping array for Scots pine research, breeding and other applications. To select the SNP set, we first genotyped 480 Scots pine samples on a 407 540 SNP screening array and identified 47 712 high-quality SNPs for the final array (called 'PiSy50k'). Here, we provide details of the design and testing, as well as allele frequency estimates from the discovery panel, functional annotation, tissue-specific expression patterns and expression level information for the SNPs or corresponding genes, when available. We validated the performance of the PiSy50k array using samples from Finland and Scotland. Overall, 39 678 (83.2%) SNPs showed low error rates (mean = 0.9%). Relatedness estimates based on array genotypes were consistent with the expected pedigrees, and the level of Mendelian error was negligible. In addition, array genotypes successfully discriminate between Scots pine populations of Finnish and Scottish origins. The PiSy50k SNP array will be a valuable tool for a wide variety of future genetic studies and forestry applications.
Collapse
Affiliation(s)
- Chedly Kastally
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
| | - Alina K. Niskanen
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
| | - Annika Perry
- UK Centre for Ecology & HydrologyBush EstatePenicuikMidlothianEH26 0QBUK
| | - Sonja T. Kujala
- Natural Resources Institute Finland (Luke)Paavo Havaksen tie 390570OuluFinland
| | - Komlan Avia
- Université de StrasbourgINRAESVQV UMR‐A 1131F‐68000ColmarFrance
| | - Sandra Cervantes
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
| | - Matti Haapanen
- Natural Resources Institute Finland (Luke)Latokartanonkaari 9FI‐00790HelsinkiFinland
| | - Robert Kesälahti
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
| | - Timo A. Kumpula
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
| | - Tiina M. Mattila
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
- Department of Organismal BiologyEBCUppsala UniversityNorbyvägen 18 AUppsala752 36Sweden
| | - Dario I. Ojeda
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
- Norwegian Institute of Bioeconomy ResearchP.O. Box 115Ås1431Norway
| | - Jaakko S. Tyrmi
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
| | - Witold Wachowiak
- Institute of Environmental BiologyFaculty of BiologyAdam Mickiewicz University in PoznańUniwersytetu Poznańskiego 661‐614PoznańPoland
| | - Stephen Cavers
- UK Centre for Ecology & HydrologyBush EstatePenicuikMidlothianEH26 0QBUK
| | - Katri Kärkkäinen
- Natural Resources Institute Finland (Luke)Paavo Havaksen tie 390570OuluFinland
| | - Outi Savolainen
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
| | - Tanja Pyhäjärvi
- Department of Ecology and GeneticsUniversity of OuluP.O. Box 300090014OuluFinland
- Department of Forest SciencesUniversity of HelsinkiP.O. Box 2700014HelsinkiFinland
| |
Collapse
|
21
|
Keeble-Gagnère G, Pasam R, Forrest KL, Wong D, Robinson H, Godoy J, Rattey A, Moody D, Mullan D, Walmsley T, Daetwyler HD, Tibbits J, Hayden MJ. Novel Design of Imputation-Enabled SNP Arrays for Breeding and Research Applications Supporting Multi-Species Hybridization. FRONTIERS IN PLANT SCIENCE 2021; 12:756877. [PMID: 35003156 PMCID: PMC8728019 DOI: 10.3389/fpls.2021.756877] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 10/27/2021] [Indexed: 05/26/2023]
Abstract
Array-based single nucleotide polymorphism (SNP) genotyping platforms have low genotype error and missing data rates compared to genotyping-by-sequencing technologies. However, design decisions used to create array-based SNP genotyping assays for both research and breeding applications are critical to their success. We describe a novel approach applicable to any animal or plant species for the design of cost-effective imputation-enabled SNP genotyping arrays with broad utility and demonstrate its application through the development of the Illumina Infinium Wheat Barley 40K SNP array Version 1.0. We show that the approach delivers high quality and high resolution data for wheat and barley, including when samples are jointly hybridised. The new array aims to maximally capture haplotypic diversity in globally diverse wheat and barley germplasm while minimizing ascertainment bias. Comprising mostly biallelic markers that were designed to be species-specific and single-copy, the array permits highly accurate imputation in diverse germplasm to improve the statistical power of genome-wide association studies (GWAS) and genomic selection. The SNP content captures tetraploid wheat (A- and B-genome) and Aegilops tauschii Coss. (D-genome) diversity and delineates synthetic and tetraploid wheat from other wheat, as well as tetraploid species and subgroups. The content includes SNP tagging key trait loci in wheat and barley, as well as direct connections to other genotyping platforms and legacy datasets. The utility of the array is enhanced through the web-based tool, Pretzel (https://plantinformatics.io/) which enables the content of the array to be visualized and interrogated interactively in the context of numerous genetic and genomic resources to be connected more seamlessly to research and breeding. The array is available for use by the international wheat and barley community.
Collapse
Affiliation(s)
| | - Raj Pasam
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | - Kerrie L. Forrest
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | - Debbie Wong
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | | | | | | | | | | | | | - Hans D. Daetwyler
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| | - Josquin Tibbits
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | - Matthew J. Hayden
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, Australia
| |
Collapse
|
22
|
Bird KA, Hardigan MA, Ragsdale AP, Knapp SJ, VanBuren R, Edger PP. Diversification, spread, and admixture of octoploid strawberry in the Western Hemisphere. AMERICAN JOURNAL OF BOTANY 2021; 108:2269-2281. [PMID: 34636416 PMCID: PMC9299191 DOI: 10.1002/ajb2.1776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 08/13/2021] [Accepted: 08/18/2021] [Indexed: 05/11/2023]
Abstract
PREMISE Polyploid species often have complex evolutionary histories that have, until recently, been intractable due to limitations of genomic resources. While recent work has further uncovered the evolutionary history of the octoploid strawberry (Fragaria L.), there are still open questions. Much is unknown about the evolutionary relationship of the wild octoploid species, Fragaria virginiana and Fragaria chiloensis, and gene flow within and among species after the formation of the octoploid genome. METHODS We leveraged a collection of wild octoploid ecotypes of strawberry representing the recognized subspecies and ranging from Alaska to southern Chile, and a high-density SNP array to investigate wild octoploid strawberry evolution. Evolutionary relationships were interrogated with phylogenetic analysis and genetic clustering algorithms. Additionally, admixture among and within species is assessed with model-based and tree-based approaches. RESULTS Phylogenetic analysis revealed that the two octoploid strawberry species are monophyletic sister lineages. The genetic clustering results show substructure between North and South American F. chiloensis populations. Additionally, model-based and tree-based methods support gene flow within and among the two octoploid species, including newly identified admixture in the Hawaiian F. chiloensis subsp. sandwicensis population. CONCLUSIONS F. virginiana and F. chiloensis are supported as monophyletic and sister lineages. All but one of the subspecies show extensive paraphyly. Furthermore, phylogenetic relationships among F. chiloensis populations supports a single population range expansion southward from North America. The inter- and intraspecific relationships of octoploid strawberry are complex and suggest substantial gene flow between sympatric populations among and within species.
Collapse
Affiliation(s)
- Kevin A. Bird
- Department of HorticultureMichigan State UniversityEast LansingMichigan48823USA
- Ecology, Evolution and Behavior ProgramMichigan State UniversityEast LansingMichigan48823USA
| | | | - Aaron P. Ragsdale
- National Laboratory of Genomics for Biodiversity (LANGEBIO)Unit of Advanced Genomics, CINVESTAVIrapuatoMexico
| | - Steven J. Knapp
- Department of Plant SciencesUniversity of CaliforniaDavisCalifornia95616USA
| | - Robert VanBuren
- Department of HorticultureMichigan State UniversityEast LansingMichigan48823USA
- Plant Resilience InstituteMichigan State UniversityEast LansingMichigan48824USA
| | - Patrick P. Edger
- Department of HorticultureMichigan State UniversityEast LansingMichigan48823USA
- Ecology, Evolution and Behavior ProgramMichigan State UniversityEast LansingMichigan48823USA
| |
Collapse
|
23
|
Andrews AJ, Puncher GN, Bernal-Casasola D, Di Natale A, Massari F, Onar V, Toker NY, Hanke A, Pavey SA, Savojardo C, Martelli PL, Casadio R, Cilli E, Morales-Muñiz A, Mantovani B, Tinti F, Cariani A. Ancient DNA SNP-panel data suggests stability in bluefin tuna genetic diversity despite centuries of fluctuating catches in the eastern Atlantic and Mediterranean. Sci Rep 2021; 11:20744. [PMID: 34671077 PMCID: PMC8528830 DOI: 10.1038/s41598-021-99708-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 09/25/2021] [Indexed: 11/10/2022] Open
Abstract
Atlantic bluefin tuna (Thunnus thynnus; BFT) abundance was depleted in the late 20th and early 21st century due to overfishing. Historical catch records further indicate that the abundance of BFT in the Mediterranean has been fluctuating since at least the 16th century. Here we build upon previous work on ancient DNA of BFT in the Mediterranean by comparing contemporary (2009–2012) specimens with archival (1911–1926) and archaeological (2nd century BCE–15th century CE) specimens that represent population states prior to these two major periods of exploitation, respectively. We successfully genotyped and analysed 259 contemporary and 123 historical (91 archival and 32 archaeological) specimens at 92 SNP loci that were selected for their ability to differentiate contemporary populations or their association with core biological functions. We found no evidence of genetic bottlenecks, inbreeding or population restructuring between temporal sample groups that might explain what has driven catch fluctuations since the 16th century. We also detected a putative adaptive response, involving the cytoskeletal protein synemin which may be related to muscle stress. However, these results require further investigation with more extensive genome-wide data to rule out demographic changes due to overfishing, and other natural and anthropogenic factors, in addition to elucidating the adaptive drivers related to these.
Collapse
Affiliation(s)
- Adam J Andrews
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Ravenna, Italy. .,Department of Cultural Heritage, University of Bologna, Ravenna, Italy.
| | - Gregory N Puncher
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Ravenna, Italy. .,Department of Biological Sciences, Canadian Rivers Institute, University of New Brunswick, Saint John, NB, Canada.
| | - Darío Bernal-Casasola
- Department of History, Geography and Philosophy, Faculty of Philosophy and Letters, University of Cádiz, Cádiz, Spain
| | | | - Francesco Massari
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Ravenna, Italy
| | - Vedat Onar
- Osteoarcheology Practice and Research Centre and Faculty of Veterinary Medicine, Istanbul University-Cerrahpaşa, Avcılar, Istanbul, Turkey
| | - Nezir Yaşar Toker
- Osteoarcheology Practice and Research Centre and Faculty of Veterinary Medicine, Istanbul University-Cerrahpaşa, Avcılar, Istanbul, Turkey
| | - Alex Hanke
- St. Andrews Biological Station, Fisheries and Oceans Canada, St. Andrews, NB, Canada
| | - Scott A Pavey
- Department of Biological Sciences, Canadian Rivers Institute, University of New Brunswick, Saint John, NB, Canada
| | | | | | - Rita Casadio
- Biocomputing Group, University of Bologna, Bologna, Italy
| | - Elisabetta Cilli
- Department of Cultural Heritage, University of Bologna, Ravenna, Italy
| | | | - Barbara Mantovani
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Fausto Tinti
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Ravenna, Italy
| | - Alessia Cariani
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Ravenna, Italy
| |
Collapse
|
24
|
Dokan K, Kawamura S, Teshima KM. Effects of single nucleotide polymorphism ascertainment on population structure inferences. G3-GENES GENOMES GENETICS 2021; 11:6237890. [PMID: 33871576 PMCID: PMC8496283 DOI: 10.1093/g3journal/jkab128] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 04/08/2021] [Indexed: 11/14/2022]
Abstract
Single nucleotide polymorphism (SNP) data are widely used in research on natural populations. Although they are useful, SNP genotyping data are known to contain bias, normally referred to as ascertainment bias, because they are conditioned by already confirmed variants. This bias is introduced during the genotyping process, including the selection of populations for novel SNP discovery and the number of individuals involved in the discovery panel and selection of SNP markers. It is widely recognized that ascertainment bias can cause inaccurate inferences in population genetics and several methods to address these bias issues have been proposed. However, especially in natural populations, it is not always possible to apply an ideal ascertainment scheme because natural populations tend to have complex structures and histories. In addition, it was not fully assessed if ascertainment bias has the same effect on different types of population structure. Here, we examine the effects of bias produced during the selection of population for SNP discovery and consequent SNP marker selection processes under three demographic models: the island, stepping-stone, and population split models. Results show that site frequency spectra and summary statistics contain biases that depend on the joint effect of population structure and ascertainment schemes. Additionally, population structure inferences are also affected by ascertainment bias. Based on these results, it is recommended to evaluate the validity of the ascertainment strategy prior to the actual typing process because the direction and extent of ascertainment bias vary depending on several factors.
Collapse
Affiliation(s)
- Kotaro Dokan
- Graduate School of System Life Science, Kyushu University, Fukuoka 819-0395, Japan
| | - Sayu Kawamura
- Graduate School of System Life Science, Kyushu University, Fukuoka 819-0395, Japan
| | - Kosuke M Teshima
- Department of Biology, Kyushu University, Fukuoka 819-0395, Japan
| |
Collapse
|
25
|
Carress H, Lawson DJ, Elhaik E. Population genetic considerations for using biobanks as international resources in the pandemic era and beyond. BMC Genomics 2021; 22:351. [PMID: 34001009 PMCID: PMC8127217 DOI: 10.1186/s12864-021-07618-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 04/14/2021] [Indexed: 12/11/2022] Open
Abstract
The past years have seen the rise of genomic biobanks and mega-scale meta-analysis of genomic data, which promises to reveal the genetic underpinnings of health and disease. However, the over-representation of Europeans in genomic studies not only limits the global understanding of disease risk but also inhibits viable research into the genomic differences between carriers and patients. Whilst the community has agreed that more diverse samples are required, it is not enough to blindly increase diversity; the diversity must be quantified, compared and annotated to lead to insight. Genetic annotations from separate biobanks need to be comparable and computable and to operate without access to raw data due to privacy concerns. Comparability is key both for regular research and to allow international comparison in response to pandemics. Here, we evaluate the appropriateness of the most common genomic tools used to depict population structure in a standardized and comparable manner. The end goal is to reduce the effects of confounding and learn from genuine variation in genetic effects on phenotypes across populations, which will improve the value of biobanks (locally and internationally), increase the accuracy of association analyses and inform developmental efforts.
Collapse
Affiliation(s)
- Hannah Carress
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | - Daniel John Lawson
- School of Mathematics and Integrative Epidemiology Unit, University of Bristol, Bristol, UK
| | - Eran Elhaik
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK. .,Department of Biology, Lund University, Lund, Sweden.
| |
Collapse
|
26
|
Clemente F, Unterländer M, Dolgova O, Amorim CEG, Coroado-Santos F, Neuenschwander S, Ganiatsou E, Cruz Dávalos DI, Anchieri L, Michaud F, Winkelbach L, Blöcher J, Arizmendi Cárdenas YO, Sousa da Mota B, Kalliga E, Souleles A, Kontopoulos I, Karamitrou-Mentessidi G, Philaniotou O, Sampson A, Theodorou D, Tsipopoulou M, Akamatis I, Halstead P, Kotsakis K, Urem-Kotsou D, Panagiotopoulos D, Ziota C, Triantaphyllou S, Delaneau O, Jensen JD, Moreno-Mayar JV, Burger J, Sousa VC, Lao O, Malaspinas AS, Papageorgopoulou C. The genomic history of the Aegean palatial civilizations. Cell 2021; 184:2565-2586.e21. [PMID: 33930288 PMCID: PMC8127963 DOI: 10.1016/j.cell.2021.03.039] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 09/17/2020] [Accepted: 03/18/2021] [Indexed: 12/30/2022]
Abstract
The Cycladic, the Minoan, and the Helladic (Mycenaean) cultures define the Bronze Age (BA) of Greece. Urbanism, complex social structures, craft and agricultural specialization, and the earliest forms of writing characterize this iconic period. We sequenced six Early to Middle BA whole genomes, along with 11 mitochondrial genomes, sampled from the three BA cultures of the Aegean Sea. The Early BA (EBA) genomes are homogeneous and derive most of their ancestry from Neolithic Aegeans, contrary to earlier hypotheses that the Neolithic-EBA cultural transition was due to massive population turnover. EBA Aegeans were shaped by relatively small-scale migration from East of the Aegean, as evidenced by the Caucasus-related ancestry also detected in Anatolians. In contrast, Middle BA (MBA) individuals of northern Greece differ from EBA populations in showing ∼50% Pontic-Caspian Steppe-related ancestry, dated at ca. 2,600-2,000 BCE. Such gene flow events during the MBA contributed toward shaping present-day Greek genomes.
Collapse
Affiliation(s)
- Florian Clemente
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Martina Unterländer
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece; Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Olga Dolgova
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028 Barcelona, Spain
| | - Carlos Eduardo G Amorim
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Francisco Coroado-Santos
- CE3C, Centre for Ecology, Evolution and Environmental Changes, Faculty of Sciences of the University of Lisbon, 1749-016 Lisbon, Portugal
| | - Samuel Neuenschwander
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland; Vital-IT, Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Elissavet Ganiatsou
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Diana I Cruz Dávalos
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Lucas Anchieri
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Frédéric Michaud
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Laura Winkelbach
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Jens Blöcher
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Yami Ommar Arizmendi Cárdenas
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Bárbara Sousa da Mota
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Eleni Kalliga
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Angelos Souleles
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Ioannis Kontopoulos
- Center for GeoGenetics, GLOBE Institute, University of Copenhagen, 1350 Copenhagen, Denmark
| | | | - Olga Philaniotou
- Ephor Emerita of Antiquities, Hellenic Ministry of Culture and Sports, 10682 Athens, Greece
| | - Adamantios Sampson
- Department of Mediterranean Studies, University of the Aegean, 85132 Rhodes, Greece
| | - Dimitra Theodorou
- Ephorate of Antiquities of Kozani, Hellenic Ministry of Culture and Sports, 50004 Kozani, Greece
| | - Metaxia Tsipopoulou
- Ephor Emerita of Antiquities, Hellenic Ministry of Culture and Sports, 10682 Athens, Greece
| | - Ioannis Akamatis
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Paul Halstead
- Department of Archaeology, University of Sheffield, Minalloy House, 10-16 Regent St., Sheffield S1 3NJ, UK
| | - Kostas Kotsakis
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Dushka Urem-Kotsou
- Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Diamantis Panagiotopoulos
- Institute of Classical Archaeology, University of Heidelberg, Marstallhof 4, 69117 Heidelberg, Germany
| | - Christina Ziota
- Ephorate of Antiquities of Florina, Hellenic Ministry of Culture and Sports, 53100 Florina, Greece
| | - Sevasti Triantaphyllou
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - J Víctor Moreno-Mayar
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland; Center for GeoGenetics, GLOBE Institute, University of Copenhagen, 1350 Copenhagen, Denmark; National Institute of Genomic Medicine (INMEGEN), 14610 Mexico City, Mexico
| | - Joachim Burger
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Vitor C Sousa
- CE3C, Centre for Ecology, Evolution and Environmental Changes, Faculty of Sciences of the University of Lisbon, 1749-016 Lisbon, Portugal
| | - Oscar Lao
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028 Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Anna-Sapfo Malaspinas
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
| | - Christina Papageorgopoulou
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece.
| |
Collapse
|
27
|
Geibel J, Reimer C, Pook T, Weigend S, Weigend A, Simianer H. How imputation can mitigate SNP ascertainment Bias. BMC Genomics 2021; 22:340. [PMID: 33980139 PMCID: PMC8114708 DOI: 10.1186/s12864-021-07663-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 04/28/2021] [Indexed: 12/30/2022] Open
Abstract
Background Population genetic studies based on genotyped single nucleotide polymorphisms (SNPs) are influenced by a non-random selection of the SNPs included in the used genotyping arrays. The resulting bias in the estimation of allele frequency spectra and population genetics parameters like heterozygosity and genetic distances relative to whole genome sequencing (WGS) data is known as SNP ascertainment bias. Full correction for this bias requires detailed knowledge of the array design process, which is often not available in practice. This study suggests an alternative approach to mitigate ascertainment bias of a large set of genotyped individuals by using information of a small set of sequenced individuals via imputation without the need for prior knowledge on the array design. Results The strategy was first tested by simulating additional ascertainment bias with a set of 1566 chickens from 74 populations that were genotyped for the positions of the Affymetrix Axiom™ 580 k Genome-Wide Chicken Array. Imputation accuracy was shown to be consistently higher for populations used for SNP discovery during the simulated array design process. Reference sets of at least one individual per population in the study set led to a strong correction of ascertainment bias for estimates of expected and observed heterozygosity, Wright’s Fixation Index and Nei’s Standard Genetic Distance. In contrast, unbalanced reference sets (overrepresentation of populations compared to the study set) introduced a new bias towards the reference populations. Finally, the array genotypes were imputed to WGS by utilization of reference sets of 74 individuals (one per population) to 98 individuals (additional commercial chickens) and compared with a mixture of individually and pooled sequenced populations. The imputation reduced the slope between heterozygosity estimates of array data and WGS data from 1.94 to 1.26 when using the smaller balanced reference panel and to 1.44 when using the larger but unbalanced reference panel. This generally supported the results from simulation but was less favorable, advocating for a larger reference panel when imputing to WGS. Conclusions The results highlight the potential of using imputation for mitigation of SNP ascertainment bias but also underline the need for unbiased reference sets. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07663-6.
Collapse
Affiliation(s)
- Johannes Geibel
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany. .,Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany.
| | - Christian Reimer
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany.,Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany
| | - Torsten Pook
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany.,Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany
| | - Steffen Weigend
- Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany.,Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, Höltystrasse 10, 31535, Neustadt-Mariensee, Germany
| | - Annett Weigend
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, Höltystrasse 10, 31535, Neustadt-Mariensee, Germany
| | - Henner Simianer
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany.,Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Göttingen, Germany
| |
Collapse
|
28
|
Sample identification and pedigree reconstruction in Wolverine (Gulo gulo) using SNP genotyping of non-invasive samples. CONSERV GENET RESOUR 2021. [DOI: 10.1007/s12686-021-01208-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
AbstractFor conservation genetic studies using non-invasively collected samples, genome-wide data may be hard to acquire. Until now, such studies have instead mostly relied on analyses of traditional genetic markers such as microsatellites (SSRs). Recently, high throughput genotyping of single nucleotide polymorphisms (SNPs) has become available, expanding the use of genomic methods to include non-model species of conservation concern. We have developed a 96-marker SNP array for use in applied conservation monitoring of the Scandinavian wolverine (Gulo gulo) population. By genotyping more than a thousand non-invasively collected samples, we were able to obtain precise estimates of different types of genotyping errors and sample dropout rates. The SNP panel significantly outperforms the SSR markers (and DBY intron markers for sexing) both in terms of precision in genotyping, sex assignment and individual identification, as well as in the proportion of samples successfully genotyped. Furthermore, SNP genotyping offers a simplified laboratory and analysis pipeline with fewer samples needed to be repeatedly genotyped in order to obtain reliable consensus data. In addition, we utilised a unique opportunity to successfully demonstrate the application of SNP genotype data for reconstructing pedigrees in wild populations, by validating the method with samples from wild individuals with known relatedness. By offering a simplified workflow with improved performance, we anticipate this methodology will facilitate the use of non-invasive samples to improve genetic management of many different types of populations that have previously been challenging to survey.
Collapse
|
29
|
Nosková A, Bhati M, Kadri NK, Crysnanto D, Neuenschwander S, Hofer A, Pausch H. Characterization of a haplotype-reference panel for genotyping by low-pass sequencing in Swiss Large White pigs. BMC Genomics 2021; 22:290. [PMID: 33882824 PMCID: PMC8061004 DOI: 10.1186/s12864-021-07610-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 04/13/2021] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND The key-ancestor approach has been frequently applied to prioritize individuals for whole-genome sequencing based on their marginal genetic contribution to current populations. Using this approach, we selected 70 key ancestors from two lines of the Swiss Large White breed that have been selected divergently for fertility and fattening traits and sequenced their genomes with short paired-end reads. RESULTS Using pedigree records, we estimated the effective population size of the dam and sire line to 72 and 44, respectively. In order to assess sequence variation in both lines, we sequenced the genomes of 70 boars at an average coverage of 16.69-fold. The boars explained 87.95 and 95.35% of the genetic diversity of the breeding populations of the dam and sire line, respectively. Reference-guided variant discovery using the GATK revealed 26,862,369 polymorphic sites. Principal component, admixture and fixation index (FST) analyses indicated considerable genetic differentiation between the lines. Genomic inbreeding quantified using runs of homozygosity was higher in the sire than dam line (0.28 vs 0.26). Using two complementary approaches, we detected 51 signatures of selection. However, only six signatures of selection overlapped between both lines. We used the sequenced haplotypes of the 70 key ancestors as a reference panel to call 22,618,811 genotypes in 175 pigs that had been sequenced at very low coverage (1.11-fold) using the GLIMPSE software. The genotype concordance, non-reference sensitivity and non-reference discrepancy between thus inferred and Illumina PorcineSNP60 BeadChip-called genotypes was 97.60, 98.73 and 3.24%, respectively. The low-pass sequencing-derived genomic relationship coefficients were highly correlated (r > 0.99) with those obtained from microarray genotyping. CONCLUSIONS We assessed genetic diversity within and between two lines of the Swiss Large White pig breed. Our analyses revealed considerable differentiation, even though the split into two populations occurred only few generations ago. The sequenced haplotypes of the key ancestor animals enabled us to implement genotyping by low-pass sequencing which offers an intriguing cost-effective approach to increase the variant density over current array-based genotyping by more than 350-fold.
Collapse
Affiliation(s)
- Adéla Nosková
- Animal Genomics, ETH Zürich, Eschikon 27, 8315, Lindau, Switzerland.
| | - Meenu Bhati
- Animal Genomics, ETH Zürich, Eschikon 27, 8315, Lindau, Switzerland
| | | | - Danang Crysnanto
- Animal Genomics, ETH Zürich, Eschikon 27, 8315, Lindau, Switzerland
| | | | | | - Hubert Pausch
- Animal Genomics, ETH Zürich, Eschikon 27, 8315, Lindau, Switzerland
| |
Collapse
|
30
|
Geibel J, Reimer C, Weigend S, Weigend A, Pook T, Simianer H. How array design creates SNP ascertainment bias. PLoS One 2021; 16:e0245178. [PMID: 33784304 PMCID: PMC8009414 DOI: 10.1371/journal.pone.0245178] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 12/22/2020] [Indexed: 12/30/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs), genotyped with arrays, have become a widely used marker type in population genetic analyses over the last 10 years. However, compared to whole genome re-sequencing data, arrays are known to lack a substantial proportion of globally rare variants and tend to be biased towards variants present in populations involved in the development process of the respective array. This affects population genetic estimators and is known as SNP ascertainment bias. We investigated factors contributing to ascertainment bias in array development by redesigning the Axiom™ Genome-Wide Chicken Array in silico and evaluating changes in allele frequency spectra and heterozygosity estimates in a stepwise manner. A sequential reduction of rare alleles during the development process was shown. This was mainly caused by the identification of SNPs in a limited set of populations and a within-population selection of common SNPs when aiming for equidistant spacing. These effects were shown to be less severe with a larger discovery panel. Additionally, a generally massive overestimation of expected heterozygosity for the ascertained SNP sets was shown. This overestimation was 24% higher for populations involved in the discovery process than not involved populations in case of the original array. The same was observed after the SNP discovery step in the redesign. However, an unequal contribution of populations during the SNP selection can mask this effect but also adds uncertainty. Finally, we make suggestions for the design of specialized arrays for large scale projects where whole genome re-sequencing techniques are still too expensive.
Collapse
Affiliation(s)
- Johannes Geibel
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, Göttingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Göttingen, Germany
- * E-mail:
| | - Christian Reimer
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, Göttingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Göttingen, Germany
| | - Steffen Weigend
- Center for Integrated Breeding Research, University of Goettingen, Göttingen, Germany
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, Neustadt-Mariensee, Germany
| | - Annett Weigend
- Institute of Farm Animal Genetics, Friedrich-Loeffler-Institut, Neustadt-Mariensee, Germany
| | - Torsten Pook
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, Göttingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Göttingen, Germany
| | - Henner Simianer
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, Göttingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Göttingen, Germany
| |
Collapse
|
31
|
Gebrehiwot NZ, Strucken EM, Marshall K, Aliloo H, Gibson JP. SNP panels for the estimation of dairy breed proportion and parentage assignment in African crossbred dairy cattle. Genet Sel Evol 2021; 53:21. [PMID: 33653262 PMCID: PMC7923343 DOI: 10.1186/s12711-021-00615-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 02/17/2021] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Understanding the relationship between genetic admixture and phenotypic performance is crucial for the optimization of crossbreeding programs. The use of small sets of informative ancestry markers can be a cost-effective option for the estimation of breed composition and for parentage assignment in situations where pedigree recording is difficult. The objectives of this study were to develop small single nucleotide polymorphism (SNP) panels that can accurately estimate the total dairy proportion and assign parentage in both West and East African crossbred dairy cows. METHODS Medium- and high-density SNP genotype data (Illumina BovineSNP50 and BovineHD Beadchip) for 4231 animals sampled from African crossbreds, African Bos taurus, European Bos taurus, Bos indicus, and African indigenous populations were used. For estimating breed composition, the absolute differences in allele frequency were calculated between pure ancestral breeds to identify SNPs with the highest discriminating power, and different combinations of SNPs weighted by ancestral origin were tested against estimates based on all available SNPs. For parentage assignment, informative SNPs were selected based on the highest minor allele frequency (MAF) in African crossbred populations assuming two Scenarios: (1) parents were selected among all the animals with known genotypes, and (2) parents were selected only among the animals known to be a parent of at least one progeny. RESULTS For the medium-density genotype data, SNPs selected for the largest differences in allele frequency between West African indigenous and European Bos taurus breeds performed best for most African crossbred populations and achieved a prediction accuracy (r2) for breed composition of 0.926 to 0.961 with 200 SNPs. For the high-density dataset, a panel with 70% of the SNPs selected on their largest difference in allele frequency between African and European Bos taurus performed best or very near best across all crossbred populations with r2 ranging from 0.978 to 0.984 with 200 SNPs. In all African crossbred populations, unambiguous parentage assignment was possible with ≥ 300 SNPs for the majority of the panels for Scenario 1 and ≥ 200 SNPs for Scenario 2. CONCLUSIONS The identified low-cost SNP assays could overcome incomplete or inaccurate pedigree records in African smallholder systems and allow effective breeding decisions to produce progeny of desired breed composition.
Collapse
Affiliation(s)
- Netsanet Z. Gebrehiwot
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351 Australia
| | - Eva M. Strucken
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351 Australia
| | - Karen Marshall
- International Livestock Research Institute and Centre for Tropical Livestock Genetics and Health, Nairobi, Kenya
| | - Hassan Aliloo
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351 Australia
| | - John P. Gibson
- Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351 Australia
| |
Collapse
|
32
|
Cao X, Liu WP, Cheng LG, Li HJ, Wu H, Liu YH, Chen C, Xiao X, Li M, Wang GD, Zhang YP. Whole genome analyses reveal significant convergence in obsessive-compulsive disorder between humans and dogs. Sci Bull (Beijing) 2021; 66:187-196. [PMID: 36654227 DOI: 10.1016/j.scib.2020.09.021] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 08/20/2020] [Accepted: 08/31/2020] [Indexed: 01/20/2023]
Abstract
Obsessive-compulsive disorder (OCD) represents a heterogeneous collection of diseases with diverse levels of phenotypic, genetic, and etiologic variability, making it difficult to identify the underlying genetic and biological mechanisms in humans. Domestic dogs exhibit several OCD-like behaviors. Using continuous circling as a representative phenotype for OCD, we screened two independent dog breeds, the Belgian Malinois and Kunming Dog and subsequently sequenced ten circling dogs and ten unaffected dogs for each breed. Using population differentiation analyses, we identified 11 candidate genes in the extreme tail of the differentiated regions between cases and controls. These genes overlap significantly with genes identified in a genome wide association study (GWAS) of human OCD, indicating strong convergence between humans and dogs. Through gene expressional analysis and functional exploration, we found that two candidate OCD risk genes, PPP2R2B and ADAMTSL3, affected the density and morphology of dendritic spines. Therefore, changes in dendritic spine may underlie some common biological and physiological pathways shared between humans and dogs. Our study revealed an unprecedented level of convergence in OCD shared between humans and dogs, and highlighted the importance of using domestic dogs as a model species for many human diseases including OCD.
Collapse
Affiliation(s)
- Xue Cao
- State Key Laboratory of Genetic Resources and Evolution and Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Department of Laboratory Animal Science, Kunming Medical University, Kunming 650500, China
| | - Wei-Peng Liu
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming 650223, China
| | - Lu-Guang Cheng
- Kunming Police Dog Base, Ministry of Public Security, Kunming 650204, China
| | - Hui-Juan Li
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming 650223, China
| | - Hong Wu
- Laboratory for Conservation and Utilization of Bio-resource & Key Laboratory for Microbial Resources of the Ministry of Education, Yunnan University, Kunming 650091, China
| | - Yan-Hu Liu
- State Key Laboratory of Genetic Resources and Evolution and Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Chao Chen
- Kunming Police Dog Base, Ministry of Public Security, Kunming 650204, China
| | - Xiao Xiao
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Ming Li
- Key Laboratory of Animal Models and Human Disease Mechanisms of the Chinese Academy of Sciences and Yunnan Province, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming 650223, China; Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China; KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China.
| | - Guo-Dong Wang
- State Key Laboratory of Genetic Resources and Evolution and Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China.
| | - Ya-Ping Zhang
- State Key Laboratory of Genetic Resources and Evolution and Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China.
| |
Collapse
|
33
|
Arca M, Mary-Huard T, Gouesnard B, Bérard A, Bauland C, Combes V, Madur D, Charcosset A, Nicolas SD. Deciphering the Genetic Diversity of Landraces With High-Throughput SNP Genotyping of DNA Bulks: Methodology and Application to the Maize 50k Array. FRONTIERS IN PLANT SCIENCE 2021; 11:568699. [PMID: 33488638 PMCID: PMC7817617 DOI: 10.3389/fpls.2020.568699] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 11/12/2020] [Indexed: 05/13/2023]
Abstract
Genebanks harbor original landraces carrying many original favorable alleles for mitigating biotic and abiotic stresses. Their genetic diversity remains, however, poorly characterized due to their large within genetic diversity. We developed a high-throughput, cheap and labor saving DNA bulk approach based on single-nucleotide polymorphism (SNP) Illumina Infinium HD array to genotype landraces. Samples were gathered for each landrace by mixing equal weights from young leaves, from which DNA was extracted. We then estimated allelic frequencies in each DNA bulk based on fluorescent intensity ratio (FIR) between two alleles at each SNP using a two step-approach. We first tested either whether the DNA bulk was monomorphic or polymorphic according to the two FIR distributions of individuals homozygous for allele A or B, respectively. If the DNA bulk was polymorphic, we estimated its allelic frequency by using a predictive equation calibrated on FIR from DNA bulks with known allelic frequencies. Our approach: (i) gives accurate allelic frequency estimations that are highly reproducible across laboratories, (ii) protects against false detection of allele fixation within landraces. We estimated allelic frequencies of 23,412 SNPs in 156 landraces representing American and European maize diversity. Modified Roger's genetic Distance between 156 landraces estimated from 23,412 SNPs and 17 simple sequence repeats using the same DNA bulks were highly correlated, suggesting that the ascertainment bias is low. Our approach is affordable, easy to implement and does not require specific bioinformatics support and laboratory equipment, and therefore should be highly relevant for large-scale characterization of genebanks for a wide range of species.
Collapse
Affiliation(s)
- Mariangela Arca
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Tristan Mary-Huard
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Brigitte Gouesnard
- AGAP, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Aurélie Bérard
- Université Paris-Saclay, INRAE, Etude du Polymorphisme des Génomes Végétaux, Evry-Courcouronnes, France
| | - Cyril Bauland
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Valérie Combes
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Delphine Madur
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Alain Charcosset
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Stéphane D. Nicolas
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| |
Collapse
|
34
|
Biddanda A, Rice DP, Novembre J. A variant-centric perspective on geographic patterns of human allele frequency variation. eLife 2020; 9:60107. [PMID: 33350384 PMCID: PMC7755386 DOI: 10.7554/elife.60107] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 11/12/2020] [Indexed: 12/14/2022] Open
Abstract
A key challenge in human genetics is to understand the geographic distribution of human genetic variation. Often genetic variation is described by showing relationships among populations or individuals, drawing inferences over many variants. Here, we introduce an alternative representation of genetic variation that reveals the relative abundance of different allele frequency patterns. This approach allows viewers to easily see several features of human genetic structure: (1) most variants are rare and geographically localized, (2) variants that are common in a single geographic region are more likely to be shared across the globe than to be private to that region, and (3) where two individuals differ, it is most often due to variants that are found globally, regardless of whether the individuals are from the same region or different regions. Our variant-centric visualization clarifies the geographic patterns of human variation and can help address misconceptions about genetic differentiation among populations.
Collapse
Affiliation(s)
- Arjun Biddanda
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Daniel P Rice
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, United States
| |
Collapse
|
35
|
Riazi S, Kraeva N, Girard T. Perioperative genetic screening: entering a new era. Br J Anaesth 2020; 125:859-862. [DOI: 10.1016/j.bja.2020.08.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 08/28/2020] [Accepted: 08/29/2020] [Indexed: 11/15/2022] Open
|
36
|
|
37
|
Camacho-Sanchez M, Velo-Antón G, Hanson JO, Veríssimo A, Martínez-Solano Í, Marques A, Moritz C, Carvalho SB. Comparative assessment of range-wide patterns of genetic diversity and structure with SNPs and microsatellites: A case study with Iberian amphibians. Ecol Evol 2020; 10:10353-10363. [PMID: 33072264 PMCID: PMC7548196 DOI: 10.1002/ece3.6670] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 07/22/2020] [Indexed: 11/11/2022] Open
Abstract
Reduced representation genome sequencing has popularized the application of single nucleotide polymorphisms (SNPs) to address evolutionary and conservation questions in nonmodel organisms. Patterns of genetic structure and diversity based on SNPs often diverge from those obtained with microsatellites to different degrees, but few studies have explicitly compared their performance under similar sampling regimes in a shared analytical framework. We compared range‐wide patterns of genetic structure and diversity in two amphibians endemic to the Iberian Peninsula: Hyla molleri and Pelobates cultripes, based on microsatellite (18 and 14 loci) and SNP (15,412 and 33,140 loci) datasets of comparable sample size and spatial extent. Model‐based clustering analyses with STRUCTURE revealed minor differences in genetic structure between marker types, but inconsistent values of the optimal number of populations (K) inferred. SNPs yielded more repeatable and less admixed ancestries with increasing K compared to microsatellites. Genetic diversity was weakly correlated between marker types, with SNPs providing a better representation of southern refugia and of gradients of genetic diversity congruent with the demographic history of both species. Our results suggest that the larger number of loci in a SNP dataset can provide more reliable inferences of patterns of genetic structure and diversity than a typical microsatellite dataset, at least at the spatial and temporal scales investigated.
Collapse
Affiliation(s)
- Miguel Camacho-Sanchez
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | - Guillermo Velo-Antón
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | - Jeffrey O Hanson
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | - Ana Veríssimo
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | | | - Adam Marques
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| | - Craig Moritz
- Centre for Biodiversity Analysis and Research School of Biology The Australian National University Canberra ACT Australia
| | - Sílvia B Carvalho
- CIBIO/InBIO Centro de Investigação em Biodiversidade e Recursos Genéticos da Universidade do Porto Vairão Portugal
| |
Collapse
|
38
|
Peripolli E, Reimer C, Ha NT, Geibel J, Machado MA, Panetto JCDC, do Egito AA, Baldi F, Simianer H, da Silva MVGB. Genome-wide detection of signatures of selection in indicine and Brazilian locally adapted taurine cattle breeds using whole-genome re-sequencing data. BMC Genomics 2020; 21:624. [PMID: 32917133 PMCID: PMC7488563 DOI: 10.1186/s12864-020-07035-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 08/27/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND The cattle introduced by European conquerors during the Brazilian colonization period were exposed to a process of natural selection in different types of biomes throughout the country, leading to the development of locally adapted cattle breeds. In this study, whole-genome re-sequencing data from indicine and Brazilian locally adapted taurine cattle breeds were used to detect genomic regions under selective pressure. Within-population and cross-population statistics were combined separately in a single score using the de-correlated composite of multiple signals (DCMS) method. Putative sweep regions were revealed by assessing the top 1% of the empirical distribution generated by the DCMS statistics. RESULTS A total of 33,328,447 biallelic SNPs with an average read depth of 12.4X passed the hard filtering process and were used to access putative sweep regions. Admixture has occurred in some locally adapted taurine populations due to the introgression of exotic breeds. The genomic inbreeding coefficient based on runs of homozygosity (ROH) concurred with the populations' historical background. Signatures of selection retrieved from the DCMS statistics provided a comprehensive set of putative candidate genes and revealed QTLs disclosing cattle production traits and adaptation to the challenging environments. Additionally, several candidate regions overlapped with previous regions under selection described in the literature for other cattle breeds. CONCLUSION The current study reported putative sweep regions that can provide important insights to better understand the selective forces shaping the genome of the indicine and Brazilian locally adapted taurine cattle breeds. Such regions likely harbor traces of natural selection pressures by which these populations have been exposed and may elucidate footprints for adaptation to the challenging climatic conditions.
Collapse
Affiliation(s)
- Elisa Peripolli
- São Paulo State University (Unesp), School of Agricultural and Veterinarian Sciences, Jaboticabal, 14884-900, Brazil
| | - Christian Reimer
- Animal Breeding and Genetics Group, Department of Animal Sciences, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Goettingen, Germany
| | - Ngoc-Thuy Ha
- Animal Breeding and Genetics Group, Department of Animal Sciences, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Goettingen, Germany
| | - Johannes Geibel
- Animal Breeding and Genetics Group, Department of Animal Sciences, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Goettingen, Germany
| | - Marco Antonio Machado
- National Council for Scientific and Technological Development (CNPq), Lago Sul, 71605-001, Brazil
- Embrapa Dairy Cattle, Juiz de Fora, 36038-330, Brazil
| | | | | | - Fernando Baldi
- São Paulo State University (Unesp), School of Agricultural and Veterinarian Sciences, Jaboticabal, 14884-900, Brazil
| | - Henner Simianer
- Animal Breeding and Genetics Group, Department of Animal Sciences, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Goettingen, Germany
- Center for Integrated Breeding Research, University of Goettingen, Albrecht-Thaer-Weg 3, 37075, Goettingen, Germany
| | | |
Collapse
|
39
|
Genome-wide identification and characterization of novel non-coding RNA-derived SSRs in wheat. Mol Biol Rep 2020; 47:6111-6125. [PMID: 32794134 DOI: 10.1007/s11033-020-05687-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 07/26/2020] [Indexed: 02/02/2023]
Abstract
Expression of eukaryotic genes is largely regulated by non-coding RNAs (ncRNA). Sequence variations in the regulatory RNAs may have critical biological consequences including transcriptional and post-transcriptional gene regulation. ncRNA-derived markers thus can be proved useful in molecular breeding, QTL mapping and association studies for trait dissection. In present study, we identified a total of 661 SSRs dwelling in pre-miRNA (15), small nuclear RNA (25) and lncRNA (621). Of these, 46 were validated and 100% amplification success was observed in selected wheat genotypes. A set of 36 ncRNA-SSRs markers was utilized for genetic variability assessment in forty-eight Indian wheat genotypes (which includes bread wheat, durum wheat and relatives). Number of alleles ranged from 1 to 4 with an average of two alleles per SSR locus. Mean PIC, observed heterozygosity and Shannon information index were found to be 0.258, 0.37 and 0.476 which suggests ncRNA-SSRs show higher polymorphism compared to genic SSRs but lower polymorphism compared to genomic SSRs. Thirty-six ncRNA-SSRs showed transferability ranging from 42.1% to 100%. Average genetic dissimilarity among wheat genotypes was found to be 0.29 based on Jaccard's dissimilarity. This is the first report of ncRNA-SSRs in wheat which will be useful for molecular breeding and genetic improvement of wheat.
Collapse
|
40
|
Lang PLM, Weiß CL, Kersten S, Latorre SM, Nagel S, Nickel B, Meyer M, Burbano HA. Hybridization ddRAD-sequencing for population genomics of nonmodel plants using highly degraded historical specimen DNA. Mol Ecol Resour 2020; 20:1228-1247. [PMID: 32306514 DOI: 10.1111/1755-0998.13168] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 03/06/2020] [Accepted: 03/30/2020] [Indexed: 12/29/2022]
Abstract
Species' responses at the genetic level are key to understanding the long-term consequences of anthropogenic global change. Herbaria document such responses, and, with contemporary sampling, provide high-resolution time-series of plant evolutionary change. Characterizing genetic diversity is straightforward for model species with small genomes and a reference sequence. For nonmodel species-with small or large genomes-diversity is traditionally assessed using restriction-enzyme-based sequencing. However, age-related DNA damage and fragmentation preclude the use of this approach for ancient herbarium DNA. Here, we combine reduced-representation sequencing and hybridization-capture to overcome this challenge and efficiently compare contemporary and historical specimens. Specifically, we describe how homemade DNA baits can be produced from reduced-representation libraries of fresh samples, and used to efficiently enrich historical libraries for the same fraction of the genome to produce compatible sets of sequence data from both types of material. Applying this approach to both Arabidopsis thaliana and the nonmodel plant Cardamine bulbifera, we discovered polymorphisms de novo in an unbiased, reference-free manner. We show that the recovered genetic variation recapitulates known genetic diversity in A. thaliana, and recovers geographical origin in both species and over time, independent of bait diversity. Hence, our method enables fast, cost-efficient, large-scale integration of contemporary and historical specimens for assessment of genome-wide genetic trends over time, independent of genome size and presence of a reference genome.
Collapse
Affiliation(s)
- Patricia L M Lang
- Research Group for Ancient Genomics and Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany.,Department of Biology, Stanford University, Stanford, CA, USA
| | - Clemens L Weiß
- Research Group for Ancient Genomics and Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany.,Department of Genetics, Stanford University, Stanford, CA, USA
| | - Sonja Kersten
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Sergio M Latorre
- Research Group for Ancient Genomics and Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Sarah Nagel
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Birgit Nickel
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Matthias Meyer
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Hernán A Burbano
- Research Group for Ancient Genomics and Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany.,Centre for Life's Origins and Evolution, Department of Genetics, Evolution, and Environment, University College London, London, UK
| |
Collapse
|
41
|
Chu J, Zhao Y, Beier S, Schulthess AW, Stein N, Philipp N, Röder MS, Reif JC. Suitability of Single-Nucleotide Polymorphism Arrays Versus Genotyping-By-Sequencing for Genebank Genomics in Wheat. FRONTIERS IN PLANT SCIENCE 2020; 11:42. [PMID: 32117381 PMCID: PMC7033508 DOI: 10.3389/fpls.2020.00042] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 01/13/2020] [Indexed: 05/20/2023]
Abstract
Genebank genomics promises to unlock valuable diversity for plant breeding but first, one key question is which marker system is most suitable to fingerprint entire genebank collections. Using wheat as model species, we tested for the presence of an ascertainment bias and investigated its impact on estimates of genetic diversity and prediction ability obtained using three marker platforms: simple sequence repeat (SSR), genotyping-by-sequencing (GBS), and array-based SNP markers. We used a panel of 378 winter wheat genotypes including 190 elite lines and 188 plant genetic resources (PGR), which were phenotyped in multi-environmental trials for grain yield and plant height. We observed an ascertainment bias for the array-based SNP markers, which led to an underestimation of the molecular diversity within the population of PGR. In contrast, the marker system played only a minor role for the overall picture of the population structure and precision of genome-wide predictions. Interestingly, we found that rare markers contributed substantially to the prediction ability. This combined with the expectation that valuable novel diversity is most likely rare suggests that markers with minor allele frequency deserve careful consideration in the design of a pre-breeding program.
Collapse
Affiliation(s)
- Jianting Chu
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Yusheng Zhao
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Sebastian Beier
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Albert W. Schulthess
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Nils Stein
- Department of Genebank, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Norman Philipp
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Marion S. Röder
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Jochen C. Reif
- Department of Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
- Faculty of Sciences III - Agricultural and Nutritional Sciences, Earth Sciences and Computer Science, Martin-Luther-University Halle-Wittenberg, Halle/Saale, Germany
| |
Collapse
|
42
|
Sard NM, Smith SR, Homola JJ, Kanefsky J, Bravener G, Adams JV, Holbrook CM, Hrodey PJ, Tallon K, Scribner KT. RAPTURE (RAD capture) panel facilitates analyses characterizing sea lamprey reproductive ecology and movement dynamics. Ecol Evol 2020; 10:1469-1488. [PMID: 32076528 PMCID: PMC7029094 DOI: 10.1002/ece3.6001] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Revised: 12/16/2019] [Accepted: 12/18/2019] [Indexed: 12/18/2022] Open
Abstract
Genomic tools are lacking for invasive and native populations of sea lamprey (Petromyzon marinus). Our objective was to discover single nucleotide polymorphism (SNP) loci to conduct pedigree analyses to quantify reproductive contributions of adult sea lampreys and dispersion of sibling larval sea lampreys of different ages in Great Lakes tributaries. Additional applications of data were explored using additional geographically expansive samples. We used restriction site-associated DNA sequencing (RAD-Seq) to discover genetic variation in Duffins Creek (DC), Ontario, Canada, and the St. Clair River (SCR), Michigan, USA. We subsequently developed RAD capture baits to genotype 3,446 RAD loci that contained 11,970 SNPs. Based on RAD capture assays, estimates of variance in SNP allele frequency among five Great Lakes tributary populations (mean F ST 0.008; range 0.00-0.018) were concordant with previous microsatellite-based studies; however, outlier loci were identified that contributed substantially to spatial population genetic structure. At finer scales within streams, simulations indicated that accuracy in genetic pedigree reconstruction was high when 200 or 500 independent loci were used, even in situations of high spawner abundance (e.g., 1,000 adults). Based on empirical collections of larval sea lamprey genotypes, we found that age-1 and age-2 families of full and half-siblings were widely but nonrandomly distributed within stream reaches sampled. Using the genomic scale set of SNP loci developed in this study, biologists can rapidly genotype sea lamprey in non-native and native ranges to investigate questions pertaining to population structuring and reproductive ecology at previously unattainable scales.
Collapse
Affiliation(s)
- Nicholas M. Sard
- Department of Fisheries and WildlifeMichigan State UniversityEast LansingMichigan
- Biology DepartmentSUNY OswegoOswegoNew York
| | - Seth R. Smith
- Department of Fisheries and WildlifeMichigan State UniversityEast LansingMichigan
| | - Jared J. Homola
- Department of Fisheries and WildlifeMichigan State UniversityEast LansingMichigan
| | - Jeannette Kanefsky
- Department of Fisheries and WildlifeMichigan State UniversityEast LansingMichigan
| | | | - Jean V. Adams
- Great Lakes Science CenterU.S. Geological SurveyAnn ArborMichigan
| | - Christopher M. Holbrook
- Great Lakes Science CenterHammond Bay Biological StationU.S. Geological SurveyMillersburgMichigan
| | | | - Kevin Tallon
- Fisheries and Oceans CanadaSault Ste. MarieONCanada
| | - Kim T. Scribner
- Department of Fisheries and WildlifeMichigan State UniversityEast LansingMichigan
- Department of Integrative BiologyState UniversityEast LansingMichigan
| |
Collapse
|
43
|
Getachew T, Haile A, Mészáros G, Rischkowsky B, Huson H, Gizaw S, Wurzinger M, Mwai A, Sölkner J. Genetic diversity, population structure and runs of homozygosity in Ethiopian short fat-tailed and Awassi sheep breeds using genome-wide 50k SNP markers. Livest Sci 2020. [DOI: 10.1016/j.livsci.2019.103899] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
44
|
Mabire C, Duarte J, Darracq A, Pirani A, Rimbert H, Madur D, Combes V, Vitte C, Praud S, Rivière N, Joets J, Pichon JP, Nicolas SD. High throughput genotyping of structural variations in a complex plant genome using an original Affymetrix® axiom® array. BMC Genomics 2019; 20:848. [PMID: 31722668 PMCID: PMC6854671 DOI: 10.1186/s12864-019-6136-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 09/23/2019] [Indexed: 12/19/2022] Open
Abstract
Background Insertions/deletions (InDels) and more specifically presence/absence variations (PAVs) are pervasive in several species and have strong functional and phenotypic effect by removing or drastically modifying genes. Genotyping of such variants on large panels remains poorly addressed, while necessary for approaches such as association mapping or genomic selection. Results We have developed, as a proof of concept, a new high-throughput and affordable approach to genotype InDels. We first identified 141,000 InDels by aligning reads from the B73 line against the genome of three temperate maize inbred lines (F2, PH207, and C103) and reciprocally. Next, we designed an Affymetrix® Axiom® array to target these InDels, with a combination of probes selected at breakpoint sites (13%) or within the InDel sequence, either at polymorphic (25%) or non-polymorphic sites (63%) sites. The final array design is composed of 662,772 probes and targets 105,927 InDels, including PAVs ranging from 35 bp to 129kbp. After Affymetrix® quality control, we successfully genotyped 86,648 polymorphic InDels (82% of all InDels interrogated by the array) on 445 maize DNA samples with 422,369 probes. Genotyping InDels using this approach produced a highly reliable dataset, with low genotyping error (~ 3%), high call rate (~ 98%), and high reproducibility (> 95%). This reliability can be further increased by combining genotyping of several probes calling the same InDels (< 0.1% error rate and > 99.9% of call rate for 5 probes). This “proof of concept” tool was used to estimate the kinship matrix between 362 maize lines with 57,824 polymorphic InDels. This InDels kinship matrix was highly correlated with kinship estimated using SNPs from Illumina 50 K SNP arrays. Conclusions We efficiently genotyped thousands of small to large InDels on a sizeable number of individuals using a new Affymetrix® Axiom® array. This powerful approach opens the way to studying the contribution of InDels to trait variation and heterosis in maize. The approach is easily extendable to other species and should contribute to decipher the biological impact of InDels at a larger scale.
Collapse
Affiliation(s)
- Clément Mabire
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Jorge Duarte
- Biogemma - Centre de Recherche de Chappes, CS 90126, 63720, Chappes, France
| | - Aude Darracq
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Ali Pirani
- Thermo Fisher Scientific, 3450 Central Expressway, Santa Clara, CA, 95051, USA
| | - Hélène Rimbert
- Biogemma - Centre de Recherche de Chappes, CS 90126, 63720, Chappes, France.,Present address: GDEC, INRA, Université Clermont Auvergne, 63000, Clermont-Ferrand, France
| | - Delphine Madur
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Valérie Combes
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Clémentine Vitte
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Sébastien Praud
- Biogemma - Centre de Recherche de Chappes, CS 90126, 63720, Chappes, France
| | - Nathalie Rivière
- Biogemma - Centre de Recherche de Chappes, CS 90126, 63720, Chappes, France
| | - Johann Joets
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | | | - Stéphane D Nicolas
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France.
| |
Collapse
|
45
|
Benjelloun B, Boyer F, Streeter I, Zamani W, Engelen S, Alberti A, Alberto FJ, BenBati M, Ibnelbachyr M, Chentouf M, Bechchari A, Rezaei HR, Naderi S, Stella A, Chikhi A, Clarke L, Kijas J, Flicek P, Taberlet P, Pompanon F. An evaluation of sequencing coverage and genotyping strategies to assess neutral and adaptive diversity. Mol Ecol Resour 2019; 19:1497-1515. [PMID: 31359622 PMCID: PMC7115901 DOI: 10.1111/1755-0998.13070] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Revised: 06/30/2019] [Accepted: 07/08/2019] [Indexed: 12/12/2022]
Abstract
Whole genome sequences (WGS) greatly increase our ability to precisely infer population genetic parameters, demographic processes, and selection signatures. However, WGS may still be not affordable for a representative number of individuals/populations. In this context, our goal was to assess the efficiency of several SNP genotyping strategies by testing their ability to accurately estimate parameters describing neutral diversity and to detect signatures of selection. We analysed 110 WGS at 12× coverage for four different species, i.e., sheep, goats and their wild counterparts. From these data we generated 946 data sets corresponding to random panels of 1K to 5M variants, commercial SNP chips and exome capture, for sample sizes of five to 48 individuals. We also extracted low-coverage genome resequencing of 1×, 2× and 5× by randomly subsampling reads from the 12× resequencing data. Globally, 5K to 10K random variants were enough for an accurate estimation of genome diversity. Conversely, commercial panels and exome capture displayed strong ascertainment biases. Besides the characterization of neutral diversity, the detection of the signature of selection and the accurate estimation of linkage disequilibrium (LD) required high-density panels of at least 1M variants. Finally, genotype likelihoods increased the quality of variant calling from low coverage resequencing but proportions of incorrect genotypes remained substantial, especially for heterozygote sites. Whole genome resequencing coverage of at least 5× appeared to be necessary for accurate assessment of genomic variations. These results have implications for studies seeking to deploy low-density SNP collections or genome scans across genetically diverse populations/species showing similar genetic characteristics and patterns of LD decay for a wide variety of purposes.
Collapse
Affiliation(s)
- Badr Benjelloun
- Univ. Grenoble-Alpes, Univ. Savoie Mont Blanc, CNRS, LECA, F-38000 Grenoble, France
- National Institute of Agronomic Research (INRA Maroc), Regional Centre of Agronomic Research, 23000 Beni-Mellal, Morocco
| | - Frédéric Boyer
- Univ. Grenoble-Alpes, Univ. Savoie Mont Blanc, CNRS, LECA, F-38000 Grenoble, France
| | - Ian Streeter
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Wahid Zamani
- Univ. Grenoble-Alpes, Univ. Savoie Mont Blanc, CNRS, LECA, F-38000 Grenoble, France
- Department of Environmental Sciences, Faculty of Natural Resources and Marine Sciences, Tarbiat Modares University, 46417-76489 Noor, Mazandaran, Iran
| | - Stefan Engelen
- CEA - Institut de biologie François-Jacob, Genoscope, 2 Rue Gaston Cremieux 91057 Evry Cedex, France
| | - Adriana Alberti
- CEA - Institut de biologie François-Jacob, Genoscope, 2 Rue Gaston Cremieux 91057 Evry Cedex, France
| | - Florian J. Alberto
- Univ. Grenoble-Alpes, Univ. Savoie Mont Blanc, CNRS, LECA, F-38000 Grenoble, France
| | - Mohamed BenBati
- National Institute of Agronomic Research (INRA Maroc), Regional Centre of Agronomic Research, 23000 Beni-Mellal, Morocco
| | - Mustapha Ibnelbachyr
- National Institute of Agronomic Research (INRA Maroc), CRRA Errachidia, 52000 Errachidia, Morocco
| | - Mouad Chentouf
- National Institute of Agronomic Research (INRA Maroc), CRRA Tangier, 90010 Tangier, Morocco
| | - Abdelmajid Bechchari
- National Institute of Agronomic Research (INRA Maroc), CRRA Oujda, 60000 Oujda, Morocco
| | - Hamid R. Rezaei
- Department of Environmental Sci, Gorgan University of Agricultural Sciences & Natural Resources, 41996-13776 Gorgan, Iran
| | - Saeid Naderi
- Environmental Sciences Department, Natural Resources Faculty, University of Guilan, 49138-15749 Guilan, Iran
| | - Alessandra Stella
- PTP Science Park, Bioinformatics Unit, Via Einstein-Loc. Cascina Codazza, 26900 Lodi, Italy
| | - Abdelkader Chikhi
- National Institute of Agronomic Research (INRA Maroc), CRRA Errachidia, 52000 Errachidia, Morocco
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - James Kijas
- Commonwealth Scientific and Industrial Research Organisation Animal Food and Health Sciences, St Lucia, QLD 4067, Australia
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Pierre Taberlet
- Univ. Grenoble-Alpes, Univ. Savoie Mont Blanc, CNRS, LECA, F-38000 Grenoble, France
| | - François Pompanon
- Univ. Grenoble-Alpes, Univ. Savoie Mont Blanc, CNRS, LECA, F-38000 Grenoble, France
| |
Collapse
|
46
|
Drislane C, Irvine AD. The role of filaggrin in atopic dermatitis and allergic disease. Ann Allergy Asthma Immunol 2019; 124:36-43. [PMID: 31622670 DOI: 10.1016/j.anai.2019.10.008] [Citation(s) in RCA: 178] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 10/07/2019] [Accepted: 10/07/2019] [Indexed: 12/14/2022]
Abstract
OBJECTIVE To provide an overview of filaggrin biology and the role of filaggrin variants in atopic dermatitis (AD) and allergic disease. DATA SOURCES We performed a PubMed literature review consisting mainly of studies relating to filaggrin in the last 5 years. STUDY SELECTIONS We selected articles that were found in PubMed using the search terms filaggrin, atopic dermatitis, skin barrier, and atopy. RESULTS Filaggrin plays an important role in the development of AD and allergic disease. Novel methods in measuring filaggrin expression and identifying filaggrin mutations aid in stratifying this patient cohort. We review new insights into understanding the role of filaggrin in AD and allergic disease. CONCLUSION Filaggrin remains a very important player in the pathogenesis of atopic dermatitis and allergic disease. This review looks at recent studies that aid our understanding of this crucial epidermal protein.
Collapse
Affiliation(s)
| | - Alan D Irvine
- Department of Paediatric Dermatology, Our Lady's Children's Hospital Crumlin, Dublin, National Children's Research Centre, Crumlin and Clinical Medicine, Trinity College Dublin, Ireland.
| |
Collapse
|
47
|
do Amaral M, Barbosa de Paula MF, Ollitrault F, Rivallan R, de Andrade Silva EM, da Silva Gesteira A, Luro F, Garcia D, Ollitrault P, Micheli F. Phylogenetic Origin of Primary and Secondary Metabolic Pathway Genes Revealed by C. maxima and C. reticulata Diagnostic SNPs. FRONTIERS IN PLANT SCIENCE 2019; 10:1128. [PMID: 31608086 PMCID: PMC6771394 DOI: 10.3389/fpls.2019.01128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 08/15/2019] [Indexed: 06/10/2023]
Abstract
Modern cultivated Citrus species and varieties result from interspecific hybridization between four ancestral taxa. Among them, Citrus maxima and Citrus reticulata, closely associated with the pummelo and mandarin horticultural groups, respectively, were particularly important as the progenitors of sour and sweet oranges (Citrus aurantium and Citrus sinensis), grapefruits (Citrus paradisi), and hybrid types resulting from modern breeding programs (tangors, tangelos, and orangelos). The differentiation between the four ancestral taxa and the phylogenomic structure of modern varieties widely drive the phenotypic diversity's organization. In particular, strong phenotypic differences exist in the coloration and sweetness and represent important criteria for breeders. In this context, focusing on the genes of the sugar, carotenoid, and chlorophyll biosynthesis pathways, the aim of this work was to develop a set of diagnostic single-nucleotide polymorphism (SNP) markers to distinguish the ancestral haplotypes of C. maxima and C. reticulata and to provide information at the intraspecific diversity level (within C. reticulata or C. maxima). In silico analysis allowed the identification of 3,347 SNPs from selected genes. Among them, 1,024 were detected as potential differentiation markers between C. reticulata and C. maxima. A total of 115 SNPs were successfully developed using a competitive PCR technology. Their transferability among all Citrus species and the true citrus genera was very good, with only 0.87% of missing data. The ancestral alleles of the SNPs were identified, and we validated the usefulness of the developed markers for tracing the ancestral haplotype in large germplasm collections and sexually recombined progeny issued from the C. reticulata/C. maxima admixture gene pool. These markers will pave the way for targeted association studies based on ancestral haplotypes.
Collapse
Affiliation(s)
- Milena do Amaral
- Centro de Biotecnologia e Genética (CBG), Departamento de Ciências Biológicas (DCB), Universidade Estadual de Santa Cruz (UESC), Ilhéus, Brazil
| | - Marcia Fabiana Barbosa de Paula
- Centro de Biotecnologia e Genética (CBG), Departamento de Ciências Biológicas (DCB), Universidade Estadual de Santa Cruz (UESC), Ilhéus, Brazil
| | | | | | - Edson Mario de Andrade Silva
- Centro de Biotecnologia e Genética (CBG), Departamento de Ciências Biológicas (DCB), Universidade Estadual de Santa Cruz (UESC), Ilhéus, Brazil
| | | | | | | | | | - Fabienne Micheli
- Centro de Biotecnologia e Genética (CBG), Departamento de Ciências Biológicas (DCB), Universidade Estadual de Santa Cruz (UESC), Ilhéus, Brazil
- CIRAD, UMR AGAP, Montpellier, France
| |
Collapse
|
48
|
Batcha AMN, Bamopoulos SA, Kerbs P, Kumar A, Jurinovic V, Rothenberg-Thurley M, Ksienzyk B, Philippou-Massier J, Krebs S, Blum H, Schneider S, Konstandin N, Bohlander SK, Heckman C, Kontro M, Hiddemann W, Spiekermann K, Braess J, Metzeler KH, Greif PA, Mansmann U, Herold T. Allelic Imbalance of Recurrently Mutated Genes in Acute Myeloid Leukaemia. Sci Rep 2019; 9:11796. [PMID: 31409822 PMCID: PMC6692371 DOI: 10.1038/s41598-019-48167-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 07/29/2019] [Indexed: 12/24/2022] Open
Abstract
The patho-mechanism of somatic driver mutations in cancer usually involves transcription, but the proportion of mutations and wild-type alleles transcribed from DNA to RNA is largely unknown. We systematically compared the variant allele frequencies of recurrently mutated genes in DNA and RNA sequencing data of 246 acute myeloid leukaemia (AML) patients. We observed that 95% of all detected variants were transcribed while the rest were not detectable in RNA sequencing with a minimum read-depth cut-off (10x). Our analysis focusing on 11 genes harbouring recurring mutations demonstrated allelic imbalance (AI) in most patients. GATA2, RUNX1, TET2, SRSF2, IDH2, PTPN11, WT1, NPM1 and CEBPA showed significant AIs. While the effect size was small in general, GATA2 exhibited the largest allelic imbalance. By pooling heterogeneous data from three independent AML cohorts with paired DNA and RNA sequencing (N = 253), we could validate the preferential transcription of GATA2-mutated alleles. Differential expression analysis of the genes with significant AI showed no significant differential gene and isoform expression for the mutated genes, between mutated and wild-type patients. In conclusion, our analyses identified AI in nine out of eleven recurrently mutated genes. AI might be a common phenomenon in AML which potentially contributes to leukaemogenesis.
Collapse
Affiliation(s)
- Aarif M N Batcha
- Institute of Medical Data Processing, Biometrics and Epidemiology (IBE), Faculty of Medicine, LMU Munich, Munich, Germany. .,Data Integration for Future Medicine (DiFuture, www.difuture.de), LMU Munich, Munich, Germany.
| | - Stefanos A Bamopoulos
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Paul Kerbs
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Ashwini Kumar
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Vindi Jurinovic
- Institute of Medical Data Processing, Biometrics and Epidemiology (IBE), Faculty of Medicine, LMU Munich, Munich, Germany.,Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Maja Rothenberg-Thurley
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Bianka Ksienzyk
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Julia Philippou-Massier
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, University of Munich, Munich, Germany
| | - Stefan Krebs
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, University of Munich, Munich, Germany
| | - Helmut Blum
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, University of Munich, Munich, Germany
| | - Stephanie Schneider
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,Institute of Human Genetics, University Hospital, LMU Munich, Munich, Germany
| | - Nikola Konstandin
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Stefan K Bohlander
- Leukaemia and Blood Cancer Research Unit, Department of Molecular Medicine and Pathology, University of Auckland, Auckland, New Zealand
| | - Caroline Heckman
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Mika Kontro
- Department of Haematology, Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
| | - Wolfgang Hiddemann
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Karsten Spiekermann
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jan Braess
- Department of Oncology and Hematology, Hospital Barmherzige Brüder, Regensburg, Germany
| | - Klaus H Metzeler
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Philipp A Greif
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Ulrich Mansmann
- Institute of Medical Data Processing, Biometrics and Epidemiology (IBE), Faculty of Medicine, LMU Munich, Munich, Germany.,Data Integration for Future Medicine (DiFuture, www.difuture.de), LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Tobias Herold
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany. .,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany. .,German Cancer Research Center (DKFZ), Heidelberg, Germany. .,Research Unit Apoptosis in Hematopoietic Stem Cells, Helmholtz Zentrum München, German Research Center for Environmental Health (HMGU), Munich, Germany.
| |
Collapse
|
49
|
SNV discovery and functional candidate gene identification for milk composition based on whole genome resequencing of Holstein bulls with extremely high and low breeding values. PLoS One 2019; 14:e0220629. [PMID: 31369641 PMCID: PMC6675115 DOI: 10.1371/journal.pone.0220629] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Accepted: 07/19/2019] [Indexed: 02/06/2023] Open
Abstract
We have sequenced the whole genomes of eight proven Holstein bulls from the four half-sib or full-sib families with extremely high and low estimated breeding values (EBV) for milk protein percentage (PP) and fat percentage (FP) using Illumina re-sequencing technology. Consequently, 2.3 billion raw reads were obtained with an average effective depth of 8.1×. After single nucleotide variant (SNV) calling, total 10,961,243 SNVs were identified, and 57,451 of them showed opposite fixed sites between the bulls with high and low EBVs within each family (called as common differential SNVs). Next, we annotated the common differential SNVs based on the bovine reference genome, and observed that 45,188 SNVs (78.70%) were located in the intergenic region of genes and merely 11,871 SNVs (20.67%) located within the protein-coding genes. Of them, 13,099 common differential SNVs that were within or close to protein-coding genes with less than 5 kb were chosen for identification of candidate genes for milk compositions in dairy cattle. By integrated analysis of the 2,657 genes with the GO terms and pathways related to protein and fat metabolism, and the known quantitative trait loci (QTLs) for milk protein and fat traits, we identified 17 promising candidate genes: ALG14, ATP2C1, PLD1, C3H1orf85, SNX7, MTHFD2L, CDKN2D, COL5A3, FDX1L, PIN1, FIG4, EXOC7, LASP1, PGS1, SAO, GPLD1 and MGEA5. Our findings provided an important foundation for further study and a prompt for molecular breeding of dairy cattle.
Collapse
|
50
|
Minias P, Dunn PO, Whittingham LA, Johnson JA, Oyler-McCance SJ. Evaluation of a Chicken 600K SNP genotyping array in non-model species of grouse. Sci Rep 2019; 9:6407. [PMID: 31015535 PMCID: PMC6478925 DOI: 10.1038/s41598-019-42885-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 04/11/2019] [Indexed: 12/30/2022] Open
Abstract
The use of single nucleotide polymorphism (SNP) arrays to generate large SNP datasets for comparison purposes have recently become an attractive alternative to other genotyping methods. Although most SNP arrays were originally developed for domestic organisms, they can be effectively applied to wild relatives to obtain large panels of SNPs. In this study, we tested the cross-species application of the Affymetrix 600K Chicken SNP array in five species of North American prairie grouse (Centrocercus and Tympanuchus genera). Two individuals were genotyped per species for a total of ten samples. A high proportion (91%) of the total 580 961 SNPs were genotyped in at least one individual (73–76% SNPs genotyped per species). Principal component analysis with autosomal SNPs separated the two genera, but failed to clearly distinguish species within genera. Gene ontology analysis identified a set of genes related to morphogenesis and development (including genes involved in feather development), which may be primarily responsible for large phenotypic differences between Centrocercus and Tympanuchus grouse. Our study provided evidence for successful cross-species application of the chicken SNP array in grouse which diverged ca. 37 mya from the chicken lineage. As far as we are aware, this is the first reported application of a SNP array in non-passerine birds, and it demonstrates the feasibility of using commercial SNP arrays in research on non-model bird species.
Collapse
Affiliation(s)
- Piotr Minias
- Department of Biodiversity Studies and Bioeducation, Faculty of Biology and Environmental Protection, University of Łódź, Banacha 1/3, 90-237, Łódź, Poland.
| | - Peter O Dunn
- Department of Biodiversity Studies and Bioeducation, Faculty of Biology and Environmental Protection, University of Łódź, Banacha 1/3, 90-237, Łódź, Poland.,Behavioral and Molecular Ecology Group, Department of Biological Sciences, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | - Linda A Whittingham
- Behavioral and Molecular Ecology Group, Department of Biological Sciences, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | - Jeff A Johnson
- Department of Biological Sciences, Institute of Applied Sciences, University of North Texas, Denton, Texas, USA
| | | |
Collapse
|