1
|
Ringbauer H, Huang Y, Akbari A, Mallick S, Olalde I, Patterson N, Reich D. Accurate detection of identity-by-descent segments in human ancient DNA. Nat Genet 2024; 56:143-151. [PMID: 38123640 PMCID: PMC10786714 DOI: 10.1038/s41588-023-01582-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 10/20/2023] [Indexed: 12/23/2023]
Abstract
Long DNA segments shared between two individuals, known as identity-by-descent (IBD), reveal recent genealogical connections. Here we introduce ancIBD, a method for identifying IBD segments in ancient human DNA (aDNA) using a hidden Markov model and imputed genotype probabilities. We demonstrate that ancIBD accurately identifies IBD segments >8 cM for aDNA data with an average depth of >0.25× for whole-genome sequencing or >1× for 1240k single nucleotide polymorphism capture data. Applying ancIBD to 4,248 ancient Eurasian individuals, we identify relatives up to the sixth degree and genealogical connections between archaeological groups. Notably, we reveal long IBD sharing between Corded Ware and Yamnaya groups, indicating that the Yamnaya herders of the Pontic-Caspian Steppe and the Steppe-related ancestry in various European Corded Ware groups share substantial co-ancestry within only a few hundred years. These results show that detecting IBD segments can generate powerful insights into the growing aDNA record, both on a small scale relevant to life stories and on a large scale relevant to major cultural-historical events.
Collapse
Affiliation(s)
- Harald Ringbauer
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA.
| | - Yilei Huang
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Bioinformatics Group, Institute of Computer Science, Universität Leipzig, Leipzig, Germany
| | - Ali Akbari
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Swapan Mallick
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| | - Iñigo Olalde
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- BIOMICs Research Group, University of the Basque Country, Vitoria-Gasteiz, Spain
- Ikerbasque-Basque Foundation of Science, Bilbao, Spain
| | - Nick Patterson
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - David Reich
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA.
- Department of Genetics, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
2
|
Sousa da Mota B, Rubinacci S, Cruz Dávalos DI, G Amorim CE, Sikora M, Johannsen NN, Szmyt MH, Włodarczak P, Szczepanek A, Przybyła MM, Schroeder H, Allentoft ME, Willerslev E, Malaspinas AS, Delaneau O. Imputation of ancient human genomes. Nat Commun 2023; 14:3660. [PMID: 37339987 PMCID: PMC10282092 DOI: 10.1038/s41467-023-39202-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 06/02/2023] [Indexed: 06/22/2023] Open
Abstract
Due to postmortem DNA degradation and microbial colonization, most ancient genomes have low depth of coverage, hindering genotype calling. Genotype imputation can improve genotyping accuracy for low-coverage genomes. However, it is unknown how accurate ancient DNA imputation is and whether imputation introduces bias to downstream analyses. Here we re-sequence an ancient trio (mother, father, son) and downsample and impute a total of 43 ancient genomes, including 42 high-coverage (above 10x) genomes. We assess imputation accuracy across ancestries, time, depth of coverage, and sequencing technology. We find that ancient and modern DNA imputation accuracies are comparable. When downsampled at 1x, 36 of the 42 genomes are imputed with low error rates (below 5%) while African genomes have higher error rates. We validate imputation and phasing results using the ancient trio data and an orthogonal approach based on Mendel's rules of inheritance. We further compare the downstream analysis results between imputed and high-coverage genomes, notably principal component analysis, genetic clustering, and runs of homozygosity, observing similar results starting from 0.5x coverage, except for the African genomes. These results suggest that, for most populations and depths of coverage as low as 0.5x, imputation is a reliable method that can improve ancient DNA studies.
Collapse
Affiliation(s)
- Bárbara Sousa da Mota
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Simone Rubinacci
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Diana Ivette Cruz Dávalos
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | | | - Martin Sikora
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Niels N Johannsen
- Department of Archaeology and Heritage Studies, Aarhus University, Aarhus, Denmark
| | - Marzena H Szmyt
- Institute for Eastern Research, Adam Mickiewicz University in Poznań, Poznań, Poland
| | - Piotr Włodarczak
- Institute of Archaeology and Ethnology, Polish Academy of Sciences, Kraków, Poland
| | - Anita Szczepanek
- Institute of Archaeology and Ethnology, Polish Academy of Sciences, Kraków, Poland
- Department of Anatomy, Jagiellonian University, Medical College, Kraków, Poland
| | | | - Hannes Schroeder
- The Globe Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Morten E Allentoft
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Trace and Environmental DNA (TrEnD) Laboratory, School of Molecular and Life Science, Curtin University, Bentley, WA, Australia
| | - Eske Willerslev
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
- MARUM, University of Bremen, Bremen, Germany
| | - Anna-Sapfo Malaspinas
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland.
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
3
|
Ringbauer H, Huang Y, Akbari A, Mallick S, Patterson N, Reich D. ancIBD - Screening for identity by descent segments in human ancient DNA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.08.531671. [PMID: 36945531 PMCID: PMC10028887 DOI: 10.1101/2023.03.08.531671] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Long DNA sequences shared between two individuals, known as Identical by descent (IBD) segments, are a powerful signal for identifying close and distant biological relatives because they only arise when the pair shares a recent common ancestor. Existing methods to call IBD segments between present-day genomes cannot be straightforwardly applied to ancient DNA data (aDNA) due to typically low coverage and high genotyping error rates. We present ancIBD, a method to identify IBD segments for human aDNA data implemented as a Python package. Our approach is based on a Hidden Markov Model, using as input genotype probabilities imputed based on a modern reference panel of genomic variation. Through simulation and downsampling experiments, we demonstrate that ancIBD robustly identifies IBD segments longer than 8 centimorgan for aDNA data with at least either 0.25x average whole-genome sequencing (WGS) coverage depth or at least 1x average depth for in-solution enrichment experiments targeting a widely used aDNA SNP set ('1240k'). This application range allows us to screen a substantial fraction of the aDNA record for IBD segments and we showcase two downstream applications. First, leveraging the fact that biological relatives up to the sixth degree are expected to share multiple long IBD segments, we identify relatives between 10,156 ancient Eurasian individuals and document evidence of long-distance migration, for example by identifying a pair of two approximately fifth-degree relatives who were buried 1410km apart in Central Asia 5000 years ago. Second, by applying ancIBD, we reveal new details regarding the spread of ancestry related to Steppe pastoralists into Europe starting 5000 years ago. We find that the first individuals in Central and Northern Europe carrying high amounts of Steppe-ancestry, associated with the Corded Ware culture, share high rates of long IBD (12-25 cM) with Yamnaya herders of the Pontic-Caspian steppe, signaling a strong bottleneck and a recent biological connection on the order of only few hundred years, providing evidence that the Yamnaya themselves are a main source of Steppe ancestry in Corded Ware people. We also detect elevated sharing of long IBD segments between Corded Ware individuals and people associated with the Globular Amphora culture (GAC) from Poland and Ukraine, who were Copper Age farmers not yet carrying Steppe-like ancestry. These IBD links appear for all Corded Ware groups in our analysis, indicating that individuals related to GAC contexts must have had a major demographic impact early on in the genetic admixtures giving rise to various Corded Ware groups across Europe. These results show that detecting IBD segments in aDNA can generate new insights both on a small scale, relevant to understanding the life stories of people, and on the macroscale, relevant to large-scale cultural-historical events.
Collapse
Affiliation(s)
- Harald Ringbauer
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Yilei Huang
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Bioinformatics Group, Institute of Computer Science, Universität Leipzig, Leipzig, Germanÿ
| | - Ali Akbari
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Swapan Mallick
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| | - Nick Patterson
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - David Reich
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| |
Collapse
|