1
|
Versoza CJ, Lloret-Villas A, Jensen JD, Pfeifer SP. A Pedigree-Based Map of Crossovers and Noncrossovers in Aye-Ayes (Daubentonia madagascariensis). Genome Biol Evol 2025; 17:evaf072. [PMID: 40242950 PMCID: PMC12079367 DOI: 10.1093/gbe/evaf072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2025] [Accepted: 04/10/2025] [Indexed: 04/18/2025] Open
Abstract
Gaining a better understanding of the rates and patterns of meiotic recombination is crucial for improving evolutionary genomic modeling, with applications ranging from demographic to selective inference. Although previous research has provided important insights into the landscape of crossovers in humans and other haplorrhines, our understanding of both the considerably more common outcome of recombination (i.e. noncrossovers) as well as the landscapes in more distantly related primates (i.e. strepsirrhines) remains limited owing to difficulties associated with both the identification of noncrossover tracts as well as species sampling. Thus, in order to elucidate recombination patterns in this understudied branch of the primate clade, we here characterize crossover and noncrossover landscapes in aye-ayes utilizing whole-genome sequencing data from six three-generation pedigrees and three two-generation multi-sibling families, and in so doing provide novel insights into this important evolutionary process shaping genomic diversity in one of the world's most critically endangered primate species.
Collapse
Affiliation(s)
- Cyril J Versoza
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Audald Lloret-Villas
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D Jensen
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Susanne P Pfeifer
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
2
|
Porubsky D, Dashnow H, Sasani TA, Logsdon GA, Hallast P, Noyes MD, Kronenberg ZN, Mokveld T, Koundinya N, Nolan C, Steely CJ, Guarracino A, Dolzhenko E, Harvey WT, Rowell WJ, Grigorev K, Nicholas TJ, Goldberg ME, Oshima KK, Lin J, Ebert P, Watkins WS, Leung TY, Hanlon VCT, McGee S, Pedersen BS, Happ HC, Jeong H, Munson KM, Hoekzema K, Chan DD, Wang Y, Knuth J, Garcia GH, Fanslow C, Lambert C, Lee C, Smith JD, Levy S, Mason CE, Garrison E, Lansdorp PM, Neklason DW, Jorde LB, Quinlan AR, Eberle MA, Eichler EE. Human de novo mutation rates from a four-generation pedigree reference. Nature 2025:10.1038/s41586-025-08922-2. [PMID: 40269156 DOI: 10.1038/s41586-025-08922-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 03/20/2025] [Indexed: 04/25/2025]
Abstract
Understanding the human de novo mutation (DNM) rate requires complete sequence information1. Here using five complementary short-read and long-read sequencing technologies, we phased and assembled more than 95% of each diploid human genome in a four-generation, twenty-eight-member family (CEPH 1463). We estimate 98-206 DNMs per transmission, including 74.5 de novo single-nucleotide variants, 7.4 non-tandem repeat indels, 65.3 de novo indels or structural variants originating from tandem repeats, and 4.4 centromeric DNMs. Among male individuals, we find 12.4 de novo Y chromosome events per generation. Short tandem repeats and variable-number tandem repeats are the most mutable, with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 16% of de novo single-nucleotide variants are postzygotic in origin with no paternal bias, including early germline mosaic mutations. We place all this variation in the context of a high-resolution recombination map (~3.4 kb breakpoint resolution) and find no correlation between meiotic crossover and de novo structural variants. These near-telomere-to-telomere familial genomes provide a truth set to understand the most fundamental processes underlying human genetic variation.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Thomas A Sasani
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Michelle D Noyes
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Nidhi Koundinya
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Cody J Steely
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Internal Medicine, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Andrea Guarracino
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Kirill Grigorev
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Thomas J Nicholas
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Michael E Goldberg
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Keisuke K Oshima
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jiadong Lin
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter Ebert
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - W Scott Watkins
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Tiffany Y Leung
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Vincent C T Hanlon
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Sean McGee
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hannah C Happ
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Altos Labs, San Diego, CA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Daniel D Chan
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Yanni Wang
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Joshua D Smith
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Erik Garrison
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Peter M Lansdorp
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Deborah W Neklason
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
3
|
Li R, Zang Y, Liu J, Wu E, Wu R, Sun H. Inferring the Degree of Relatedness and Kinship Types Using an All-in-One Marker Set. Genes (Basel) 2025; 16:455. [PMID: 40282415 PMCID: PMC12026669 DOI: 10.3390/genes16040455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2025] [Revised: 04/09/2025] [Accepted: 04/13/2025] [Indexed: 04/29/2025] Open
Abstract
BACKGROUND/OBJECTIVES Kinship inference is commonly adopted in various forensic applications, but previous studies have often lacked precision. METHODS In this study, a new method for the nomenclature of kinship types, i.e., kinship chain (KC), was proposed, and then, six types of identity by state (IBS) scores were calculated for simulated and real families using four types of markers. Finally, several Bayesian network (BN)-based classifiers were constructed to investigate the efficiency of the kinship inference. RESULTS A total of 7, 22, 58, and 3 KCs were obtained for common first-, second-, and third-degree relatives and unrelated pairs, respectively. High accuracies could be achieved in distinguishing between related and unrelated pairs after combining the four types of genetic markers, with an accuracy of >99.99% for all 7 KCs of first-degree relationships and ~99% for 14 out of 22 KCs of second-degree relatives. When comparing relationships of the same degree, the accuracies were 99.28%, 42.31%, and 15.82% for first-, second-, and third-degree relationships, respectively. When it came to differentiating unspecific relationships, the overall accuracy was over 80%. All the results were validated on real family data. CONCLUSIONS With the new nomenclature method of kinship types and the combination of autosomal and non-autosomal genetic markers, kinship inference can be realized with high accuracy and precision, which will be helpful in complex forensic cases, such as the identification of mass disaster victims.
Collapse
Affiliation(s)
- Ran Li
- Medical College, Jiaying University, Meizhou 514031, China;
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; (Y.Z.); (J.L.); (E.W.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Yu Zang
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; (Y.Z.); (J.L.); (E.W.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Jiajun Liu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; (Y.Z.); (J.L.); (E.W.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Enlin Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; (Y.Z.); (J.L.); (E.W.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Riga Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; (Y.Z.); (J.L.); (E.W.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| | - Hongyu Sun
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China; (Y.Z.); (J.L.); (E.W.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou 510080, China
| |
Collapse
|
4
|
Carioscia SA, Biddanda A, Starostik MR, Tang X, Hoffmann ER, Demko ZP, McCoy RC. Common variation in meiosis genes shapes human recombination phenotypes and aneuploidy risk. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.04.02.25325097. [PMID: 40321295 PMCID: PMC12047964 DOI: 10.1101/2025.04.02.25325097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/14/2025]
Abstract
The leading cause of human pregnancy loss is aneuploidy, often tracing to errors in chromosome segregation during female meiosis. While abnormal crossover recombination is known to confer risk for aneuploidy, limited data have hindered understanding of the potential shared genetic basis of these key molecular phenotypes. To address this gap, we performed retrospective analysis of preimplantation genetic testing data from 139,416 in vitro fertilized embryos from 22,850 sets of biological parents. By tracing transmission of haplotypes, we identified 3,656,198 crossovers, as well as 92,485 aneuploid chromosomes. Counts of crossovers were lower in aneuploid versus euploid embryos, consistent with their role in chromosome pairing and segregation. Our analyses further revealed that a common haplotype spanning the meiotic cohesin SMC1B is significantly associated with both crossover count and maternal meiotic aneuploidy, with evidence supporting a non-coding cis-regulatory mechanism. Transcriptome- and phenome-wide association tests also implicated variation in the synaptonemal complex component C14orf39 and crossover-regulating ubiquitin ligases CCNB1IP1 and RNF212 in meiotic aneuploidy risk. More broadly, recombination and aneuploidy possess a partially shared genetic basis that also overlaps with reproductive aging traits. Our findings highlight the dual role of recombination in generating genetic diversity, while ensuring meiotic fidelity.
Collapse
Affiliation(s)
| | - Arjun Biddanda
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | | | - Xiaona Tang
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Eva R. Hoffmann
- DNRF Center for Chromosome Stability, Department of Cellular and Molecular Medicine, University of Copenhagen, Copenhagen, Denmark
| | | | - Rajiv C. McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
5
|
Kling D, Jepsen AH, Kampmann ML, Jacobsen SB, Tillmar A, Børsting C, Andersen JD. Forensic investigative genetic genealogy using genotypes generated or imputed from transcriptomes. Forensic Sci Int Genet 2025; 78:103277. [PMID: 40121765 DOI: 10.1016/j.fsigen.2025.103277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Revised: 02/04/2025] [Accepted: 03/18/2025] [Indexed: 03/25/2025]
Abstract
The utility of transcriptome analysis in forensic genetics is steadily increasing. The transcriptome, with its ability to reflect both transcript levels and their nucleotide sequences, is proving to be useful for a variety of different applications, including body fluid identification and donor assignment, thereby providing both genetic and contextual information. Furthermore, the substantial single nucleotide polymorphism (SNP) coverage obtainable with whole transcriptome sequencing may prove useful for additional applications. In this study, we expand the current knowledge of transcriptomics in forensic genetics by showing how RNA can be used for forensic investigative genetic genealogy (FIGG) purposes and inference of distant relationships. Genetic data was simulated for relationships ranging from full siblings (first-degree relatives) to third cousins (seventh-degree relatives). The sets of SNP genotypes were subsequently reduced to only include observed and imputed SNP genotypes at loci covered by transcriptome sequencing of whole blood. The relationships of relatives as distant as second cousins could be reliably classified based on an average of 99,548 SNPs. Appropriate thresholds for sequence quality parameters limited the rate of erroneous genotype calls, with the remaining errors proving to have little to no effect on relationship inference. In conclusion, we present a proof-of-concept study on how transcriptome-based genotypes, in combination with imputed genotypes, may be used for reliable relationship inference for FIGG purposes.
Collapse
Affiliation(s)
- Daniel Kling
- Department of Forensic Sciences, Oslo University Hospital, Oslo, Norway; Department of Forensic Genetics and Forensic Toxicology, National Board of Forensic Medicine, Linköping, Sweden; Biostatistics (BIAS), Norwegian University of Life Sciences, Ås, Norway
| | - Alberte Honoré Jepsen
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Marie-Louise Kampmann
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Stine Bøttcher Jacobsen
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Tillmar
- Department of Forensic Genetics and Forensic Toxicology, National Board of Forensic Medicine, Linköping, Sweden; Department of Biomedical and Clinical Sciences, Faculty of Health Sciences, Linköping University, Linköping, Sweden
| | - Claus Børsting
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jeppe Dyrberg Andersen
- Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
6
|
Kaiser VB, Semple CA. CTCF-anchored chromatin loop dynamics during human meiosis. BMC Biol 2025; 23:83. [PMID: 40114154 PMCID: PMC11927364 DOI: 10.1186/s12915-025-02181-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Accepted: 03/03/2025] [Indexed: 03/22/2025] Open
Abstract
BACKGROUND During meiosis, the mammalian genome is organised within chromatin loops, which facilitate synapsis, crossing over and chromosome segregation, setting the stage for recombination events and the generation of genetic diversity. Chromatin looping is thought to play a major role in the establishment of cross overs during prophase I of meiosis, in diploid early primary spermatocytes. However, chromatin conformation dynamics during human meiosis are difficult to study experimentally, due to the transience of each cell division and the difficulty of obtaining stage-resolved cell populations. Here, we employed a machine learning framework trained on single cell ATAC-seq and RNA-seq data to predict CTCF-anchored looping during spermatogenesis, including cell types at different stages of meiosis. RESULTS We find dramatic changes in genome-wide looping patterns throughout meiosis: compared to pre-and-post meiotic germline cell types, loops in meiotic early primary spermatocytes are more abundant, more variable between individual cells, and more evenly spread throughout the genome. In preparation for the first meiotic division, loops also include longer stretches of DNA, encompassing more than half of the total genome. These loop structures then influence the rate of recombination initiation and resolution as cross overs. In contrast, in later mature sperm stages, we find evidence of genome compaction, with loops being confined to the telomeric ends of the chromosomes. CONCLUSION Overall, we find that chromatin loops do not orchestrate the gene expression dynamics seen during spermatogenesis, but loops do play important roles in recombination, influencing the positions of DNA breakage and cross over events.
Collapse
Affiliation(s)
- Vera B Kaiser
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, UK.
| | - Colin A Semple
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, UK
| |
Collapse
|
7
|
Palsson G, Hardarson MT, Jonsson H, Steinthorsdottir V, Stefansson OA, Eggertsson HP, Gudjonsson SA, Olason PI, Gylfason A, Masson G, Thorsteinsdottir U, Sulem P, Helgason A, Gudbjartsson DF, Halldorsson BV, Stefansson K. Complete human recombination maps. Nature 2025; 639:700-707. [PMID: 39843742 PMCID: PMC11922761 DOI: 10.1038/s41586-024-08450-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Accepted: 11/25/2024] [Indexed: 01/24/2025]
Abstract
Human recombination maps are a valuable resource for association and linkage studies and crucial for many inferences of population history and natural selection. Existing maps1-5 are based solely on cross-over (CO) recombination, omitting non-cross-overs (NCOs)-the more common form of recombination6-owing to the difficulty in detecting them. Using whole-genome sequence data in families, we estimate the number of NCOs transmitted from parent to offspring and derive complete, sex-specific recombination maps including both NCOs and COs. Mothers have fewer but longer NCOs than fathers, and oocytes accumulate NCOs in a non-regulated fashion with maternal age. Recombination, primarily NCO, is responsible for 1.8% (95% confidence interval: 1.3-2.3) and 11.3% (95% confidence interval: 9.0-13.6) of paternal and maternal de novo mutations, respectively, and may drive the increase in de novo mutations with maternal age. NCOs are substantially more prominent than COs in centromeres, possibly to avoid large-scale genomic changes that may cause aneuploidy. Our results demonstrate that NCOs highlight to a much greater extent than COs the differences in the meiotic process between the sexes, in which maternal NCOs may reflect the safeguarding of oocytes from infancy until ovulation.
Collapse
Affiliation(s)
| | - Marteinn T Hardarson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- School of Technology, Reykjavik University, Reykjavík, Iceland
| | | | | | | | | | | | | | | | | | - Unnur Thorsteinsdottir
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | | | - Agnar Helgason
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- Department of Anthropology, University of Iceland, Reykjavik, Iceland
| | - Daniel F Gudbjartsson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | - Bjarni V Halldorsson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.
- School of Technology, Reykjavik University, Reykjavík, Iceland.
| | - Kari Stefansson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland.
| |
Collapse
|
8
|
Browning SR, Browning BL. Estimating gene conversion rates from population data using multi-individual identity by descent. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.22.639693. [PMID: 40060563 PMCID: PMC11888280 DOI: 10.1101/2025.02.22.639693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
In humans, homologous gene conversions occur at a higher rate than crossovers, however gene conversion tracts are small and often unobservable. As a result, estimating gene conversion rates is more difficult than estimating crossover rates. We present a method for multi-individual identity-by-descent (IBD) inference that allows for mismatches due to genotype error and gene conversion. We use the inferred IBD to detect alleles that have changed due to gene conversion in the recent past. We analyze data from the TOPMed and UK Biobank studies to estimate autosome-wide maps of gene conversion rates. For 10 kb, 100kb, and 1 Mb windows, the correlation between our TOPMed gene conversion map and the deCODE sex-averaged crossover map ranges from 0.56 to 0.67. We find that the strongest gene conversion hotspots typically die back to the baseline gene conversion rate within 1 kb. In 100 kb and 1 Mb windows, our estimated gene conversion map has higher correlation than the deCODE sex-averaged crossover map with PRDM9 binding enrichment (0.34 vs 0.29 for 100 kb windows and 0.52 vs 0.34 for 1 Mb windows), suggesting that the effect of PRDM9 is greater on gene conversion than on crossover recombination. Our TOPMed gene conversion maps are constructed from 55-fold more observed allele conversions than the recently published deCODE gene conversion maps. Our map provides sex-averaged estimates for 10 kb, 100 kb, and 1 Mb windows, whereas the deCODE gene conversion maps provide sex-specific estimates for 3 Mb windows.
Collapse
Affiliation(s)
- Sharon R. Browning
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA
| | - Brian L. Browning
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, 98195, USA
| |
Collapse
|
9
|
Temple SD, Browning SR. Multiple-testing corrections in selection scans using identity-by-descent segments. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.29.635528. [PMID: 39975073 PMCID: PMC11838353 DOI: 10.1101/2025.01.29.635528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Failing to correct for multiple testing in selection scans can lead to false discoveries of recent genetic adaptations. The scanning statistics in selection studies are often too complicated to theoretically derive a genome-wide significance level or empirically validate control of the family-wise error rate (FWER). By modeling the autocorrelation of identity-by-descent (IBD) rates, we propose a computationally efficient method to determine genome-wide significance levels in an IBD-based scan for recent positive selection. In whole genome simulations, we show that our method has approximate control of the FWER and can adapt to the spacing of tests along the genome. We also show that these scans can have more than fifty percent power to reject the null model in hard sweeps with a selection coefficient s > = 0.01 and a sweeping allele frequency between twenty-five and seventy-five percent. A few human genes and gene complexes have statistically significant excesses of IBD segments in thousands of samples of African, European, and South Asian ancestry groups from the Trans-Omics for Precision Medicine project and the United Kingdom Biobank. Among the significant loci, many signals of recent selection are shared across ancestry groups. One shared selection signal at a skeletal cell development gene is extremely strong in African ancestry samples.
Collapse
Affiliation(s)
- Seth D. Temple
- Department of Statistics, University of Washington, Seattle, Washington, USA
- Department of Statistics, University of Michigan, Ann Arbor, Michigan, USA
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, Michigan, USA
| | - Sharon R. Browning
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| |
Collapse
|
10
|
Tillmar A, Kling D. Comparative Study of Statistical Approaches and SNP Panels to Infer Distant Relationships in Forensic Genetics. Genes (Basel) 2025; 16:114. [PMID: 40004443 PMCID: PMC11855180 DOI: 10.3390/genes16020114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Revised: 01/14/2025] [Accepted: 01/19/2025] [Indexed: 02/27/2025] Open
Abstract
Background/Objectives: Inferring genetic relationships based on genetic data has gained an increasing focus in the last years, in particular explained by the rise of forensic investigative genetic genealogy (FIGG) but also the introduction of expanded SNP panels in forensic genetics. A plethora of statistical methods are used throughout publications; in direct-to-consumer (DTC) testing, the shared segment approach is used, in screenings of relationships in medical genetic research, for instance, methods-of-moment estimators, e.g., estimation of the kinship coefficient, are used, and in forensic genetics, the likelihood and the likelihood ratio are commonly used to evaluate the genetic data under competing hypotheses. This current study aims to compare and contrast examples of the aforementioned statistical methods to infer relationships from genetic data. Methods/Results: This study includes some historical and some recently published panels of SNP markers to illustrate the strength and caveats of the statistical methods on different marker sets and a selection of pre-defined pairwise relationships, 1st through 7th degree. Extensive simulations are performed and subsequently subsetted based on the marker panels alluded to above. As has been shown in previous research, the likelihood ratio is most powerful, i.e., high correct classifications, when SNP data are sparse, say below 20,000 markers, whereas the windowed kinships and segment approaches are equally powerful when very dense SNP data are available, say >20,000 markers. In between lay approaches using method-of-moments estimators which perform well when the degree of relationship is below four but less so beyond, say, 4th degree relationships. The likelihood ratio is the only method that is easily adapted for non-pairwise tests and therefore has an additional depth not addressed in the current study. We furthermore perform a study of genotyping error rates and their impact on the different statistical methods employed to infer relationships, where the results show that error rates below 1% seem to have low impact across all methods, in particular for errors yielding false heterozygote genotypes.
Collapse
Affiliation(s)
- Andreas Tillmar
- Department of Forensic Genetics and Forensic Toxicology, National Board of Forensic Medicine, SE-587 58 Linköping, Sweden;
- Department of Clinical and Experimental Medicine, Faculty of Health Sciences, Linköping University, SE-582 25 Linköping, Sweden
| | - Daniel Kling
- Department of Forensic Genetics and Forensic Toxicology, National Board of Forensic Medicine, SE-587 58 Linköping, Sweden;
- Department of Forensic Sciences, Oslo University Hospital, NO-0450 Oslo, Norway
- Department of Biostatistics (BIAS), Norwegian University of Life Sciences, NO-1433 Aas, Norway
| |
Collapse
|
11
|
Freudiger A, Jovanovic VM, Huang Y, Snyder-Mackler N, Conrad DF, Miller B, Montague MJ, Westphal H, Stadler PF, Bley S, Horvath JE, Brent LJN, Platt ML, Ruiz-Lambides A, Tung J, Nowick K, Ringbauer H, Widdig A. Estimating realized relatedness in free-ranging macaques by inferring identity-by-descent segments. Proc Natl Acad Sci U S A 2025; 122:e2401106122. [PMID: 39808663 PMCID: PMC11760927 DOI: 10.1073/pnas.2401106122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 12/04/2024] [Indexed: 01/16/2025] Open
Abstract
Biological relatedness is a key consideration in studies of behavior, population structure, and trait evolution. Except for parent-offspring dyads, pedigrees capture relatedness imperfectly. The number and length of identical-by-descent DNA segments (IBD) yield the most precise relatedness estimates. Here, we leverage different methods for estimating IBD segments from low-depth whole genome resequencing data to demonstrate the feasibility and value of resolving fine-scaled gradients of relatedness in free-living animals. Using primarily 4 to 6× depth data from a rhesus macaque (Macaca mulatta) population with long-term pedigree data, we show that we can infer the number and length of IBD segments across the genome with high accuracy even at 0.5× sequencing depth. In line with expectations based on simulation, the resulting estimates demonstrate substantial variation in genetic relatedness within kin classes, leading to overlapping distributions between kin classes. By comparing the IBD-based estimates with pedigree and short tandem repeat-based methods, we show that IBD estimates are more reliable and provide more detailed information on kinship. The inferred IBD segments also identify cryptic genetic relatives not represented in the pedigree and reveal elevated recombination rates in females relative to males, which enables the majority of close maternal and paternal kin to be distinguished with genotype data alone. Our findings represent a breakthrough in the ability to study the predictors and consequences of genetic relatedness in natural populations, contributing to our understanding of a fundamental component of population structure in the wild.
Collapse
Affiliation(s)
- Annika Freudiger
- Department of Primate Behavioral Ecology, Institute of Biology, Leipzig University, Leipzig04103, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig04103, Germany
| | - Vladimir M. Jovanovic
- Department of Biology, Chemistry and Pharmacy, Human Biology and Primate Evolution, Freie Universität Berlin, Berlin14195, Germany
- Department of Mathematics and Computer Science, Bioinformatics Solution Center, Freie Universität Berlin, Berlin14195, Germany
| | - Yilei Huang
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig04103, Germany
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig04107, Germany
| | - Noah Snyder-Mackler
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ85281
| | - Donald F. Conrad
- Division of Genetics, Oregon National Primate Research Center, Portland, OR97006
| | - Brian Miller
- Division of Genetics, Oregon National Primate Research Center, Portland, OR97006
| | - Michael J. Montague
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA19104
| | - Hendrikje Westphal
- Department of Primate Behavioral Ecology, Institute of Biology, Leipzig University, Leipzig04103, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig04103, Germany
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig04107, Germany
| | - Peter F. Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig04107, Germany
- Max Planck Institute for Mathematics in the Sciences, Leipzig04103, Germany
- Institute for Theoretical Chemistry, University of Vienna, Vienna1090, Austria
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá111311, Colombia
- Santa Fe Institute, Santa Fe, NM87501
| | - Stefanie Bley
- Department of Primate Behavioral Ecology, Institute of Biology, Leipzig University, Leipzig04103, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig04103, Germany
| | - Julie E. Horvath
- Research and Collections Section, North Carolina Museum of Natural Sciences, Raleigh, NC27601
- Department of Biological Sciences, North Carolina State University, Raleigh, NC27607
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC27517
| | - Lauren J. N. Brent
- Centre for Research in Animal Behavior, University of Exeter, ExeterEX4 4QD, United Kingdom
| | - Michael L. Platt
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA19104
- Marketing Department, Wharton School of Business, University of Pennsylvania, Philadelphia, PA19104
- Department of Psychology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA19104
| | - Angelina Ruiz-Lambides
- Cayo Santiago Field Station, Caribbean Primate Research Center, University of Puerto Rico, Punta Santiago00741, Puerto Rico
| | - Jenny Tung
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig04103, Germany
- Department of Evolutionary Anthropology, Duke University, Durham, NC27710
- Department of Biology, Duke University, Durham, NC27710
- Duke University Population Research Institute, Durham, NC27710
| | - Katja Nowick
- Department of Biology, Chemistry and Pharmacy, Human Biology and Primate Evolution, Freie Universität Berlin, Berlin14195, Germany
- Department of Mathematics and Computer Science, Bioinformatics Solution Center, Freie Universität Berlin, Berlin14195, Germany
| | - Harald Ringbauer
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig04103, Germany
| | - Anja Widdig
- Department of Primate Behavioral Ecology, Institute of Biology, Leipzig University, Leipzig04103, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig04103, Germany
- German Centre for Integrative Biodiversity Research, Leipzig04103, Germany
| |
Collapse
|
12
|
Versoza CJ, Lloret-Villas A, Jensen JD, Pfeifer SP. A pedigree-based map of crossovers and non-crossovers in aye-ayes ( Daubentonia madagascariensis). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.08.622675. [PMID: 39605366 PMCID: PMC11601232 DOI: 10.1101/2024.11.08.622675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Gaining a better understanding of rates and patterns of meiotic recombination is crucial for improving evolutionary genomic modelling, with applications ranging from demographic to selective inference. Although previous research has provided important insights into the landscape of crossovers in humans and other haplorrhines, our understanding of both the considerably more common outcome of recombination (i.e., non-crossovers) as well as the landscapes in more distantly-related primates (i.e., strepsirrhines) remains limited owing to difficulties associated with both the identification of non-crossover tracts as well as species sampling. Thus, in order to elucidate recombination patterns in this under-studied branch of the primate clade, we here characterize crossover and non-crossover landscapes in aye-ayes utilizing whole-genome sequencing data from six three-generation pedigrees as well as three two-generation multi-sibling families, and in so doing provide novel insights into this important evolutionary process shaping genomic diversity in one of the world's most critically endangered primate species.
Collapse
Affiliation(s)
- Cyril J. Versoza
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Audald Lloret-Villas
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D. Jensen
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Susanne P. Pfeifer
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
13
|
Zang Y, Wu E, Li T, Liu J, Wu R, Li R, Sun H. Evaluation of Four Forensic Investigative Genetic Genealogy Analysis Approaches with Decreased Numbers of SNPs and Increased Genotyping Errors. Genes (Basel) 2024; 15:1329. [PMID: 39457453 PMCID: PMC11507463 DOI: 10.3390/genes15101329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 09/27/2024] [Accepted: 10/12/2024] [Indexed: 10/28/2024] Open
Abstract
Background: Forensic investigative genetic genealogy (FIGG) has developed rapidly in recent years and is considered a novel tool for crime investigation. However, crime scene samples are often of low quality and quantity and are challenging to analyze. Deciding which approach should be used for kinship inference in forensic practice remains a troubling problem for investigators. Methods: In this study, we selected four popular approaches-KING, IBS, TRUFFLE, and GERMLINE-comprising one method of moment (MoM) estimator and three identical by descent (IBD) segment-based tools and compared their performance at varying numbers of SNPs and levels of genotyping errors using both simulated and real family data. We also explored the possibility of making robust kinship inferences for samples with ultra-high genotyping errors by integrating MoM and the IBD segment-based methods. Results: The results showed that decreasing the number of SNPs had little effect on kinship inference when no fewer than 164 K SNPs were used for all four approaches. However, as the number decreased further, decreased efficiency was observed for the three IBD segment-based methods. Genotyping errors also had a significant effect on kinship inference, especially when they exceeded 1%. In contrast, MoM was much more robust to genotyping errors. Furthermore, the combination of the MoM and the IBD segment-based methods showed a higher overall accuracy, indicating its potential to improve the tolerance to genotyping errors. Conclusions: In conclusion, this study shows that different approaches have unique characteristics and should be selected for different scenarios. More importantly, the integration of the MoM and the IBD segment-based methods can improve the robustness of kinship inference and has great potential for applications in forensic practice.
Collapse
Affiliation(s)
- Yu Zang
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (Y.Z.); (E.W.); (T.L.); (J.L.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Enlin Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (Y.Z.); (E.W.); (T.L.); (J.L.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Tingjun Li
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (Y.Z.); (E.W.); (T.L.); (J.L.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Jiajun Liu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (Y.Z.); (E.W.); (T.L.); (J.L.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Riga Wu
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (Y.Z.); (E.W.); (T.L.); (J.L.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| | - Ran Li
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (Y.Z.); (E.W.); (T.L.); (J.L.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
- School of Medicine, Jiaying University, Meizhou 514015, China
| | - Hongyu Sun
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou 510080, China; (Y.Z.); (E.W.); (T.L.); (J.L.); (R.W.)
- Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-Sen University, Guangzhou 510080, China
| |
Collapse
|
14
|
de Lima LG, Guarracino A, Koren S, Potapova T, McKinney S, Rhie A, Solar SJ, Seidel C, Fagen B, Walenz BP, Bouffard GG, Brooks SY, Peterson M, Hall K, Crawford J, Young AC, Pickett BD, Garrison E, Phillippy AM, Gerton JL. The formation and propagation of human Robertsonian chromosomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.24.614821. [PMID: 39386535 PMCID: PMC11463614 DOI: 10.1101/2024.09.24.614821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Robertsonian chromosomes are a type of variant chromosome found commonly in nature. Present in one in 800 humans, these chromosomes can underlie infertility, trisomies, and increased cancer incidence. Recognized cytogenetically for more than a century, their origins have remained mysterious. Recent advances in genomics allowed us to assemble three human Robertsonian chromosomes completely. We identify a common breakpoint and epigenetic changes in centromeres that provide insight into the formation and propagation of common Robertsonian translocations. Further investigation of the assembled genomes of chimpanzee and bonobo highlights the structural features of the human genome that uniquely enable the specific crossover event that creates these chromosomes. Resolving the structure and epigenetic features of human Robertsonian chromosomes at a molecular level paves the way to understanding how chromosomal structural variation occurs more generally, and how chromosomes evolve.
Collapse
Affiliation(s)
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Sergey Koren
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Sean McKinney
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Arang Rhie
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Steven J Solar
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Chris Seidel
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Brandon Fagen
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Brian P Walenz
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gerard G Bouffard
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shelise Y Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Kate Hall
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Juyun Crawford
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alice C Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Brandon D Pickett
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Adam M Phillippy
- Stowers Institute for Medical Research, Kansas City, MO, USA
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | |
Collapse
|
15
|
Budowle B, Baker L, Sajantila A, Mittelman K, Mittelman D. Prioritizing privacy and presentation of supportable hypothesis testing in forensic genetic genealogy investigations. Biotechniques 2024; 76:425-431. [PMID: 39119680 DOI: 10.1080/07366205.2024.2386218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 07/25/2024] [Indexed: 08/10/2024] Open
Abstract
Investigative leads are not generated by traditional forensic DNA testing, if the source of the forensic evidence or a 1st degree relative of unidentified human remains is not in the DNA database. In such cases, forensic genetic genealogy (FGG) can provide valuable leads. However, FGG generated genetic data contain private and sensitive information. Therefore, it is essential to deploy approaches that minimize unnecessary disclosure of these data to mitigate potential risks to individual privacy. We recommend protective practices that need not impact effective reporting of relationship identifications. Examples include performing one-to-one comparisons of DNA profiles of third-party samples and evidence samples offline with an "air gap" to the internet and shielding the specific shared single nucleotide polymorphisms (SNP) states and locations by binning adjacent SNPs in forensic reports. Such approaches reduce risk of unwanted access to or reverse engineering of third-party individuals' genetic data and can give these donors greater confidence to support use of their DNA profiles in FGG investigation.
Collapse
Affiliation(s)
- Bruce Budowle
- Othram Inc., The Woodlands, TX 77381, USA
- Department of Forensic Medicine, University of Helsinki, Finland
- Forensic Science Institute, Radford University, Radford, VA 24142, USA
| | - Lee Baker
- Othram Inc., The Woodlands, TX 77381, USA
| | - Antti Sajantila
- Department of Forensic Medicine, University of Helsinki, Finland
- Forensic Medicine Unit, Finnish Institute for Health & Welfare, Helsinki,Finland
| | | | | |
Collapse
|
16
|
Schweiger R, Lee S, Zhou C, Yang TP, Smith K, Li S, Sanghvi R, Neville M, Mitchell E, Nessa A, Wadge S, Small KS, Campbell PJ, Sudmant PH, Rahbari R, Durbin R. Insights into non-crossover recombination from long-read sperm sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.05.602249. [PMID: 39005338 PMCID: PMC11245106 DOI: 10.1101/2024.07.05.602249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Meiotic recombination is a fundamental process that generates genetic diversity by creating new combinations of existing alleles. Although human crossovers have been studied at the pedigree, population and single-cell level, the more frequent non-crossover events that lead to gene conversion are harder to study, particularly at the individual level. Here we show that single high-fidelity long sequencing reads from sperm can capture both crossovers and non-crossovers, allowing effectively arbitrary sample sizes for analysis from one male. Using fifteen sperm samples from thirteen donors we demonstrate variation between and within donors for the rates of different types of recombination. Intriguingly, we observe a tendency for non-crossover gene conversions to occur upstream of nearby PRDM9 binding sites, whereas crossover locations have a slight downstream bias. We further provide evidence for two distinct non-crossover processes. One gives rise to the vast majority of non-crossovers with mean conversion tract length under 50bp, which we suggest is an outcome of standard PRDM9-induced meiotic recombination. In contrast ~2% of non-crossovers have much longer mean tract length, and potentially originate from the same process as complex events with more than two haplotype switches, which is not associated with PRDM9 binding sites and is also seen in somatic cells.
Collapse
Affiliation(s)
- Regev Schweiger
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom
| | - Sangjin Lee
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Chenxi Zhou
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom
| | - Tsun-Po Yang
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Katie Smith
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Stacy Li
- Department of Integrative Biology, University of California Berkeley, Berkeley, USA
| | - Rashesh Sanghvi
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Matthew Neville
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Emily Mitchell
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Ayrun Nessa
- Kings College London, Department of Twin Research & Genetic Epidemiology, London, United Kingdom
| | - Sam Wadge
- Kings College London, Department of Twin Research & Genetic Epidemiology, London, United Kingdom
| | - Kerrin S Small
- Kings College London, Department of Twin Research & Genetic Epidemiology, London, United Kingdom
| | - Peter J Campbell
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
- Wellcome-MRC Cambridge Stem Cell Institute, Cambridge Biomedical Campus, Cambridge, UK
| | - Peter H Sudmant
- Department of Integrative Biology, University of California Berkeley, Berkeley, USA
- Center for Computational Biology, University of California Berkeley, Berkeley, USA
| | - Raheleh Rahbari
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Richard Durbin
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom
- Wellcome Sanger Institute, Cancer Ageing and Somatic Mutation, Hinxton, Cambridge CB10 1SA, United Kingdom
| |
Collapse
|
17
|
Yang D, Ma SX, Zhao GL, Gao A, Xu ZK. Determining the effects of genetic linkage when using a combination of STR and SNP loci for kinship testing. Leg Med (Tokyo) 2024; 69:102441. [PMID: 38599008 DOI: 10.1016/j.legalmed.2024.102441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 03/19/2024] [Accepted: 03/30/2024] [Indexed: 04/12/2024]
Abstract
The pedigree likelihood ratio (LR) can be used for determining kinship in the forensic kinship testing. LR can be obtained by analyzing the DNA data of Short tandem repeat (STR) and single nucleotide polymorphism (SNP) loci. With the advancement of biotechnology, increasing number of genetic markers have been identified, thereby expanding the pedigree range of kinship testing. Moreover, some of the loci are physically closer to each other and genetic linkage between loci is inevitable. LRs can be calculated by accounting for linkage or ignoring linkage (LRlinkage and LRignore, respectively). GeneVisa is a software for kinship testing (www.genevisa.net) and adopts the Lander-Green algorithm to deal with genetic linkage. Herein, we used the simulation program of the software GeneVisa to investigate the effects of genetic linkage on 1st-degree, 2nd-degree, and 3rd-degree kinship testing. We used this software to simulate LRlinkage and LRignore values based on 43 STRs and 134 SNPs in commercial kits by using the allele frequency rate and genetic distance data of the European population. The effects of linkage on LR distribution and LRs of routine cases were investigated by comparing the LRlinkage values with the LRignore values. Our results revealed that the linkage effect on LR distributions is small, but the effect on LRs of routine cases may be large. Moreover, the results indicated that the discriminatory power of genetic markers for kinship testing can be improved by accounting for linkage.
Collapse
Affiliation(s)
- Da Yang
- Institute of Forensic Medicine And Laboratory Medicine, Jining Medical University, Shandong province, P. R. China; Forensic Science Center of Jining Medical University, Shandong province, P. R. China.
| | - Sheng Xuan Ma
- Institute of Forensic Medicine And Laboratory Medicine, Jining Medical University, Shandong province, P. R. China
| | - Guo Liang Zhao
- Institute of Forensic Medicine And Laboratory Medicine, Jining Medical University, Shandong province, P. R. China
| | - Ao Gao
- Institute of Forensic Medicine And Laboratory Medicine, Jining Medical University, Shandong province, P. R. China
| | - Zhao Kun Xu
- Institute of Forensic Medicine And Laboratory Medicine, Jining Medical University, Shandong province, P. R. China
| |
Collapse
|
18
|
Aktürk Ş, Mapelli I, Güler MN, Gürün K, Katırcıoğlu B, Vural KB, Sağlıcan E, Çetin M, Yaka R, Sürer E, Atağ G, Çokoğlu SS, Sevkar A, Altınışık NE, Koptekin D, Somel M. Benchmarking kinship estimation tools for ancient genomes using pedigree simulations. Mol Ecol Resour 2024; 24:e13960. [PMID: 38676702 DOI: 10.1111/1755-0998.13960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 03/19/2024] [Accepted: 03/28/2024] [Indexed: 04/29/2024]
Abstract
There is growing interest in uncovering genetic kinship patterns in past societies using low-coverage palaeogenomes. Here, we benchmark four tools for kinship estimation with such data: lcMLkin, NgsRelate, KIN, and READ, which differ in their input, IBD estimation methods, and statistical approaches. We used pedigree and ancient genome sequence simulations to evaluate these tools when only a limited number (1 to 50 K, with minor allele frequency ≥0.01) of shared SNPs are available. The performance of all four tools was comparable using ≥20 K SNPs. We found that first-degree related pairs can be accurately classified even with 1 K SNPs, with 85% F1 scores using READ and 96% using NgsRelate or lcMLkin. Distinguishing third-degree relatives from unrelated pairs or second-degree relatives was also possible with high accuracy (F1 > 90%) with 5 K SNPs using NgsRelate and lcMLkin, while READ and KIN showed lower success (69 and 79% respectively). Meanwhile, noise in population allele frequencies and inbreeding (first-cousin mating) led to deviations in kinship coefficients, with different sensitivities across tools. We conclude that using multiple tools in parallel might be an effective approach to achieve robust estimates on ultra-low-coverage genomes.
Collapse
Affiliation(s)
- Şevval Aktürk
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Igor Mapelli
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Merve N Güler
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Kanat Gürün
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Büşra Katırcıoğlu
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Kıvılcım Başak Vural
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Ekin Sağlıcan
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Mehmet Çetin
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Reyhan Yaka
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
- Centre for Palaeogenetics, Stockholm, Sweden
- Department of Archaeology and Classical Studies, Stockholm University, Stockholm, Sweden
| | - Elif Sürer
- Department of Modeling and Simulation, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Gözde Atağ
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Sevim Seda Çokoğlu
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Arda Sevkar
- Department of Anthropology, Hacettepe University, Ankara, Turkey
| | - N Ezgi Altınışık
- Department of Anthropology, Hacettepe University, Ankara, Turkey
| | - Dilek Koptekin
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Mehmet Somel
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| |
Collapse
|
19
|
Joseph J, Prentout D, Laverré A, Tricou T, Duret L. High prevalence of PRDM9-independent recombination hotspots in placental mammals. Proc Natl Acad Sci U S A 2024; 121:e2401973121. [PMID: 38809707 PMCID: PMC11161765 DOI: 10.1073/pnas.2401973121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 04/26/2024] [Indexed: 05/31/2024] Open
Abstract
In many mammals, recombination events are concentrated in hotspots directed by a sequence-specific DNA-binding protein named PRDM9. Intriguingly, PRDM9 has been lost several times in vertebrates, and notably among mammals, it has been pseudogenized in the ancestor of canids. In the absence of PRDM9, recombination hotspots tend to occur in promoter-like features such as CpG islands. It has thus been proposed that one role of PRDM9 could be to direct recombination away from PRDM9-independent hotspots. However, the ability of PRDM9 to direct recombination hotspots has been assessed in only a handful of species, and a clear picture of how much recombination occurs outside of PRDM9-directed hotspots in mammals is still lacking. In this study, we derived an estimator of past recombination activity based on signatures of GC-biased gene conversion in substitution patterns. We quantified recombination activity in PRDM9-independent hotspots in 52 species of boreoeutherian mammals. We observe a wide range of recombination rates at these loci: several species (such as mice, humans, some felids, or cetaceans) show a deficit of recombination, while a majority of mammals display a clear peak of recombination. Our results demonstrate that PRDM9-directed and PRDM9-independent hotspots can coexist in mammals and that their coexistence appears to be the rule rather than the exception. Additionally, we show that the location of PRDM9-independent hotspots is relatively more stable than that of PRDM9-directed hotspots, but that PRDM9-independent hotspots nevertheless evolve slowly in concert with DNA hypomethylation.
Collapse
Affiliation(s)
- Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| | - Djivan Prentout
- Department of Biological Sciences, Columbia University, New York, NY10027
| | - Alexandre Laverré
- Department of Ecology and Evolution, University of Lausanne, LausanneCH-1015, Switzerland
- Swiss Institute of Bioinformatics, LausanneCH-1015, Switzerland
| | - Théo Tricou
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| | - Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| |
Collapse
|
20
|
Qiao Y, Jewett EM, McManus KF, Freyman WA, Curran JE, Williams-Blangero S, Blangero J, The 23andMe Research Team, Williams AL. Reconstructing parent genomes using siblings and other relatives. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.10.593578. [PMID: 38798596 PMCID: PMC11118276 DOI: 10.1101/2024.05.10.593578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Reconstructing the DNA of ancestors from their descendants has the potential to empower phenotypic analyses (including association and genetic nurture studies), improve pedigree reconstruction, and shed light on the ancestral population and phenotypes of ancestors. We developed HAPI-RECAP, a method that reconstructs the DNA of parents from full siblings and their relatives. This tool leverages HAPI2's output, a new phasing approach that applies to siblings (and optionally one or both parents) and reliably infers parent haplotypes but does not link the ungenotyped parents' DNA across chromosomes or between segments flanking ambiguities. By combining IBD between the reconstructed parents and the relatives, HAPI-RECAP resolves the source parent of these segments. Moreover, the method exploits crossovers the children inherited and sex-specific genetic maps to infer the reconstructed parents' sexes. We validated these methods on research participants from both 23andMe, Inc. and the San Antonio Mexican American Family Studies. Given data for one parent, HAPI2 reconstructs large fractions of the missing parent's DNA, between 77.6% and 99.97% among all families, and 90.3% on average in three- and four-child families. When reconstructing both parents, HAPI-RECAP inferred between 33.2% and 96.6% of the parents' genotypes, averaging 70.6% in four-child families. Reconstructed genotypes have average error rates < 10-3, or comparable to those from direct genotyping. HAPI-RECAP inferred the parent sexes 100% correctly given IBD-linked segments and can also reconstruct parents without any IBD. As datasets grow in size, more families will be implicitly collected; HAPI-RECAP holds promise to enable high quality parent genotype reconstruction.
Collapse
Affiliation(s)
- Ying Qiao
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
| | | | | | | | - Joanne E. Curran
- South Texas Diabetes and Obesity Institute and Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX 78520, USA
| | - Sarah Williams-Blangero
- South Texas Diabetes and Obesity Institute and Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX 78520, USA
| | - John Blangero
- South Texas Diabetes and Obesity Institute and Department of Human Genetics, University of Texas Rio Grande Valley School of Medicine, Brownsville, TX 78520, USA
| | | | - Amy L. Williams
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- 23andMe, Inc., Sunnyvale, CA 94086, USA
| |
Collapse
|
21
|
Gerton JL. A working model for the formation of Robertsonian chromosomes. J Cell Sci 2024; 137:jcs261912. [PMID: 38606789 PMCID: PMC11057876 DOI: 10.1242/jcs.261912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/13/2024] Open
Abstract
Robertsonian chromosomes form by fusion of two chromosomes that have centromeres located near their ends, known as acrocentric or telocentric chromosomes. This fusion creates a new metacentric chromosome and is a major mechanism of karyotype evolution and speciation. Robertsonian chromosomes are common in nature and were first described in grasshoppers by the zoologist W. R. B. Robertson more than 100 years ago. They have since been observed in many species, including catfish, sheep, butterflies, bats, bovids, rodents and humans, and are the most common chromosomal change in mammals. Robertsonian translocations are particularly rampant in the house mouse, Mus musculus domesticus, where they exhibit meiotic drive and create reproductive isolation. Recent progress has been made in understanding how Robertsonian chromosomes form in the human genome, highlighting some of the fundamental principles of how and why these types of fusion events occur so frequently. Consequences of these fusions include infertility and Down's syndrome. In this Hypothesis, I postulate that the conditions that allow these fusions to form are threefold: (1) sequence homology on non-homologous chromosomes, often in the form of repetitive DNA; (2) recombination initiation during meiosis; and (3) physical proximity of the homologous sequences in three-dimensional space. This Hypothesis highlights the latest progress in understanding human Robertsonian translocations within the context of the broader literature on Robertsonian chromosomes.
Collapse
|
22
|
Woerner AE, Novroski NM, Mandape S, King JL, Crysup B, Coble MD. Identifying distant relatives using benchtop-scale sequencing. Forensic Sci Int Genet 2024; 69:103005. [PMID: 38171224 DOI: 10.1016/j.fsigen.2023.103005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 11/20/2023] [Accepted: 12/19/2023] [Indexed: 01/05/2024]
Abstract
The genetic component of forensic genetic genealogy (FGG) is an estimate of kinship, often conducted at genome scales between a great number of individuals. The promise of FGG is substantial: in concert with genealogical records and other nongenetic information, it can indirectly identify a person of interest. A downside of FGG is cost, as it is currently expensive and requires chemistries uncommon to forensic genetic laboratories (microarrays and high throughput sequencing). The more common benchtop sequencers can be coupled with a targeted PCR assay to conduct FGG, though such approaches have limited resolution for kinship. This study evaluates low-pass sequencing, an alternative strategy that is accessible to benchtop sequencers and can produce resolutions comparable to high-pass sequencing. Samples from a three-generation pedigree were augmented to include up to 7th degree relatives (using whole genome pedigree simulations) and the ability to recover the true kinship coefficient was assessed using algorithms qualitatively similar to those found in GEDmatch. We show that up to 7th degree relatives can be reliably inferred from 1 × whole genome sequencing obtainable from desktop sequencers.
Collapse
Affiliation(s)
- August E Woerner
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, Fort Worth, TX, USA.
| | - Nicole M Novroski
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA; Department of Anthropology, University of Toronto, Mississauga, ON, Canada
| | - Sammed Mandape
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA
| | - Jonathan L King
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA
| | - Benjamin Crysup
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, Fort Worth, TX, USA
| | - Michael D Coble
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, Fort Worth, TX, USA
| |
Collapse
|
23
|
Simon A, Coop G. The contribution of gene flow, selection, and genetic drift to five thousand years of human allele frequency change. Proc Natl Acad Sci U S A 2024; 121:e2312377121. [PMID: 38363870 PMCID: PMC10907250 DOI: 10.1073/pnas.2312377121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 01/09/2024] [Indexed: 02/18/2024] Open
Abstract
Genomic time series from experimental evolution studies and ancient DNA datasets offer us a chance to directly observe the interplay of various evolutionary forces. We show how the genome-wide variance in allele frequency change between two time points can be decomposed into the contributions of gene flow, genetic drift, and linked selection. In closed populations, the contribution of linked selection is identifiable because it creates covariances between time intervals, and genetic drift does not. However, repeated gene flow between populations can also produce directionality in allele frequency change, creating covariances. We show how to accurately separate the fraction of variance in allele frequency change due to admixture and linked selection in a population receiving gene flow. We use two human ancient DNA datasets, spanning around 5,000 y, as time transects to quantify the contributions to the genome-wide variance in allele frequency change. We find that a large fraction of genome-wide change is due to gene flow. In both cases, after correcting for known major gene flow events, we do not observe a signal of genome-wide linked selection. Thus despite the known role of selection in shaping long-term polymorphism levels, and an increasing number of examples of strong selection on single loci and polygenic scores from ancient DNA, it appears to be gene flow and drift, and not selection, that are the main determinants of recent genome-wide allele frequency change. Our approach should be applicable to the growing number of contemporary and ancient temporal population genomics datasets.
Collapse
Affiliation(s)
- Alexis Simon
- Center for Population Biology, University of California, Davis, CA95616
- Department of Evolution and Ecology, University of California, Davis, CA95616
| | - Graham Coop
- Center for Population Biology, University of California, Davis, CA95616
- Department of Evolution and Ecology, University of California, Davis, CA95616
| |
Collapse
|
24
|
Chase MA, Vilcot M, Mugal CF. The role of recombination dynamics in shaping signatures of direct and indirect selection across the Ficedula flycatcher genome †. Proc Biol Sci 2024; 291:20232382. [PMID: 38228173 DOI: 10.1098/rspb.2023.2382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 12/14/2023] [Indexed: 01/18/2024] Open
Abstract
Recombination is a central evolutionary process that reshuffles combinations of alleles along chromosomes, and consequently is expected to influence the efficacy of direct selection via Hill-Robertson interference. Additionally, the indirect effects of selection on neutral genetic diversity are expected to show a negative relationship with recombination rate, as background selection and genetic hitchhiking are stronger when recombination rate is low. However, owing to the limited availability of recombination rate estimates across divergent species, the impact of evolutionary changes in recombination rate on genomic signatures of selection remains largely unexplored. To address this question, we estimate recombination rate in two Ficedula flycatcher species, the taiga flycatcher (Ficedula albicilla) and collared flycatcher (Ficedula albicollis). We show that recombination rate is strongly correlated with signatures of indirect selection, and that evolutionary changes in recombination rate between species have observable impacts on this relationship. Conversely, signatures of direct selection on coding sequences show little to no relationship with recombination rate, even when restricted to genes where recombination rate is conserved between species. Thus, using measures of indirect and direct selection that bridge micro- and macro-evolutionary timescales, we demonstrate that the role of recombination rate and its dynamics varies for different signatures of selection.
Collapse
Affiliation(s)
- Madeline A Chase
- Department of Ecology and Genetics, Uppsala University, 75236 Uppsala, Sweden
- Swiss Ornithological Institute, 6204 Sempach, Switzerland
| | - Maurine Vilcot
- Department of Ecology and Genetics, Uppsala University, 75236 Uppsala, Sweden
- CEFE, University of Montpellier, CNRS, EPHE, IRD, 34293 Montpellier 5, France
| | - Carina F Mugal
- Department of Ecology and Genetics, Uppsala University, 75236 Uppsala, Sweden
- Laboratory of Biometry and Evolutionary Biology, University of Lyon 1, CNRS UMR 5558, 69622 Villeurbanne cedex, France
| |
Collapse
|
25
|
Simon A, Coop G. The contribution of gene flow, selection, and genetic drift to five thousand years of human allele frequency change. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.11.548607. [PMID: 37503227 PMCID: PMC10370008 DOI: 10.1101/2023.07.11.548607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Genomic time series from experimental evolution studies and ancient DNA datasets offer us a chance to directly observe the interplay of various evolutionary forces. We show how the genome-wide variance in allele frequency change between two time points can be decomposed into the contributions of gene flow, genetic drift, and linked selection. In closed populations, the contribution of linked selection is identifiable because it creates covariances between time intervals, and genetic drift does not. However, repeated gene flow between populations can also produce directionality in allele frequency change, creating covariances. We show how to accurately separate the fraction of variance in allele frequency change due to admixture and linked selection in a population receiving gene flow. We use two human ancient DNA datasets, spanning around 5,000 years, as time transects to quantify the contributions to the genome-wide variance in allele frequency change. We find that a large fraction of genome-wide change is due to gene flow. In both cases, after correcting for known major gene flow events, we do not observe a signal of genome-wide linked selection. Thus despite the known role of selection in shaping long-term polymorphism levels, and an increasing number of examples of strong selection on single loci and polygenic scores from ancient DNA, it appears to be gene flow and drift, and not selection, that are the main determinants of recent genome-wide allele frequency change. Our approach should be applicable to the growing number of contemporary and ancient temporal population genomics datasets.
Collapse
Affiliation(s)
- Alexis Simon
- Center for Population Biology, University of California, Davis, CA 95616
- Department of Evolution and Ecology, University of California, Davis, CA 95616
| | - Graham Coop
- Center for Population Biology, University of California, Davis, CA 95616
- Department of Evolution and Ecology, University of California, Davis, CA 95616
| |
Collapse
|
26
|
Versoza CJ, Weiss S, Johal R, La Rosa B, Jensen JD, Pfeifer SP. Novel Insights into the Landscape of Crossover and Noncrossover Events in Rhesus Macaques (Macaca mulatta). Genome Biol Evol 2024; 16:evad223. [PMID: 38051960 PMCID: PMC10773715 DOI: 10.1093/gbe/evad223] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 11/04/2023] [Accepted: 11/28/2023] [Indexed: 12/07/2023] Open
Abstract
Meiotic recombination landscapes differ greatly between distantly and closely related taxa, populations, individuals, sexes, and even within genomes; however, the factors driving this variation are yet to be well elucidated. Here, we directly estimate contemporary crossover rates and, for the first time, noncrossover rates in rhesus macaques (Macaca mulatta) from four three-generation pedigrees comprising 32 individuals. We further compare these results with historical, demography-aware, linkage disequilibrium-based recombination rate estimates. From paternal meioses in the pedigrees, 165 crossover events with a median resolution of 22.3 kb were observed, corresponding to a male autosomal map length of 2,357 cM-approximately 15% longer than an existing linkage map based on human microsatellite loci. In addition, 85 noncrossover events with a mean tract length of 155 bp were identified-similar to the tract lengths observed in the only other two primates in which noncrossovers have been studied to date, humans and baboons. Consistent with observations in other placental mammals with PRDM9-directed recombination, crossover (and to a lesser extent noncrossover) events in rhesus macaques clustered in intergenic regions and toward the chromosomal ends in males-a pattern in broad agreement with the historical, sex-averaged recombination rate estimates-and evidence of GC-biased gene conversion was observed at noncrossover sites.
Collapse
Affiliation(s)
- Cyril J Versoza
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | - Sarah Weiss
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Ravneet Johal
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Bruno La Rosa
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | - Susanne P Pfeifer
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
27
|
Dinh BL, Tang E, Taparra K, Nakatsuka N, Chen F, Chiang CWK. Recombination map tailored to Native Hawaiians may improve robustness of genomic scans for positive selection. Hum Genet 2024; 143:85-99. [PMID: 38157018 PMCID: PMC10794367 DOI: 10.1007/s00439-023-02625-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 11/25/2023] [Indexed: 01/03/2024]
Abstract
Recombination events establish the patterns of haplotypic structure in a population and estimates of recombination rates are used in several downstream population and statistical genetic analyses. Using suboptimal maps from distantly related populations may reduce the efficacy of genomic analyses, particularly for underrepresented populations such as the Native Hawaiians. To overcome this challenge, we constructed recombination maps using genome-wide array data from two study samples of Native Hawaiians: one reflecting the current admixed state of Native Hawaiians (NH map) and one based on individuals of enriched Polynesian ancestries (PNS map) with the potential to be used for less admixed Polynesian populations such as the Samoans. We found the recombination landscape to be less correlated with those from other continental populations (e.g. Spearman's rho = 0.79 between PNS and CEU (Utah residents with Northern and Western European ancestry) compared to 0.92 between YRI (Yoruba in Ibadan, Nigeria) and CEU at 50 kb resolution), likely driven by the unique demographic history of the Native Hawaiians. PNS also shared the fewest recombination hotspots with other populations (e.g. 8% of hotspots shared between PNS and CEU compared to 27% of hotspots shared between YRI and CEU). We found that downstream analyses in the Native Hawaiian population, such as local ancestry inference, imputation, and IBD segment and relatedness detections, would achieve similar efficacy when using the NH map compared to an omnibus map. However, for genome scans of adaptive loci using integrated haplotype scores, we found several loci with apparent genome-wide significant signals (|Z-score|> 4) in Native Hawaiians that would not have been significant when analyzed using NH-specific maps. Population-specific recombination maps may therefore improve the robustness of haplotype-based statistics and help us better characterize the evolutionary history that may underlie Native Hawaiian-specific health conditions that persist today.
Collapse
Affiliation(s)
- Bryan L Dinh
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Echo Tang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Kekoa Taparra
- Department of Radiation Oncology, Stanford University, Palo Alto, CA, USA
| | | | - Fei Chen
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Charleston W K Chiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
28
|
Zuo Z. The successive emergence of ERVL-MaLRs in primates. Virus Evol 2023; 9:vead072. [PMID: 38131004 PMCID: PMC10735291 DOI: 10.1093/ve/vead072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 11/01/2023] [Accepted: 12/01/2023] [Indexed: 12/23/2023] Open
Abstract
Although the ERVL-mammalian-apparent LTR retrotransposons (MaLRs) are the fourth largest family of transposable elements in the human genome, their evolutionary history and relationship have not been thoroughly studied. In this study, through RepeatMasker annotations of some representative species and construction of phylogenetic tree by sequence similarity, all primate-specific MaLR members are found to descend from MLT1A1 retrotransposon. Comparative genomic analysis, transposition-in-transposition inference, and sequence feature comparisons consistently show that each MaLR member evolved from its predecessor successively and had a limited activity period during primate evolution. Accordingly, a novel MaLR member was discovered as successor of MSTB1 in Tarsiiformes. At last, the identification of candidate precursor and intermediate THE1A elements provides further evidence for the previously proposed arms race model between ZNF430/ZNF100 and THE1B/THE1A. Taken together, this study sheds light on the evolutionary history of MaLRs and can serve as a foundation for future research on their interactions with zinc finger genes, gene regulation, and human health implications.
Collapse
Affiliation(s)
- Zheng Zuo
- School of Life Science and Technology, Southeast University, Nanjing 210096, China
| |
Collapse
|
29
|
Cotter DJ, Webster TH, Wilson MA. Genomic and demographic processes differentially influence genetic variation across the human X chromosome. PLoS One 2023; 18:e0287609. [PMID: 37910456 PMCID: PMC10619814 DOI: 10.1371/journal.pone.0287609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 06/08/2023] [Indexed: 11/03/2023] Open
Abstract
Many forces influence genetic variation across the genome including mutation, recombination, selection, and demography. Increased mutation and recombination both lead to increases in genetic diversity in a region-specific manner, while complex demographic patterns shape patterns of diversity on a more global scale. While these processes act across the entire genome, the X chromosome is particularly interesting because it contains several distinct regions that are subject to different combinations and strengths of these forces: the pseudoautosomal regions (PARs) and the X-transposed region (XTR). The X chromosome thus can serve as a unique model for studying how genetic and demographic forces act in different contexts to shape patterns of observed variation. We therefore sought to explore diversity, divergence, and linkage disequilibrium in each region of the X chromosome using genomic data from 26 human populations. Across populations, we find that both diversity and substitution rate are consistently elevated in PAR1 and the XTR compared to the rest of the X chromosome. In contrast, linkage disequilibrium is lowest in PAR1, consistent with the high recombination rate in this region, and highest in the region of the X chromosome that does not recombine in males. However, linkage disequilibrium in the XTR is intermediate between PAR1 and the autosomes, and much lower than the non-recombining X. Finally, in addition to these global patterns, we also observed variation in ratios of X versus autosomal diversity consistent with population-specific evolutionary history as well. While our results were generally consistent with previous work, two unexpected observations emerged. First, our results suggest that the XTR does not behave like the rest of the recombining X and may need to be evaluated separately in future studies. Second, the different regions of the X chromosome appear to exhibit unique patterns of linked selection across different human populations. Together, our results highlight profound regional differences across the X chromosome, simultaneously making it an ideal system for exploring the action of evolutionary forces as well as necessitating its careful consideration and treatment in genomic analyses.
Collapse
Affiliation(s)
- Daniel J. Cotter
- Department of Genetics, Stanford University, Stanford, CA, United States of America
| | - Timothy H. Webster
- Department of Anthropology, University of Utah, Salt Lake City, UT, United States of America
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
| | - Melissa A. Wilson
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
- Center for Evolution and Medicine, Biodesign Institute, Arizona State University, Tempe, AZ, United States of America
| |
Collapse
|
30
|
Cahoon CK, Richter CM, Dayton AE, Libuda DE. Sexual dimorphic regulation of recombination by the synaptonemal complex in C. elegans. eLife 2023; 12:e84538. [PMID: 37796106 PMCID: PMC10611432 DOI: 10.7554/elife.84538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 10/02/2023] [Indexed: 10/06/2023] Open
Abstract
In sexually reproducing organisms, germ cells faithfully transmit the genome to the next generation by forming haploid gametes, such as eggs and sperm. Although most meiotic proteins are conserved between eggs and sperm, many aspects of meiosis are sexually dimorphic, including the regulation of recombination. The synaptonemal complex (SC), a large ladder-like structure that forms between homologous chromosomes, is essential for regulating meiotic chromosome organization and promoting recombination. To assess whether sex-specific differences in the SC underpin sexually dimorphic aspects of meiosis, we examined Caenorhabditis elegans SC central region proteins (known as SYP proteins) in oogenesis and spermatogenesis and uncovered sex-specific roles for the SYPs in regulating meiotic recombination. We find that SC composition, specifically SYP-2, SYP-3, SYP-5, and SYP-6, is regulated by sex-specific mechanisms throughout meiotic prophase I. During pachytene, both oocytes and spermatocytes differentially regulate the stability of SYP-2 and SYP-3 within an assembled SC. Further, we uncover that the relative amount of SYP-2 and SYP-3 within the SC is independently regulated in both a sex-specific and a recombination-dependent manner. Specifically, we find that SYP-2 regulates the early steps of recombination in both sexes, while SYP-3 controls the timing and positioning of crossover recombination events across the genomic landscape in only oocytes. Finally, we find that SYP-2 and SYP-3 dosage can influence the composition of the other SYPs in the SC via sex-specific mechanisms during pachytene. Taken together, we demonstrate dosage-dependent regulation of individual SC components with sex-specific functions in recombination. These sexual dimorphic features of the SC provide insights into how spermatogenesis and oogenesis adapted similar chromosome structures to differentially regulate and execute recombination.
Collapse
Affiliation(s)
- Cori K Cahoon
- Institute of Molecular Biology, Department of Biology, University of OregonEugeneUnited States
| | - Colette M Richter
- Institute of Molecular Biology, Department of Biology, University of OregonEugeneUnited States
| | - Amelia E Dayton
- Institute of Molecular Biology, Department of Biology, University of OregonEugeneUnited States
| | - Diana E Libuda
- Institute of Molecular Biology, Department of Biology, University of OregonEugeneUnited States
| |
Collapse
|
31
|
Dinh BL, Tang E, Taparra K, Nakatsuka N, Chen F, Chiang CWK. Recombination map tailored to Native Hawaiians improves robustness of genomic scans for positive selection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.12.548735. [PMID: 37503129 PMCID: PMC10370006 DOI: 10.1101/2023.07.12.548735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Recombination events establish the patterns of haplotypic structure in a population and estimates of recombination rates are used in several downstream population and statistical genetic analyses. Using suboptimal maps from distantly related populations may reduce the efficacy of genomic analyses, particularly for underrepresented populations such as the Native Hawaiians. To overcome this challenge, we constructed recombination maps using genome-wide array data from two study samples of Native Hawaiians: one reflecting the current admixed state of Native Hawaiians (NH map), and one based on individuals of enriched Polynesian ancestries (PNS map) with the potential to be used for less admixed Polynesian populations such as the Samoans. We found the recombination landscape to be less correlated with those from other continental populations (e.g. Spearman's rho = 0.79 between PNS and CEU (Utah residents with Northern and Western European ancestry) compared to 0.92 between YRI (Yoruba in Ibadan, Nigeria) and CEU at 50 kb resolution), likely driven by the unique demographic history of the Native Hawaiians. PNS also shared the fewest recombination hotspots with other populations (e.g. 8% of hotspots shared between PNS and CEU compared to 27% of hotspots shared between YRI and CEU). We found that downstream analyses in the Native Hawaiian population, such as local ancestry inference, imputation, and IBD segment and relatedness detections, would achieve similar efficacy when using the NH map compared to an omnibus map. However, for genome scans of adaptive loci using integrated haplotype scores, we found several loci with apparent genome-wide significant signals (|Z-score| > 4) in Native Hawaiians that would not have been significant when analyzed using NH-specific maps. Population-specific recombination maps may therefore improve the robustness of haplotype-based statistics and help us better characterize the evolutionary history that may underlie Native Hawaiian-specific health conditions that persist today.
Collapse
Affiliation(s)
- Bryan L Dinh
- Department of Quantitative and Computational Biology, University of Southern California
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Echo Tang
- Department of Quantitative and Computational Biology, University of Southern California
| | - Kekoa Taparra
- Department of Radiation Oncology, Stanford University, Palo Alto, California
| | | | - Fei Chen
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Charleston W K Chiang
- Department of Quantitative and Computational Biology, University of Southern California
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| |
Collapse
|
32
|
Chevy ET, Huerta-Sánchez E, Ramachandran S. Integrating sex-bias into studies of archaic introgression on chromosome X. PLoS Genet 2023; 19:e1010399. [PMID: 37578977 PMCID: PMC10449224 DOI: 10.1371/journal.pgen.1010399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 08/24/2023] [Accepted: 07/10/2023] [Indexed: 08/16/2023] Open
Abstract
Evidence of interbreeding between archaic hominins and humans comes from methods that infer the locations of segments of archaic haplotypes, or 'archaic coverage' using the genomes of people living today. As more estimates of archaic coverage have emerged, it has become clear that most of this coverage is found on the autosomes- very little is retained on chromosome X. Here, we summarize published estimates of archaic coverage on autosomes and chromosome X from extant human samples. We find on average 7 times more archaic coverage on autosomes than chromosome X, and identify broad continental patterns in this ratio: greatest in European samples, and least in South Asian samples. We also perform extensive simulation studies to investigate how the amount of archaic coverage, lengths of coverage, and rates of purging of archaic coverage are affected by sex-bias caused by an unequal sex ratio within the archaic introgressors. Our results generally confirm that, with increasing male sex-bias, less archaic coverage is retained on chromosome X. Ours is the first study to explicitly model such sex-bias and its potential role in creating the dearth of archaic coverage on chromosome X.
Collapse
Affiliation(s)
- Elizabeth T. Chevy
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Emilia Huerta-Sánchez
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, Rhode Island, United States of America
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, Rhode Island, United States of America
- Data Science Initiative, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
33
|
Medvedev A, Lebedev M, Ponomarev A, Kosaretskiy M, Osipenko D, Tischenko A, Kosaretskiy E, Wang H, Kolobkov D, Chamberlain-Evans V, Vakhitov R, Nikonorov P. GRAPE: genomic relatedness detection pipeline. F1000Res 2023; 11:589. [PMID: 37224332 PMCID: PMC10182380 DOI: 10.12688/f1000research.111658.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/22/2023] [Indexed: 05/26/2023] Open
Abstract
Classifying the degree of relatedness between pairs of individuals has both scientific and commercial applications. As an example, genome-wide association studies (GWAS) may suffer from high rates of false positive results due to unrecognized population structure. This problem becomes especially relevant with recent increases in large-cohort studies. Accurate relationship classification is also required for genetic linkage analysis to identify disease-associated loci. Additionally, DNA relatives matching service is one of the leading drivers for the direct-to-consumer genetic testing market. Despite the availability of scientific and research information on the methods for determining kinship and the accessibility of relevant tools, the assembly of the pipeline, which stably operates on a real-world genotypic data, requires significant research and development resources. Currently, there is no open source end-to-end solution for relatedness detection in genomic data, that is fast, reliable and accurate for both close and distant degrees of kinship, combines all the necessary processing steps to work on a real data, and is ready for production integration. To address this, we developed GRAPE: Genomic RelAtedness detection PipelinE. It combines data preprocessing, identity-by-descent (IBD) segments detection, and accurate relationship estimation. The project uses software development best practices, as well as Global Alliance for Genomics and Health (GA4GH) standards and tools. Pipeline efficiency is demonstrated on both simulated and real-world datasets. GRAPE is available from: https://github.com/genxnetwork/grape.
Collapse
Affiliation(s)
- Alexander Medvedev
- Skolkovo Institute of Science and Technology, Moscow, Russian Federation
- GENXT, Hinxton, UK
| | | | | | | | | | | | | | - Hui Wang
- GENXT, Hinxton, UK
- Huazhong Agricultural University, Wuhan, China
| | | | | | | | | |
Collapse
|
34
|
Begg TJA, Schmidt A, Kocher A, Larmuseau MHD, Runfeldt G, Maier PA, Wilson JD, Barquera R, Maj C, Szolek A, Sager M, Clayton S, Peltzer A, Hui R, Ronge J, Reiter E, Freund C, Burri M, Aron F, Tiliakou A, Osborn J, Behar DM, Boecker M, Brandt G, Cleynen I, Strassburg C, Prüfer K, Kühnert D, Meredith WR, Nöthen MM, Attenborough RD, Kivisild T, Krause J. Genomic analyses of hair from Ludwig van Beethoven. Curr Biol 2023; 33:1431-1447.e22. [PMID: 36958333 DOI: 10.1016/j.cub.2023.02.041] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 10/11/2022] [Accepted: 02/13/2023] [Indexed: 03/25/2023]
Abstract
Ludwig van Beethoven (1770-1827) remains among the most influential and popular classical music composers. Health problems significantly impacted his career as a composer and pianist, including progressive hearing loss, recurring gastrointestinal complaints, and liver disease. In 1802, Beethoven requested that following his death, his disease be described and made public. Medical biographers have since proposed numerous hypotheses, including many substantially heritable conditions. Here we attempt a genomic analysis of Beethoven in order to elucidate potential underlying genetic and infectious causes of his illnesses. We incorporated improvements in ancient DNA methods into existing protocols for ancient hair samples, enabling the sequencing of high-coverage genomes from small quantities of historical hair. We analyzed eight independently sourced locks of hair attributed to Beethoven, five of which originated from a single European male. We deemed these matching samples to be almost certainly authentic and sequenced Beethoven's genome to 24-fold genomic coverage. Although we could not identify a genetic explanation for Beethoven's hearing disorder or gastrointestinal problems, we found that Beethoven had a genetic predisposition for liver disease. Metagenomic analyses revealed furthermore that Beethoven had a hepatitis B infection during at least the months prior to his death. Together with the genetic predisposition and his broadly accepted alcohol consumption, these present plausible explanations for Beethoven's severe liver disease, which culminated in his death. Unexpectedly, an analysis of Y chromosomes sequenced from five living members of the Van Beethoven patrilineage revealed the occurrence of an extra-pair paternity event in Ludwig van Beethoven's patrilineal ancestry.
Collapse
Affiliation(s)
- Tristan James Alexander Begg
- Department of Archaeology, University of Cambridge, CB2 3ER Cambridge, UK; Institute for Archaeological Sciences, University of Tübingen, 72070 Tübingen, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany.
| | - Axel Schmidt
- Institute of Human Genetics, University Hospital of Bonn, Bonn 53127, Germany
| | - Arthur Kocher
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany; Transmission, Infection, Diversification and Evolution Group, Max Planck Institute for the Science of Human History, 07745 Jena, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Maarten H D Larmuseau
- Department of Human Genetics, Katholieke Universiteit Leuven, 3000 Leuven, Belgium; Laboratory of Human Genetic Genealogy, Department of Human Genetics, Katholieke Universiteit Leuven, 3000 Leuven, Belgium; ARCHES - Antwerp Cultural Heritage Sciences, Faculty of Design Sciences, University of Antwerp, 2000 Antwerp, Belgium; Histories vzw, 9000 Gent, Belgium
| | | | | | - John D Wilson
- Austrian Academy of Sciences, 1030 Vienna, Austria; University of Vienna, 1010 Vienna, Austria
| | - Rodrigo Barquera
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany
| | - Carlo Maj
- Institute of Human Genetics, University Hospital of Bonn, Bonn 53127, Germany; Center for Human Genetics, University Hospital of Marburg, Marburg, Germany
| | - András Szolek
- Applied Bioinformatics, Department for Computer Science, University of Tübingen, Sand 14, 72076 Tübingen, Germany; Department of Immunology, Interfaculty Institute for Cell Biology, University of Tübingen, Tübingen, Germany
| | | | - Stephen Clayton
- Institute for Archaeological Sciences, University of Tübingen, 72070 Tübingen, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Alexander Peltzer
- Quantitative Biology Center (QBiC) University of Tübingen, Tübingen, Germany
| | - Ruoyun Hui
- MacDonald Institute for Archaeological Research, University of Cambridge, Cambridge CB2 3ER, UK; Alan Turing Institute, 2QR, John Dodson House, London NW1 2DB, UK
| | | | - Ella Reiter
- Institute for Archaeological Sciences, University of Tübingen, 72070 Tübingen, Germany
| | - Cäcilia Freund
- Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Marta Burri
- Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Franziska Aron
- Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Anthi Tiliakou
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Joanna Osborn
- Department of Archaeology, University of Cambridge, CB2 3ER Cambridge, UK
| | - Doron M Behar
- Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | | | - Guido Brandt
- Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Isabelle Cleynen
- Department of Human Genetics, Katholieke Universiteit Leuven, 3000 Leuven, Belgium
| | - Christian Strassburg
- Department of Internal Medicine I, University Hospital Bonn, 53127 Bonn, Germany
| | - Kay Prüfer
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany
| | - Denise Kühnert
- Transmission, Infection, Diversification and Evolution Group, Max Planck Institute for the Science of Human History, 07745 Jena, Germany; European Virus Bioinformatics Center (EVBC), Jena, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - William Rhea Meredith
- American Beethoven Society, San Jose State University, San Jose, CA 95192, USA; Ira F. Brilliant Center for Beethoven Studies, San Jose State University, San Jose, CA 95192, USA; School of Music and Dance, San Jose State University, San Jose, CA 95192, USA
| | - Markus M Nöthen
- Institute of Human Genetics, University Hospital of Bonn, Bonn 53127, Germany
| | - Robert David Attenborough
- MacDonald Institute for Archaeological Research, University of Cambridge, Cambridge CB2 3ER, UK; School of Archaeology & Anthropology, Australian National University, Canberra, ACT 0200, Australia
| | - Toomas Kivisild
- Department of Archaeology, University of Cambridge, CB2 3ER Cambridge, UK; Department of Human Genetics, Katholieke Universiteit Leuven, 3000 Leuven, Belgium; Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu 51010, Estonia.
| | - Johannes Krause
- Institute for Archaeological Sciences, University of Tübingen, 72070 Tübingen, Germany; Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany.
| |
Collapse
|
35
|
Dissecting the Meiotic Recombination Patterns in a Brassica napus Double Haploid Population Using 60K SNP Array. Int J Mol Sci 2023; 24:ijms24054469. [PMID: 36901901 PMCID: PMC10003086 DOI: 10.3390/ijms24054469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 02/14/2023] [Accepted: 02/22/2023] [Indexed: 02/26/2023] Open
Abstract
Meiotic recombination not only maintains the stability of the chromosome structure but also creates genetic variations for adapting to changeable environments. A better understanding of the mechanism of crossover (CO) patterns at the population level is useful for crop improvement. However, there are limited cost-effective and universal methods to detect the recombination frequency at the population level in Brassica napus. Here, the Brassica 60K Illumina Infinium SNP array (Brassica 60K array) was used to systematically study the recombination landscape in a double haploid (DH) population of B. napus. It was found that COs were unevenly distributed across the whole genome, and a higher frequency of COs existed at the distal ends of each chromosome. A considerable number of genes (more than 30%) in the CO hot regions were associated with plant defense and regulation. In most tissues, the average gene expression level in the hot regions (CO frequency of greater than 2 cM/Mb) was significantly higher than that in the regions with a CO frequency of less than 1 cM/Mb. In addition, a bin map was constructed with 1995 recombination bins. For seed oil content, Bin 1131 to 1134, Bin 1308 to 1311, Bin 1864 to 1869, and Bin 2184 to 2230 were identified on chromosomes A08, A09, C03, and C06, respectively, which could explain 8.5%, 17.3%, 8.6%, and 3.9% of the phenotypic variation. These results could not only deepen our understanding of meiotic recombination in B. napus at the population level, and provide useful information for rapeseed breeding in the future, but also provided a reference for studying CO frequency in other species.
Collapse
|
36
|
Wang F, Moon W, Letsou W, Sapkota Y, Wang Z, Im C, Baedke JL, Robison L, Yasui Y. Genome-Wide Analysis of Rare Haplotypes Associated with Breast Cancer Risk. Cancer Res 2023; 83:332-345. [PMID: 36354368 PMCID: PMC9852031 DOI: 10.1158/0008-5472.can-22-1888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 09/09/2022] [Accepted: 11/08/2022] [Indexed: 11/12/2022]
Abstract
Numerous common genetic variants have been linked to breast cancer risk, but they only partially explain the total breast cancer heritability. Inference from Nordic population-based twin data indicates rare high-risk loci as the chief determinant of breast cancer risk. Here, we use haplotypes, rather than single variants, to identify rare high-risk loci for breast cancer. With computationally phased genotypes from 181,034 white British women in the UK Biobank, a genome-wide haplotype-breast cancer association analysis was conducted using sliding windows of 5 to 500 consecutive array-genotyped variants. In the discovery stage, haplotype-breast cancer associations were evaluated retrospectively in the prestudy-enrollment data including 5,487 breast cancer cases. Breast cancer hazard ratios (HR) for additive haplotypic effects were estimated using Cox regression. The replication analysis included a prospective cohort of women free of breast cancer at enrollment, of whom 3,524 later developed breast cancer. This two-stage analysis detected 13 rare loci (frequency <1%), each associated with an appreciable breast cancer-risk increase (discovery: HRs = 2.84-6.10, P < 5 × 10-8; replication: HRs = 2.08-5.61, P < 0.01). In contrast, the variants that formed these rare haplotypes individually exhibited much smaller effects. Functional annotation revealed extensive cis-regulatory DNA elements in breast cancer-related cells underlying the replicated rare haplotypes. Using phased, imputed genotypes from 30,064 cases and 25,282 controls in the DRIVE OncoArray case-control study, 6 of the 13 rare-loci associations were found generalizable (odds ratio estimates: 1.48-7.67, P < 0.05). This study demonstrates the complementary advantage of utilizing rare haplotypes to capture novel risk loci and suggests the potential for the discovery of more genetic elements contributing to cancer heritability as large data sets of germline whole-genome sequencing become available. SIGNIFICANCE A genome-wide two-stage haplotype analysis identifies rare haplotypes associated with breast cancer risk and suggests that the rare risk haplotypes represent long-range interactions with regulatory consequences influencing cancer risk.
Collapse
Affiliation(s)
- Fan Wang
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Wonjong Moon
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - William Letsou
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Yadav Sapkota
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Zhaoming Wang
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Cindy Im
- School of Public Health, University of Alberta, Edmonton, Alberta T6G 1C9, Canada
| | - Jessica L. Baedke
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Leslie Robison
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Yutaka Yasui
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
- School of Public Health, University of Alberta, Edmonton, Alberta T6G 1C9, Canada
| |
Collapse
|
37
|
Abeyratne CR, Macaya-Sanz D, Zhou R, Barry KW, Daum C, Haiby K, Lipzen A, Stanton B, Yoshinaga Y, Zane M, Tuskan GA, DiFazio SP. High-resolution mapping reveals hotspots and sex-biased recombination in Populus trichocarpa. G3 (BETHESDA, MD.) 2023; 13:jkac269. [PMID: 36250890 PMCID: PMC9836356 DOI: 10.1093/g3journal/jkac269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 09/28/2022] [Indexed: 12/14/2022]
Abstract
Fine-scale meiotic recombination is fundamental to the outcome of natural and artificial selection. Here, dense genetic mapping and haplotype reconstruction were used to estimate recombination for a full factorial Populus trichocarpa cross of 7 males and 7 females. Genomes of the resulting 49 full-sib families (N = 829 offspring) were resequenced, and high-fidelity biallelic SNP/INDELs and pedigree information were used to ascertain allelic phase and impute progeny genotypes to recover gametic haplotypes. The 14 parental genetic maps contained 1,820 SNP/INDELs on average that covered 376.7 Mb of physical length across 19 chromosomes. Comparison of parental and progeny haplotypes allowed fine-scale demarcation of cross-over regions, where 38,846 cross-over events in 1,658 gametes were observed. Cross-over events were positively associated with gene density and negatively associated with GC content and long-terminal repeats. One of the most striking findings was higher rates of cross-overs in males in 8 out of 19 chromosomes. Regions with elevated male cross-over rates had lower gene density and GC content than windows showing no sex bias. High-resolution analysis identified 67 candidate cross-over hotspots spread throughout the genome. DNA sequence motifs enriched in these regions showed striking similarity to those of maize, Arabidopsis, and wheat. These findings, and recombination estimates, will be useful for ongoing efforts to accelerate domestication of this and other biomass feedstocks, as well as future studies investigating broader questions related to evolutionary history, perennial development, phenology, wood formation, vegetative propagation, and dioecy that cannot be studied using annual plant model systems.
Collapse
Affiliation(s)
| | - David Macaya-Sanz
- Department of Forest Ecology & Genetics, CIFOR-INIA, CSIC, Madrid 28040, Spain
| | - Ran Zhou
- Warnell School of Forestry and Natural Resources, Department of Genetics, and Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
| | - Kerrie W Barry
- Department of Energy Joint Genome Institute, Berkeley, CA 94720, USA
| | - Christopher Daum
- Department of Energy Joint Genome Institute, Berkeley, CA 94720, USA
| | | | - Anna Lipzen
- Department of Energy Joint Genome Institute, Berkeley, CA 94720, USA
| | | | - Yuko Yoshinaga
- Department of Energy Joint Genome Institute, Berkeley, CA 94720, USA
| | - Matthew Zane
- Department of Energy Joint Genome Institute, Berkeley, CA 94720, USA
| | - Gerald A Tuskan
- Biosciences Division, Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Stephen P DiFazio
- Department of Biology, West Virginia University, Morgantown, WV 26506, USA
| |
Collapse
|
38
|
Souilmi Y, Tobler R, Johar A, Williams M, Grey ST, Schmidt J, Teixeira JC, Rohrlach A, Tuke J, Johnson O, Gower G, Turney C, Cox M, Cooper A, Huber CD. Admixture has obscured signals of historical hard sweeps in humans. Nat Ecol Evol 2022; 6:2003-2015. [PMID: 36316412 PMCID: PMC9715430 DOI: 10.1038/s41559-022-01914-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Accepted: 09/16/2022] [Indexed: 11/06/2022]
Abstract
The role of natural selection in shaping biological diversity is an area of intense interest in modern biology. To date, studies of positive selection have primarily relied on genomic datasets from contemporary populations, which are susceptible to confounding factors associated with complex and often unknown aspects of population history. In particular, admixture between diverged populations can distort or hide prior selection events in modern genomes, though this process is not explicitly accounted for in most selection studies despite its apparent ubiquity in humans and other species. Through analyses of ancient and modern human genomes, we show that previously reported Holocene-era admixture has masked more than 50 historic hard sweeps in modern European genomes. Our results imply that this canonical mode of selection has probably been underappreciated in the evolutionary history of humans and suggest that our current understanding of the tempo and mode of selection in natural populations may be inaccurate.
Collapse
Affiliation(s)
- Yassine Souilmi
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
| | - Raymond Tobler
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Evolution of Cultural Diversity Initiative, Australian National University, Canberra, Australian Capital Territory, Australia.
| | - Angad Johar
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Cardiovascular Diseases, Mayo Clinic, Rochester, MN, USA.
| | - Matthew Williams
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Shane T Grey
- Transplantation Immunology Group, Immunology Division, Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia
- St Vincent's Clinical School, Faculty of Medicine, UNSW, Darlinghurst, New South Wales, Australia
| | - Joshua Schmidt
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - João C Teixeira
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Adam Rohrlach
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Jonathan Tuke
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, South Australia, Australia
- School of Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Olivia Johnson
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Graham Gower
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Chris Turney
- Chronos 14Carbon-Cycle Facility and Earth and Sustainability Science Research Centre, University of New South Wales, Sydney, New South Wales, Australia
| | - Murray Cox
- Statistics and Bioinformatics Group, School of Fundamental Sciences, Massey University, Palmerston North, New Zealand
| | - Alan Cooper
- South Australian Museum, Adelaide, South Australia, Australia.
- BlueSky Genetics, Ashton, South Australia, Australia.
| | - Christian D Huber
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Biology, Penn State University, University Park, PA, USA.
| |
Collapse
|
39
|
Murga-Moreno J, Coronado-Zamora M, Casillas S, Barbadilla A. impMKT: the imputed McDonald and Kreitman test, a straightforward correction that significantly increases the evidence of positive selection of the McDonald and Kreitman test at the gene level. G3 GENES|GENOMES|GENETICS 2022; 12:6670623. [PMID: 35976111 PMCID: PMC9526038 DOI: 10.1093/g3journal/jkac206] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/28/2022] [Indexed: 11/14/2022]
Abstract
The McDonald and Kreitman test is one of the most powerful and widely used methods to detect and quantify recurrent natural selection in DNA sequence data. One of its main limitations is the underestimation of positive selection due to the presence of slightly deleterious variants segregating at low frequencies. Although several approaches have been developed to overcome this limitation, most of them work on gene pooled analyses. Here, we present the imputed McDonald and Kreitman test (impMKT), a new straightforward approach for the detection of positive selection and other selection components of the distribution of fitness effects at the gene level. We compare imputed McDonald and Kreitman test with other widely used McDonald and Kreitman test approaches considering both simulated and empirical data. By applying imputed McDonald and Kreitman test to humans and Drosophila data at the gene level, we substantially increase the statistical evidence of positive selection with respect to previous approaches (e.g. by 50% and 157% compared with the McDonald and Kreitman test in Drosophila and humans, respectively). Finally, we review the minimum number of genes required to obtain a reliable estimation of the proportion of adaptive substitution (α) in gene pooled analyses by using the imputed McDonald and Kreitman test compared with other McDonald and Kreitman test implementations. Because of its simplicity and increased power to detect recurrent positive selection on genes, we propose the imputed McDonald and Kreitman test as the first straightforward approach for testing specific evolutionary hypotheses at the gene level. The software implementation and population genomics data are available at the web-server imkt.uab.cat.
Collapse
Affiliation(s)
- Jesús Murga-Moreno
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Marta Coronado-Zamora
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Sònia Casillas
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| | - Antonio Barbadilla
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona , Barcelona 08193, Spain
| |
Collapse
|
40
|
Lyu R, Tsui V, Crismani W, Liu R, Shim H, McCarthy D. sgcocaller and comapr: personalised haplotype assembly and comparative crossover map analysis using single-gamete sequencing data. Nucleic Acids Res 2022; 50:e118. [PMID: 36107768 PMCID: PMC9723612 DOI: 10.1093/nar/gkac764] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 08/17/2022] [Accepted: 09/06/2022] [Indexed: 12/24/2022] Open
Abstract
Profiling gametes of an individual enables the construction of personalised haplotypes and meiotic crossover landscapes, now achievable at larger scale than ever through the availability of high-throughput single-cell sequencing technologies. However, high-throughput single-gamete data commonly have low depth of coverage per gamete, which challenges existing gamete-based haplotype phasing methods. In addition, haplotyping a large number of single gametes from high-throughput single-cell DNA sequencing data and constructing meiotic crossover profiles using existing methods requires intensive processing. Here, we introduce efficient software tools for the essential tasks of generating personalised haplotypes and calling crossovers in gametes from single-gamete DNA sequencing data (sgcocaller), and constructing, visualising, and comparing individualised crossover landscapes from single gametes (comapr). With additional data pre-possessing, the tools can also be applied to bulk-sequenced samples. We demonstrate that sgcocaller is able to generate impeccable phasing results for high-coverage datasets, on which it is more accurate and stable than existing methods, and also performs well on low-coverage single-gamete sequencing datasets for which current methods fail. Our tools achieve highly accurate results with user-friendly installation, comprehensive documentation, efficient computation times and minimal memory usage.
Collapse
Affiliation(s)
- Ruqian Lyu
- Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, 9 Princes Street, Fitzroy, Victoria 3065, Australia,Melbourne Integrative Genomics/School of Mathematics and Statistics, Faculty of Science, The University of Melbourne, Building 184, Royal Parade, Parkville, Victoria 3010, Australia
| | - Vanessa Tsui
- DNA Repair and Recombination Laboratory, St Vincent’s Institute of Medical Research, 9 Princes Street, Fitzroy, Victoria 3065, Australia,The Faculty of Medicine, Dentistry and Health Science, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Wayne Crismani
- DNA Repair and Recombination Laboratory, St Vincent’s Institute of Medical Research, 9 Princes Street, Fitzroy, Victoria 3065, Australia,The Faculty of Medicine, Dentistry and Health Science, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Ruijie Liu
- Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, 9 Princes Street, Fitzroy, Victoria 3065, Australia
| | | | - Davis J McCarthy
- To whom correspondence should be addressed. Tel: +61 3 9231 2480; Fax: +61 3 9416 2676;
| |
Collapse
|
41
|
Avadhanam S, Williams AL. Simultaneous inference of parental admixture proportions and admixture times from unphased local ancestry calls. Am J Hum Genet 2022; 109:1405-1420. [PMID: 35908549 PMCID: PMC9388397 DOI: 10.1016/j.ajhg.2022.06.016] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 06/24/2022] [Indexed: 02/06/2023] Open
Abstract
Population genetic analyses of local ancestry tracts routinely assume that the ancestral admixture process is identical for both parents of an individual, an assumption that may be invalid when considering recent admixture. Here, we present Parental Admixture Proportion Inference (PAPI), a Bayesian tool for inferring the admixture proportions and admixture times for each parent of a single admixed individual. PAPI analyzes unphased local ancestry tracts and has two components: a binomial model that leverages genome-wide ancestry fractions to infer parental admixture proportions and a hidden Markov model (HMM) that infers admixture times from tract lengths. Crucially, the HMM accounts for unobserved within-ancestry recombination by approximating the pedigree crossover dynamics, enabling inference of parental admixture times. In simulations, we find that PAPI's admixture proportion estimates deviate from the truth by 0.047 on average, outperforming ANCESTOR and PedMix by 46.0% and 57.6%, respectively. Moreover, PAPI's admixture time estimates were strongly correlated with the truth (R=0.76) but have an average downward bias of 1.01 generations that is partly attributable to inaccuracies in local ancestry inference. As an illustration of its utility, we ran PAPI on African American genotypes from the PAGE study (N = 5,786) and found strong evidence of assortative mating by ancestry proportion: couples' ancestry proportions are highly correlated (R = 0.87) and are closer to each other than expected under random mating (p < 10-6). We anticipate that PAPI will be useful in studying the population dynamics of admixture and will also be of interest to individuals seeking to learn about their personal genealogies.
Collapse
Affiliation(s)
- Siddharth Avadhanam
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
| | - Amy L Williams
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA.
| |
Collapse
|
42
|
Snedecor J, Fennell T, Stadick S, Homer N, Antunes J, Stephens K, Holt C. Fast and Accurate Kinship Estimation Using Sparse SNPs in Relatively Large Database Searches. Forensic Sci Int Genet 2022; 61:102769. [DOI: 10.1016/j.fsigen.2022.102769] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 08/18/2022] [Accepted: 08/20/2022] [Indexed: 11/28/2022]
|
43
|
Turner SD, Nagraj V, Scholz M, Jessa S, Acevedo C, Ge J, Woerner AE, Budowle B. Evaluating the Impact of Dropout and Genotyping Error on SNP-Based Kinship Analysis With Forensic Samples. Front Genet 2022; 13:882268. [PMID: 35846115 PMCID: PMC9282869 DOI: 10.3389/fgene.2022.882268] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 05/16/2022] [Indexed: 11/13/2022] Open
Abstract
Technological advances in sequencing and single nucleotide polymorphism (SNP) genotyping microarray technology have facilitated advances in forensic analysis beyond short tandem repeat (STR) profiling, enabling the identification of unknown DNA samples and distant relationships. Forensic genetic genealogy (FGG) has facilitated the identification of distant relatives of both unidentified remains and unknown donors of crime scene DNA, invigorating the use of biological samples to resolve open cases. Forensic samples are often degraded or contain only trace amounts of DNA. In this study, the accuracy of genome-wide relatedness methods and identity by descent (IBD) segment approaches was evaluated in the presence of challenges commonly encountered with forensic data: missing data and genotyping error. Pedigree whole-genome simulations were used to estimate the genotypes of thousands of individuals with known relationships using multiple populations with different biogeographic ancestral origins. Simulations were also performed with varying error rates and types. Using these data, the performance of different methods for quantifying relatedness was benchmarked across these scenarios. When the genotyping error was low (<1%), IBD segment methods outperformed genome-wide relatedness methods for close relationships and are more accurate at distant relationship inference. However, with an increasing genotyping error (1–5%), methods that do not rely on IBD segment detection are more robust and outperform IBD segment methods. The reduced call rate had little impact on either class of methods. These results have implications for the use of dense SNP data in forensic genomics for distant kinship analysis and FGG, especially when the sample quality is low.
Collapse
Affiliation(s)
- Stephen D. Turner
- Signature Science, LLC., Austin, TX, United States
- *Correspondence: Stephen D. Turner,
| | - V.P. Nagraj
- Signature Science, LLC., Austin, TX, United States
| | | | | | | | - Jianye Ge
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - August E. Woerner
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| |
Collapse
|
44
|
Ariano B, Mattiangeli V, Breslin EM, Parkinson EW, McLaughlin TR, Thompson JE, Power RK, Stock JT, Mercieca-Spiteri B, Stoddart S, Malone C, Gopalakrishnan S, Cassidy LM, Bradley DG. Ancient Maltese genomes and the genetic geography of Neolithic Europe. Curr Biol 2022; 32:2668-2680.e6. [PMID: 35588742 PMCID: PMC9245899 DOI: 10.1016/j.cub.2022.04.069] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 02/07/2022] [Accepted: 04/22/2022] [Indexed: 12/14/2022]
Abstract
Archaeological consideration of maritime connectivity has ranged from a biogeographical perspective that considers the sea as a barrier to a view of seaways as ancient highways that facilitate exchange. Our results illustrate the former. We report three Late Neolithic human genomes from the Mediterranean island of Malta that are markedly enriched for runs of homozygosity, indicating inbreeding in their ancestry and an effective population size of only hundreds, a striking illustration of maritime isolation in this agricultural society. In the Late Neolithic, communities across mainland Europe experienced a resurgence of hunter-gatherer ancestry, pointing toward the persistence of different ancestral strands that subsequently admixed. This is absent in the Maltese genomes, giving a further indication of their genomic insularity. Imputation of genome-wide genotypes in our new and 258 published ancient individuals allowed shared identity-by-descent segment analysis, giving a fine-grained genetic geography of Neolithic Europe. This highlights the differentiating effects of seafaring Mediterranean expansion and also island colonization, including that of Ireland, Britain, and Orkney. These maritime effects contrast profoundly with a lack of migratory barriers in the establishment of Central European farming populations from Anatolia and the Balkans.
Collapse
Affiliation(s)
- Bruno Ariano
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland
| | | | - Emily M Breslin
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland
| | - Eóin W Parkinson
- Department of Classics and Archaeology, University of Malta, Msida 2080, Malta
| | - T Rowan McLaughlin
- Department of Scientific Research, The British Museum, Great Russell Street, London WC1B 3DG, UK
| | - Jess E Thompson
- McDonald Institute for Archaeological Research, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
| | - Ronika K Power
- Department of History and Archaeology, Macquarie University, 25B Wally's Walk, Sydney, NSW, Australia
| | - Jay T Stock
- Department of Anthropology, Western University, 1151 Richmond St, London, ON N6G 2V4, Canada
| | | | - Simon Stoddart
- McDonald Institute for Archaeological Research, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
| | - Caroline Malone
- School of Natural and Built Environment, Queen's University Belfast, Elmwood Avenue, Belfast, UK
| | - Shyam Gopalakrishnan
- GLOBE Institute, University of Copenhagen, Øster Farimagsgade 5, 1353 København K, Denmark.
| | - Lara M Cassidy
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland.
| | - Daniel G Bradley
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland.
| |
Collapse
|
45
|
Smith J, Qiao Y, Williams AL. Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification. G3 (BETHESDA, MD.) 2022; 12:jkac072. [PMID: 35348675 PMCID: PMC9157175 DOI: 10.1093/g3journal/jkac072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 03/07/2022] [Indexed: 11/29/2022]
Abstract
Despite decades of methods development for classifying relatives in genetic studies, pairwise relatedness methods' recalls are above 90% only for first through third-degree relatives. The top-performing approaches, which leverage identity-by-descent segments, often use only kinship coefficients, while others, including estimation of recent shared ancestry (ERSA), use the number of segments relatives share. To quantify the potential for using segment numbers in relatedness inference, we leveraged information theory measures to analyze exact (i.e. produced by a simulator) identity-by-descent segments from simulated relatives. Over a range of settings, we found that the mutual information between the relatives' degree of relatedness and a tuple of their kinship coefficient and segment number is on average 4.6% larger than between the degree and the kinship coefficient alone. We further evaluated identity-by-descent segment number utility by building a Bayes classifier to predict first through sixth-degree relationships using different feature sets. When trained and tested with exact segments, the inclusion of segment numbers improves the recall by between 0.28% and 3% for second through sixth-degree relatives. However, the recalls improve by less than 1.8% per degree when using inferred segments, suggesting limitations due to identity-by-descent detection accuracy. Last, we compared our Bayes classifier that includes segment numbers with both ERSA and IBIS and found comparable recalls, with the Bayes classifier and ERSA slightly outperforming each other across different degrees. Overall, this study shows that identity-by-descent segment numbers can improve relatedness inference, but errors from current SNP array-based detection methods yield dampened signals in practice.
Collapse
Affiliation(s)
- Jesse Smith
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY 14853, USA
| | - Ying Qiao
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
| | - Amy L Williams
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
46
|
Li Y, Chen S, Rapakoulia T, Kuwahara H, Yip KY, Gao X. Deep learning identifies and quantifies recombination hotspot determinants. Bioinformatics 2022; 38:2683-2691. [PMID: 35561158 PMCID: PMC9113300 DOI: 10.1093/bioinformatics/btac234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 03/08/2022] [Accepted: 04/08/2022] [Indexed: 11/30/2022] Open
Abstract
MOTIVATION Recombination is one of the essential genetic processes for sexually reproducing organisms, which can happen more frequently in some regions, called recombination hotspots. Although several factors, such as PRDM9 binding motifs, are known to be related to the hotspots, their contributions to the recombination hotspots have not been quantified, and other determinants are yet to be elucidated. Here, we propose a computational method, RHSNet, based on deep learning and signal processing, to identify and quantify the hotspot determinants in a purely data-driven manner, utilizing datasets from various studies, populations, sexes and species. RESULTS RHSNet can significantly outperform other sequence-based methods on multiple datasets across different species, sexes and studies. In addition to being able to identify hotspot regions and the well-known determinants accurately, more importantly, RHSNet can quantify the determinants that contribute significantly to the recombination hotspot formation in the relation between PRDM9 binding motif, histone modification and GC content. Further cross-sex, cross-population and cross-species studies suggest that the proposed method has the generalization power and potential to identify and quantify the evolutionary determinant motifs. AVAILABILITY AND IMPLEMENTATION https://github.com/frankchen121212/RHSNet. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yu Li
- To whom correspondence should be addressed. or
| | | | | | - Hiroyuki Kuwahara
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Kevin Y Yip
- Department of Computer Science and Engineering (CSE), The Chinese University of Hong Kong (CUHK), 999077, Hong Kong SAR, China
| | - Xin Gao
- To whom correspondence should be addressed. or
| |
Collapse
|
47
|
Lucotte EA, Albiñana C, Laurent R, Bhérer C, Bataillon T, Toupance B. Detection of sexually antagonistic transmission distortions in trio datasets. Evol Lett 2022; 6:203-216. [PMID: 35386833 PMCID: PMC8966469 DOI: 10.1002/evl3.271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 12/07/2021] [Accepted: 12/14/2021] [Indexed: 11/24/2022] Open
Abstract
Sexual dimorphisms are widespread in animals and plants, for morphological as well as physiological traits. Understanding the genetic basis of sexual dimorphism and its evolution is crucial for understanding biological differences between the sexes. Genetic variants with sex‐antagonistic effects on fitness are expected to segregate in populations at the early phases of sexual dimorphism emergence. Detecting such variants is notoriously difficult, and the few genome‐scan methods employed so far have limited power and little specificity. Here, we propose a new framework to detect a signature of sexually antagonistic (SA) selection. We rely on trio datasets where sex‐biased transmission distortions can be directly tracked from parents to offspring, and identify signals of SA transmission distortions in genomic regions. We report the genomic location of six candidate regions detected in human populations as potentially under sexually antagonist selection. We find an enrichment of genes associated with embryonic development within these regions. Last, we highlight two candidate regions for SA selection in humans.
Collapse
Affiliation(s)
- Elise A. Lucotte
- Bioinformatic Research Center Aarhus University Aarhus 8000 Denmark
- Eco‐anthropologie (EA) Muséum national d'Histoire naturelle, CNRS, Université de Paris Paris 75016 France
- Cancer Epidemiology: Gene and Environment INSERM U1018 Paris 75654 France
- Ecologie Systématique Evolution Univ. Paris‐Sud, AgroParisTech, CNRS, Université Paris‐Saclay Orsay 91400 France
| | - Clara Albiñana
- Bioinformatic Research Center Aarhus University Aarhus 8000 Denmark
- National Centre for Register‐based Research, Department of Economics and Business Economics, Aarhus BSS Aarhus University Aarhus 8210 Denmark
| | - Romain Laurent
- Eco‐anthropologie (EA) Muséum national d'Histoire naturelle, CNRS, Université de Paris Paris 75016 France
| | - Claude Bhérer
- Department of Human Genetics, Faculty of Medicine McGill University Montreal QC H3G 2M1 Canada
| | - Thomas Bataillon
- Bioinformatic Research Center Aarhus University Aarhus 8000 Denmark
| | - Bruno Toupance
- Eco‐anthropologie (EA) Muséum national d'Histoire naturelle, CNRS, Université de Paris Paris 75016 France
| | | |
Collapse
|
48
|
Colomer-Vilaplana A, Murga-Moreno J, Canalda-Baltrons A, Inserte C, Soto D, Coronado-Zamora M, Barbadilla A, Casillas S. PopHumanVar: an interactive application for the functional characterization and prioritization of adaptive genomic variants in humans. Nucleic Acids Res 2022; 50:D1069-D1076. [PMID: 34664660 PMCID: PMC8728255 DOI: 10.1093/nar/gkab925] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 09/17/2021] [Accepted: 09/28/2021] [Indexed: 12/22/2022] Open
Abstract
Adaptive challenges that humans faced as they expanded across the globe left specific molecular footprints that can be decoded in our today's genomes. Different sets of metrics are used to identify genomic regions that have undergone selection. However, there are fewer methods capable of pinpointing the allele ultimately responsible for this selection. Here, we present PopHumanVar, an interactive online application that is designed to facilitate the exploration and thorough analysis of candidate genomic regions by integrating both functional and population genomics data currently available. PopHumanVar generates useful summary reports of prioritized variants that are putatively causal of recent selective sweeps. It compiles data and graphically represents different layers of information, including natural selection statistics, as well as functional annotations and genealogical estimations of variant age, for biallelic single nucleotide variants (SNVs) of the 1000 Genomes Project phase 3. Specifically, PopHumanVar amasses SNV-based information from GEVA, SnpEFF, GWAS Catalog, ClinVar, RegulomeDB and DisGeNET databases, as well as accurate estimations of iHS, nSL and iSAFE statistics. Notably, PopHumanVar can successfully identify known causal variants of frequently reported candidate selection regions, including EDAR in East-Asians, ACKR1 (DARC) in Africans and LCT/MCM6 in Europeans. PopHumanVar is open and freely available at https://pophumanvar.uab.cat.
Collapse
Affiliation(s)
- Aina Colomer-Vilaplana
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Jesús Murga-Moreno
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Aleix Canalda-Baltrons
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Clara Inserte
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Daniel Soto
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Marta Coronado-Zamora
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Antonio Barbadilla
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Sònia Casillas
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| |
Collapse
|
49
|
Chan AW, Villwock SS, Williams AL, Jannink JL. Sexual dimorphism and the effect of wild introgressions on recombination in cassava (Manihot esculenta Crantz) breeding germplasm. G3 (BETHESDA, MD.) 2022; 12:jkab372. [PMID: 34791172 PMCID: PMC8728042 DOI: 10.1093/g3journal/jkab372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Accepted: 09/29/2021] [Indexed: 01/09/2023]
Abstract
Recombination has essential functions in meiosis, evolution, and breeding. The frequency and distribution of crossovers dictate the generation of new allele combinations and can vary across species and between sexes. Here, we examine recombination landscapes across the 18 chromosomes of cassava (Manihot esculenta Crantz) with respect to male and female meioses and known introgressions from the wild relative Manihot glaziovii. We used SHAPEIT2 and duoHMM to infer crossovers from genotyping-by-sequencing data and a validated multigenerational pedigree from the International Institute of Tropical Agriculture cassava breeding germplasm consisting of 7020 informative meioses. We then constructed new genetic maps and compared them to an existing map previously constructed by the International Cassava Genetic Map Consortium. We observed higher recombination rates in females compared to males, and lower recombination rates in M. glaziovii introgression segments on chromosomes 1 and 4, with suppressed recombination along the entire length of the chromosome in the case of the chromosome 4 introgression. Finally, we discuss hypothesized mechanisms underlying our observations of heterochiasmy and crossover suppression and discuss the broader implications for plant breeding.
Collapse
Affiliation(s)
- Ariel W Chan
- Section of Plant Breeding and Genetics, School of Integrative Plant Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Seren S Villwock
- Section of Plant Breeding and Genetics, School of Integrative Plant Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Amy L Williams
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
| | - Jean-Luc Jannink
- RW Holley Center for Agriculture and Health, United States Department of Agriculture—Agricultural Research Service, School of Integrative Plant Sciences, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
50
|
Lencz T, Backenroth D, Granot-Hershkovitz E, Green A, Gettler K, Cho JH, Weissbrod O, Zuk O, Carmi S. Utility of polygenic embryo screening for disease depends on the selection strategy. eLife 2021; 10:e64716. [PMID: 34635206 PMCID: PMC8510582 DOI: 10.7554/elife.64716] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 08/09/2021] [Indexed: 12/13/2022] Open
Abstract
Polygenic risk scores (PRSs) have been offered since 2019 to screen in vitro fertilization embryos for genetic liability to adult diseases, despite a lack of comprehensive modeling of expected outcomes. Here we predict, based on the liability threshold model, the expected reduction in complex disease risk following polygenic embryo screening for a single disease. A strong determinant of the potential utility of such screening is the selection strategy, a factor that has not been previously studied. When only embryos with a very high PRS are excluded, the achieved risk reduction is minimal. In contrast, selecting the embryo with the lowest PRS can lead to substantial relative risk reductions, given a sufficient number of viable embryos. We systematically examine the impact of several factors on the utility of screening, including: variance explained by the PRS, number of embryos, disease prevalence, parental PRSs, and parental disease status. We consider both relative and absolute risk reductions, as well as population-averaged and per-couple risk reductions, and also examine the risk of pleiotropic effects. Finally, we confirm our theoretical predictions by simulating 'virtual' couples and offspring based on real genomes from schizophrenia and Crohn's disease case-control studies. We discuss the assumptions and limitations of our model, as well as the potential emerging ethical concerns.
Collapse
Affiliation(s)
- Todd Lencz
- Departments of Psychiatry and Molecular Medicine, Zucker School of Medicine at Hofstra/NorthwellHempsteadUnited States
- Department of Psychiatry, Division of Research, The Zucker Hillside Hospital Division of Northwell HealthGlen OaksUnited States
- Institute for Behavioral Science, The Feinstein Institutes for Medical ResearchManhassetUnited States
| | - Daniel Backenroth
- Braun School of Public Health and Community Medicine, The Hebrew University of JerusalemJerusalemIsrael
| | - Einat Granot-Hershkovitz
- Braun School of Public Health and Community Medicine, The Hebrew University of JerusalemJerusalemIsrael
| | - Adam Green
- Braun School of Public Health and Community Medicine, The Hebrew University of JerusalemJerusalemIsrael
| | - Kyle Gettler
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Judy H Cho
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount SinaiNew YorkUnited States
- Department of Medicine, Icahn School of Medicine at Mount SinaiNew YorkUnited States
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public HealthBostonUnited States
| | - Or Zuk
- Department of Statistics and Data Science, The Hebrew University of JerusalemJerusalemIsrael
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of JerusalemJerusalemIsrael
| |
Collapse
|