1
|
Chen X, Baker D, Dolzhenko E, Devaney JM, Noya J, Berlyoung AS, Brandon R, Hruska KS, Lochovsky L, Kruszka P, Newman S, Farrow E, Thiffault I, Pastinen T, Kasperaviciute D, Gilissen C, Vissers L, Hoischen A, Berger S, Vilain E, Délot E, Eberle MA. Genome-wide profiling of highly similar paralogous genes using HiFi sequencing. Nat Commun 2025; 16:2340. [PMID: 40057485 PMCID: PMC11890787 DOI: 10.1038/s41467-025-57505-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 02/21/2025] [Indexed: 05/13/2025] Open
Abstract
Variant calling is hindered in segmental duplications by sequence homology. We developed Paraphase, a HiFi-based informatics method that resolves highly similar genes by phasing all haplotypes of paralogous genes together. We applied Paraphase to 160 long (>10 kb) segmental duplication regions across the human genome with high (>99%) sequence similarity, encoding 316 genes. Analysis across five ancestral populations revealed highly variable copy numbers of these regions. We identified 23 paralog groups with exceptionally low within-group diversity, where extensive gene conversion and unequal crossing over contribute to highly similar gene copies. Furthermore, our analysis of 36 trios identified 7 de novo SNVs and 4 de novo gene conversion events, 2 of which are non-allelic. Finally, we summarized extensive genetic diversity in 9 medically relevant genes previously considered challenging to genotype. Paraphase provides a framework for resolving gene paralogs, enabling accurate testing in medically relevant genes and population-wide studies of previously inaccessible genes.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Emily Farrow
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
- UMKC School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Isabelle Thiffault
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
- UMKC School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
- Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Tomi Pastinen
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
- UMKC School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
| | | | - Christian Gilissen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Lisenka Vissers
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Alexander Hoischen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, The Netherlands
- Radboud Center for Infectious Diseases (RCI), Department of Internal Medicine, Radboud University Medical Center, Nijmegen, The Netherlands
- Radboud Expertise Center for Immunodeficiency and Autoinflammation and Radboud Center for Infectious Disease (RCI), Radboud University Medical Center, Nijmegen, The Netherlands
| | - Seth Berger
- Center for Genetics Medicine Research, Children's National Hospital, Washington, DC, USA
| | - Eric Vilain
- Institute for Clinical and Translational Science, University of California, Irvine, CA, USA
| | - Emmanuèle Délot
- Institute for Clinical and Translational Science, University of California, Irvine, CA, USA
| | | |
Collapse
|
2
|
Saunders PA, Muyle A. Sex Chromosome Evolution: Hallmarks and Question Marks. Mol Biol Evol 2024; 41:msae218. [PMID: 39417444 PMCID: PMC11542634 DOI: 10.1093/molbev/msae218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 10/14/2024] [Accepted: 10/15/2024] [Indexed: 10/19/2024] Open
Abstract
Sex chromosomes are widespread in species with separate sexes. They have evolved many times independently and display a truly remarkable diversity. New sequencing technologies and methodological developments have allowed the field of molecular evolution to explore this diversity in a large number of model and nonmodel organisms, broadening our vision on the mechanisms involved in their evolution. Diverse studies have allowed us to better capture the common evolutionary routes that shape sex chromosomes; however, we still mostly fail to explain why sex chromosomes are so diverse. We review over half a century of theoretical and empirical work on sex chromosome evolution and highlight pending questions on their origins, turnovers, rearrangements, degeneration, dosage compensation, gene content, and rates of evolution. We also report recent theoretical progress on our understanding of the ultimate reasons for sex chromosomes' existence.
Collapse
Affiliation(s)
- Paul A Saunders
- CEFE, University of Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Aline Muyle
- CEFE, University of Montpellier, CNRS, EPHE, IRD, Montpellier, France
| |
Collapse
|
3
|
Laufer VA, Glover TW, Wilson TE. Applications of advanced technologies for detecting genomic structural variation. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2023; 792:108475. [PMID: 37931775 PMCID: PMC10792551 DOI: 10.1016/j.mrrev.2023.108475] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 09/07/2023] [Accepted: 11/02/2023] [Indexed: 11/08/2023]
Abstract
Chromosomal structural variation (SV) encompasses a heterogenous class of genetic variants that exerts strong influences on human health and disease. Despite their importance, many structural variants (SVs) have remained poorly characterized at even a basic level, a discrepancy predicated upon the technical limitations of prior genomic assays. However, recent advances in genomic technology can identify and localize SVs accurately, opening new questions regarding SV risk factors and their impacts in humans. Here, we first define and classify human SVs and their generative mechanisms, highlighting characteristics leveraged by various SV assays. We next examine the first-ever gapless assembly of the human genome and the technical process of assembling it, which required third-generation sequencing technologies to resolve structurally complex loci. The new portions of that "telomere-to-telomere" and subsequent pangenome assemblies highlight aspects of SV biology likely to develop in the near-term. We consider the strengths and limitations of the most promising new SV technologies and when they or longstanding approaches are best suited to meeting salient goals in the study of human SV in population-scale genomics research, clinical, and public health contexts. It is a watershed time in our understanding of human SV when new approaches are expected to fundamentally change genomic applications.
Collapse
Affiliation(s)
- Vincent A Laufer
- Department of Pathology, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| | - Thomas W Glover
- Department of Pathology, University of Michigan Medical School, Ann Arbor, MI 48109, USA; Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| | - Thomas E Wilson
- Department of Pathology, University of Michigan Medical School, Ann Arbor, MI 48109, USA; Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| |
Collapse
|
4
|
Bonito M, Ravasini F, Novelletto A, D'Atanasio E, Cruciani F, Trombetta B. Disclosing complex mutational dynamics at a Y chromosome palindrome evolving through intra- and inter-chromosomal gene conversion. Hum Mol Genet 2023; 32:65-78. [PMID: 35921243 DOI: 10.1093/hmg/ddac144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 06/21/2022] [Accepted: 06/21/2022] [Indexed: 01/17/2023] Open
Abstract
The human MSY ampliconic region is mainly composed of large duplicated sequences that are organized in eight palindromes (termed P1-P8), and may undergo arm-to-arm gene conversion. Although the importance of these elements is widely recognized, their evolutionary dynamics are still nuanced. Here, we focused on the P8 palindrome, which shows a complex evolutionary history, being involved in intra- and inter-chromosomal gene conversion. To disclose its evolutionary complexity, we performed a high-depth (50×) targeted next-generation sequencing of this element in 157 subjects belonging to the most divergent lineages of the Y chromosome tree. We found a total of 72 polymorphic paralogous sequence variants that have been exploited to identify 41 Y-Y gene conversion events that occurred during recent human history. Through our analysis, we were able to categorize P8 arms into three portions, whose molecular diversity was modelled by different evolutionary forces. Notably, the outer region of the palindrome is not involved in any gene conversion event and evolves exclusively through the action of mutational pressure. The inner region is affected by Y-Y gene conversion occurring at a rate of 1.52 × 10-5 conversions/base/year, with no bias towards the retention of the ancestral state of the sequence. In this portion, GC-biased gene conversion is counterbalanced by a mutational bias towards AT bases. Finally, the middle region of the arms, in addition to intra-chromosomal gene conversion, is involved in X-to-Y gene conversion (at a rate of 6.013 × 10-8 conversions/base/year) thus being a major force in the evolution of the VCY/VCX gene family.
Collapse
Affiliation(s)
- Maria Bonito
- Department of Biology and Biotechnology 'Charles Darwin', Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia - Fondazione Cenci Bolognetti, Rome 00185, Italy
| | - Francesco Ravasini
- Department of Biology and Biotechnology 'Charles Darwin', Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia - Fondazione Cenci Bolognetti, Rome 00185, Italy
| | - Andrea Novelletto
- Department of Biology, University of Rome Tor Vergata, Rome 00133, Italy
| | - Eugenia D'Atanasio
- Institute of Molecular Biology and Pathology (IBPM), CNR, Rome 00185, Italy
| | - Fulvio Cruciani
- Department of Biology and Biotechnology 'Charles Darwin', Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia - Fondazione Cenci Bolognetti, Rome 00185, Italy.,Institute of Molecular Biology and Pathology (IBPM), CNR, Rome 00185, Italy
| | - Beniamino Trombetta
- Department of Biology and Biotechnology 'Charles Darwin', Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia - Fondazione Cenci Bolognetti, Rome 00185, Italy
| |
Collapse
|
5
|
Stark-Dykema ER, Dulka EA, Gerlinger ER, Mueller JL. X-linked palindromic gene families 4930567H17Rik and Mageb5 are dispensable for male mouse fertility. Sci Rep 2022; 12:8554. [PMID: 35595785 PMCID: PMC9122934 DOI: 10.1038/s41598-022-12433-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 05/10/2022] [Indexed: 11/17/2022] Open
Abstract
Mammalian sex chromosomes are enriched for large, nearly-identical, palindromic sequences harboring genes expressed predominately in testicular germ cells. Discerning if individual palindrome-associated gene families are essential for male reproduction is difficult due to challenges in disrupting all copies of a gene family. Here we generate precise, independent, deletions to assess the reproductive roles of two X-linked palindromic gene families with spermatid-predominant expression, 4930567H17Rik and Mageb5. Sequence analyses reveals mouse 4930567H17Rik and Mageb5 are orthologs of human HSFX3 and MAGEB5, respectively, where 4930567H17Rik/HSFX3 is harbored in a palindrome in humans and mice, while Mageb5 is not. Additional sequence analyses show 4930567H17Rik and HSFX3 are rapidly diverging in rodents and primates, respectively. Mice lacking either 4930567H17Rik or Mageb5 gene families do not have detectable defects in male fertility, fecundity, spermatogenesis, or in gene regulation, but do show differences in sperm head morphology, suggesting a potential role in sperm function. We conclude that while all palindrome-associated gene families are not essential for male fertility, large palindromes influence the evolution of their associated gene families.
Collapse
Affiliation(s)
- Evan R Stark-Dykema
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Eden A Dulka
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Emma R Gerlinger
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Jacob L Mueller
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
6
|
Bonito M, D’Atanasio E, Ravasini F, Cariati S, Finocchio A, Novelletto A, Trombetta B, Cruciani F. New insights into the evolution of human Y chromosome palindromes through mutation and gene conversion. Hum Mol Genet 2021; 30:2272-2285. [PMID: 34244762 PMCID: PMC8600007 DOI: 10.1093/hmg/ddab189] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 07/01/2021] [Accepted: 07/05/2021] [Indexed: 12/16/2022] Open
Abstract
About one-quarter of the euchromatic portion of the male-specific region of the human Y chromosome consists of large duplicated sequences that are organized in eight palindromes (termed P1-P8), which undergo arm-to arm gene conversion, a proposed mechanism for maintaining their sequence integrity. Although the relevance of gene conversion in the evolution of palindromic sequences has been profoundly recognized, the dynamic of this mechanism is still nuanced. To shed light into the evolution of these genomic elements, we performed a high-depth (50×) targeted next-generation sequencing of the palindrome P6 in 157 subjects belonging to the most divergent evolutionary lineages of the Y chromosome. We found 118 new paralogous sequence variants, which were placed into the context of a robust Y chromosome phylogeny based on 7240 SNPs of the X-degenerate region. We mapped along the phylogeny 80 gene conversion events that shaped the diversity of P6 arms during recent human history. In contrast to previous studies, we demonstrated that arm-to-arm gene conversion, which occurs at a rate of 6.01 × 10 -6 conversions/base/year, is not biased toward the retention of the ancestral state of sequences. We also found a significantly lower mutation rate of the arms (6.18 × 10-10 mutations/base/year) compared with the spacer (9.16 × 10-10 mutations/base/year), a finding that may explain the observed higher inter-species conservation of arms, without invoking any bias of conversion. Finally, by formally testing the mutation/conversion balance in P6, we found that the arms of this palindrome reached a steady-state equilibrium between mutation and gene conversion.
Collapse
Affiliation(s)
- Maria Bonito
- Department of Biology and Biotechnology ‘Charles Darwin’, Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia-Fondazione Cenci Bolognetti, Rome 0185, Italy
| | - Eugenia D’Atanasio
- Institute of Molecular Biology and Pathology (IBPM), CNR, Rome 0185, Italy
| | - Francesco Ravasini
- Department of Biology and Biotechnology ‘Charles Darwin’, Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia-Fondazione Cenci Bolognetti, Rome 0185, Italy
| | - Selene Cariati
- Department of Biology and Biotechnology ‘Charles Darwin’, Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia-Fondazione Cenci Bolognetti, Rome 0185, Italy
| | - Andrea Finocchio
- Department of Biology, University of Rome Tor Vergata, Rome 0133, Italy
| | - Andrea Novelletto
- Department of Biology, University of Rome Tor Vergata, Rome 0133, Italy
| | - Beniamino Trombetta
- Department of Biology and Biotechnology ‘Charles Darwin’, Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia-Fondazione Cenci Bolognetti, Rome 0185, Italy
| | - Fulvio Cruciani
- Department of Biology and Biotechnology ‘Charles Darwin’, Sapienza University of Rome, Laboratory affiliated to Istituto Pasteur Italia-Fondazione Cenci Bolognetti, Rome 0185, Italy
- Institute of Molecular Biology and Pathology (IBPM), CNR, Rome 0185, Italy
| |
Collapse
|
7
|
Jackson EK, Bellott DW, Skaletsky H, Page DC. GC-biased gene conversion in X-chromosome palindromes conserved in human, chimpanzee, and rhesus macaque. G3 GENES|GENOMES|GENETICS 2021; 11:6317831. [PMID: 34849781 PMCID: PMC8981503 DOI: 10.1093/g3journal/jkab224] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/28/2021] [Indexed: 12/03/2022]
Abstract
Gene conversion is GC-biased across a wide range of taxa. Large palindromes on mammalian
sex chromosomes undergo frequent gene conversion that maintains arm-to-arm sequence
identity greater than 99%, which may increase their susceptibility to the effects of
GC-biased gene conversion. Here, we demonstrate a striking history of GC-biased gene
conversion in 12 palindromes conserved on the X chromosomes of human, chimpanzee, and
rhesus macaque. Primate X-chromosome palindrome arms have significantly higher GC content
than flanking single-copy sequences. Nucleotide replacements that occurred in human and
chimpanzee palindrome arms over the past 7 million years are one-and-a-half times as
GC-rich as the ancestral bases they replaced. Using simulations, we show that our observed
pattern of nucleotide replacements is consistent with GC-biased gene conversion with a
magnitude of 70%, similar to previously reported values based on analyses of human
meioses. However, GC-biased gene conversion since the divergence of human and rhesus
macaque explains only a fraction of the observed difference in GC content between
palindrome arms and flanking sequence, suggesting that palindromes are older than 29
million years and/or had elevated GC content at the time of their formation. This work
supports a greater than 2:1 preference for GC bases over AT bases during gene conversion
and demonstrates that the evolution and composition of mammalian sex chromosome
palindromes is strongly influenced by GC-biased gene conversion.
Collapse
Affiliation(s)
- Emily K Jackson
- Whitehead Institute, Cambridge, MA 02142, USA
- Howard Hughes Medical Institute, Whitehead Institute, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | - Helen Skaletsky
- Whitehead Institute, Cambridge, MA 02142, USA
- Howard Hughes Medical Institute, Whitehead Institute, Cambridge, MA 02142, USA
| | - David C Page
- Whitehead Institute, Cambridge, MA 02142, USA
- Howard Hughes Medical Institute, Whitehead Institute, Cambridge, MA 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
8
|
Jackson EK, Bellott DW, Cho TJ, Skaletsky H, Hughes JF, Pyntikova T, Page DC. Large palindromes on the primate X Chromosome are preserved by natural selection. Genome Res 2021; 31:1337-1352. [PMID: 34290043 PMCID: PMC8327919 DOI: 10.1101/gr.275188.120] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 05/17/2021] [Indexed: 12/27/2022]
Abstract
Mammalian sex chromosomes carry large palindromes that harbor protein-coding gene families with testis-biased expression. However, there are few known examples of sex-chromosome palindromes conserved between species. We identified 26 palindromes on the human X Chromosome, constituting more than 2% of its sequence, and characterized orthologous palindromes in the chimpanzee and the rhesus macaque using a clone-based sequencing approach that incorporates full-length nanopore reads. Many of these palindromes are missing or misassembled in the current reference assemblies of these species' genomes. We find that 12 human X palindromes have been conserved for at least 25 million years, with orthologs in both chimpanzee and rhesus macaque. Insertions and deletions between species are significantly depleted within the X palindromes' protein-coding genes compared to their noncoding sequence, demonstrating that natural selection has preserved these gene families. The spacers that separate the left and right arms of palindromes are a site of localized structural instability, with seven of 12 conserved palindromes showing no spacer orthology between human and rhesus macaque. Analysis of the 1000 Genomes Project data set revealed that human X-palindrome spacers are enriched for deletions relative to arms and flanking sequence, including a common spacer deletion that affects 13% of human X Chromosomes. This work reveals an abundance of conserved palindromes on primate X Chromosomes and suggests that protein-coding gene families in palindromes (most of which remain poorly characterized) promote X-palindrome survival in the face of ongoing structural instability.
Collapse
Affiliation(s)
- Emily K Jackson
- Whitehead Institute, Cambridge, Massachusetts 02142, USA
- Howard Hughes Medical Institute, Whitehead Institute, Cambridge, Massachusetts 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | | - Ting-Jan Cho
- Whitehead Institute, Cambridge, Massachusetts 02142, USA
| | - Helen Skaletsky
- Whitehead Institute, Cambridge, Massachusetts 02142, USA
- Howard Hughes Medical Institute, Whitehead Institute, Cambridge, Massachusetts 02142, USA
| | | | | | - David C Page
- Whitehead Institute, Cambridge, Massachusetts 02142, USA
- Howard Hughes Medical Institute, Whitehead Institute, Cambridge, Massachusetts 02142, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|