1
|
Lu W, Yao L, Wang Y, Li F, Zhou B, Ming W, Jiang Y, Liu X, Liu Y, Sun X, Wang Y, Bai Y. Characterization of extrachromosomal circular DNA associated with genomic repeat sequences in breast cancer. Int J Cancer 2025; 157:384-397. [PMID: 40135469 DOI: 10.1002/ijc.35423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Revised: 02/26/2025] [Accepted: 03/05/2025] [Indexed: 03/27/2025]
Abstract
Extrachromosomal circular DNA (eccDNA) has emerged as a potential biomarker for disease due to its stable closed circular structure. However, the diagnostic utility of eccDNA remains underexplored. In this study, we demonstrate that the characteristics of eccDNA associated with genomic repetitive elements change in breast cancer patient tissues and plasma. These changes can serve as signatures for accurate cancer classification. We profiled eccDNA annotated to repeat elements across the genome in tissues and plasma, aggregating each repeat element to the superfamily and subfamily level. Our findings indicate that eccDNA associated with repetitive elements in cancer exhibits regular patterns of enrichment or depletion in specific elements, particularly at the family level. Additionally, these repeat element changes are present in different subtypes of breast cancer, correlated with varying hormone receptor expression. Although there are differences in the landscapes of eccDNA on repetitive elements between cancer tissues and paired plasma, the unique characteristics of eccDNA associated with repetitive sequences in the plasma of cancer patients facilitate better differentiation from normal individuals. These analyses reveal that changes in eccDNA associated with repeat sequences in human cancers can be used as diagnostic biomarkers for cancer patients.
Collapse
Affiliation(s)
- Wenxiang Lu
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Lingsong Yao
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Ying Wang
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Fuyu Li
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Bingbo Zhou
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Wenlong Ming
- Institute for AI in Medicine, School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing, China
| | - Yali Jiang
- The Friendship Hospital of Ili Kazakh Autonomous Prefecture, Ili & Jiangsu Joint Institute of Health, Yining, Xinjiang Uygur Autonomous Region, China
| | - Xiaoan Liu
- Department of Breast Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Yun Liu
- Department of Information, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Xiao Sun
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Yan Wang
- The Friendship Hospital of Ili Kazakh Autonomous Prefecture, Ili & Jiangsu Joint Institute of Health, Yining, Xinjiang Uygur Autonomous Region, China
- Department of Endoscopy, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Yunfei Bai
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| |
Collapse
|
2
|
Shadrina M, Kalay Ö, Demirkaya-Budak S, LeDuc CA, Chung WK, Turgut D, Budak G, Arslan E, Semenyuk V, Davis-Dusenbery B, Seidman CE, Yost HJ, Jain A, Gelb BD. Efficient identification of de novo mutations in family trios: a consensus-based informatic approach. Life Sci Alliance 2025; 8:e202403039. [PMID: 40155050 PMCID: PMC11953573 DOI: 10.26508/lsa.202403039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Revised: 03/19/2025] [Accepted: 03/20/2025] [Indexed: 04/01/2025] Open
Abstract
Accurate identification of de novo variants (DNVs) remains challenging despite advances in sequencing technologies, often requiring ad hoc filters and manual inspection. Here, we explored a purely informatic, consensus-based approach for identifying DNVs in proband-parent trios using short-read genome sequencing data. We evaluated variant calls generated by three sequence analysis pipelines-GATK HaplotypeCaller, DeepTrio, and Velsera GRAF-and examined the assumption that a requirement of consensus can serve as an effective filter for high-quality DNVs. Comparison with a highly accurate DNV set, validated previously by manual inspection and Sanger sequencing, demonstrated that consensus filtering, followed by a force-calling procedure, effectively removed false-positive calls, achieving 98.0-99.4% precision. At the same time, sensitivity of the workflow based on the previously established DNVs reached 99.4%. Validation in the HG002-3-4 Genome-in-a-Bottle trio confirmed its robustness, with precision reaching 99.2% and sensitivity up to 96.6%. We believe that this consensus approach can be widely implemented as an automated bioinformatics workflow suitable for large-scale analyses without the need for manual intervention, especially when very high precision is valued over sensitivity.
Collapse
Affiliation(s)
- Mariya Shadrina
- Mindich Child Health and Development Institute and the Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York, NY, USA
| | | | | | - Charles A LeDuc
- Department of Pediatrics, Columbia University, New York, NY, USA
| | - Wendy K Chung
- Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | | | | | | | | | | | - Christine E Seidman
- Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - H Joseph Yost
- Molecular Medicine Program, University of Utah, Salt Lake City, UT, USA
| | | | - Bruce D Gelb
- Mindich Child Health and Development Institute and the Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York, NY, USA
- Department of Pediatrics, Icahn School of Medicine, New York, NY, USA
| |
Collapse
|
3
|
Liu Q, Tian W. Association of human-specific expanded short tandem repeats with neuron-specific regulatory features. SCIENCE ADVANCES 2025; 11:eadp9707. [PMID: 40446031 PMCID: PMC12124357 DOI: 10.1126/sciadv.adp9707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Accepted: 04/24/2025] [Indexed: 06/02/2025]
Abstract
Short tandem repeats (STRs), characterized by high-copy number mutations, represent one of the fastest-evolving genomic elements. However, human-specific expanded STRs (heSTRs) have lacked comprehensive genome-wide characterization. Leveraging 148 human and 26 nonhuman primate haploid genomes, we identified 8813 heSTRs with robust expansions in copy number distributions. Our analysis revealed notable associations between heSTRs and brain- and neuron-specific distal regulatory signals. Potential target genes regulated by heSTRs, identified by incorporating distal regulations, are enriched with neuronal development-related functions and disorders, displaying neuron-specific expression enhancement in humans. Moreover, heSTRs are associated with enhanced chromatin accessibility specifically in human neurons. In addition, heSTRs show substantial association with pathogenic STR loci exhibiting abnormal copy number variations, as reported by cohort studies on schizophrenia and autism. This study underscores the role of heSTRs in both human evolution and disorders, offering valuable insights for future research on STRs from an evolutionary perspective.
Collapse
Affiliation(s)
- Qiming Liu
- State Key Laboratory of Genetics and Development of Complex Phenotypes, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai, China
| | - Weidong Tian
- State Key Laboratory of Genetics and Development of Complex Phenotypes, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai, China
- Children’s Hospital of Fudan University, Shanghai, China
- Children’s Hospital of Shandong University, Jinan, China
| |
Collapse
|
4
|
Nakashima T, Miyauchi T, Takeuchi R, Sugihara Y, Funakoshi Y, Ohka F, Maeda S, Hirato J, Yoshioka T, Okita H, Narita Y, Kanemura Y, Kojima Y, Watanabe Y, Saito R, Suzuki H. Diversity of U1 Small Nuclear RNAs and Diagnostic Methods for Their Mutations. Cancer Sci 2025. [PMID: 40425278 DOI: 10.1111/cas.70110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2025] [Revised: 05/16/2025] [Accepted: 05/20/2025] [Indexed: 05/29/2025] Open
Abstract
U1 small nuclear RNA (snRNA) mutations are recurrent non-coding alterations found in various malignancies, yet their identification has proven challenging due to their repetitive nature. We characterized the complex interindividual diversity and genomic architecture of U1 snRNA loci using sequencing data and a pangenome reference. Our analysis uncovered copy number variations and the diversity of single-nucleotide variants in regions not predicted to have significant functional impact. Compared to traditional linear reference-based analyses for mutations, the pangenome graph demonstrated the best accuracy, successfully identifying previously undetectable mutations. This underscores the utility of pangenome graph references for cancer genome research, particularly in repetitive and highly diverse genomic regions. Additionally, we developed mutation detection methods employing targeted capture sequencing, rapid quantitative polymerase chain reaction, and a machine learning approach based on splicing patterns, all exhibiting high precision in identifying U1 snRNA mutations. Our findings elucidate the structural complexity of U1 snRNA loci and establish robust methodologies for precise mutation detection in these regions.
Collapse
Affiliation(s)
- Takuma Nakashima
- Division of Brain Tumor Translational Research, National Cancer Center Research Institute, Chuo City, Japan
- Department of Neurosurgery, Nagoya University School of Medicine, Nagoya, Japan
| | - Tsubasa Miyauchi
- Division of Brain Tumor Translational Research, National Cancer Center Research Institute, Chuo City, Japan
| | - Ryota Takeuchi
- In Vitro Diagnostics Business, KYORIN Pharmaceutical Co., Ltd, Tokyo, Japan
| | - Yuriko Sugihara
- Division of Brain Tumor Translational Research, National Cancer Center Research Institute, Chuo City, Japan
| | - Yusuke Funakoshi
- Division of Brain Tumor Translational Research, National Cancer Center Research Institute, Chuo City, Japan
| | - Fumiharu Ohka
- Department of Neurosurgery, Nagoya University School of Medicine, Nagoya, Japan
| | - Sachi Maeda
- Department of Neurosurgery, Nagoya University School of Medicine, Nagoya, Japan
| | - Junko Hirato
- Department of Pathology, Public Tomioka General Hospital, Tomioka, Japan
| | - Takako Yoshioka
- Department of Pathology, National Center for Child Health and Development, Setagaya, Japan
| | - Hajime Okita
- Division of Diagnostic Pathology, Keio University School of Medicine, Minato City, Japan
| | - Yoshitaka Narita
- Department of Neurosurgery and Neuro-Oncology, National Cancer Center Hospital, Chuo City, Japan
| | - Yonehiro Kanemura
- Department of Biomedical Research and Innovation, Institute for Clinical Research, NHO Osaka National Hospital, Osaka, Japan
- Department of Neurosurgery, NHO Osaka National Hospital, Osaka, Japan
| | - Yasuhiro Kojima
- Laboratory of Computational Life Science, National Cancer Center Research Institute, Tokyo, Japan
| | - Yuko Watanabe
- Department of Pediatric Oncology, National Cancer Center Hospital, Chuo City, Japan
| | - Ryuta Saito
- Department of Neurosurgery, Nagoya University School of Medicine, Nagoya, Japan
| | - Hiromichi Suzuki
- Division of Brain Tumor Translational Research, National Cancer Center Research Institute, Chuo City, Japan
| |
Collapse
|
5
|
Santos R, Lee H, Williams A, Baffour-Kyei A, Lee SH, Troakes C, Al-Chalabi A, Breen G, Iacoangeli A. Investigating the Performance of Oxford Nanopore Long-Read Sequencing with Respect to Illumina Microarrays and Short-Read Sequencing. Int J Mol Sci 2025; 26:4492. [PMID: 40429637 PMCID: PMC12111203 DOI: 10.3390/ijms26104492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2025] [Revised: 05/01/2025] [Accepted: 05/06/2025] [Indexed: 05/29/2025] Open
Abstract
Oxford Nanopore Technologies (ONT) long-read sequencing (LRS) has emerged as a promising genomic analysis tool, yet comprehensive benchmarks with established platforms across diverse datasets remain limited. This study aimed to benchmark LRS performance against Illumina short-read sequencing (SRS) and microarrays for variant detection across different genomic contexts and to evaluate the impact of experimental factors. We sequenced 14 human genomes using the three platforms and evaluated single nucleotide variants (SNVs), insertions/deletions (indels), and structural variants (SVs) detection, stratifying by high-complexity, low-complexity, and dark genome regions while assessing effects of multiplexing, depth, and read length. LRS SNV accuracy was slightly lower than that of SRS in high-complexity regions (F-measure: 0.954 vs. 0.967) but showed comparable sensitivity in low-complexity regions. LRS showed robust performance for small (1-5 bp) indels in high-complexity regions (F-measure: 0.869), but SRS agreement decreased significantly in low-complexity regions and for larger indel sizes. Within dark regions, LRS identified more indels than SRS, but showed lower base-level accuracy. LRS identified 2.86 times more SVs than SRS, excelling at detecting large variants (>6 kb), with SV detection improving with sequencing depth. Sequencing depth strongly influenced variant calling performance, whereas multiplexing effects were minimal. Our findings provide valuable insights for optimising LRS applications in genomic research and diagnostics.
Collapse
Affiliation(s)
- Renato Santos
- Department of Biostatistics & Health Informatics, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 16 De Crespigny Park, London SE5 8AB, UK;
| | - Hyunah Lee
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 16 De Crespigny Park, London SE5 8AB, UK; (H.L.); (A.B.-K.); (S.-H.L.); (G.B.)
| | - Alexander Williams
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 16 De Crespigny Park, London SE5 8AB, UK; (H.L.); (A.B.-K.); (S.-H.L.); (G.B.)
| | - Anastasia Baffour-Kyei
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 16 De Crespigny Park, London SE5 8AB, UK; (H.L.); (A.B.-K.); (S.-H.L.); (G.B.)
| | - Sang-Hyuck Lee
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 16 De Crespigny Park, London SE5 8AB, UK; (H.L.); (A.B.-K.); (S.-H.L.); (G.B.)
| | - Claire Troakes
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 5 Cutcombe Rd, London SE5 9RX, UK; (C.T.); (A.A.-C.)
| | - Ammar Al-Chalabi
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 5 Cutcombe Rd, London SE5 9RX, UK; (C.T.); (A.A.-C.)
| | - Gerome Breen
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 16 De Crespigny Park, London SE5 8AB, UK; (H.L.); (A.B.-K.); (S.-H.L.); (G.B.)
| | - Alfredo Iacoangeli
- Department of Biostatistics & Health Informatics, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 16 De Crespigny Park, London SE5 8AB, UK;
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry Psychology & Neuroscience, King’s College London, 5 Cutcombe Rd, London SE5 9RX, UK; (C.T.); (A.A.-C.)
- Perron Institute for Neurological and Translational Science, Ground RR Block QE II Medical Centre Ralph & Patricia Sarich Neuroscience Building, 8 Verdun St, Nedlands, WA 6009, Australia
- NIHR Maudsley Biomedical Research Centre (BRC), South London and Maudsley NHS Foundation Trust, 16 De Crespigny Park, London SE5 8AF, UK
| |
Collapse
|
6
|
Liu Y, Li M, Segal A, Zhang M, Sestan N. Decoding human brain evolution: Insights from genomics. Curr Opin Neurobiol 2025; 92:103033. [PMID: 40334295 DOI: 10.1016/j.conb.2025.103033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 02/13/2025] [Accepted: 04/07/2025] [Indexed: 05/09/2025]
Abstract
The human brain has undergone remarkable structural and functional specializations compared to that of nonhuman primates (NHPs), underlying the advanced cognitive abilities unique to humans. However, the cellular and genetic basis driving these specializations remains largely unknown. Comparing humans to our closest living relatives, chimpanzee and other great apes, is essential for identifying truly human-specific features. Recent comparative studies with closely related NHPs at the single-cell resolution using multimodal genomic profiling, assisted with high-throughput functional screening have provided unprecedented insights into human-specific brain features and their genetic underpinnings. In this review, we synthesize the current knowledge of human brain evolution at cellular and molecular levels, emphasizing how genetic changes have shaped these adaptations. We also discuss the emerging opportunities presented by new technologies and comprehensive atlases for advancing our understanding of human brain evolution.
Collapse
Affiliation(s)
- Yuting Liu
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | - Mingli Li
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | - Ashlea Segal
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA; Wu-Tsai Institute, Yale University, New Haven, CT, 06520, USA
| | - Menglei Zhang
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA; Wu-Tsai Institute, Yale University, New Haven, CT, 06520, USA; Departments of Comparative Medicine, Genetics, and Psychiatry, Program in Cellular Neuroscience, Neurodegeneration and Repair, and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT, 06510, USA.
| |
Collapse
|
7
|
Zwartkruis MM, de Pagter MS, Gommers D, Koopmans M, Ottenheim CPE, Kortooms JV, Albring M, Elferink MG, Wadman RI, Asselman FL, Cuppen I, van der Pol WL, Nelen MR, van Haaften GW, Groen EJN. A de novo deletion underlying spinal muscular atrophy: implications for carrier testing and genetic counseling. Hum Mol Genet 2025; 34:894-904. [PMID: 40094379 PMCID: PMC12056310 DOI: 10.1093/hmg/ddaf035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2025] [Revised: 02/24/2025] [Accepted: 02/28/2025] [Indexed: 03/19/2025] Open
Abstract
Spinal muscular atrophy (SMA) is an autosomal recessive disease most commonly caused by homozygous deletion of the SMN1 gene. Parents of affected children are typically carriers, with a recurrence risk of 25% for future pregnancies. Their close relatives have up to 50% chance of being carriers. Carriers typically possess a single copy of the SMN1 gene; however, some parents carry two copies of SMN1. Current standard diagnostic carrier tests are unable to distinguish between silent carriers with two copies on one chromosome (2 + 0 genotype) and non-carriers (1 + 1 genotype), where a de novo deletion occurred. This distinction is crucial for recurrence risk assessment, which highlights the unsolved challenge to carrier testing and genetic counseling. We combined microsatellite marker analysis, SMN copy number analysis, Sanger sequencing, long-read sequencing and de novo assembly to investigate the cause of the absence of SMN1 in a pedigree with an SMA patient identified through newborn screening, whose parents each carried two SMN1 copies. Our analysis revealed that the father is a silent carrier, while de novo assembly of the SMN locus showed a 1.4 megabase (Mb) de novo deletion between mother and child. This deletion encompasses SMN1 and SMN2 and represents the first reported nucleotide-level resolved SMA-causing deletion to date. Our findings allowed informed counseling of at-risk relatives and illustrate the complexity of SMA carrier testing and counseling. This case underscores the feasibility of and need for advanced genetic testing for SMA carriership in select cases, to improve genetic counseling practices, risk assessment, and family planning.
Collapse
Affiliation(s)
- Maria M Zwartkruis
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
- Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Mirjam S de Pagter
- Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Demi Gommers
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
- Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Marije Koopmans
- Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Cecile P E Ottenheim
- Department of Human Genetics, Amsterdam UMC, University of Amsterdam, Meibergdreef 9, Amsterdam 1105 AZ, the Netherlands
| | - Joris V Kortooms
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Mirjan Albring
- Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Martin G Elferink
- Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Renske I Wadman
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Fay-Lynn Asselman
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Inge Cuppen
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - W Ludo van der Pol
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Marcel R Nelen
- Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Gijs W van Haaften
- Department of Genetics, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| | - Ewout J N Groen
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Heidelberglaan 100, Utrecht 3584 CX, the Netherlands
| |
Collapse
|
8
|
Gozashti L, Harringmeyer OS, Hoekstra HE. How repeats rearrange chromosomes: The molecular basis of chromosomal inversions in deer mice. Cell Rep 2025; 44:115644. [PMID: 40327505 DOI: 10.1016/j.celrep.2025.115644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 01/08/2025] [Accepted: 04/11/2025] [Indexed: 05/08/2025] Open
Abstract
Large genomic rearrangements, such as chromosomal inversions, can play a key role in evolution, but the mechanisms by which these rearrangements arise remain poorly understood. To study the origins of inversions, we generated chromosome-level de novo genome assemblies for four subspecies of the deer mouse (Peromyscus maniculatus) with known inversion polymorphisms. We identified ∼8,000 inversions, including 47 megabase-scale inversions, that together affect ∼30% of the genome. Analysis of inversion breakpoints suggests that while most small (<1 Mb) inversions arose via ectopic recombination between retrotransposons, large (>1 Mb) inversions are primarily associated with segmental duplications (SDs). Large inversion breakpoints frequently occur near centromeres, which may be explained by an accumulation of retrotransposons in pericentromeric regions driving SDs. Additionally, multiple large inversions likely arose from ectopic recombination between near-identical centromeric satellite arrays located megabases apart, suggesting that centromeric repeats may also facilitate inversions. Together, our results illuminate how repeats give rise to massive shifts in chromosome architecture.
Collapse
Affiliation(s)
- Landen Gozashti
- Department of Organismic & Evolutionary Biology, Department of Molecular & Cellular Biology, Museum of Comparative Zoology and Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
| | - Olivia S Harringmeyer
- Department of Organismic & Evolutionary Biology, Department of Molecular & Cellular Biology, Museum of Comparative Zoology and Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA.
| | - Hopi E Hoekstra
- Department of Organismic & Evolutionary Biology, Department of Molecular & Cellular Biology, Museum of Comparative Zoology and Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
9
|
Denizli I, Monteiro A, Elmer KR, Stevenson TJ. Photoperiod-driven testicular DNA methylation in gonadotropin and sex steroid receptor promoters in Siberian hamsters. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2025; 211:327-337. [PMID: 39954063 DOI: 10.1007/s00359-025-01733-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 01/28/2025] [Accepted: 01/30/2025] [Indexed: 02/17/2025]
Abstract
Seasonal cycles in breeding, often orchestrated by annual changes in photoperiod, are common in nature. Here, we studied how change in photoperiod affects DNA methylation in the testes of a highly seasonal breeder: the Siberian hamster (Phodopus sungorus). We hypothesized that DNA methylation in promoter regions associated with key reproductive genes such as follicle-stimulating hormone receptor in the testes is linked to breeding and non-breeding states. Using Oxford Nanopore sequencing, we identified more than 10 million (10,151,742) differentially methylated cytosine-guanine (CpG) sites in the genome between breeding long photoperiod and non-breeding short photoperiod conditions. ShinyGo enrichment analyses identified biological pathways consisting of reproductive system, hormone-mediated signalling and gonad development. We found that short photoperiod induced DNA methylation in the promoter regions for androgen receptor (Ar), estrogen receptors (Esr1, Esr2), kisspeptin1 receptor (kiss1r) and follicle-stimulating hormone receptor (Fshr). Long photoperiods were observed to have higher DNA methylation in promoters for basic helix-loop-helix ARNT-like 1 (Bmal1), progesterone receptor (Pgr) and thyroid-stimulating hormone receptor (Tshr). Our findings provide insights into the epigenetic mechanisms underlying seasonal adaptations in timing reproduction in Siberian hamsters and could be informative for understanding male fertility and reproductive disorders in mammals.
Collapse
Affiliation(s)
- Irem Denizli
- School of Biodiversity, One Health and Veterinary Medicine, College of Medical, Veterinary & Life Sciences, University of Glasgow, Glasgow, G12 8QQ, UK.
| | - Ana Monteiro
- School of Biodiversity, One Health and Veterinary Medicine, College of Medical, Veterinary & Life Sciences, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Kathryn R Elmer
- School of Biodiversity, One Health and Veterinary Medicine, College of Medical, Veterinary & Life Sciences, University of Glasgow, Glasgow, G12 8QQ, UK
| | - Tyler J Stevenson
- School of Biodiversity, One Health and Veterinary Medicine, College of Medical, Veterinary & Life Sciences, University of Glasgow, Glasgow, G12 8QQ, UK
| |
Collapse
|
10
|
Yoo D, Rhie A, Hebbar P, Antonacci F, Logsdon GA, Solar SJ, Antipov D, Pickett BD, Safonova Y, Montinaro F, Luo Y, Malukiewicz J, Storer JM, Lin J, Sequeira AN, Mangan RJ, Hickey G, Monfort Anez G, Balachandran P, Bankevich A, Beck CR, Biddanda A, Borchers M, Bouffard GG, Brannan E, Brooks SY, Carbone L, Carrel L, Chan AP, Crawford J, Diekhans M, Engelbrecht E, Feschotte C, Formenti G, Garcia GH, de Gennaro L, Gilbert D, Green RE, Guarracino A, Gupta I, Haddad D, Han J, Harris RS, Hartley GA, Harvey WT, Hiller M, Hoekzema K, Houck ML, Jeong H, Kamali K, Kellis M, Kille B, Lee C, Lee Y, Lees W, Lewis AP, Li Q, Loftus M, Loh YHE, Loucks H, Ma J, Mao Y, Martinez JFI, Masterson P, McCoy RC, McGrath B, McKinney S, Meyer BS, Miga KH, Mohanty SK, Munson KM, Pal K, Pennell M, Pevzner PA, Porubsky D, Potapova T, Ringeling FR, Rocha JL, Ryder OA, Sacco S, Saha S, Sasaki T, Schatz MC, Schork NJ, Shanks C, Smeds L, Son DR, Steiner C, Sweeten AP, Tassia MG, Thibaud-Nissen F, Torres-González E, Trivedi M, Wei W, Wertz J, Yang M, Zhang P, Zhang S, Zhang Y, Zhang Z, et alYoo D, Rhie A, Hebbar P, Antonacci F, Logsdon GA, Solar SJ, Antipov D, Pickett BD, Safonova Y, Montinaro F, Luo Y, Malukiewicz J, Storer JM, Lin J, Sequeira AN, Mangan RJ, Hickey G, Monfort Anez G, Balachandran P, Bankevich A, Beck CR, Biddanda A, Borchers M, Bouffard GG, Brannan E, Brooks SY, Carbone L, Carrel L, Chan AP, Crawford J, Diekhans M, Engelbrecht E, Feschotte C, Formenti G, Garcia GH, de Gennaro L, Gilbert D, Green RE, Guarracino A, Gupta I, Haddad D, Han J, Harris RS, Hartley GA, Harvey WT, Hiller M, Hoekzema K, Houck ML, Jeong H, Kamali K, Kellis M, Kille B, Lee C, Lee Y, Lees W, Lewis AP, Li Q, Loftus M, Loh YHE, Loucks H, Ma J, Mao Y, Martinez JFI, Masterson P, McCoy RC, McGrath B, McKinney S, Meyer BS, Miga KH, Mohanty SK, Munson KM, Pal K, Pennell M, Pevzner PA, Porubsky D, Potapova T, Ringeling FR, Rocha JL, Ryder OA, Sacco S, Saha S, Sasaki T, Schatz MC, Schork NJ, Shanks C, Smeds L, Son DR, Steiner C, Sweeten AP, Tassia MG, Thibaud-Nissen F, Torres-González E, Trivedi M, Wei W, Wertz J, Yang M, Zhang P, Zhang S, Zhang Y, Zhang Z, Zhao SA, Zhu Y, Jarvis ED, Gerton JL, Rivas-González I, Paten B, Szpiech ZA, Huber CD, Lenz TL, Konkel MK, Yi SV, Canzar S, Watson CT, Sudmant PH, Molloy E, Garrison E, Lowe CB, Ventura M, O'Neill RJ, Koren S, Makova KD, Phillippy AM, Eichler EE. Complete sequencing of ape genomes. Nature 2025; 641:401-418. [PMID: 40205052 PMCID: PMC12058530 DOI: 10.1038/s41586-025-08816-3] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 02/19/2025] [Indexed: 04/11/2025]
Abstract
The most dynamic and repetitive regions of great ape genomes have traditionally been excluded from comparative studies1-3. Consequently, our understanding of the evolution of our species is incomplete. Here we present haplotype-resolved reference genomes and comparative analyses of six ape species: chimpanzee, bonobo, gorilla, Bornean orangutan, Sumatran orangutan and siamang. We achieve chromosome-level contiguity with substantial sequence accuracy (<1 error in 2.7 megabases) and completely sequence 215 gapless chromosomes telomere-to-telomere. We resolve challenging regions, such as the major histocompatibility complex and immunoglobulin loci, to provide in-depth evolutionary insights. Comparative analyses enabled investigations of the evolution and diversity of regions previously uncharacterized or incompletely studied without bias from mapping to the human reference genome. Such regions include newly minted gene families in lineage-specific segmental duplications, centromeric DNA, acrocentric chromosomes and subterminal heterochromatin. This resource serves as a comprehensive baseline for future evolutionary studies of humans and our closest living ape relatives.
Collapse
Affiliation(s)
- DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arang Rhie
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Prajna Hebbar
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Francesca Antonacci
- Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Steven J Solar
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Dmitry Antipov
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Brandon D Pickett
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Yana Safonova
- Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA, USA
| | - Francesco Montinaro
- Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy
- Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Yanting Luo
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, USA
| | - Joanna Malukiewicz
- Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
- German Primate Center, Primate Genetics Laboratory, Goettingen, Germany
| | - Jessica M Storer
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| | - Jiadong Lin
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Riley J Mangan
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Genetics Training Program, Harvard Medical School, Boston, MA, USA
| | - Glenn Hickey
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | | | - Anton Bankevich
- Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA, USA
| | - Christine R Beck
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA
| | - Arjun Biddanda
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | | | - Gerard G Bouffard
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Emry Brannan
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Shelise Y Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lucia Carbone
- Department of Medicine, KCVI, Oregon Health Sciences University, Portland, OR, USA
- Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA
| | - Laura Carrel
- PSU Medical School, Penn State University School of Medicine, Hershey, PA, USA
| | - Agnes P Chan
- The Translational Genomics Research Institute, City of Hope National Medical Center, Phoenix, AZ, USA
| | - Juyun Crawford
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Eric Engelbrecht
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
| | - Cedric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Luciana de Gennaro
- Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy
| | - David Gilbert
- San Diego Biomedical Research Institute, San Diego, CA, USA
| | - Richard E Green
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Ishaan Gupta
- Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA
| | - Diana Haddad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Junmin Han
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Robert S Harris
- Department of Biology, Penn State University, University Park, PA, USA
| | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt, Germany
- Senckenberg Research Institute, Frankfurt, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University Frankfurt, Frankfurt, Germany
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kaivan Kamali
- Department of Biology, Penn State University, University Park, PA, USA
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Bryce Kille
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Chul Lee
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Youngho Lee
- Laboratory of Bioinformatics and Population Genetics, Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - William Lees
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
- Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Qiuhui Li
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Mark Loftus
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
| | - Yong Hwee Eddie Loh
- Neuroscience Research Institute, University of California, Santa Barbara, Santa Barbara, CA, USA
| | - Hailey Loucks
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
- Shanghai Jiao Tong University Chongqing Research Institute, Chongqing, China
| | - Juan F I Martinez
- Computer Science and Engineering Department, Huck Institutes of Life Sciences, Pennsylvania State University, State College, PA, USA
| | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Barbara McGrath
- Department of Biology, Penn State University, University Park, PA, USA
| | - Sean McKinney
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Britta S Meyer
- Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Saswat K Mohanty
- Department of Biology, Penn State University, University Park, PA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Karol Pal
- Department of Biology, Penn State University, University Park, PA, USA
| | - Matt Pennell
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Francisca R Ringeling
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany
| | - Joana L Rocha
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
| | | | - Samuel Sacco
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Swati Saha
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
| | - Takayo Sasaki
- San Diego Biomedical Research Institute, San Diego, CA, USA
| | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Nicholas J Schork
- The Translational Genomics Research Institute, City of Hope National Medical Center, Phoenix, AZ, USA
| | - Cole Shanks
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Linnéa Smeds
- Department of Biology, Penn State University, University Park, PA, USA
| | - Dongmin R Son
- Department of Ecology, Evolution and Marine Biology, Neuroscience Research Institute, University of California, Santa Barbara, Santa Barbara, CA, USA
| | | | - Alexander P Sweeten
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael G Tassia
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | - Mihir Trivedi
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Wenjie Wei
- School of Life Sciences, Westlake University, Hangzhou, China
- National Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Julie Wertz
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Muyu Yang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Panpan Zhang
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yang Zhang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Zhenmiao Zhang
- Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA
| | - Sarah A Zhao
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yixin Zhu
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Erich D Jarvis
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | | | - Iker Rivas-González
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Zachary A Szpiech
- Department of Biology, Penn State University, University Park, PA, USA
| | - Christian D Huber
- Department of Biology, Penn State University, University Park, PA, USA
| | - Tobias L Lenz
- Research Unit for Evolutionary Immunogenomics, Department of Biology, University of Hamburg, Hamburg, Germany
| | - Miriam K Konkel
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
| | - Soojin V Yi
- Department of Ecology, Evolution and Marine Biology, Neuroscience Research Institute, University of California, Santa Barbara, Santa Barbara, CA, USA
- Department of Molecular, Cellular and Developmental Biology, Neuroscience Research Institute, University of California, Santa Barbara, Santa Barbara, CA, USA
| | - Stefan Canzar
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Louisville, Louisville, KY, USA
| | - Peter H Sudmant
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Erin Molloy
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Craig B Lowe
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, USA
| | - Mario Ventura
- Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy
| | - Rachel J O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Sergey Koren
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kateryna D Makova
- Department of Biology, Penn State University, University Park, PA, USA.
| | - Adam M Phillippy
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
11
|
Porubsky D, Dashnow H, Sasani TA, Logsdon GA, Hallast P, Noyes MD, Kronenberg ZN, Mokveld T, Koundinya N, Nolan C, Steely CJ, Guarracino A, Dolzhenko E, Harvey WT, Rowell WJ, Grigorev K, Nicholas TJ, Goldberg ME, Oshima KK, Lin J, Ebert P, Watkins WS, Leung TY, Hanlon VCT, McGee S, Pedersen BS, Happ HC, Jeong H, Munson KM, Hoekzema K, Chan DD, Wang Y, Knuth J, Garcia GH, Fanslow C, Lambert C, Lee C, Smith JD, Levy S, Mason CE, Garrison E, Lansdorp PM, Neklason DW, Jorde LB, Quinlan AR, Eberle MA, Eichler EE. Human de novo mutation rates from a four-generation pedigree reference. Nature 2025:10.1038/s41586-025-08922-2. [PMID: 40269156 DOI: 10.1038/s41586-025-08922-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 03/20/2025] [Indexed: 04/25/2025]
Abstract
Understanding the human de novo mutation (DNM) rate requires complete sequence information1. Here using five complementary short-read and long-read sequencing technologies, we phased and assembled more than 95% of each diploid human genome in a four-generation, twenty-eight-member family (CEPH 1463). We estimate 98-206 DNMs per transmission, including 74.5 de novo single-nucleotide variants, 7.4 non-tandem repeat indels, 65.3 de novo indels or structural variants originating from tandem repeats, and 4.4 centromeric DNMs. Among male individuals, we find 12.4 de novo Y chromosome events per generation. Short tandem repeats and variable-number tandem repeats are the most mutable, with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 16% of de novo single-nucleotide variants are postzygotic in origin with no paternal bias, including early germline mosaic mutations. We place all this variation in the context of a high-resolution recombination map (~3.4 kb breakpoint resolution) and find no correlation between meiotic crossover and de novo structural variants. These near-telomere-to-telomere familial genomes provide a truth set to understand the most fundamental processes underlying human genetic variation.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Thomas A Sasani
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Michelle D Noyes
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Nidhi Koundinya
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Cody J Steely
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Internal Medicine, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Andrea Guarracino
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Kirill Grigorev
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Thomas J Nicholas
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Michael E Goldberg
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Keisuke K Oshima
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jiadong Lin
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter Ebert
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - W Scott Watkins
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Tiffany Y Leung
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Vincent C T Hanlon
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Sean McGee
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hannah C Happ
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Altos Labs, San Diego, CA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Daniel D Chan
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Yanni Wang
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Joshua D Smith
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Erik Garrison
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Peter M Lansdorp
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Deborah W Neklason
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
12
|
Gao Y, Yang L, Kuhn K, Li W, Zanton G, Bowman M, Zhao P, Zhou Y, Fang L, Cole JB, Rosen BD, Ma L, Li C, Baldwin RL, Van Tassell CP, Zhang Z, Smith TPL, Liu GE. Long read and preliminary pangenome analyses reveal breed-specific structural variations and novel sequences in Holstein and Jersey cattle. J Adv Res 2025:S2090-1232(25)00258-9. [PMID: 40258473 DOI: 10.1016/j.jare.2025.04.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2024] [Revised: 04/06/2025] [Accepted: 04/10/2025] [Indexed: 04/23/2025] Open
Abstract
INTRODUCTION Most SV studies in livestock rely on short-read sequencing, posing challenges in accurately characterizing large genomic variants due to their limited read length. OBJECTIVES Our goal is to reveal structural variation and novel sequences specific to Holstein and Jersey cattle breeds using long-read and pan-genome analyses. METHODS We sequenced 20 Holsteins and 8 Jersey cattle using PacBio HiFi to 20×, and integrated five read-based and one assembly-based SV caller to determine SVs. RESULTS We assembled the 28 genomes averaging 3.25 Gb with a contig N50 of 69.36 Mb and using the ARS-UCD1.2 reference, we acquired Holstein/Jersey SV catalogs with 74,068/54,689 events spanning 202/135 Mb (7.43 %/4.97 % of the genome). SVs were enriched in less conserved, non-coding, and non-regulatory regions. Comparing Holsteins with differing feed efficiency (FE), SVs unique to high FE were linked to energy metabolism and olfactory receptors, while those specific to low FE were associated with material transport. We constructed Holstein/Jersey pangenome graphs with 148,598/105,875 nodes and 208,891/147,990 edges, representing 47,028/37,137 biallelic and multi-allelic events, and 63.75/42.34 Mb of novel sequence. We observed SV count saturation with 20 Holsteins, while adding Jerseys significantly increased the SV count, highlighting breed-specific SV events. CONCLUSION Our long-read data and SV catalogs are valuable resources, revealing that the cattle genome is more complex than previously thought.
Collapse
Affiliation(s)
- Yahui Gao
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA; Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA.
| | - Liu Yang
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA; Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA.
| | - Kristen Kuhn
- USDA, ARS, U.S. Meat Animal Research Center (USMARC), Clay Center, NE, USA.
| | - Wenli Li
- US Dairy Forage Research Center, USDA-ARS, Madison, WI, USA.
| | - Geoffrey Zanton
- US Dairy Forage Research Center, USDA-ARS, Madison, WI, USA.
| | - Mary Bowman
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Pengju Zhao
- Hainan Institute, Zhejiang University, Yongyou Industry Park, Yazhou Bay Sci-Tech City, Sanya 572000, China.
| | - Yang Zhou
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China.
| | - Lingzhao Fang
- Quantitative Genetics and Genomics (QGG), Aarhus University, Aarhus, Denmark.
| | - John B Cole
- Council on Dairy Cattle Breeding, 4201 Northview Dr, Bowie, MD 20716, USA; Department of Animal Sciences, Donald Henry Barron Reproductive and Perinatal Biology Research Program, and the Genetics Institute, University of Florida, Gainesville, FL 32611-0910, USA; Department of Animal Science, North Carolina State University, Raleigh, NC 27695-7621, USA.
| | - Benjamin D Rosen
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Li Ma
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA.
| | - Congjun Li
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Ransom L Baldwin
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Curtis P Van Tassell
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Zhe Zhang
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China.
| | - Timothy P L Smith
- USDA, ARS, U.S. Meat Animal Research Center (USMARC), Clay Center, NE, USA.
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| |
Collapse
|
13
|
Ma W, Chaisson M. Genotyping sequence-resolved copy number variation using pangenomes reveals paralog-specific global diversity and expression divergence of duplicated genes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.08.11.607269. [PMID: 39149335 PMCID: PMC11326217 DOI: 10.1101/2024.08.11.607269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Copy number variant (CNV) genes are important in evolution and disease, yet sequence variation in CNV genes remains a blind spot in large-scale studies. We present ctyper, a method that leverages pangenomes to produce allele-specific copy numbers with locally phased variants from next-generation sequencing (NGS) reads. Benchmarking on 3,351 CNV genes, including HLA, SMN, and CYP2D6, and 212 challenging medically relevant (CMR) genes that are poorly mapped by NGS, ctyper captures 96.5% of phased variants with ≥99.1% correctness of copy number on CNV genes and 94.8% of phased variants on CMR genes. Applying alignment-free algorithms, ctyper requires 1.5 hours per genome on a single CPU. The results improve prediction of gene expression compared to known expression quantitative trait loci (eQTL) variants. Allele-specific expression quantified divergent expression on 7.94% of paralogs and tissue-specific biases on 4.68% of paralogs. We found reduced expression of SMN-2 due to SMN1 conversion, potentially affecting spinal muscular atrophy, and increased expression of translocated duplications of AMY2B. Overall, ctyper enables biobank-scale genotyping of CNV and CMR genes.
Collapse
|
14
|
Fernandez-Luna L, Aguilar-Perez C, Grochowski CM, Mehaffey MG, Carvalho CMB, Gonzaga-Jauregui C. Genome-wide maps of highly-similar intrachromosomal repeats that can mediate ectopic recombination in three human genome assemblies. HGG ADVANCES 2025; 6:100396. [PMID: 39722459 PMCID: PMC11794170 DOI: 10.1016/j.xhgg.2024.100396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 12/23/2024] [Accepted: 12/23/2024] [Indexed: 12/28/2024] Open
Abstract
Repeated sequences spread throughout the genome play important roles in shaping the structure of chromosomes and facilitating the generation of new genomic variation through structural rearrangements. Several mechanisms of structural variation formation use shared nucleotide similarity between repeated sequences as substrate for ectopic recombination. We performed genome-wide analyses of direct and inverted intrachromosomal repeated sequence pairs with 200 bp or more and 80% or greater sequence identity in three human genome assemblies, GRCh37, GRCh38, and T2T-CHM13. Overall, the composition and distribution of direct and inverted repeated sequences identified was similar among the three assemblies involving 13%-15% of the haploid genome, with an increased, albeit not significant, number of repeated sequences in T2T-CHM13. Interestingly, the majority of repeated sequences are below 1 kb in length with a median of 84.2% identity, highlighting the potential relevance of smaller, less identical repeats, such as Alu-Alu pairs, for ectopic recombination. We cross-referenced the identified repeated sequences with protein-coding genes to identify those at risk for being involved in genomic rearrangements. Olfactory receptors and immune response genes were enriched among those impacted.
Collapse
Affiliation(s)
- Luis Fernandez-Luna
- International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México
| | - Carlos Aguilar-Perez
- International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México
| | | | | | | | - Claudia Gonzaga-Jauregui
- International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México; Pacific Northwest Research Institute, Seattle, WA, USA.
| |
Collapse
|
15
|
Hartley GA, Okhovat M, Hoyt SJ, Fuller E, Pauloski N, Alexandre N, Alexandrov I, Drennan R, Dubocanin D, Gilbert DM, Mao Y, McCann C, Neph S, Ryabov F, Sasaki T, Storer JM, Svendsen D, Troy W, Wells J, Core L, Stergachis A, Carbone L, O'Neill RJ. Centromeric transposable elements and epigenetic status drive karyotypic variation in the eastern hoolock gibbon. CELL GENOMICS 2025; 5:100808. [PMID: 40088887 PMCID: PMC12008813 DOI: 10.1016/j.xgen.2025.100808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 12/10/2024] [Accepted: 02/12/2025] [Indexed: 03/17/2025]
Abstract
Great apes have maintained a stable karyotype with few large-scale rearrangements; in contrast, gibbons have undergone a high rate of chromosomal rearrangements coincident with rapid centromere turnover. Here, we characterize fully assembled centromeres in the eastern hoolock gibbon, Hoolock leuconedys (HLE), finding a diverse group of transposable elements (TEs) that differ from the canonical alpha-satellites found across centromeres of other apes. We find that HLE centromeres contain a CpG methylation centromere dip region, providing evidence that this epigenetic feature is conserved in the absence of satellite arrays. We uncovered a variety of atypical centromeric features, including protein-coding genes and mismatched replication timing. Further, we identify duplications and deletions in HLE centromeres that distinguish them from other gibbons. Finally, we observed differentially methylated TEs, topologically associated domain boundaries, and segmental duplications at chromosomal breakpoints, and thus propose that a combination of multiple genomic attributes with propensities for chromosome instability shaped gibbon centromere evolution.
Collapse
Affiliation(s)
- Gabrielle A Hartley
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Mariam Okhovat
- Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA
| | - Savannah J Hoyt
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Emily Fuller
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Nicole Pauloski
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Nicolas Alexandre
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Ivan Alexandrov
- Department of Anatomy and Anthropology and Department of Human Molecular Genetics and Biochemistry, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ryan Drennan
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Danilo Dubocanin
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - David M Gilbert
- San Diego Biomedical Research Institute, San Diego, CA 92121, USA
| | - Yizi Mao
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Christine McCann
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Shane Neph
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Fedor Ryabov
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Takayo Sasaki
- San Diego Biomedical Research Institute, San Diego, CA 92121, USA
| | - Jessica M Storer
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Derek Svendsen
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | | | - Jackson Wells
- Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA
| | - Leighton Core
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Andrew Stergachis
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Lucia Carbone
- Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA; Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, OR, USA; Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, USA; Division of Genetics, Oregon National Primate Research Center, Portland, OR, USA
| | - Rachel J O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA; Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA; Department of Genetics and Genome Sciences, UConn Health, Farmington, CT, USA.
| |
Collapse
|
16
|
Rodrigues PS, Burssed B, Bellucco F, Rosolen DCB, Kim CA, Melaragno MI. Cytogenomic characterization of karyotypes with additional autosomal material. Sci Rep 2025; 15:12191. [PMID: 40204846 PMCID: PMC11982272 DOI: 10.1038/s41598-025-97077-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2024] [Accepted: 04/02/2025] [Indexed: 04/11/2025] Open
Abstract
Chromosomal rearrangements involving additional material in individuals with phenotypic alterations usually result in partial trisomy, often accompanied by partial monosomy. To characterize chromosomal rearrangements and analyze genomic characteristics in the breakpoint regions in 31 patients with additional material on an autosomal chromosome. Different tests were performed to characterize these patients, including karyotyping, chromosomal microarray analysis (CMA), and fluorescent in situ hybridization (FISH). In silico analyses evaluated A/B chromosomal compartments, segmental duplications, and repetitive elements at breakpoints. The 31 rearrangements resulted in 47 copy number variations (CNVs) and a range of structural aberrations were identified, including six tandem duplications, 19 derivative chromosomes, two intrachromosomal rearrangements, one recombinant, two dicentric chromosomes, and one triplication. A deleted segment was associated with the duplication in 16 of the 19 patients with derivative chromosomes from translocation. Among the trios whose chromosome rearrangement origin could be investigated, 54,5% were de novo, 31,9% were maternally inherited, and 13,6% were paternally inherited from balanced translocations or inversion. Breakpoint analysis revealed that 22 were in the A compartment (euchromatin), 25 were in the B compartment (heterochromatin), and five were in an undefined compartment. Additionally, 14 patients had breakpoints in regions of segmental duplications and repeat elements. Our study found that a deletion accompanied by additional genetic material was present in 51.6% of the patients, uncovering the underlying genetic imbalances. Statistical analyses revealed a positive correlation between chromosome size and the occurrence of CNVs in the rearrangements. Furthermore, no preference was observed for breakpoints occurring in compartments A and B, repetitive elements, or segmental duplications.
Collapse
Affiliation(s)
| | - Bruna Burssed
- Genetics Division, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Fernanda Bellucco
- Genetics Division, Universidade Federal de São Paulo, São Paulo, Brazil
| | | | - Chong Ae Kim
- Genetics Unit, Instituto da Criança, Universidade de São Paulo, São Paulo, Brazil
| | - Maria Isabel Melaragno
- Genetics Division, Universidade Federal de São Paulo, São Paulo, Brazil.
- Genetics Division, Department of Morphology and Genetics, Universidade Federal de São Paulo, Rua Botucatu, 740, São Paulo, CEP 04023-900, SP, Brazil.
| |
Collapse
|
17
|
Takeda A, Nonaka D, Imazu Y, Fukunaga T, Hamada M. REPrise: de novo interspersed repeat detection using inexact seeding. Mob DNA 2025; 16:16. [PMID: 40181468 PMCID: PMC11966803 DOI: 10.1186/s13100-025-00353-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Accepted: 03/17/2025] [Indexed: 04/05/2025] Open
Abstract
BACKGROUND Interspersed repeats occupy a large part of many eukaryotic genomes, and thus their accurate annotation is essential for various genome analyses. Database-free de novo repeat detection approaches are powerful for annotating genomes that lack well-curated repeat databases. However, existing tools do not yet have sufficient repeat detection performance. RESULTS In this study, we developed REPrise, a de novo interspersed repeat detection software program based on a seed-and-extension method. Although the algorithm of REPrise is similar to that of RepeatScout, which is currently the de facto standard tool, we incorporated three unique techniques into REPrise: inexact seeding, affine gap scoring and loose masking. Analyses of rice and simulation genome datasets showed that REPrise outperformed RepeatScout in terms of sensitivity, especially when the repeat sequences contained many mutations. Furthermore, when applied to the complete human genome dataset T2T-CHM13, REPrise demonstrated the potential to detect novel repeat sequence families. CONCLUSION REPrise can detect interspersed repeats with high sensitivity even in long genomes. Our software enhances repeat annotation in diverse genomic studies, contributing to a deeper understanding of genomic structures.
Collapse
Affiliation(s)
- Atsushi Takeda
- Department of Electrical Engineering and Bioscience, Graduate School of Advanced Science and Engineering, Waseda University, Tokyo, 1698555, Japan
- Computational Bio Big-Data Open Innovation Laboratory, AIST-Waseda University, Tokyo, 1698555, Japan
| | - Daisuke Nonaka
- Department of Computer Science, Graduate School of Information Science and Technology, the University of Tokyo, Tokyo, 1130032, Japan
| | - Yuta Imazu
- Department of Electrical Engineering and Bioscience, School of Advanced Science and Engineering, Waseda University, Tokyo, 1698555, Japan
| | - Tsukasa Fukunaga
- Department of Computer Science, Graduate School of Information Science and Technology, the University of Tokyo, Tokyo, 1130032, Japan.
- Waseda Institute for Advanced Study, Waseda University, Tokyo, 1690051, Japan.
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Graduate School of Advanced Science and Engineering, Waseda University, Tokyo, 1698555, Japan.
- Computational Bio Big-Data Open Innovation Laboratory, AIST-Waseda University, Tokyo, 1698555, Japan.
- Graduate School of Medicine, Nippon Medical School, Tokyo, 1138602, Japan.
| |
Collapse
|
18
|
Zhang S, Xu N, Fu L, Yang X, Ma K, Li Y, Yang Z, Li Z, Feng Y, Jiang X, Han J, Hu R, Zhang L, Lian D, de Gennaro L, Paparella A, Ryabov F, Meng D, He Y, Wu D, Yang C, Mao Y, Bian X, Lu Y, Antonacci F, Ventura M, Shepelev VA, Miga KH, Alexandrov IA, Logsdon GA, Phillippy AM, Su B, Zhang G, Eichler EE, Lu Q, Shi Y, Sun Q, Mao Y. Integrated analysis of the complete sequence of a macaque genome. Nature 2025; 640:714-721. [PMID: 40011769 PMCID: PMC12003069 DOI: 10.1038/s41586-025-08596-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 01/03/2025] [Indexed: 02/28/2025]
Abstract
The crab-eating macaques (Macaca fascicularis) and rhesus macaques (Macaca mulatta) are pivotal in biomedical and evolutionary research1-3. However, their genomic complexity and interspecies genetic differences remain unclear4. Here, we present a complete genome assembly of a crab-eating macaque, revealing 46% fewer segmental duplications and 3.83 times longer centromeres than those of humans5,6. We also characterize 93 large-scale genomic differences between macaques and humans at a single-base-pair resolution, highlighting their impact on gene regulation in primate evolution. Using ten long-read macaque genomes, hundreds of short-read macaque genomes and full-length transcriptome data, we identified roughly 2 Mbp of fixed-genetic variants, roughly 240 Mbp of complex loci, 16.76 Mbp genetic differentiation regions and 110 alternative splice events, potentially associated with various phenotypic differences between the two macaque species. In summary, the integrated genetic analysis enhances understanding of lineage-specific phenotypes, adaptation and primate evolution, thereby improving their biomedical applications in human disease research.
Collapse
Affiliation(s)
- Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
| | - Ning Xu
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Lianting Fu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Kaiyue Ma
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yamei Li
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Zikun Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Zhengtong Li
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yu Feng
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
| | - Xinrui Jiang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Junmin Han
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Ruixing Hu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Lu Zhang
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Da Lian
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Luciana de Gennaro
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Annalisa Paparella
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Dan Meng
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yaoxi He
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
| | - Dongya Wu
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- School of Medicine, Zhejiang University, Hangzhou, China
| | - Chentao Yang
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Yuxiang Mao
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xinyan Bian
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yong Lu
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Francesca Antonacci
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Mario Ventura
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Valery A Shepelev
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Karen H Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ivan A Alexandrov
- Department of Anatomy and Anthropology and Department of Human Molecular Genetics and Biochemistry, Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Glennis A Logsdon
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Adam M Phillippy
- Center for Genomics and Data Science Research, Genome Informatics Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Bing Su
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Guojie Zhang
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- School of Medicine, Zhejiang University, Hangzhou, China
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Qing Lu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Qiang Sun
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China.
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China.
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
- University of Chinese Academy of Sciences, Beijing, China.
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China.
- Shanghai Key Laboratory of Embryo Original Diseases, International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
19
|
Fu Y, Timp W, Sedlazeck FJ. Computational analysis of DNA methylation from long-read sequencing. Nat Rev Genet 2025:10.1038/s41576-025-00822-5. [PMID: 40155770 DOI: 10.1038/s41576-025-00822-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/30/2025] [Indexed: 04/01/2025]
Abstract
DNA methylation is a critical epigenetic mechanism in numerous biological processes, including gene regulation, development, ageing and the onset of various diseases such as cancer. Studies of methylation are increasingly using single-molecule long-read sequencing technologies to simultaneously measure epigenetic states such as DNA methylation with genomic variation. These long-read data sets have spurred the continuous development of advanced computational methods to gain insights into the roles of methylation in regulating chromatin structure and gene regulation. In this Review, we discuss the computational methods for calling methylation signals, contrasting methylation between samples, analysing cell-type diversity and gaining additional genomic insights, and then further discuss the challenges and future perspectives of tool development for DNA methylation research.
Collapse
Affiliation(s)
- Yilei Fu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Winston Timp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
20
|
Zwartkruis MM, Elferink MG, Gommers D, Signoria I, Blasco-Pérez L, Costa-Roger M, van der Sel J, Renkens IJ, Green JW, Kortooms JV, Vermeulen C, Straver R, van Deutekom HWM, Veldink JH, Asselman F, Tizzano EF, Wadman RI, van der Pol WL, van Haaften GW, Groen EJN. Long-read sequencing identifies copy-specific markers of SMN gene conversion in spinal muscular atrophy. Genome Med 2025; 17:26. [PMID: 40119448 PMCID: PMC11927269 DOI: 10.1186/s13073-025-01448-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 03/07/2025] [Indexed: 03/24/2025] Open
Abstract
BACKGROUND The complex 2 Mb survival motor neuron (SMN) locus on chromosome 5q13, including the spinal muscular atrophy (SMA)-causing gene SMN1 and modifier SMN2, remains incompletely resolved due to numerous segmental duplications. Variation in SMN2 copy number, presumably influenced by SMN1 to SMN2 gene conversion, affects disease severity, though SMN2 copy number alone has insufficient prognostic value due to limited genotype-phenotype correlations. With advancements in newborn screening and SMN-targeted therapies, identifying genetic markers to predict disease progression and treatment response is crucial. Progress has thus far been limited by methodological constraints. METHODS To address this, we developed HapSMA, a method to perform polyploid phasing of the SMN locus to enable copy-specific analysis of SMN and its surrounding genes. We used HapSMA on publicly available Oxford Nanopore Technologies (ONT) sequencing data of 29 healthy controls and performed long-read, targeted ONT sequencing of the SMN locus of 31 patients with SMA. RESULTS In healthy controls, we identified single nucleotide variants (SNVs) specific to SMN1 and SMN2 haplotypes that could serve as gene conversion markers. Broad phasing including the NAIP gene allowed for a more complete view of SMN locus variation. Genetic variation in SMN2 haplotypes was larger in SMA patients. Forty-two percent of SMN2 haplotypes of SMA patients showed varying SMN1 to SMN2 gene conversion breakpoints, serving as direct evidence of gene conversion as a common genetic characteristic in SMA and highlighting the importance of inclusion of SMA patients when investigating the SMN locus. CONCLUSIONS Our findings illustrate that both methodological advances and the analysis of patient samples are required to advance our understanding of complex genetic loci and address critical clinical challenges.
Collapse
Affiliation(s)
- M M Zwartkruis
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
- Department of Genetics, University Medical Center Utrecht, Utrecht, the Netherlands
| | - M G Elferink
- Department of Genetics, University Medical Center Utrecht, Utrecht, the Netherlands
| | - D Gommers
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
- Department of Genetics, University Medical Center Utrecht, Utrecht, the Netherlands
| | - I Signoria
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - L Blasco-Pérez
- Medicine Genetics Group, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
- Department of Clinical and Molecular Genetics, Hospital Vall d'Hebron, Barcelona, Spain
| | - M Costa-Roger
- Medicine Genetics Group, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
- Department of Clinical and Molecular Genetics, Hospital Vall d'Hebron, Barcelona, Spain
| | - J van der Sel
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
- Department of Genetics, University Medical Center Utrecht, Utrecht, the Netherlands
| | - I J Renkens
- Department of Genetics, University Medical Center Utrecht, Utrecht, the Netherlands
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, the Netherlands
- Utrecht Sequencing Facility, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, the Netherlands
| | - J W Green
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - J V Kortooms
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - C Vermeulen
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, the Netherlands
- Oncode Institute, Utrecht, the Netherlands
| | - R Straver
- Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, the Netherlands
- Oncode Institute, Utrecht, the Netherlands
| | - H W M van Deutekom
- Department of Genetics, University Medical Center Utrecht, Utrecht, the Netherlands
| | - J H Veldink
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - F Asselman
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - E F Tizzano
- Medicine Genetics Group, Vall d'Hebron Research Institute (VHIR), Barcelona, Spain
- Department of Clinical and Molecular Genetics, Hospital Vall d'Hebron, Barcelona, Spain
| | - R I Wadman
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - W L van der Pol
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands
| | - G W van Haaften
- Department of Genetics, University Medical Center Utrecht, Utrecht, the Netherlands.
| | - E J N Groen
- Department of Neurology and Neurosurgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht, the Netherlands.
| |
Collapse
|
21
|
Balle CM, Lildballe DL, Bedei I, Luschka R, Skakkebæk A, Chang S, Agirman Z, Keller J, Weber A, Schäfer RE, Becker-Follmann J, Gravholt CH. Reliable detection of sex chromosome abnormalities by quantitative fluorescence polymerase chain reaction. Clin Chem Lab Med 2025:cclm-2024-1400. [PMID: 40103208 DOI: 10.1515/cclm-2024-1400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Accepted: 03/10/2025] [Indexed: 03/20/2025]
Abstract
OBJECTIVES Many patients with sex chromosome abnormalities (SCAs) are diagnosed late in life or remain undiagnosed, leading to delayed or inadequate medical intervention and care. This study aimed to develop a reliable, rapid and cost-effective test for identifying SCAs using a blood sample - an essential step toward establishing a neonatal screening program. METHODS A total of 360 blood samples (180 SCA patients, and 180 controls) were obtained from four cross-sectional studies of adult patients with SCAs and age-matched controls. Informed consent was collected, and all procedures followed the Declaration of Helsinki. Multiplex quantitative fluorescence polymerase chain reaction (QF-PCR) utilizing short tandem repeat (STR) and X-linked segmental duplication (SD) markers was performed. Results were analyzed using an automated algorithm. Deviant results were manually reviewed to differentiate errors in the PCR process from those in automated data analysis. RESULTS Following automated data analysis of QF-PCR results, the method accurately identified 174 SCA patients (sensitivity: 96.7 %) and 171 controls (specificity: 95.0 %). Mosaic karyotypes were particularly challenging to diagnose. Manual reanalysis of the QF-PCR results corrected all false positives, achieving 100 % specificity. CONCLUSIONS This method is promising for reliable SCA detection in blood samples, offering cost-effectiveness and scalability. The specificity following automated data analysis was not satisfactory. The underlying PCR technique, however, demonstrated 100 % specificity, indicating that refining the automated analysis algorithm would significantly reduce false positive results. With further refinements, we believe this test would be highly suitable for further evaluation in a newborn screening setting.
Collapse
Affiliation(s)
- Camilla Mains Balle
- Department of Endocrinology, 11297 Aarhus University Hospital , Aarhus N, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus C, Denmark
| | - Dorte L Lildballe
- Department of Clinical Medicine, Aarhus University, Aarhus C, Denmark
- Department of Molecular Medicine, 11297 Aarhus University Hospital , Aarhus N, Denmark
| | - Ivonne Bedei
- Department of Obstetrics and Gynecology, Division of Prenatal Medicine and Fetal Therapy, Justus Liebig University Giessen, Giessen, Germany
| | - Ruth Luschka
- Department of Pediatric Surgery and Urology, Children and Youth Hospital Auf Der Bult, Hannover, Germany
| | - Anne Skakkebæk
- Department of Clinical Medicine, Aarhus University, Aarhus C, Denmark
- Department of Molecular Medicine, 11297 Aarhus University Hospital , Aarhus N, Denmark
- Department of Clinical Genetics, 11297 Aarhus University Hospital , Aarhus N, Denmark
| | - Simon Chang
- Department of Endocrinology, 11297 Aarhus University Hospital , Aarhus N, Denmark
- Department of Molecular Medicine, 11297 Aarhus University Hospital , Aarhus N, Denmark
| | | | | | - Axel Weber
- Center for Human Genetics, University of Marburg, Marburg, Germany
- Medizinische Genetik Mainz, Limbach Genetics, Mainz, Germany
| | | | | | - Claus H Gravholt
- Department of Endocrinology, 11297 Aarhus University Hospital , Aarhus N, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus C, Denmark
- Department of Molecular Medicine, 11297 Aarhus University Hospital , Aarhus N, Denmark
| |
Collapse
|
22
|
Real TD, Hebbar P, Yoo D, Antonacci F, Pačar I, Diekhans M, Mikol GJ, Popoola OG, Mallory BJ, Vollger MR, Dishuck PC, Guitart X, Rozanski AN, Munson KM, Hoekzema K, Ranchalis JE, Neph SJ, Sedeño-Cortes AE, Paten B, Salama SR, Stergachis AB, Eichler EE. Genetic diversity and regulatory features of human-specific NOTCH2NL duplications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.14.643395. [PMID: 40166283 PMCID: PMC11956922 DOI: 10.1101/2025.03.14.643395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
NOTCH2NL (NOTCH2-N-terminus-like) genes arose from incomplete, recent chromosome 1 segmental duplications implicated in human brain cortical expansion. Genetic characterization of these loci and their regulation is complicated by the fact they are embedded in large, nearly identical duplications that predispose to recurrent microdeletion syndromes. Using nearly complete long-read assemblies generated from 67 human and 12 ape haploid genomes, we show independent recurrent duplication among apes with functional copies emerging in humans ~2.1 million years ago. We distinguish NOTCH2NL paralogs present in every human haplotype (NOTCH2NLA) from copy number variable ones. We also characterize large-scale structural variation, including gene conversion, for 28% of haplotypes leading to a previously undescribed paralog, NOTCH2tv. Finally, we apply Fiber-seq and long-read transcript sequencing to human cortical neurospheres to characterize the regulatory landscape and find that the most fixed paralogs, NOTCH2 and NOTCH2NLA, harbor the greatest number of paralog-specific elements potentially driving their regulation.
Collapse
Affiliation(s)
- Taylor D. Real
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Prajna Hebbar
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Francesca Antonacci
- Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, 70125, Italy
| | - Ivana Pačar
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Gregory J. Mikol
- College of Natural & Agricultural Sciences, University of California, Riverside, Riverside, CA 92521, USA
| | - Oyeronke G. Popoola
- Department of Psychology and Neuroscience, University of North Carolina, Chapel Hill, Chapel Hill, NC 27514, USA
| | - Benjamin J. Mallory
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Mitchell R. Vollger
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Xavi Guitart
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Allison N. Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Jane E. Ranchalis
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Shane J. Neph
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Adriana E. Sedeño-Cortes
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Benedict Paten
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Sofie R. Salama
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
- Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Andrew B. Stergachis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
23
|
Dobry J, Zhu Z, Zhou Q, Wapstra E, Deakin JE, Ezaz T. The role of unbalanced segmental duplication in sex chromosome evolution in Australian ridge-tailed goannas. Sci Rep 2025; 15:8545. [PMID: 40074818 PMCID: PMC11903900 DOI: 10.1038/s41598-025-93574-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Accepted: 03/07/2025] [Indexed: 03/14/2025] Open
Abstract
Varanids are known for conserved sex chromosomes, but there are differences in the size of the W chromosome but not in morphology among species representing varying stages of sex chromosome evolution. We tested for homology of the ZW sex chromosome system with size differences in varanids among four species from two lineages in Australia, the Odatria and the Gouldii. We found that while DNA sequences of the sex chromosomes are conserved in the species we tested, we also identified a homologous region on an enlarged autosomal microchromosome that shares sequences with the W chromosome in some isolated populations of V. acanthurus and V. citrinus from the Odatria lineage. The enlarged microchromosome was unpaired in all individuals tested and is likely an unbalanced segmental duplication translocated between chromosome 1, the W, and another microchromosome. This suggests an ancient balanced duplication homologous to the W and the terminal region of the long arm of chromosome 1. The most parsimonious explanation is that the duplicated region likely originated on chromosome 1. We hypothesised in our reconstruction that genes and related DNA sequences associated with the sex-linkage group have likely originated on an autosome. Subsequently, the sequences may have undergone duplication and translocation to the W chromosome, followed by the accumulation of lineage specific repeat elements and amplifications on the W at different rates in various lineages. Lastly, these sequences are likely to have undergone duplication and translocation to another autosomal microchromosome. Given the role of segmental duplications and translocations as important evolutionary drivers of speciation in other taxa, together with the rapid speciation that has occurred in Australian varanids, our findings provide broader insight into the evolutionary pathway leading to rapid chromosomal and genic divergence of species.
Collapse
Affiliation(s)
- Jason Dobry
- Centre for Conservation Ecology and Genomics, Institute for Applied Ecology, Faculty of Science and Technology, University of Canberra, Canberra, ACT, 2601, Australia
| | - Zexian Zhu
- MOE Laboratory of Biosystems Homeostasis and Protection and Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, 310058, China
| | - Qi Zhou
- MOE Laboratory of Biosystems Homeostasis and Protection and Zhejiang Provincial Key Laboratory for Cancer Molecular Cell Biology, Life Sciences Institute, Zhejiang University, Hangzhou, 310058, China
- Center for Reproductive Medicine, The 2nd Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Erik Wapstra
- School of Natural Sciences, University of Tasmania, Hobart, TAS, 7001, Australia
| | - Janine E Deakin
- Centre for Conservation Ecology and Genomics, Institute for Applied Ecology, Faculty of Science and Technology, University of Canberra, Canberra, ACT, 2601, Australia
| | - Tariq Ezaz
- Centre for Conservation Ecology and Genomics, Institute for Applied Ecology, Faculty of Science and Technology, University of Canberra, Canberra, ACT, 2601, Australia.
| |
Collapse
|
24
|
Yang Q, Sun J, Wang X, Wang J, Liu Q, Ru J, Zhang X, Wang S, Hao R, Bian P, Dai X, Gong M, Zhang Z, Wang A, Bai F, Li R, Cai Y, Jiang Y. SVLearn: a dual-reference machine learning approach enables accurate cross-species genotyping of structural variants. Nat Commun 2025; 16:2406. [PMID: 40069188 PMCID: PMC11897243 DOI: 10.1038/s41467-025-57756-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Accepted: 03/04/2025] [Indexed: 03/15/2025] Open
Abstract
Structural variations (SVs) are diverse forms of genetic alterations and drive a wide range of human diseases. Accurately genotyping SVs, particularly occurring at repetitive genomic regions, from short-read sequencing data remains challenging. Here, we introduce SVLearn, a machine-learning approach for genotyping bi-allelic SVs. It exploits a dual-reference strategy to engineer a curated set of genomic, alignment, and genotyping features based on a reference genome in concert with an allele-based alternative genome. Using 38,613 human-derived SVs, we show that SVLearn significantly outperforms four state-of-the-art tools, with precision improvements of up to 15.61% for insertions and 13.75% for deletions in repetitive regions. On two additional sets of 121,435 cattle SVs and 113,042 sheep SVs, SVLearn demonstrates a strong generalizability to cross-species genotype SVs with a weighted genotype concordance score of up to 90%. Notably, SVLearn enables accurate genotyping of SVs at low sequencing coverage, which is comparable to the accuracy at 30× coverage. Our studies suggest that SVLearn can accelerate the understanding of associations between the genome-scale, high-quality genotyped SVs and diseases across multiple species.
Collapse
Affiliation(s)
- Qimeng Yang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Xinyu Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Jiong Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Quanzhong Liu
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Jinlong Ru
- Institute of Virology, Helmholtz Centre Munich - German Research Centre for Environmental Health, Neuherberg, Germany
| | - Xin Zhang
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Sizhe Wang
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Ran Hao
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Peipei Bian
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Xuelei Dai
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
- Yazhouwan National Laboratory, Sanya, Hainan, China
| | - Mian Gong
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
| | - Zhuangbiao Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Ao Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Fengting Bai
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Ran Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Yudong Cai
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China.
| | - Yu Jiang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China.
| |
Collapse
|
25
|
Chen X, Baker D, Dolzhenko E, Devaney JM, Noya J, Berlyoung AS, Brandon R, Hruska KS, Lochovsky L, Kruszka P, Newman S, Farrow E, Thiffault I, Pastinen T, Kasperaviciute D, Gilissen C, Vissers L, Hoischen A, Berger S, Vilain E, Délot E, Eberle MA. Genome-wide profiling of highly similar paralogous genes using HiFi sequencing. Nat Commun 2025; 16:2340. [PMID: 40057485 PMCID: PMC11890787 DOI: 10.1038/s41467-025-57505-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 02/21/2025] [Indexed: 05/13/2025] Open
Abstract
Variant calling is hindered in segmental duplications by sequence homology. We developed Paraphase, a HiFi-based informatics method that resolves highly similar genes by phasing all haplotypes of paralogous genes together. We applied Paraphase to 160 long (>10 kb) segmental duplication regions across the human genome with high (>99%) sequence similarity, encoding 316 genes. Analysis across five ancestral populations revealed highly variable copy numbers of these regions. We identified 23 paralog groups with exceptionally low within-group diversity, where extensive gene conversion and unequal crossing over contribute to highly similar gene copies. Furthermore, our analysis of 36 trios identified 7 de novo SNVs and 4 de novo gene conversion events, 2 of which are non-allelic. Finally, we summarized extensive genetic diversity in 9 medically relevant genes previously considered challenging to genotype. Paraphase provides a framework for resolving gene paralogs, enabling accurate testing in medically relevant genes and population-wide studies of previously inaccessible genes.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Emily Farrow
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
- UMKC School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Isabelle Thiffault
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
- UMKC School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
- Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Tomi Pastinen
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
- UMKC School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
| | | | - Christian Gilissen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Lisenka Vissers
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Alexander Hoischen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
- Research Institute for Medical Innovation, Radboud University Medical Center, Nijmegen, The Netherlands
- Radboud Center for Infectious Diseases (RCI), Department of Internal Medicine, Radboud University Medical Center, Nijmegen, The Netherlands
- Radboud Expertise Center for Immunodeficiency and Autoinflammation and Radboud Center for Infectious Disease (RCI), Radboud University Medical Center, Nijmegen, The Netherlands
| | - Seth Berger
- Center for Genetics Medicine Research, Children's National Hospital, Washington, DC, USA
| | - Eric Vilain
- Institute for Clinical and Translational Science, University of California, Irvine, CA, USA
| | - Emmanuèle Délot
- Institute for Clinical and Translational Science, University of California, Irvine, CA, USA
| | | |
Collapse
|
26
|
Ma K, Yang X, Mao Y. Advancing evolutionary medicine with complete primate genomes and advanced biotechnologies. Trends Genet 2025; 41:201-217. [PMID: 39627062 DOI: 10.1016/j.tig.2024.11.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 11/03/2024] [Accepted: 11/06/2024] [Indexed: 03/06/2025]
Abstract
Evolutionary medicine, which integrates evolutionary biology and medicine, significantly enhances our understanding of human traits and disease susceptibility. However, previous studies in this field have often focused on single-nucleotide variants due to technological limitations in characterizing complex genomic regions, hindering the comprehensive analyses of their evolutionary origins and clinical significance. In this review, we summarize recent advancements in complete telomere-to-telomere (T2T), primate genomes and other primate resources, and illustrate how these resources facilitate the research of complex regions. We focus on several biomedically relevant regions to examine the relationship between primate genome evolution and human diseases. We also highlight the potentials of high-throughput functional genomic technologies for assessing candidate loci. Finally, we discuss future directions for primate research within the context of evolutionary medicine.
Collapse
Affiliation(s)
- Kaiyue Ma
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China; Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, Zhejiang, China.
| |
Collapse
|
27
|
Liang SA, Ren T, Zhang J, He J, Wang X, Jiang X, He Y, McCoy RC, Fu Q, Akey JM, Mao Y, Chen L. A refined analysis of Neanderthal-introgressed sequences in modern humans with a complete reference genome. Genome Biol 2025; 26:32. [PMID: 39962554 PMCID: PMC11834205 DOI: 10.1186/s13059-025-03502-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Accepted: 02/11/2025] [Indexed: 02/20/2025] Open
Abstract
BACKGROUND Leveraging long-read sequencing technologies, the first complete human reference genome, T2T-CHM13, corrects assembly errors in previous references and resolves the remaining 8% of the genome. While studies on archaic admixture in modern humans have so far relied on the GRCh37 reference due to the availability of archaic genome data, the impact of T2T-CHM13 in this field remains unexplored. RESULTS We remap the sequencing reads of the high-quality Altai Neanderthal and Denisovan genomes onto GRCh38 and T2T-CHM13. Compared to GRCh37, we find that T2T-CHM13 significantly improves read mapping quality in archaic samples. We then apply IBDmix to identify Neanderthal-introgressed sequences in 2504 individuals from 26 geographically diverse populations using different reference genomes. We observe that commonly used pre-phasing filtering strategies in public datasets substantially influence archaic ancestry determination, underscoring the need for careful filter selection. Our analysis identifies approximately 51 Mb of Neanderthal sequences unique to T2T-CHM13, predominantly in genomic regions where GRCh38 and T2T-CHM13 assemblies diverge. Additionally, we uncover novel instances of population-specific archaic introgression in diverse populations, spanning genes involved in metabolism, olfaction, and ion-channel function. Finally, to facilitate the exploration of archaic alleles and adaptive signals in human genomics and evolutionary research, we integrate these introgressed sequences and adaptive signals across all reference genomes into a visualization database, ASH ( www.arcseqhub.com ). CONCLUSIONS Our study enhances the detection of archaic variations in modern humans, highlights the importance of utilizing the T2T-CHM13 reference, and provides novel insights into the functional consequences of archaic hominin admixture.
Collapse
Affiliation(s)
- Shen-Ao Liang
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Tianxin Ren
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Jiayu Zhang
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Jiahui He
- Ministry of Education Key Laboratory of Contemporary Anthropology, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Xuankai Wang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Xinrui Jiang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Yuan He
- Ministry of Education Key Laboratory of Contemporary Anthropology, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, 21212, USA
| | - Qiaomei Fu
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, 100044, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Joshua M Akey
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, 08540, USA
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, 200030, China.
- Center for Genomic Research, International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University, Yiwu, 322000, China.
| | - Lu Chen
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China.
| |
Collapse
|
28
|
Bein B, Chrysostomakis I, Arantes LS, Brown T, Gerheim C, Schell T, Schneider C, Leushkin E, Chen Z, Sigwart J, Gonzalez V, Wong NLWS, Santos FR, Blom MPK, Mayer F, Mazzoni CJ, Böhne A, Winkler S, Greve C, Hiller M. Long-read sequencing and genome assembly of natural history collection samples and challenging specimens. Genome Biol 2025; 26:25. [PMID: 39930463 PMCID: PMC11809032 DOI: 10.1186/s13059-025-03487-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 01/27/2025] [Indexed: 02/14/2025] Open
Abstract
Museum collections harbor millions of samples, largely unutilized for long-read sequencing. Here, we use ethanol-preserved samples containing kilobase-sized DNA to show that amplification-free protocols can yield contiguous genome assemblies. Additionally, using a modified amplification-based protocol, employing an alternative polymerase to overcome PCR bias, we assemble the 3.1 Gb maned sloth genome, surpassing the previous 500 Mb protocol size limit. Our protocol also improves assemblies of other difficult-to-sequence molluscs and arthropods, including millimeter-sized organisms. By highlighting collections as valuable sample resources and facilitating genome assembly of tiny and challenging organisms, our study advances efforts to obtain reference genomes of all eukaryotes.
Collapse
Affiliation(s)
- Bernhard Bein
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, Frankfurt, 60325, Germany
- Senckenberg Research Institute, Senckenberganlage 25, Frankfurt, 60325, Germany
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University , Max-Von-Laue-Str. 9, Frankfurt, 60438, Germany
| | - Ioannis Chrysostomakis
- Center for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change, Museum Koenig Bonn, Adenauerallee 127, Bonn, 53113, Germany
| | - Larissa S Arantes
- Berlin Center for Genomics in Biodiversity Research (BeGenDiv), Königin-Luise-Straße 2-4, Berlin, 14195, Germany
- Department of Evolutionary Genetics, Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, Berlin, 10315, Germany
| | - Tom Brown
- Berlin Center for Genomics in Biodiversity Research (BeGenDiv), Königin-Luise-Straße 2-4, Berlin, 14195, Germany
- Department of Evolutionary Genetics, Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, Berlin, 10315, Germany
| | - Charlotte Gerheim
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, Frankfurt, 60325, Germany
- Senckenberg Research Institute, Senckenberganlage 25, Frankfurt, 60325, Germany
| | - Tilman Schell
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, Frankfurt, 60325, Germany
- Senckenberg Research Institute, Senckenberganlage 25, Frankfurt, 60325, Germany
| | - Clément Schneider
- Senckenberg Research Institute, Am Museum 1, Görlitz, 02826, Germany
| | - Evgeny Leushkin
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, Frankfurt, 60325, Germany
- Senckenberg Research Institute, Senckenberganlage 25, Frankfurt, 60325, Germany
| | - Zeyuan Chen
- Senckenberg Research Institute, Senckenberganlage 25, Frankfurt, 60325, Germany
| | - Julia Sigwart
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, Frankfurt, 60325, Germany
- Senckenberg Research Institute, Senckenberganlage 25, Frankfurt, 60325, Germany
| | - Vanessa Gonzalez
- Global Genome Initiative, National Museum of Natural History, Smithsonian Institution, Washington, DC, 20013, USA
| | - Nur Leena W S Wong
- International Institute of Aquaculture and Aquatic Sciences, Universiti Putra Malaysia, Port Dickson, Negeri Sembilan, 71050, Malaysia
| | - Fabricio R Santos
- Laboratório de Biodiversidade E Evolução Molecular, Departamento de Genética, Universidade Federal de Minas Gerais, Ecologia E Evolução, Belo Horizonte, Minas Gerais, Brazil
| | - Mozes P K Blom
- Museum Für Naturkunde, Leibniz Institute for Evolution and Biodiversity Science, Invalidenstraße 43, Berlin, 10115, Germany
| | - Frieder Mayer
- Museum Für Naturkunde, Leibniz Institute for Evolution and Biodiversity Science, Invalidenstraße 43, Berlin, 10115, Germany
| | - Camila J Mazzoni
- Berlin Center for Genomics in Biodiversity Research (BeGenDiv), Königin-Luise-Straße 2-4, Berlin, 14195, Germany
- Department of Evolutionary Genetics, Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, Berlin, 10315, Germany
| | - Astrid Böhne
- Center for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change, Museum Koenig Bonn, Adenauerallee 127, Bonn, 53113, Germany
| | - Sylke Winkler
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, Dresden, 01307, Germany
- DRESDEN Concept Genome Center, Technische Universität Dresden, Fetscherstraße 105, Dresden, 01307, Germany
| | - Carola Greve
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, Frankfurt, 60325, Germany
- Senckenberg Research Institute, Senckenberganlage 25, Frankfurt, 60325, Germany
| | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, Senckenberganlage 25, Frankfurt, 60325, Germany.
- Senckenberg Research Institute, Senckenberganlage 25, Frankfurt, 60325, Germany.
- Institute of Cell Biology and Neuroscience, Faculty of Biosciences, Goethe University , Max-Von-Laue-Str. 9, Frankfurt, 60438, Germany.
| |
Collapse
|
29
|
Negi S, Stenton SL, Berger SI, Canigiula P, McNulty B, Violich I, Gardner J, Hillaker T, O'Rourke SM, O'Leary MC, Carbonell E, Austin-Tse C, Lemire G, Serrano J, Mangilog B, VanNoy G, Kolmogorov M, Vilain E, O'Donnell-Luria A, Délot E, Miga KH, Monlong J, Paten B. Advancing long-read nanopore genome assembly and accurate variant calling for rare disease detection. Am J Hum Genet 2025; 112:428-449. [PMID: 39862869 PMCID: PMC11866955 DOI: 10.1016/j.ajhg.2025.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 12/22/2024] [Accepted: 01/02/2025] [Indexed: 01/27/2025] Open
Abstract
More than 50% of families with suspected rare monogenic diseases remain unsolved after whole-genome analysis by short-read sequencing (SRS). Long-read sequencing (LRS) could help bridge this diagnostic gap by capturing variants inaccessible to SRS, facilitating long-range mapping and phasing and providing haplotype-resolved methylation profiling. To evaluate LRS's additional diagnostic yield, we sequenced a rare-disease cohort of 98 samples from 41 families, using nanopore sequencing, achieving per sample ∼36× average coverage and 32-kb read N50 from a single flow cell. Our Napu pipeline generated assemblies, phased variants, and methylation calls. LRS covered, on average, coding exons in ∼280 genes and ∼5 known Mendelian disease-associated genes that were not covered by SRS. In comparison to SRS, LRS detected additional rare, functionally annotated variants, including structural variants (SVs) and tandem repeats, and completely phased 87% of protein-coding genes. LRS detected additional de novo variants and could be used to distinguish postzygotic mosaic variants from prezygotic de novos. Diagnostic variants were established by LRS in 11 probands, with diverse underlying genetic causes including de novo and compound heterozygous variants, large-scale SVs, and epigenetic modifications. Our study demonstrates LRS's potential to enhance diagnostic yield for rare monogenic diseases, implying utility in future clinical genomics workflows.
Collapse
Affiliation(s)
- Shloka Negi
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Sarah L Stenton
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Seth I Berger
- Children's National Research Institute, Washington, DC, USA
| | | | - Brandy McNulty
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Ivo Violich
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Joshua Gardner
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Todd Hillaker
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Sara M O'Rourke
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Melanie C O'Leary
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Elizabeth Carbonell
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Christina Austin-Tse
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Gabrielle Lemire
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Jillian Serrano
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Brian Mangilog
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Grace VanNoy
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mikhail Kolmogorov
- Cancer Data Science Laboratory, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Eric Vilain
- Institute for Clinical and Translational Science, University of California, Irvine, Irvine, CA, USA
| | - Anne O'Donnell-Luria
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Emmanuèle Délot
- Institute for Clinical and Translational Science, University of California, Irvine, Irvine, CA, USA
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Jean Monlong
- Institut de Recherche en Santé Digestive, Université de Toulouse, INSERM, INRA, ENVT, UPS, Toulouse, France.
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
30
|
Navasca A, Singh J, Rivera-Varas V, Gill U, Secor G, Baldwin T. Dispensable genome and segmental duplications drive the genome plasticity in Fusarium solani. FRONTIERS IN FUNGAL BIOLOGY 2025; 6:1432339. [PMID: 39974207 PMCID: PMC11835900 DOI: 10.3389/ffunb.2025.1432339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 01/20/2025] [Indexed: 02/21/2025]
Abstract
Fusarium solani is a species complex encompassing a large phylogenetic clade with diverse members occupying varied habitats. We recently reported a unique opportunistic F. solani associated with unusual dark galls in sugarbeet. We assembled the chromosome-level genome of the F. solani sugarbeet isolate strain SB1 using Oxford Nanopore and Hi-C sequencing. The average size of F. solani genomes is 54 Mb, whereas SB1 has a larger genome of 59.38 Mb, organized into 15 chromosomes. The genome expansion of strain SB1 is due to the high repeats and segmental duplications within its three potentially accessory chromosomes. These chromosomes are absent in the closest reference genome with chromosome-level assembly, F. vanettenii 77-13-4. Segmental duplications were found in three chromosomes but are most extensive between two specific SB1 chromosomes, suggesting that this isolate may have doubled its accessory genes. Further comparison of the F. solani strain SB1 genome demonstrates inversions and syntenic regions to an accessory chromosome of F. vanettenii 77-13-4. The pan-genome of 12 publicly available F. solani isolates nearly reached gene saturation, with few new genes discovered after the addition of the last genome. Based on orthogroups and average nucleotide identity, F. solani is not grouped by lifestyle or origin. The pan-genome analysis further revealed the enrichment of several enzymes-coding genes within the dispensable (accessory + unique genes) genome, such as hydrolases, transferases, oxidoreductases, lyases, ligases, isomerase, and dehydrogenase. The evidence presented here suggests that genome plasticity, genetic diversity, and adaptive traits in Fusarium solani are driven by the dispensable genome with significant contributions from segmental duplications.
Collapse
Affiliation(s)
| | | | | | | | | | - Thomas Baldwin
- Department of Plant Pathology, North Dakota State University, Fargo, ND, United States
| |
Collapse
|
31
|
Sandroni V, Chaumette B. Understanding the Emergence of Schizophrenia in the Light of Human Evolution: New Perspectives in Genetics. GENES, BRAIN, AND BEHAVIOR 2025; 24:e70013. [PMID: 39801370 PMCID: PMC11725983 DOI: 10.1111/gbb.70013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2024] [Revised: 12/17/2024] [Accepted: 12/21/2024] [Indexed: 01/16/2025]
Abstract
Schizophrenia is a frequent and disabling disease. The persistence of the disorder despite its harmful consequences represents an evolutionary paradox. Based on recent discoveries in genetics, scientists have formulated the "price-to-pay" hypothesis: schizophrenia would be intimately related to human evolution, particularly to brain development and human-specific higher cognitive functions. The objective of the present work is to question scientific literature about the relationship between schizophrenia and human evolution from a genetic point of view. In the last two decades, research investigated the association between schizophrenia and a few genetic evolutionary markers: Human accelerated regions, segmental duplications, and highly repetitive DNA such as the Olduvai domain. Other studies focused on the action of natural selection on schizophrenia-associated genetic variants, also thanks to the complete sequencing of archaic hominins' genomes (Neanderthal, Denisova). Results suggested that a connection between human evolution and schizophrenia may exist; nonetheless, much research is still needed, and it is possible that a definitive answer to the evolutionary paradox of schizophrenia will never be found.
Collapse
Affiliation(s)
- Veronica Sandroni
- Université Paris Cité, Institute of Psychiatry and Neuroscience of Paris (IPNP)ParisFrance
- GHU‐Paris Psychiatrie et NeurosciencesHôpital Sainte AnneParisFrance
| | - Boris Chaumette
- Université Paris Cité, Institute of Psychiatry and Neuroscience of Paris (IPNP)ParisFrance
- GHU‐Paris Psychiatrie et NeurosciencesHôpital Sainte AnneParisFrance
- Human Genetics and Cognitive FunctionsInstitut Pasteur, Université Paris CitéParisFrance
- Department of PsychiatryMcGill UniversityMontrealCanada
| |
Collapse
|
32
|
Xia S, Chen J, Arsala D, Emerson JJ, Long M. Functional innovation through new genes as a general evolutionary process. Nat Genet 2025; 57:295-309. [PMID: 39875578 DOI: 10.1038/s41588-024-02059-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 12/15/2024] [Indexed: 01/30/2025]
Abstract
In the past decade, our understanding of how new genes originate in diverse organisms has advanced substantially, and more than a dozen molecular mechanisms for generating initial gene structures were identified, in addition to gene duplication. These new genes have been found to integrate into and modify pre-existing gene networks primarily through mutation and selection, revealing new patterns and rules with stable origination rates across various organisms. This progress has challenged the prevailing belief that new proteins evolve from pre-existing genes, as new genes may arise de novo from noncoding DNA sequences in many organisms, with high rates observed in flowering plants. New genes have important roles in phenotypic and functional evolution across diverse biological processes and structures, with detectable fitness effects of sexual conflict genes that can shape species divergence. Such knowledge of new genes can be of translational value in agriculture and medicine.
Collapse
Affiliation(s)
- Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Jianhai Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA
| | - J J Emerson
- Department of Ecology and Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL, USA.
| |
Collapse
|
33
|
Jeong H, Dishuck PC, Yoo D, Harvey WT, Munson KM, Lewis AP, Kordosky J, Garcia GH, Human Genome Structural Variation Consortium (HGSVC), Yilmaz F, Hallast P, Lee C, Pastinen T, Eichler EE. Structural polymorphism and diversity of human segmental duplications. Nat Genet 2025; 57:390-401. [PMID: 39779957 PMCID: PMC11821543 DOI: 10.1038/s41588-024-02051-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Accepted: 12/04/2024] [Indexed: 01/11/2025]
Abstract
Segmental duplications (SDs) contribute significantly to human disease, evolution and diversity but have been difficult to resolve at the sequence level. We present a population genetics survey of SDs by analyzing 170 human genome assemblies (from 85 samples representing 38 Africans and 47 non-Africans) in which the majority of autosomal SDs are fully resolved using long-read sequence assembly. Excluding the acrocentric short arms and sex chromosomes, we identify 173.2 Mb of duplicated sequence (47.4 Mb not present in the telomere-to-telomere reference) distinguishing fixed from structurally polymorphic events. We find that intrachromosomal SDs are among the most variable, with rare events mapping near their progenitor sequences. African genomes harbor significantly more intrachromosomal SDs and are more likely to have recently duplicated gene families with higher copy numbers than non-African samples. Comparison to a resource of 563 million full-length isoform sequencing reads identifies 201 novel, potentially protein-coding genes corresponding to these copy number polymorphic SDs.
Collapse
Affiliation(s)
- Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Altos Labs, San Diego, CA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jennifer Kordosky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Tomi Pastinen
- Children's Mercy Hospital and University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
34
|
Ren W, Fang Z, Dolzhenko E, Saunders CT, Cheng Z, Popic V, Peltz G. A Murine Database of Structural Variants Enables the Genetic Architecture of a Spontaneous Murine Lymphoma to be Characterized. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.09.632219. [PMID: 39868308 PMCID: PMC11761040 DOI: 10.1101/2025.01.09.632219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/28/2025]
Abstract
A more complete map of the pattern of genetic variation among inbred mouse strains is essential for characterizing the genetic architecture of the many available mouse genetic models of important biomedical traits. Although structural variants (SVs) are a major component of genetic variation, they have not been adequately characterized among inbred strains due to methodological limitations. To address this, we generated high-quality long-read sequencing data for 40 inbred strains; and designed a pipeline to optimally identify and validate different types of SVs. This generated a database for 40 inbred strains with 573,191SVs, which included 10,815 duplications and 2,115 inversions, that also has 70 million SNPs and 7.5 million insertions/deletions. Analysis of this SV database led to the discovery of a novel bi-genic model for susceptibility to a B cell lymphoma that spontaneously develops in SJL mice, which was initially described 55 years ago. The first genetic factor is a previously identified endogenous retrovirus encoded protein that stimulates CD4 T cells to produce the cytokines required for lymphoma growth. The second genetic factor is a newly found deletion SV, which ablates a protein whose promotes B lymphoma development in SJL mice. Characterizing the genetic architecture of SJL lymphoma susceptibility could provide new insight into the pathogenesis of a human lymphoma that has similarities with this murine lymphoma.
Collapse
Affiliation(s)
- Wenlong Ren
- Department of Anesthesia, Pain and Perioperative Medicine, Stanford University School of Medicine, Stanford CA 94305
| | - Zhuoqing Fang
- Department of Anesthesia, Pain and Perioperative Medicine, Stanford University School of Medicine, Stanford CA 94305
| | | | | | - Zhuanfen Cheng
- Department of Anesthesia, Pain and Perioperative Medicine, Stanford University School of Medicine, Stanford CA 94305
| | | | - Gary Peltz
- Department of Anesthesia, Pain and Perioperative Medicine, Stanford University School of Medicine, Stanford CA 94305
| |
Collapse
|
35
|
Kim J, Park J, Yang J, Kim S, Joe S, Park G, Hwang T, Cho MJ, Lee S, Lee JE, Park JH, Yeo MK, Kim SY. Highly accurate Korean draft genomes reveal structural variation highlighting human telomere evolution. Nucleic Acids Res 2025; 53:gkae1294. [PMID: 39778865 PMCID: PMC11707537 DOI: 10.1093/nar/gkae1294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 12/09/2024] [Accepted: 01/06/2025] [Indexed: 01/11/2025] Open
Abstract
Given the presence of highly repetitive genomic regions such as subtelomeric regions, understanding human genomic evolution remains challenging. Recently, long-read sequencing technology has facilitated the identification of complex genetic variants, including structural variants (SVs), at the single-nucleotide level. Here, we resolved SVs and their underlying DNA damage-repair mechanisms in subtelomeric regions, which are among the most uncharted genomic regions. We generated ∼20 × high-fidelity long-read sequencing data from three Korean individuals and their partially phased high-quality de novo genome assemblies (contig N50: 6.3-58.2 Mb). We identified 131 138 deletion and 121 461 insertion SVs, 41.6% of which were prevalent in the East Asian population. The commonality of the SVs identified among the Korean population was examined by short-read sequencing data from 103 Korean individuals, providing the first comprehensive SV set representing the population based on the long-read assemblies. Manual investigation of 19 large subtelomeric SVs (≥5 kb) and their associated repair signatures revealed the potential repair mechanisms leading to the formation of these SVs. Our study provides mechanistic insight into human telomere evolution and can facilitate our understanding of human SV formation.
Collapse
Affiliation(s)
- Jun Kim
- Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
- Personalized Genomic Medicine Research Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Jong Lyul Park
- Personalized Genomic Medicine Research Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
- Department of Bioscience, University of Science and Technology (UST), 217, Gajeong-ro, Yuseong-gu, Daejeon 34113, Republic of Korea
| | - Jin Ok Yang
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science & Technology (KAIST), 291, Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Sangok Kim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
- Department of Bioscience, University of Science and Technology (UST), 217, Gajeong-ro, Yuseong-gu, Daejeon 34113, Republic of Korea
| | - Soobok Joe
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Gunwoo Park
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Taeyeon Hwang
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Mun-Jeong Cho
- Department of Bioscience, University of Science and Technology (UST), 217, Gajeong-ro, Yuseong-gu, Daejeon 34113, Republic of Korea
| | - Seungjae Lee
- DNALink, Inc, 31, Magokjungang 8-ro 3-gil, Gangseo-gu, Seoul 07793, Republic of Korea
| | - Jong-Eun Lee
- DNALink, Inc, 31, Magokjungang 8-ro 3-gil, Gangseo-gu, Seoul 07793, Republic of Korea
| | - Ji-Hwan Park
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
- Department of Biological Science, Ajou University, 206, World cup-ro, Yeongtong-gu, Suwon 16499, Republic of Korea
| | - Min-Kyung Yeo
- Department of Pathology, Chungnam National University School of Medicine, 282, Munhwa-ro, Jung-gu, Daejeon 35015, Republic of Korea
| | - Seon-Young Kim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, 125, Gwahak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
- Department of Bioscience, University of Science and Technology (UST), 217, Gajeong-ro, Yuseong-gu, Daejeon 34113, Republic of Korea
| |
Collapse
|
36
|
Chakraborty A, Chopde S, Madhusudhan M. Motif distribution in genomes gives insights into gene clustering and co-regulation. Nucleic Acids Res 2025; 53:gkae1178. [PMID: 39657779 PMCID: PMC11724300 DOI: 10.1093/nar/gkae1178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 09/17/2024] [Accepted: 11/15/2024] [Indexed: 12/12/2024] Open
Abstract
We read the genome as proteins in the cell would - by studying the distributions of 5-6 base motifs of DNA in the whole genome or smaller stretches such as parts of, or whole chromosomes. This led us to some interesting findings about motif clustering and chromosome organization. It is quite clear that the motif distribution in genomes is not random at the length scales we examined: 1 kb to entire chromosomes. The observed-to-expected (OE) ratios of motif distributions show strong correlations in pairs of chromosomes that are susceptible to translocations. With the aid of examples, we suggest that similarity in motif distributions in promoter regions of genes could imply co-regulation. A simple extension of this idea empowers us with the ability to construct gene regulatory networks. Further, we could make inferences about the spatial proximity of genomic fragments using these motif distributions. Spatially proximal regions, as deduced by Hi-C or pcHi-C, were ∼3.5 times more likely to have their motif distributions correlated than non-proximal regions. These correlations had strong contributions from the CTCF protein recognizing motifs which are known markers of topologically associated domains. In general, correlating genomic regions by motif distribution comparisons alone is rife with functional information.
Collapse
Affiliation(s)
- Atreyi Chakraborty
- Department of Biology, Indian Institute of Science Education and Research, Dr Homi Bhabha Rd, Pashan, Pune, Maharashtra 411008, India
| | - Sumant Chopde
- Department of Data Science, Indian Institute of Science Education and Research, Dr Homi Bhabha Rd, Pashan, Pune, Maharashtra 411008, India
| | - Mallur Srivatsan Madhusudhan
- Department of Biology, Indian Institute of Science Education and Research, Dr Homi Bhabha Rd, Pashan, Pune, Maharashtra 411008, India
- Department of Data Science, Indian Institute of Science Education and Research, Dr Homi Bhabha Rd, Pashan, Pune, Maharashtra 411008, India
| |
Collapse
|
37
|
Cao C, Miao J, Xie Q, Sun J, Cheng H, Zhang Z, Wu F, Liu S, Ye X, Gong H, Zhang Z, Wang Q, Pan Y, Wang Z. A near telomere-to-telomere genome assembly of the Jinhua pig: enabling more accurate genetic research. Gigascience 2025; 14:giaf048. [PMID: 40372724 PMCID: PMC12080228 DOI: 10.1093/gigascience/giaf048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Revised: 01/29/2025] [Accepted: 03/31/2025] [Indexed: 05/16/2025] Open
Abstract
BACKGROUND Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and centromeres and telomeres missing, which limits our understanding of the important traits in these genomic regions. FINDINGS We present a near-complete genome assembly for the Jinhua pig (JH-T2T) and provide a set of diploid Jinhua reference genomes, constructed using PacBio HiFi, ONT long reads, and Hi-C reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only 6 gaps. It features annotations of 46.90% repetitive sequences, 33 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes and loses 114 genes. Moreover, it enhances the mapping rate for both Western and Chinese local pigs, outperforming Sscrofa11.1 as a reference genome. Additionally, this comprehensive genome assembly will facilitate large-scale variant detection. CONCLUSIONS This study produced a near-gapless assembly of the pig genome and provides a set of haploid Jinhua reference genomes. Our findings represent a significant advance in pig genomics, providing a robust resource that enhances genetic research, breeding programs, and biomedical applications.
Collapse
Affiliation(s)
- Caiyun Cao
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- Hainan Institute of Zhejiang University, Building 11, Yongyou Industrial Park, Yazhou Bay Science and Technology City, Yazhou District, Sanya 572025 Hainan, China
| | - Jian Miao
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Qinqin Xie
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Jiabao Sun
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Hong Cheng
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Zhenyang Zhang
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Fen Wu
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Shuang Liu
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Xiaowei Ye
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Huanfa Gong
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Zhe Zhang
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Qishan Wang
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- Hainan Institute of Zhejiang University, Building 11, Yongyou Industrial Park, Yazhou Bay Science and Technology City, Yazhou District, Sanya 572025 Hainan, China
| | - Yuchun Pan
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- Hainan Institute of Zhejiang University, Building 11, Yongyou Industrial Park, Yazhou Bay Science and Technology City, Yazhou District, Sanya 572025 Hainan, China
| | - Zhen Wang
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
38
|
Sobral AF, Dinis-Oliveira RJ, Barbosa DJ. CRISPR-Cas technology in forensic investigations: Principles, applications, and ethical considerations. Forensic Sci Int Genet 2025; 74:103163. [PMID: 39437497 DOI: 10.1016/j.fsigen.2024.103163] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 10/08/2024] [Accepted: 10/09/2024] [Indexed: 10/25/2024]
Abstract
CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated proteins) systems are adaptive immune systems originally present in bacteria, where they are essential to protect against external genetic elements, including viruses and plasmids. Taking advantage of this system, CRISPR-Cas-based technologies have emerged as incredible tools for precise genome editing, thus significantly advancing several research fields. Forensic sciences represent a multidisciplinary field that explores scientific methods to investigate and resolve legal issues, particularly criminal investigations and subject identification. Consequently, it plays a critical role in the justice system, providing scientific evidence to support judicial investigations. Although less explored, CRISPR-Cas-based methodologies demonstrate strong potential in the field of forensic sciences due to their high accuracy and sensitivity, including DNA profiling and identification, interpretation of crime scene investigations, detection of food contamination or fraud, and other aspects related to environmental forensics. However, using CRISPR-Cas-based methodologies in human samples raises several ethical issues and concerns regarding the potential misuse of individual genetic information. In this manuscript, we provide an overview of potential applications of CRISPR-Cas-based methodologies in several areas of forensic sciences and discuss the legal implications that challenge their routine implementation in this research field.
Collapse
Affiliation(s)
- Ana Filipa Sobral
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, University Institute of Health Sciences - CESPU, Gandra 4585-116, Portugal; UCIBIO - Applied Molecular Biosciences Unit, Toxicologic Pathology Research Laboratory, University Institute of Health Sciences (1H-TOXRUN, IUCS-CESPU), Gandra 4585-116, Portugal.
| | - Ricardo Jorge Dinis-Oliveira
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, University Institute of Health Sciences - CESPU, Gandra 4585-116, Portugal; UCIBIO - Applied Molecular Biosciences Unit, Translational Toxicology Research Laboratory, University Institute of Health Sciences (1H-TOXRUN, IUCS-CESPU), Gandra 4585-116, Portugal; Department of Public Health and Forensic Sciences and Medical Education, Faculty of Medicine, University of Porto, Porto 4200-319, Portugal; FOREN - Forensic Science Experts, Dr. Mário Moutinho Avenue, No. 33-A, Lisbon 1400-136, Portugal.
| | - Daniel José Barbosa
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, University Institute of Health Sciences - CESPU, Gandra 4585-116, Portugal; UCIBIO - Applied Molecular Biosciences Unit, Translational Toxicology Research Laboratory, University Institute of Health Sciences (1H-TOXRUN, IUCS-CESPU), Gandra 4585-116, Portugal.
| |
Collapse
|
39
|
Luo LY, Wu H, Zhao LM, Zhang YH, Huang JH, Liu QY, Wang HT, Mo DX, EEr HH, Zhang LQ, Chen HL, Jia SG, Wang WM, Li MH. Telomere-to-telomere sheep genome assembly identifies variants associated with wool fineness. Nat Genet 2025; 57:218-230. [PMID: 39779954 DOI: 10.1038/s41588-024-02037-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 11/19/2024] [Indexed: 01/11/2025]
Abstract
Ongoing efforts to improve sheep reference genome assemblies still leave many gaps and incomplete regions, resulting in a few common failures and errors in genomic studies. Here, we report a 2.85-Gb gap-free telomere-to-telomere genome of a ram (T2T-sheep1.0), including all autosomes and the X and Y chromosomes. This genome adds 220.05 Mb of previously unresolved regions and 754 new genes to the most updated reference assembly ARS-UI_Ramb_v3.0; it contains four types of repeat units (SatI, SatII, SatIII and CenY) in centromeric regions. T2T-sheep1.0 has a base accuracy of more than 99.999%, corrects several structural errors in previous reference assemblies and improves structural variant detection in repetitive sequences. Alignment of whole-genome short-read sequences of global domestic and wild sheep against T2T-sheep1.0 identifies 2,664,979 new single-nucleotide polymorphisms in previously unresolved regions, which improves the population genetic analyses and detection of selective signals for domestication (for example, ABCC4) and wool fineness (for example, FOXQ1).
Collapse
Affiliation(s)
- Ling-Yun Luo
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Hui Wu
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Li-Ming Zhao
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems; Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs; Engineering Research Center of Grassland Industry, Ministry of Education; College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, China
| | - Ya-Hui Zhang
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Jia-Hui Huang
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Qiu-Yue Liu
- Institute of Genetics and Developmental Biology, The Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Hai-Tao Wang
- Institute of Genetics and Developmental Biology, The Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Dong-Xin Mo
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - He-Hua EEr
- Institute of Animal Science, Ningxia Academy of Agriculture and Forestry Sciences, Yinchuan, China
| | - Lian-Quan Zhang
- Ningxia Shuomuyanchi Tan Sheep Breeding Co. Ltd., Wuzhong, China
| | | | - Shan-Gang Jia
- College of Grassland Science and Technology, China Agricultural University, Beijing, China.
| | - Wei-Min Wang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems; Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs; Engineering Research Center of Grassland Industry, Ministry of Education; College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, China.
| | - Meng-Hua Li
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, China.
| |
Collapse
|
40
|
Liu J, Li Q, Hu Y, Yu Y, Zheng K, Li D, Qin L, Yu X. The complete telomere-to-telomere sequence of a mouse genome. Science 2024; 386:1141-1146. [PMID: 39636971 DOI: 10.1126/science.adq8191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Accepted: 10/24/2024] [Indexed: 12/07/2024]
Abstract
The current reference genome of Mus musculus, GRCm39, has major gaps in both euchromatic and heterochromatic regions associated with repetitive sequences. In this work, we have sequenced and assembled the telomere-to-telomere genome of mouse haploid embryonic stem cells. The results reveal more than 7.7% of previously uncovered sequences of the mouse genome, including ribosomal DNA arrays and pericentromeric and subtelomeric regions, as well as an additional 140 genes predicted to be protein-coding. This study helps to address knowledge gaps in the mouse genome.
Collapse
Affiliation(s)
- Junli Liu
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Qilin Li
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Yixuan Hu
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Yi Yu
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Kai Zheng
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Dengfeng Li
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Lexin Qin
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| | - Xiaochun Yu
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou, Zhejiang, China
| |
Collapse
|
41
|
Holland M, Rutkowski R, C. Levin T. Evolutionary Dynamics of Proinflammatory Caspases in Primates and Rodents. Mol Biol Evol 2024; 41:msae220. [PMID: 39431598 PMCID: PMC11630849 DOI: 10.1093/molbev/msae220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 10/05/2024] [Accepted: 10/14/2024] [Indexed: 10/22/2024] Open
Abstract
Caspase-1 and related proteases are key players in inflammation and innate immunity. Here, we characterize the evolutionary history of caspase-1 and its close relatives across 19 primates and 21 rodents, focusing on differences that may cause discrepancies between humans and animal studies. While caspase-1 has been retained in all these taxa, other members of the caspase-1 subfamily (caspase-4, caspase-5, caspase-11, and caspase-12 and CARD16, 17, and 18) each have unique evolutionary trajectories. Caspase-4 is found across simian primates, whereas we identified multiple pseudogenization and gene loss events in caspase-5, caspase-11, and the CARDs. Because caspase-4 and caspase-11 are both key players in the noncanonical inflammasome pathway, we expected that these proteins would be likely to evolve rapidly. Instead, we found that these two proteins are largely conserved, whereas caspase-4's close paralog, caspase-5, showed significant indications of positive selection, as did primate caspase-1. Caspase-12 is a nonfunctional pseudogene in humans. We find this extends across most primates, although many rodents and some primates retain an intact, and likely functional, caspase-12. In mouse laboratory lines, we found that 50% of common strains carry nonsynonymous variants that may impact the functions of caspase-11 and caspase-12 and therefore recommend specific strains to be used (and avoided). Finally, unlike rodents, primate caspases have undergone repeated rounds of gene conversion, duplication, and loss leading to a highly dynamic proinflammatory caspase repertoire. Thus, we uncovered many differences in the evolution of primate and rodent proinflammatory caspases and discuss the potential implications of this history for caspase gene functions.
Collapse
Affiliation(s)
- Mische Holland
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Rachel Rutkowski
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Tera C. Levin
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
42
|
Kim S, Kim J. Units containing telomeric repeats are prevalent in subtelomeric regions of a Mesorhabditis isolate collected from the Republic of Korea. Genes Genomics 2024; 46:1461-1472. [PMID: 39367283 DOI: 10.1007/s13258-024-01576-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 09/11/2024] [Indexed: 10/06/2024]
Abstract
BACKGROUND Mesorhabditis is known for its somatic genome being only a small portion of the germline genome due to programmed DNA elimination. This phenotype may be associated with the maintenance of telomeres at the ends of fragmented somatic chromosomes. OBJECTIVE To comprehensively investigate the telomeric regions of Mesorhabditis nematodes at the sequence level, we endeavored to collect a Mesorhabditis nematode in the Republic of Korea and acquire its highly contiguous genome sequences. METHODS We isolated a Mesorhabditis nematode and assembled its 108-Mb draft genome using both 6.3 Gb (53 ×) of short-read and 3.0 Gb (25 × , N50 = 5.7 kb) of nanopore-based long-read sequencing data. Our genome assembly exhibits comparable quality to the public genome of Mesorhabditis belari in terms of contiguity and evolutionary conserved genes. RESULTS Unexpectedly, our Mesorhabditis genome has many more interstitial telomeric sequences (ITSs), specifically subtelomeric ones, compared to the genomes of Caenorhabditis elegans and M. belari. Moreover, several subtelomeric sequences containing ITSs had 4-26 homologous sequences, implying they are highly repetitive. Based on this highly repetitive nature, we hypothesize that subtelomeric ITSs might have accumulated through the action of transposable elements containing ITSs. CONCLUSIONS It still remains elusive whether these ITS-containing units are associated with programmed DNA elimination, but they may facilitate new telomere formation after DNA elimination. Our genomic resources for Mesorhabditis can aid in understanding how its distinct phenotypes have evolved.
Collapse
Affiliation(s)
- Seoyeon Kim
- Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, Daejeon, 34134, Republic of Korea
| | - Jun Kim
- Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, Daejeon, 34134, Republic of Korea.
| |
Collapse
|
43
|
de los Angeles Becerra Rodriguez M, Gonzalez Muñoz E, Moore T. Oligodendrocyte-specific expression of PSG8- AS1 suggests a role in myelination with prognostic value in oligodendroglioma. Noncoding RNA Res 2024; 9:1061-1068. [PMID: 39022681 PMCID: PMC11254506 DOI: 10.1016/j.ncrna.2024.06.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 05/03/2024] [Accepted: 06/10/2024] [Indexed: 07/20/2024] Open
Abstract
The segmentally duplicated Pregnancy-specific glycoprotein (PSG) locus on chromosome 19q13 may be one of the most rapidly evolving in the human genome. It comprises ten coding genes (PSG1-9, 11) and one predominantly non-coding gene (PSG10) that are expressed in the placenta and gut, in addition to several poorly characterized long non-coding RNAs. We report that long non-coding RNA PSG8-AS1 has an oligodendrocyte-specific expression pattern and is co-expressed with genes encoding key myelin constituents. PSG8-AS1 exhibits two peaks of expression during human brain development coinciding with the most active periods of oligodendrogenesis and myelination. PSG8-AS1 orthologs were found in the genomes of several primates but significant expression was found only in the human, suggesting a recent evolutionary origin of its proposed role in myelination. Additionally, because co-deletion of chromosomes 1p/19q is a genomic marker of oligodendroglioma, expression of PSG8-AS1 was examined in these tumors. PSG8-AS1 may be a promising diagnostic biomarker for glioma, with prognostic value in oligodendroglioma.
Collapse
Affiliation(s)
- Maria de los Angeles Becerra Rodriguez
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
- SFI Centre for Research Training in Genomics Data Science, University College Cork, Cork, Ireland
| | - Elena Gonzalez Muñoz
- Instituto de Investigación Biomédica de Málaga y Plataforma en Nanomedicina-IBIMA Plataforma BIONAND, 29590, Málaga, Spain
- Universidad de Malaga, Dpto. Biología Celular, Genética y Fisiología, 29071, Málaga, Spain
| | - Tom Moore
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland
| |
Collapse
|
44
|
Moss ND, Lollis D, Silver DL. How our brains are built: emerging approaches to understand human-specific features. Curr Opin Genet Dev 2024; 89:102278. [PMID: 39549607 DOI: 10.1016/j.gde.2024.102278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 10/08/2024] [Accepted: 10/21/2024] [Indexed: 11/18/2024]
Abstract
Understanding what makes us uniquely human is a long-standing question permeating fields from genomics, neuroscience, and developmental biology to medicine. The discovery of human-specific genomic sequences has enabled a new understanding of the molecular features of human brain evolution. Advances in sequencing, computational, and in vitro screening approaches collectively reveal new roles of uniquely human sequences in regulating gene expression. Here, we review the landscape of human-specific loci and describe how emerging technologies are being used to understand their molecular functions and impact on brain development. We describe current challenges in the field and the potential of integrating new hypotheses and approaches to propel our understanding of the human brain.
Collapse
Affiliation(s)
- Nicole D Moss
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Davoneshia Lollis
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA. https://twitter.com/@_mlollis
| | - Debra L Silver
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Department of Cell Biology and Neurobiology, Duke University Medical Center, Durham, NC 27710, USA; Duke Institute for Brain Sciences and Duke Regeneration Center, Duke University Medical Center, Durham, NC 27710, USA.
| |
Collapse
|
45
|
Mastrorosa FK, Oshima KK, Rozanski AN, Harvey WT, Eichler EE, Logsdon GA. Identification and annotation of centromeric hypomethylated regions with CDR-Finder. Bioinformatics 2024; 40:btae733. [PMID: 39657946 PMCID: PMC11663805 DOI: 10.1093/bioinformatics/btae733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 11/26/2024] [Accepted: 12/06/2024] [Indexed: 12/12/2024] Open
Abstract
MOTIVATION Centromeres are chromosomal regions historically understudied with sequencing technologies due to their repetitive nature and short-read mapping limitations. However, recent improvements in long-read sequencing allow for the investigation of complex regions of the genome at the sequence and epigenetic levels. RESULTS Here, we present Centromere Dip Region (CDR)-Finder: a tool to identify regions of hypomethylation within the centromeres of high-quality, contiguous genome assemblies. These regions are typically associated with a unique type of chromatin containing the histone H3 variant CENP-A, which marks the location of the kinetochore. CDR-Finder identifies the CDRs in large and short centromeres and generates a BED file indicating the location of the CDRs within the centromere. It also outputs a plot for visualization, validation, and downstream analysis. AVAILABILITY AND IMPLEMENTATION CDR-Finder is available at https://github.com/EichlerLab/CDR-Finder.
Collapse
Affiliation(s)
- Francesco Kumara Mastrorosa
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, United States
| | - Keisuke K Oshima
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Allison N Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, United States
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, United States
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, United States
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, United States
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, United States
| |
Collapse
|
46
|
Guitart X, Porubsky D, Yoo D, Dougherty ML, Dishuck PC, Munson KM, Lewis AP, Hoekzema K, Knuth J, Chang S, Pastinen T, Eichler EE. Independent expansion, selection, and hypervariability of the TBC1D3 gene family in humans. Genome Res 2024; 34:1798-1810. [PMID: 39107043 DOI: 10.1101/gr.279299.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 07/29/2024] [Indexed: 08/09/2024]
Abstract
TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on Chromosome 17. We find that all human copy-number variation maps to two distinct clusters located at Chromosome 17q12 and that humans are highly structurally variable at this locus, differing by as many as 20 copies and ∼1 Mbp in length depending on haplotypes. We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Last, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.
Collapse
Affiliation(s)
- Xavi Guitart
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Max L Dougherty
- Tisch Cancer Institute, Division of Hematology and Medical Oncology, The Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Stephen Chang
- Department of Biochemistry
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University, Stanford, California 94305, USA
| | - Tomi Pastinen
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, Missouri 64108, USA
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, Missouri 64108, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
47
|
Wu H, Luo LY, Zhang YH, Zhang CY, Huang JH, Mo DX, Zhao LM, Wang ZX, Wang YC, He-Hua EE, Bai WL, Han D, Dou XT, Ren YL, Dingkao R, Chen HL, Ye Y, Du HD, Zhao ZQ, Wang XJ, Jia SG, Liu ZH, Li MH. Telomere-to-telomere genome assembly of a male goat reveals variants associated with cashmere traits. Nat Commun 2024; 15:10041. [PMID: 39567477 PMCID: PMC11579321 DOI: 10.1038/s41467-024-54188-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 10/30/2024] [Indexed: 11/22/2024] Open
Abstract
A complete goat (Capra hircus) reference genome enhances analyses of genetic variation, thus providing insights into domestication and selection in goats and related species. Here, we assemble a telomere-to-telomere (T2T) gap-free genome (2.86 Gb) from a cashmere goat (T2T-goat1.0), including a Y chromosome of 20.96 Mb. With a base accuracy of >99.999%, T2T-goat1.0 corrects numerous genome-wide structural and base errors in previous assemblies and adds 288.5 Mb of previously unresolved regions and 446 newly assembled genes to the reference genome. We sequence the genomes of five representative goat breeds for PacBio reads, and use T2T-goat1.0 as a reference to identify a total of 63,417 structural variations (SVs) with up to 4711 (7.42%) in the previously unresolved regions. T2T-goat1.0 was applied in population analyses of global wild and domestic goats, which revealed 32,419 SVs and 25,397,794 SNPs, including 870 SVs and 545,026 SNPs in the previously unresolved regions. Also, our analyses reveal a set of selective variants and genes associated with domestication (e.g., NKG2D and ABCC4) and cashmere traits (e.g., ABCC4 and ASIP).
Collapse
Affiliation(s)
- Hui Wu
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
- Northern Agriculture and Animal Husbandry Technical Innovation Center, Chinese Academy of Agricultural Sciences, Hohhot, China
| | - Ling-Yun Luo
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Ya-Hui Zhang
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Chong-Yan Zhang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Jia-Hui Huang
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Dong-Xin Mo
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Li-Ming Zhao
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, China
| | - Zhi-Xin Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - Yi-Chuan Wang
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China
| | - EEr He-Hua
- Institute of Animal Science, NingXia Academy of Agriculture and Forestry Sciences, Yinchuan, China
| | - Wen-Lin Bai
- College of Animal Science and Veterinary Medicine, Shenyang Agricultural University, Shenyang, China
| | - Di Han
- Modern Agricultural Production Base Construction Engineering Center of Liaoning Province, Liaoyang, China
| | - Xing-Tang Dou
- Liaoning Province Liaoning Cashmere Goat Original Breeding Farm Co., Ltd., Liaoyang, China
| | - Yan-Ling Ren
- Shandong Binzhou Academy of Animal Science and Veterinary Medicine, Binzhou, China
| | | | | | - Yong Ye
- Zhongwei Goat Breeding Center of Ningxia Province, Zhongwei, China
| | - Hai-Dong Du
- Zhongwei Goat Breeding Center of Ningxia Province, Zhongwei, China
| | - Zhan-Qiang Zhao
- Zhongwei Goat Breeding Center of Ningxia Province, Zhongwei, China
| | - Xi-Jun Wang
- Jiaxiang Animal Husbandry and Veterinary Development Center, Jining, China
| | - Shan-Gang Jia
- College of Grassland Science and Technology, China Agricultural University, Beijing, China.
| | - Zhi-Hong Liu
- College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China.
| | - Meng-Hua Li
- Frontiers Science Center for Molecular Design Breeding (MOE); State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
48
|
Iyer SV, Goodwin S, McCombie WR. Leveraging the power of long reads for targeted sequencing. Genome Res 2024; 34:1701-1718. [PMID: 39567237 PMCID: PMC11610587 DOI: 10.1101/gr.279168.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 10/01/2024] [Indexed: 11/22/2024]
Abstract
Long-read sequencing technologies have improved the contiguity and, as a result, the quality of genome assemblies by generating reads long enough to span and resolve complex or repetitive regions of the genome. Several groups have shown the power of long reads in detecting thousands of genomic and epigenomic features that were previously missed by short-read sequencing approaches. While these studies demonstrate how long reads can help resolve repetitive and complex regions of the genome, they also highlight the throughput and coverage requirements needed to accurately resolve variant alleles across large populations using these platforms. At the time of this review, whole-genome long-read sequencing is more expensive than short-read sequencing on the highest throughput short-read instruments; thus, achieving sufficient coverage to detect low-frequency variants (such as somatic variation) in heterogenous samples remains challenging. Targeted sequencing, on the other hand, provides the depth necessary to detect these low-frequency variants in heterogeneous populations. Here, we review currently used and recently developed targeted sequencing strategies that leverage existing long-read technologies to increase the resolution with which we can look at nucleic acids in a variety of biological contexts.
Collapse
Affiliation(s)
- Shruti V Iyer
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | | |
Collapse
|
49
|
Auwerx C, Kutalik Z, Reymond A. The pleiotropic spectrum of proximal 16p11.2 CNVs. Am J Hum Genet 2024; 111:2309-2346. [PMID: 39332410 PMCID: PMC11568765 DOI: 10.1016/j.ajhg.2024.08.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 08/18/2024] [Accepted: 08/21/2024] [Indexed: 09/29/2024] Open
Abstract
Recurrent genomic rearrangements at 16p11.2 BP4-5 represent one of the most common causes of genomic disorders. Originally associated with increased risk for autism spectrum disorder, schizophrenia, and intellectual disability, as well as adiposity and head circumference, these CNVs have since been associated with a plethora of phenotypic alterations, albeit with high variability in expressivity and incomplete penetrance. Here, we comprehensively review the pleiotropy associated with 16p11.2 BP4-5 rearrangements to shine light on its full phenotypic spectrum. Illustrating this phenotypic heterogeneity, we expose many parallels between findings gathered from clinical versus population-based cohorts, which often point to the same physiological systems, and emphasize the role of the CNV beyond neuropsychiatric and anthropometric traits. Revealing the complex and variable clinical manifestations of this CNV is crucial for accurate diagnosis and personalized treatment strategies for carrier individuals. Furthermore, we discuss areas of research that will be key to identifying factors contributing to phenotypic heterogeneity and gaining mechanistic insights into the molecular pathways underlying observed associations, while demonstrating how diversity in affected individuals, cohorts, experimental models, and analytical approaches can catalyze discoveries.
Collapse
Affiliation(s)
- Chiara Auwerx
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland; Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Zoltán Kutalik
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
50
|
Kumara Mastrorosa F, Oshima KK, Rozanski AN, Harvey WT, Eichler EE, Logsdon GA. Identification and annotation of centromeric hypomethylated regions with Centromere Dip Region (CDR)-Finder. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.01.621587. [PMID: 39574726 PMCID: PMC11580854 DOI: 10.1101/2024.11.01.621587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2024]
Abstract
Centromeres are chromosomal regions historically understudied with sequencing technologies due to their repetitive nature and short-read mapping limitations. However, recent improvements in long-read sequencing allowed for the investigation of complex regions of the genome at the sequence and epigenetic levels. Here, we present Centromere Dip Region (CDR)-Finder: a tool to identify regions of hypomethylation within the centromeres of high-quality, contiguous genome assemblies. These regions are typically associated with a unique type of chromatin containing the histone H3 variant CENP-A, which marks the location of the kinetochore. CDR-Finder identifies the CDRs in large and short centromeres and generates a BED file indicating the location of the CDRs within the centromere. It also outputs a plot for visualization, validation, and downstream analysis. CDR-Finder is available at https://github.com/EichlerLab/CDR-Finder.
Collapse
Affiliation(s)
- F. Kumara Mastrorosa
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Keisuke K. Oshima
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Allison N. Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William T. Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Present address: Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|