1
|
Werren EA, Kalsner L, Ewald J, Peracchio M, King C, Vats P, Audano PA, Robinson PN, Adams MD, Kelly MA, Matson AP. Phenotypic Expansion of Knobloch Syndrome Type 2 in an Individual With a De Novo PAK2 Variant. Am J Med Genet A 2025; 197:e64006. [PMID: 39876536 PMCID: PMC12052494 DOI: 10.1002/ajmg.a.64006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 01/08/2025] [Accepted: 01/16/2025] [Indexed: 01/30/2025]
Abstract
P21-activated kinase 2 (PAK2) is a serine/threonine kinase essential for a variety of cellular processes including signal transduction, cellular survival, proliferation, and migration. A recent report proposed monoallelic PAK2 variants cause Knobloch syndrome type 2 (KNO2)-a developmental disorder primarily characterized by ocular anomalies. Here, we identified a novel de novo heterozygous missense variant in PAK2, NM_002577.4:c.1273G>A, p.(D425N), by genome sequencing in an individual with features consistent with KNO2. Notable clinical phenotypes observed in this individual were global developmental delay, congenital retinal detachment, mild cerebral ventriculomegaly, hypotonia, failure to thrive, pyloric stenosis, feeding intolerance, patent ductus arteriosus, and mild facial dysmorphism. The p.(D425N) variant lies within the protein kinase domain and is predicted to be functionally damaging by in silico analysis. Previous clinical genetic testing did not report this variant due to unknown relevance of PAK2 variants at the time of testing, highlighting the importance of reanalysis. Our findings substantiate the candidacy of PAK2 variants in KNO2 and expand the KNO2 clinical phenotypic spectrum.
Collapse
Affiliation(s)
- Elizabeth A. Werren
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, 06032, USA
| | - Louisa Kalsner
- Department of Pediatrics, University of Connecticut School of Medicine, Farmington, CT 06030, USA
- Division of Genetics, Connecticut Children’s Medical Center, Hartford, CT 06106, USA
- Division of Neurology, Connecticut Children’s Medical Center, Hartford, CT 06106, USA
| | - Jessica Ewald
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, 06032, USA
| | - Michael Peracchio
- Division of Genetics, Connecticut Children’s Medical Center, Hartford, CT 06106, USA
| | - Cameron King
- Department of Research, Connecticut Children’s Medical Center, Hartford, CT 06106, USA
| | - Purva Vats
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, 06032, USA
| | - Peter A. Audano
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, 06032, USA
| | - Peter N. Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, 06032, USA
| | - Mark D. Adams
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, 06032, USA
| | - Melissa A. Kelly
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, 06032, USA
| | - Adam P. Matson
- Department of Pediatrics, University of Connecticut School of Medicine, Farmington, CT 06030, USA
- Division of Neonatology, Connecticut Children’s Medical Center, Hartford, CT 06106, USA
- Department of Immunology, UConn Health, Farmington, Connecticut, 06030, USA
| |
Collapse
|
2
|
Luo J, Luo C, Han M, Wang Q, Song Z, Zhang H, Gao Q, Lin T, Huang C, Zhao Y, Ma C. A natural variation of flavone synthase II gene enhances flavone accumulation and confers drought adaptation in chrysanthemum. THE NEW PHYTOLOGIST 2025. [PMID: 40448392 DOI: 10.1111/nph.70255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2025] [Accepted: 05/06/2025] [Indexed: 06/02/2025]
Abstract
Flavones, a key group of flavonoids, play a significant role in plant adaptation to ecological niches and are valuable medicinal resources. However, the genetic basis underlying their contribution to ecological adaptation remains largely unknown. Here, using metabolite-based genome-wide association study, we report that the natural variation of flavone contents in Chrysanthemum indicum, a wild chrysanthemum and medicinal herb, is mainly determined by a recently duplicated flavone synthase II gene CiFNSII-1.2. Enzymatic assays and molecular dynamics simulations reveal that the key amino acid residues 246th and 261th confer the higher enzymatic activity of CiFNSII-1.2 compared with its ancestral form. These residues act as critical modulators, regulating the flexibility of the external entrance and contributing to the enzyme's improved functionality. Transgenic evaluation demonstrate that CiFNSII-1.2 contributes to flavone accumulation and drought adaptation. Our findings provide insights into the biochemical and evolutionary role of flavones in facilitating adaptation to drought-prone habitats in chrysanthemum.
Collapse
Affiliation(s)
- Jiayi Luo
- Department of Ornamental Horticulture, College of Horticulture, Frontiers Science Center for Molecular Design Breeding (MOE), China Agricultural University, Beijing, 100193, China
| | - Chang Luo
- Institute of Grassland, Flowers and Ecology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100092, China
| | - Mingzheng Han
- Department of Ornamental Horticulture, College of Horticulture, Frontiers Science Center for Molecular Design Breeding (MOE), China Agricultural University, Beijing, 100193, China
| | - Qinrui Wang
- DP Technology, No. 3 Zhongguancun Street, Beijing, 100089, China
| | - Zhenzhen Song
- Department of Ornamental Horticulture, College of Horticulture, Frontiers Science Center for Molecular Design Breeding (MOE), China Agricultural University, Beijing, 100193, China
| | - Haixia Zhang
- Department of Ornamental Horticulture, College of Horticulture, Frontiers Science Center for Molecular Design Breeding (MOE), China Agricultural University, Beijing, 100193, China
| | - Qiang Gao
- Qi Biodesign, No. 9 South penglaiyuan Street, Beijing, 102209, China
| | - Tao Lin
- State Key Laboratory of Agrobiotechnology, Beijing Key Laboratory of Growth and Developmental Regulation for Protected Vegetable Crops, College of Horticulture, China Agricultural University, Beijing, 100193, China
| | - Conglin Huang
- Institute of Grassland, Flowers and Ecology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100092, China
| | - Yafei Zhao
- Department of Ornamental Horticulture, College of Horticulture, Frontiers Science Center for Molecular Design Breeding (MOE), China Agricultural University, Beijing, 100193, China
| | - Chao Ma
- Department of Ornamental Horticulture, College of Horticulture, Frontiers Science Center for Molecular Design Breeding (MOE), China Agricultural University, Beijing, 100193, China
| |
Collapse
|
3
|
Liu Q, Tian W. Association of human-specific expanded short tandem repeats with neuron-specific regulatory features. SCIENCE ADVANCES 2025; 11:eadp9707. [PMID: 40446031 PMCID: PMC12124357 DOI: 10.1126/sciadv.adp9707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Accepted: 04/24/2025] [Indexed: 06/02/2025]
Abstract
Short tandem repeats (STRs), characterized by high-copy number mutations, represent one of the fastest-evolving genomic elements. However, human-specific expanded STRs (heSTRs) have lacked comprehensive genome-wide characterization. Leveraging 148 human and 26 nonhuman primate haploid genomes, we identified 8813 heSTRs with robust expansions in copy number distributions. Our analysis revealed notable associations between heSTRs and brain- and neuron-specific distal regulatory signals. Potential target genes regulated by heSTRs, identified by incorporating distal regulations, are enriched with neuronal development-related functions and disorders, displaying neuron-specific expression enhancement in humans. Moreover, heSTRs are associated with enhanced chromatin accessibility specifically in human neurons. In addition, heSTRs show substantial association with pathogenic STR loci exhibiting abnormal copy number variations, as reported by cohort studies on schizophrenia and autism. This study underscores the role of heSTRs in both human evolution and disorders, offering valuable insights for future research on STRs from an evolutionary perspective.
Collapse
Affiliation(s)
- Qiming Liu
- State Key Laboratory of Genetics and Development of Complex Phenotypes, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai, China
| | - Weidong Tian
- State Key Laboratory of Genetics and Development of Complex Phenotypes, Department of Computational Biology, School of Life Sciences, Fudan University, Shanghai, China
- Children’s Hospital of Fudan University, Shanghai, China
- Children’s Hospital of Shandong University, Jinan, China
| |
Collapse
|
4
|
Chen F, Zhang Y, Li W, Sedlazeck FJ, Shen L, Creighton CJ. Global DNA methylation differences involving germline structural variation impact gene expression in pediatric brain tumors. Nat Commun 2025; 16:4713. [PMID: 40399292 PMCID: PMC12095544 DOI: 10.1038/s41467-025-60110-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 05/13/2025] [Indexed: 05/23/2025] Open
Abstract
The extent of genetic variation and its influence on gene expression across multiple tissue and cellular contexts is still being characterized, with germline Structural Variants (SVs) being historically understudied. DNA methylation also represents a component of normal germline variation across individuals. Here, we combine germline SVs (by short-read sequencing) with tumor DNA methylation across 1292 pediatric brain tumor patients. For thousands of methylation probes for CpG Islands (CGIs) or enhancers, rare and common SV breakpoints upstream or downstream associate with differential methylation in tumors spanning various histologic types, a significant subset involving genes with SV-associated differential expression. Cancer predisposition genes involving SV-associated differential methylation and expression include MSH2, RSPA, and PALB2. SV breakpoints falling within CGIs or histone marks H3K36me3 or H3K9me3 associate with differential CGI methylation. Genes with SVs and CGI methylation associated with patient survival include POLD4. Our results capture a class of normal phenotypic variation having disease implications.
Collapse
Affiliation(s)
- Fengju Chen
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Yiqun Zhang
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, CA, 92697, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Lanlan Shen
- USDA Children's Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Chad J Creighton
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA.
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
- Department of Medicine, Baylor College of Medicine, Houston, TX, 77030, USA.
| |
Collapse
|
5
|
Li C, Ge M, Long K, Han Z, Li J, Li M, Zhang Z. Parental Phasing Study Identified Lineage-Specific Variants Associated with Gene Expression and Epigenetic Modifications in European-Chinese Hybrid Pigs. Animals (Basel) 2025; 15:1494. [PMID: 40427370 PMCID: PMC12108307 DOI: 10.3390/ani15101494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2025] [Revised: 05/09/2025] [Accepted: 05/14/2025] [Indexed: 05/29/2025] Open
Abstract
Understanding how hybrids integrate lineage-specific regulatory variants at the haplotype level is crucial for elucidating the genetic basis of heterosis in livestock. In this study, we established three crossbred pig families derived from distant genetic lineages and systematically identified variants from different lineages, including single nucleotide polymorphisms (SNPs) and structural variations (SVs). At the phase level, we quantitatively analyzed gene expression, four histone modifications (H3K4me3, H3K27ac, H3K4me1, and H3K27me3), and the binding strength of transcription factor (CTCF) in backfat (BF) and longissimus dorsi (LD) muscle. By colocalization analysis of phased genetic variants with phased gene expression levels and with phased epigenetic modifications, we identified 18,670 expression quantitative trait loci (eQTL) (FDR < 0.05) and 8,652 epigenetic modification quantitative trait loci (epiQTL) (FDR < 0.05). The integration of eQTL and epiQTL allowed us to explore the potential regulatory mechanisms by which lineage-specific genetic variants simultaneously influence gene expression and epigenetic modifications. For example, we identified a Large White lineage-specific duplication (DUP) encompassing the KIT gene that was significantly associated with its promoter activity (FDR = 7.83 × 10-4) and expression levels (FDR = 9.03 × 10-4). Additionally, we found that a Duroc lineage-specific SNP located upstream of AMIGO2 was significantly associated with a Duroc-specific H3K27ac peak (FDR = 0.035) and also showed a significant association with AMIGO2 expression levels (FDR = 5.12 × 10-4). These findings underscore the importance of phased regulatory variants in shaping lineage-specific transcriptional programs and highlight how the haplotype-resolved integration of eQTL and epigenetic signals can reveal the mechanistic underpinnings of hybrid regulatory architecture. Our results offer insights for molecular marker development in precision pig breeding.
Collapse
Affiliation(s)
- Chenyu Li
- National Key Laboratory for Swine Genetic Improvement and Germplasm innovation Technology, Jiangxi Agricultural University, Nanchang 330045, China; (C.L.); ·
| | - Mei Ge
- National Key Laboratory for Swine Genetic Improvement and Germplasm innovation Technology, Jiangxi Agricultural University, Nanchang 330045, China; (C.L.); ·
| | - Keren Long
- State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China; (K.L.); (Z.H.); (J.L.)
| | - Ziyin Han
- State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China; (K.L.); (Z.H.); (J.L.)
| | - Jing Li
- State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China; (K.L.); (Z.H.); (J.L.)
| | - Mingzhou Li
- State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China; (K.L.); (Z.H.); (J.L.)
| | - Zhiyan Zhang
- National Key Laboratory for Swine Genetic Improvement and Germplasm innovation Technology, Jiangxi Agricultural University, Nanchang 330045, China; (C.L.); ·
| |
Collapse
|
6
|
Amudaria S, Jawhar SJ. MIMI-ONET: Multi-Modal image augmentation via Butterfly Optimized neural network for Huntington DiseaseDetection. Brain Res 2025; 1855:149530. [PMID: 40010625 DOI: 10.1016/j.brainres.2025.149530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Revised: 02/18/2025] [Accepted: 02/22/2025] [Indexed: 02/28/2025]
Abstract
Huntington's disease (HD) is a chronic neurodegenerative ailment that affects cognitive decline, motor impairment, and psychiatric symptoms. However, the existing HD detection methods are struggle with limited annotated datasets that restricts their generalization performance. This research work proposes a novel MIMI-ONET for primary detection of HD using augmented multi-modal brain MRI images. The two-dimensional stationary wavelet transform (2DSWT) decomposes the MRI images into different frequency wavelet sub-bands. These sub-bands are enhanced with Contract Stretching Adaptive Histogram Equalization (CSAHE) and Multi-scale Adaptive Retinex (MSAR) by reducing the irrelevant distortions. The proposed MIMI-ONET introduces a Hepta Generative Adversarial Network (Hepta-GAN) to generates different noise-free HD images based on hepta azimuth angles (45°, 90°, 135°, 180°, 225°, 270°, 315°). Hepta-GAN incorporates Affine Estimation Module (AEM) to extract the multi-scale features using dilated convolutional layers for efficient HD image generation. Moreover, Hepta-GAN is normalized with Butterfly Optimization (BO) algorithm for enhancing augmentation performance by balancing the parameters. Finally, the generated images are given to Deep neural network (DNN) for the classification of normal control (NC), Adult-Onset HD (AHD) and Juvenile HD (JHD) cases. The ability of the proposed MIMI-ONET is evaluated with precision, specificity, f1 score, recall, and accuracy, PSNR and MSE. From the experimental results, the proposed MIMI-ONET attains the accuracy of 98.85% and reaches PSNR value of 48.05 based on the gathered Image-HD dataset. The proposed MIMI-ONET increases the overall accuracy of 9.96%, 1.85%, 5.91%, 13.80% and 13.5% for 3DCNN, KNN, FCN, RNN and ML framework respectively.
Collapse
Affiliation(s)
- S Amudaria
- Department of Computer Science and Engineering, Arunachala College of Engineering for Women, Manavilai, Nagercoil, Tamil Nadu, India.
| | - S Joseph Jawhar
- Department of Electrical and Electronics Engineering, Arunachala College of Engineering for Women, Manavilai, Nagercoil, Tamil Nadu, India
| |
Collapse
|
7
|
He Y, Zhang X, Peng MS, Li YC, Liu K, Zhang Y, Mao L, Guo Y, Ma Y, Zhou B, Zheng W, Yue T, Liao Y, Liang SA, Chen L, Zhang W, Chen X, Tang B, Yang X, Ye K, Gao S, Lu Y, Wang Y, Wan S, Hao R, Wang X, Mao Y, Dai S, Gao Z, Yang LQ, Guo J, Li J, Liu C, Wang J, Sovannary T, Bunnath L, Kampuansai J, Inta A, Srikummool M, Kutanan W, Ho HQ, Pham KD, Singthong S, Sochampa S, Kyaing UW, Pongamornkul W, Morlaeku C, Rattanakrajangsri K, Kong QP, Zhang YP, Su B. Genome diversity and signatures of natural selection in mainland Southeast Asia. Nature 2025:10.1038/s41586-025-08998-w. [PMID: 40369069 DOI: 10.1038/s41586-025-08998-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 04/09/2025] [Indexed: 05/16/2025]
Abstract
Mainland Southeast Asia (MSEA) has rich ethnic and cultural diversity with a population of nearly 300 million1,2. However, people from MSEA are underrepresented in the current human genomic databases. Here we present the SEA3K genome dataset (phase I), generated by deep short-read whole-genome sequencing of 3,023 individuals from 30 MSEA populations, and long-read whole-genome sequencing of 37 representative individuals. We identified 79.59 million small variants and 96,384 structural variants, among which 22.83 million small variants and 24,622 structural variants are unique to this dataset. We observed a high genetic heterogeneity across MSEA populations, reflected by the varied combinations of genetic components. We identified 44 genomic regions with strong signatures of Darwinian positive selection, covering 89 genes involved in varied physiological systems such as physical traits and immune response. Furthermore, we observed varied patterns of archaic Denisovan introgression in MSEA populations, supporting the proposal of at least two distinct instances of Denisovan admixture into modern humans in Asia3. We also detected genomic regions that suggest adaptive archaic introgressions in MSEA populations. The large number of novel genomic variants in MSEA populations highlight the necessity of studying regional populations that can help answer key questions related to prehistory, genetic adaptation and complex diseases.
Collapse
Affiliation(s)
- Yaoxi He
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
| | - Xiaoming Zhang
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
| | - Min-Sheng Peng
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yu-Chun Li
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming, China
- Kunming Key Laboratory of Healthy Aging Study, Kunming, China
| | - Kai Liu
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yu Zhang
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Leyan Mao
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yongbo Guo
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yujie Ma
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Bin Zhou
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wangshan Zheng
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Tian Yue
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yuwen Liao
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Shen-Ao Liang
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, China
| | - Lu Chen
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, China
| | - Weijie Zhang
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoning Chen
- National Genomics Data Center, China National Center for Bioinformation, Beijing, China
| | - Bixia Tang
- National Genomics Data Center, China National Center for Bioinformation, Beijing, China
| | - Xiaofei Yang
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- Center for Mathematical Medical, the First Affiliated Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Kai Ye
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- Center for Mathematical Medical, the First Affiliated Hospital, Xi'an Jiaotong University, Xi'an, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- Genome Institute, the First Affiliated Hospital, Xi'an Jiaotong University, Xi'an, China
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, China
- Faculty of Science, Leiden University, Leiden, The Netherlands
| | - Shenghan Gao
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Yurun Lu
- CEMS, NCMIS, HCMS, MADIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Yong Wang
- CEMS, NCMIS, HCMS, MADIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Shijie Wan
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Rushan Hao
- School of Medicine, Yunnan University, Kunming, China
| | - Xuankai Wang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Center for Genomic Research, International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
| | - Shanshan Dai
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Zongliang Gao
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
- Kunming Key Laboratory of Healthy Aging Study, Kunming, China
| | - Li-Qin Yang
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
- Kunming Key Laboratory of Healthy Aging Study, Kunming, China
| | - Jianxin Guo
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Jiangguo Li
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Chao Liu
- Laboratory Animal Center, Kunming Institute of Zoology, the Chinese Academy of Sciences, Kunming, China
- National Resource Center for Non-Human Primates, Kunming, China
| | - Jianhua Wang
- Department of Anthropology, School of Sociology, Yunnan Minzu University, Kunming, China
| | - Tuot Sovannary
- Department of Geography and Land Management, Royal University of Phnom Penh, Phnom Penh, Cambodia
| | - Long Bunnath
- Department of Geography and Land Management, Royal University of Phnom Penh, Phnom Penh, Cambodia
| | - Jatupol Kampuansai
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai, Thailand
| | - Angkhana Inta
- Department of Biology, Faculty of Science, Chiang Mai University, Chiang Mai, Thailand
| | - Metawee Srikummool
- Department of Biochemistry, Faculty of Medical Science, Naresuan University, Phitsanulok, Thailand
| | - Wibhu Kutanan
- Department of Biology, Faculty of Science, Naresuan University, Phitsanulok, Thailand
| | - Huy Quang Ho
- Department of Immunology, Ha Noi Medical University, Ha Noi, Vietnam
| | - Khoa Dang Pham
- Department of Immunology, Ha Noi Medical University, Ha Noi, Vietnam
| | | | | | - U Win Kyaing
- Field School of Archaeology, Paukkhaung, Myanmar
| | - Wittaya Pongamornkul
- Queen Sirikit Botanic Garden (QSBG), The Botanical Garden Organization, Chiang Mai, Thailand
| | - Chutima Morlaeku
- Inter Mountain Peoples Education and Culture in Thailand Association (IMPECT), Sansai, Thailand
| | | | - Qing-Peng Kong
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
- KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming, China.
- Kunming Key Laboratory of Healthy Aging Study, Kunming, China.
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China.
| | - Ya-Ping Zhang
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
- KIZ/CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming, China.
- University of Chinese Academy of Sciences, Beijing, China.
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, School of Life Sciences, Yunnan University, Kunming, China.
| | - Bing Su
- State Key Laboratory of Genetic Evolution and Animal Models, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China.
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China.
| |
Collapse
|
8
|
Lakbir S, de Wit R, de Bruijn I, Kundra R, Madupuri R, Gao J, Schultz N, Meijer GA, Heringa J, Fijneman RJA, Abeln S. Tumor break load quantitates structural variant-associated genomic instability with biological and clinical relevance across cancers. NPJ Precis Oncol 2025; 9:140. [PMID: 40369102 PMCID: PMC12078582 DOI: 10.1038/s41698-025-00922-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Accepted: 04/24/2025] [Indexed: 05/16/2025] Open
Abstract
While structural variants (SVs) are a clear sign of genomic instability, they have not been systematically quantified per patient since declining costs have only recently enabled large-scale profiling. Therefore, the biological and clinical impact of high numbers of SVs in patients is unknown. We introduce tumor break load (TBL), defined as the sum of unbalanced SVs, as a measure for SV-associated genomic instability. Using pan-cancer data from TCGA, PCAWG, and CCLE, we show that a high TBL is associated with significant changes in gene expression in 26/31 cancer types that consistently involve upregulation of DNA damage repair and downregulation of immune response pathways. Patients with a high TBL show a higher risk of recurrence and shorter median survival times for 5/15 cancer types. Our data demonstrate that TBL is a biologically and clinically relevant feature of genomic instability that may aid patient prognostication and treatment stratification. For the datasets analyzed in this study, TBL has been made available in cBioPortal.
Collapse
Affiliation(s)
- Soufyan Lakbir
- Bioinformatics Section, Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Translational Gastrointestinal Oncology Group, Department of Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands
- AI Technology for Life Group, Department of Information and Computing Science; Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - Renske de Wit
- Translational Gastrointestinal Oncology Group, Department of Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands
- AI Technology for Life Group, Department of Information and Computing Science; Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - Ino de Bruijn
- Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Ritika Kundra
- Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | | | - Jianjiong Gao
- Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | | | - Gerrit A Meijer
- Translational Gastrointestinal Oncology Group, Department of Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Jaap Heringa
- Bioinformatics Section, Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Remond J A Fijneman
- Translational Gastrointestinal Oncology Group, Department of Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands.
| | - Sanne Abeln
- Bioinformatics Section, Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.
- AI Technology for Life Group, Department of Information and Computing Science; Department of Biology, Utrecht University, Utrecht, The Netherlands.
| |
Collapse
|
9
|
Liu S, Shi C, Chen C, Tan Y, Tian Y, Macqueen DJ, Li Q. Haplotype-resolved genomes provide insights into the origins and functional significance of genome diversity in bivalves. Cell Rep 2025; 44:115697. [PMID: 40349337 DOI: 10.1016/j.celrep.2025.115697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Revised: 03/20/2025] [Accepted: 04/23/2025] [Indexed: 05/14/2025] Open
Abstract
Bivalves are famed for exhibiting vast genetic diversity of poorly understood origins and functional significance. Through comparative genomics, we demonstrate that high genetic diversity in these invertebrates is not directly linked to genome size. Using oysters as a representative clade, we show that despite genome size reduction during evolution, these bivalves maintain remarkable genetic variability. By constructing a haplotype-resolved genome for Crassostrea sikamea, we identify widespread haplotype divergent sequences (HDSs), representing genomic regions unique to each haplotype. We show that HDSs are driven by transposable elements, playing a key role in creating and maintaining genetic diversity during oyster evolution. Comparisons of haplotype-resolved genomes across four bivalve orders uncover diverse HDS origins, highlighting a role in genetic innovation and expression regulation across broad timescales. Further analyses show that, in oysters, haplotype polymorphisms drive gene expression variation, which is likely to promote phenotypic plasticity and adaptation. These findings advance our understanding of the relationships among genome structure, diversity, and adaptability in a highly successful invertebrate group.
Collapse
Affiliation(s)
- Shikai Liu
- Key Laboratory of Mariculture (Ocean University of China), Ministry of Education, and College of Fisheries, Ocean University of China, Qingdao, China; Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao Marine Science and Technology Center, Qingdao, Shandong, China.
| | - Chenyu Shi
- Key Laboratory of Mariculture (Ocean University of China), Ministry of Education, and College of Fisheries, Ocean University of China, Qingdao, China
| | - Chenguang Chen
- Key Laboratory of Mariculture (Ocean University of China), Ministry of Education, and College of Fisheries, Ocean University of China, Qingdao, China
| | - Ying Tan
- Key Laboratory of Mariculture (Ocean University of China), Ministry of Education, and College of Fisheries, Ocean University of China, Qingdao, China
| | - Yuan Tian
- Key Laboratory of Mariculture (Ocean University of China), Ministry of Education, and College of Fisheries, Ocean University of China, Qingdao, China
| | - Daniel J Macqueen
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, UK
| | - Qi Li
- Key Laboratory of Mariculture (Ocean University of China), Ministry of Education, and College of Fisheries, Ocean University of China, Qingdao, China; Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao Marine Science and Technology Center, Qingdao, Shandong, China.
| |
Collapse
|
10
|
Eisfeldt J, Ek M, Nordenskjöld M, Lindstrand A. Toward clinical long-read genome sequencing for rare diseases. Nat Genet 2025:10.1038/s41588-025-02160-y. [PMID: 40335760 DOI: 10.1038/s41588-025-02160-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2024] [Accepted: 03/11/2025] [Indexed: 05/09/2025]
Abstract
Genetic diagnostics is driven by technological advances, forming a tight interface between research, clinic and industry, which enables rapid implementation of new technologies. Short-read genome and exome sequencing, the current state of the art in clinical genetics, can detect a broad spectrum of genetic variants across the genome. However, despite these advancements, more than half of individuals with rare diseases remain undiagnosed after genomic investigations. Long-read whole-genome sequencing (LR-WGS) is a promising technology that identifies previously difficult-to-detect variants while also enabling phasing and methylation analysis and has the potential of generating complete personal assemblies. To pave the way for clinical use of LR-WGS, the clinical genomic community must establish standardized protocols and quality parameters while also developing innovative tools for data analysis and interpretation. In this Perspective, we explore the key challenges and benefits in integrating LR-WGS into routine clinical diagnostics.
Collapse
Affiliation(s)
- Jesper Eisfeldt
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Department of Clinical Genetics and Genomics, Karolinska University Hospital, Stockholm, Sweden
- Science for Life Laboratory, Karolinska Institutet Science Park, Solna, Sweden
| | - Marlene Ek
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Department of Clinical Genetics and Genomics, Karolinska University Hospital, Stockholm, Sweden
| | - Magnus Nordenskjöld
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Department of Clinical Genetics and Genomics, Karolinska University Hospital, Stockholm, Sweden
| | - Anna Lindstrand
- Department of Molecular Medicine and Surgery and Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden.
- Department of Clinical Genetics and Genomics, Karolinska University Hospital, Stockholm, Sweden.
| |
Collapse
|
11
|
Ghorbani M, Moosa S, Siddig Z, Farhad R, Naeem H, Harvey WT, Mastrorosa FK, Munson KM, Mohamad Razali R, Aliyev E, Diboun I, Abouelhassan R, Tauro M, Hassan S, Mathew R, Al Hashmi M, Mathew LS, Wang K, Salhab AR, Vempalli FR, El Khouly A, Alazwani I, Tomei S, Fakhro KA, Satti A, Benini R, Rhie A, Eichler EE, Mokrab Y. Near-complete Middle Eastern genomes refine autozygosity and enhance disease-causing and population-specific variant discovery. Nat Genet 2025; 57:1119-1131. [PMID: 40325133 DOI: 10.1038/s41588-025-02173-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 03/18/2025] [Indexed: 05/07/2025]
Abstract
Advances in long-read sequencing have enabled routine complete assembly of human genomes, but much remains to be done to represent broader populations and show impact on disease-gene discovery. Here, we report highly accurate, near-complete and phased genomes from six Middle Eastern (ME) family trios (n = 18) with neurodevelopmental conditions, representing ancestries from Sudan, Jordan, Syria, Qatar and Afghanistan. These genomes revealed 42.2 Mb of new sequence (13.8% impacting known genes), 75 new HLA/KIR alleles and strong signals of inbreeding, with ROH covering up to one-third of chromosomes 6 and 12 in one individual. Using assembly-based variant calling, we identified 23 de novo and recessive variants as strong candidates for causing previously unresolved symptoms in the probands. The ME genomes revealed unique variation relative to existing references, showing enhanced mappability and variant calling. These results underscore the value of de novo assembly for disease variant discovery and the need for sampled ME-specific references to better characterize population-relevant variation.
Collapse
Affiliation(s)
| | | | | | | | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Rozaimi Mohamad Razali
- Department of Biomedical Science, College of Health Sciences, Qatar University, Doha, Qatar
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Khalid A Fakhro
- Sidra Medicine, Doha, Qatar
- Department of Genetic Medicine, Weill Cornell Medicine, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
| | | | - Ruba Benini
- Sidra Medicine, Doha, Qatar
- Department of Genetic Medicine, Weill Cornell Medicine, Doha, Qatar
| | - Arang Rhie
- National Human Genome Research Institute, Bethesda, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Younes Mokrab
- Sidra Medicine, Doha, Qatar.
- Department of Biomedical Science, College of Health Sciences, Qatar University, Doha, Qatar.
- Department of Genetic Medicine, Weill Cornell Medicine, Doha, Qatar.
| |
Collapse
|
12
|
Cuenca-Guardiola J, de la Morena-Barrio B, Corral J, Fernández-Breis JT. Advanced analysis of retrotransposon variation in the human genome with nanopore sequencing using RetroInspector. Sci Rep 2025; 15:14489. [PMID: 40281075 PMCID: PMC12032414 DOI: 10.1038/s41598-025-98847-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 04/15/2025] [Indexed: 04/29/2025] Open
Abstract
Transposable elements (TEs) make up 45% of the human genome, are a source of genetic variability difficult to detect, and involved in processes related to gene regulation and disease. Nanopore sequencing is recognized as one of the best technologies for detecting TEs; however, tools for analyzing of human TE insertions and deletions with nanopore-based data can be improved. RetroInspector is an easy to use, configurable Snakemake pipeline that performs detection, annotation, enrichment, and genotyping of TEs. RetroInspector requires the FASTQ files of the samples and the reference genome to start the identification and analysis of TEs. The user can also set the threshold for the number of supporting reads for the variant filtering. RetroInspector also allows users to compare the results of two samples. Different versions of the reference genome can be used and the presence of retrotransposition features can be annotated. RetroInspector has been run on three nanopore sequencing datasets and validated experimentally using proprietary and public data with over 80% precision.
Collapse
Affiliation(s)
- Javier Cuenca-Guardiola
- Departamento de Informática y Sistemas, IMIB-Pascual Parrilla, CEIR Campus Mare Nostrum, Universidad de Murcia, 30100, Murcia, Spain
| | - Belén de la Morena-Barrio
- Servicio de Hematología, CIBERER-ISCIII, IMIB-Pascual Parrilla, Centro Regional de Hemodonación, Hospital Universitario Morales Meseguer, Universidad de Murcia, 30003, Murcia, Spain
| | - Javier Corral
- Servicio de Hematología, CIBERER-ISCIII, IMIB-Pascual Parrilla, Centro Regional de Hemodonación, Hospital Universitario Morales Meseguer, Universidad de Murcia, 30003, Murcia, Spain
| | - Jesualdo Tomás Fernández-Breis
- Departamento de Informática y Sistemas, IMIB-Pascual Parrilla, CEIR Campus Mare Nostrum, Universidad de Murcia, 30100, Murcia, Spain.
| |
Collapse
|
13
|
Porubsky D, Dashnow H, Sasani TA, Logsdon GA, Hallast P, Noyes MD, Kronenberg ZN, Mokveld T, Koundinya N, Nolan C, Steely CJ, Guarracino A, Dolzhenko E, Harvey WT, Rowell WJ, Grigorev K, Nicholas TJ, Goldberg ME, Oshima KK, Lin J, Ebert P, Watkins WS, Leung TY, Hanlon VCT, McGee S, Pedersen BS, Happ HC, Jeong H, Munson KM, Hoekzema K, Chan DD, Wang Y, Knuth J, Garcia GH, Fanslow C, Lambert C, Lee C, Smith JD, Levy S, Mason CE, Garrison E, Lansdorp PM, Neklason DW, Jorde LB, Quinlan AR, Eberle MA, Eichler EE. Human de novo mutation rates from a four-generation pedigree reference. Nature 2025:10.1038/s41586-025-08922-2. [PMID: 40269156 DOI: 10.1038/s41586-025-08922-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 03/20/2025] [Indexed: 04/25/2025]
Abstract
Understanding the human de novo mutation (DNM) rate requires complete sequence information1. Here using five complementary short-read and long-read sequencing technologies, we phased and assembled more than 95% of each diploid human genome in a four-generation, twenty-eight-member family (CEPH 1463). We estimate 98-206 DNMs per transmission, including 74.5 de novo single-nucleotide variants, 7.4 non-tandem repeat indels, 65.3 de novo indels or structural variants originating from tandem repeats, and 4.4 centromeric DNMs. Among male individuals, we find 12.4 de novo Y chromosome events per generation. Short tandem repeats and variable-number tandem repeats are the most mutable, with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 16% of de novo single-nucleotide variants are postzygotic in origin with no paternal bias, including early germline mosaic mutations. We place all this variation in the context of a high-resolution recombination map (~3.4 kb breakpoint resolution) and find no correlation between meiotic crossover and de novo structural variants. These near-telomere-to-telomere familial genomes provide a truth set to understand the most fundamental processes underlying human genetic variation.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Harriet Dashnow
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Thomas A Sasani
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Michelle D Noyes
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Nidhi Koundinya
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Cody J Steely
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Department of Internal Medicine, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Andrea Guarracino
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Kirill Grigorev
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Thomas J Nicholas
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Michael E Goldberg
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Keisuke K Oshima
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jiadong Lin
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter Ebert
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - W Scott Watkins
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Tiffany Y Leung
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Vincent C T Hanlon
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Sean McGee
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brent S Pedersen
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hannah C Happ
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Altos Labs, San Diego, CA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Daniel D Chan
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Yanni Wang
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Joshua D Smith
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Erik Garrison
- Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Peter M Lansdorp
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Deborah W Neklason
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Aaron R Quinlan
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
14
|
Choppavarapu L, Fang K, Liu T, Ohihoin AG, Jin VX. Hi-C profiling in tissues reveals 3D chromatin-regulated breast tumor heterogeneity informing a looping-mediated therapeutic avenue. Cell Rep 2025; 44:115450. [PMID: 40112000 PMCID: PMC12103084 DOI: 10.1016/j.celrep.2025.115450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Revised: 01/12/2025] [Accepted: 02/28/2025] [Indexed: 03/22/2025] Open
Abstract
The limitations of Hi-C (high-throughput chromosome conformation capture) profiling in in vitro cell culture include failing to recapitulate disease-specific physiological properties and lacking a clinically relevant disease microenvironment. In this study, we conduct Hi-C profiling in a pilot cohort of 12 breast tissues comprising two normal tissues, five ER+ breast primary tumors, and five tamoxifen-treated recurrent tumors. We demonstrate 3D chromatin-regulated breast tumor heterogeneity and identify a looping-mediated target gene, CA2, which might play a role in driving tamoxifen resistance. The inhibition of CA2 impedes tumor growth both in vitro and in vivo and reverses chromatin looping. The disruption of CA2 looping reduces tamoxifen-resistant cancer cell proliferation, decreases CA2 mRNA and protein expression, and weakens the looping interaction. Our study thus provides mechanistic and functional insights into the role of 3D chromatin architecture in regulating breast tumor heterogeneity and informs a new looping-mediated therapeutic avenue for treating breast cancer.
Collapse
Affiliation(s)
- Lavanya Choppavarapu
- Divison of Biostatistics, Data Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA; MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA; Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Kun Fang
- Divison of Biostatistics, Data Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA; MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA; Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Tianxiang Liu
- Divison of Biostatistics, Data Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA; MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA; Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Aigbe G Ohihoin
- Cell and Developmental Biology PhD program, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Victor X Jin
- Divison of Biostatistics, Data Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA; MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA; Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI 53226, USA.
| |
Collapse
|
15
|
Gao Y, Yang L, Kuhn K, Li W, Zanton G, Bowman M, Zhao P, Zhou Y, Fang L, Cole JB, Rosen BD, Ma L, Li C, Baldwin RL, Van Tassell CP, Zhang Z, Smith TPL, Liu GE. Long read and preliminary pangenome analyses reveal breed-specific structural variations and novel sequences in Holstein and Jersey cattle. J Adv Res 2025:S2090-1232(25)00258-9. [PMID: 40258473 DOI: 10.1016/j.jare.2025.04.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2024] [Revised: 04/06/2025] [Accepted: 04/10/2025] [Indexed: 04/23/2025] Open
Abstract
INTRODUCTION Most SV studies in livestock rely on short-read sequencing, posing challenges in accurately characterizing large genomic variants due to their limited read length. OBJECTIVES Our goal is to reveal structural variation and novel sequences specific to Holstein and Jersey cattle breeds using long-read and pan-genome analyses. METHODS We sequenced 20 Holsteins and 8 Jersey cattle using PacBio HiFi to 20×, and integrated five read-based and one assembly-based SV caller to determine SVs. RESULTS We assembled the 28 genomes averaging 3.25 Gb with a contig N50 of 69.36 Mb and using the ARS-UCD1.2 reference, we acquired Holstein/Jersey SV catalogs with 74,068/54,689 events spanning 202/135 Mb (7.43 %/4.97 % of the genome). SVs were enriched in less conserved, non-coding, and non-regulatory regions. Comparing Holsteins with differing feed efficiency (FE), SVs unique to high FE were linked to energy metabolism and olfactory receptors, while those specific to low FE were associated with material transport. We constructed Holstein/Jersey pangenome graphs with 148,598/105,875 nodes and 208,891/147,990 edges, representing 47,028/37,137 biallelic and multi-allelic events, and 63.75/42.34 Mb of novel sequence. We observed SV count saturation with 20 Holsteins, while adding Jerseys significantly increased the SV count, highlighting breed-specific SV events. CONCLUSION Our long-read data and SV catalogs are valuable resources, revealing that the cattle genome is more complex than previously thought.
Collapse
Affiliation(s)
- Yahui Gao
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA; Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA.
| | - Liu Yang
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA; Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA.
| | - Kristen Kuhn
- USDA, ARS, U.S. Meat Animal Research Center (USMARC), Clay Center, NE, USA.
| | - Wenli Li
- US Dairy Forage Research Center, USDA-ARS, Madison, WI, USA.
| | - Geoffrey Zanton
- US Dairy Forage Research Center, USDA-ARS, Madison, WI, USA.
| | - Mary Bowman
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Pengju Zhao
- Hainan Institute, Zhejiang University, Yongyou Industry Park, Yazhou Bay Sci-Tech City, Sanya 572000, China.
| | - Yang Zhou
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan 430070, China.
| | - Lingzhao Fang
- Quantitative Genetics and Genomics (QGG), Aarhus University, Aarhus, Denmark.
| | - John B Cole
- Council on Dairy Cattle Breeding, 4201 Northview Dr, Bowie, MD 20716, USA; Department of Animal Sciences, Donald Henry Barron Reproductive and Perinatal Biology Research Program, and the Genetics Institute, University of Florida, Gainesville, FL 32611-0910, USA; Department of Animal Science, North Carolina State University, Raleigh, NC 27695-7621, USA.
| | - Benjamin D Rosen
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Li Ma
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA.
| | - Congjun Li
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Ransom L Baldwin
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Curtis P Van Tassell
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| | - Zhe Zhang
- State Key Laboratory of Swine and Poultry Breeding Industry, National Engineering Research Center for Breeding Swine Industry, Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China.
| | - Timothy P L Smith
- USDA, ARS, U.S. Meat Animal Research Center (USMARC), Clay Center, NE, USA.
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD 20705, USA.
| |
Collapse
|
16
|
Gompert Z, Feder JL, Parchman TL, Planidin NP, Whiting FJH, Nosil P. Adaptation repeatedly uses complex structural genomic variation. Science 2025; 388:eadp3745. [PMID: 40245138 DOI: 10.1126/science.adp3745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 09/30/2024] [Accepted: 02/04/2025] [Indexed: 04/19/2025]
Abstract
Structural elements are widespread across genomes, but their complexity and role in repeatedly driving local adaptation remain unclear. In this work, we use phased genome assemblies to show that adaptive divergence in cryptic color pattern in a stick insect is repeatedly underlain by structural variation, but not a simple chromosomal inversion. We found that color pattern in populations of stick insects on two mountains is associated with translocations that have also been inverted. These translocations differ in size and origin on each mountain, but they overlap partially and involve some of the same gene regions. Moreover, this structural variation is subject to divergent selection and arose without introgression between species. Our results show how the origin of structural variation provides a mechanism for repeated bouts of adaptation.
Collapse
Affiliation(s)
| | - Jeffrey L Feder
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, USA
| | | | | | | | - Patrik Nosil
- CEFE, Univ Montpellier, CNRS, EPHE, IRD, Montpellier, France
| |
Collapse
|
17
|
Wu X, Li Y, Li P, Lu G, Wu J, Wang Z, Wen Q, Cui B, Wang J, Zhang F. Structural Variations in Ulcerative Colitis-associated Escherichia coli Reduce Fructose Utilization and Aggravate Inflammation Under High-Fructose Diet. Gastroenterology 2025:S0016-5085(25)00635-3. [PMID: 40250773 DOI: 10.1053/j.gastro.2025.03.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 02/16/2025] [Accepted: 03/09/2025] [Indexed: 04/20/2025]
Abstract
BACKGROUND AND AIMS Structural variations (SVs) have significant effects on microbial phenotypes. The underlying mechanism of functional changes caused by gut microbial SVs in the development of ulcerative colitis (UC) need further investigation. METHODS We performed long-read (Oxford Nanopore Technology-based) and short-read (Illumina-based) metagenomic sequencing on stool samples from 93 patients with UC and 100 healthy controls (HCs) and analyzed microbial SVs. A total of 648 Escherichia coli strains from fecal samples of patients with UC (UC-strains) and HCs (HC-strains) were isolated. SV-associated scrK gene deletion was verified via whole-genome sequencing or targeted polymerase chain reaction. Then, representative UC-strains, HC-strains, and scrK-knockout E coli were used for the in vitro and in vivo experiments to investigate the effects of specific SVs in E coli on fructose utilization ability and colitis. RESULTS E coli in UC with the highest fold change had SV-affected functional differences on fructose metabolism to that of HCs. The fructose utilization gene deletion was common in UC-strains, ostensibly reducing fructose utilization in vitro and leading to fructose-dependent aggravation of colitis in murine models. UC-strains and HC-strains induced comparable colitis under low fructose. However, high fructose exacerbated colitis severity exclusively in UC-strain-colonized mice, with elevated intestinal fructose residues, significant microbiome/metabolome changes, increased inflammation, and gut barrier disruption. These changes were mechanistically dependent on the deletion of the fructose utilization gene scrK. CONCLUSIONS SV-caused difference in fructose utilization and proinflammatory properties in E coli from patients with UC influence the development of UC, emphasizing the importance of fine-scale metagenomic studies in disease.
Collapse
Affiliation(s)
- Xia Wu
- Department of Microbiota Medicine & Medical Center for Digestive Diseases, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Yuejuan Li
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Pan Li
- Department of Microbiota Medicine & Medical Center for Digestive Diseases, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Gaochen Lu
- Department of Microbiota Medicine & Medical Center for Digestive Diseases, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Jianyu Wu
- Department of Microbiota Medicine & Medical Center for Digestive Diseases, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Zheyu Wang
- Department of Microbiota Medicine & Medical Center for Digestive Diseases, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Quan Wen
- Department of Microbiota Medicine & Medical Center for Digestive Diseases, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Bota Cui
- Department of Microbiota Medicine & Medical Center for Digestive Diseases, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Jun Wang
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China.
| | - Faming Zhang
- Department of Microbiota Medicine & Medical Center for Digestive Diseases, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| |
Collapse
|
18
|
Li Q, Keskus AG, Wagner J, Izydorczyk MB, Timp W, Sedlazeck FJ, Klein AP, Zook JM, Kolmogorov M, Schatz MC. Unraveling the hidden complexity of cancer through long-read sequencing. Genome Res 2025; 35:599-620. [PMID: 40113261 PMCID: PMC12047254 DOI: 10.1101/gr.280041.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2025]
Abstract
Cancer is fundamentally a disease of the genome, characterized by extensive genomic, transcriptomic, and epigenomic alterations. Most current studies predominantly use short-read sequencing, gene panels, or microarrays to explore these alterations; however, these technologies can systematically miss or misrepresent certain types of alterations, especially structural variants, complex rearrangements, and alterations within repetitive regions. Long-read sequencing is rapidly emerging as a transformative technology for cancer research by providing a comprehensive view across the genome, transcriptome, and epigenome, including the ability to detect alterations that previous technologies have overlooked. In this Perspective, we explore the current applications of long-read sequencing for both germline and somatic cancer analysis. We provide an overview of the computational methodologies tailored to long-read data and highlight key discoveries and resources within cancer genomics that were previously inaccessible with prior technologies. We also address future opportunities and persistent challenges, including the experimental and computational requirements needed to scale to larger sample sizes, the hurdles in sequencing and analyzing complex cancer genomes, and opportunities for leveraging machine learning and artificial intelligence technologies for cancer informatics. We further discuss how the telomere-to-telomere genome and the emerging human pangenome could enhance the resolution of cancer genome analysis, potentially revolutionizing early detection and disease monitoring in patients. Finally, we outline strategies for transitioning long-read sequencing from research applications to routine clinical practice.
Collapse
Affiliation(s)
- Qiuhui Li
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Ayse G Keskus
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, USA
| | - Michal B Izydorczyk
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Winston Timp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Texas 77030, USA
- Department of Computer Science, Rice University, Houston, Texas 77251, USA
| | - Alison P Klein
- Sidney Kimmel Comprehensive Cancer Center, Department of Oncology, Johns Hopkins Medicine, Baltimore, Maryland 21031, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, USA
| | - Mikhail Kolmogorov
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland 20892, USA;
| | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA;
- Sidney Kimmel Comprehensive Cancer Center, Department of Oncology, Johns Hopkins Medicine, Baltimore, Maryland 21031, USA
| |
Collapse
|
19
|
Mahmoud M, Agustinho DP, Sedlazeck FJ. A Hitchhiker's Guide to long-read genomic analysis. Genome Res 2025; 35:545-558. [PMID: 40228901 PMCID: PMC12047252 DOI: 10.1101/gr.279975.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/16/2025]
Abstract
Over the past decade, long-read sequencing has evolved into a pivotal technology for uncovering the hidden and complex regions of the genome. Significant cost efficiency, scalability, and accuracy advancements have driven this evolution. Concurrently, novel analytical methods have emerged to harness the full potential of long reads. These advancements have enabled milestones such as the first fully completed human genome, enhanced identification and understanding of complex genomic variants, and deeper insights into the interplay between epigenetics and genomic variation. This mini-review provides a comprehensive overview of the latest developments in long-read DNA sequencing analysis, encompassing reference-based and de novo assembly approaches. We explore the entire workflow, from initial data processing to variant calling and annotation, focusing on how these methods improve our ability to interpret a wide array of genomic variants. Additionally, we discuss the current challenges, limitations, and future directions in the field, offering a detailed examination of the state-of-the-art bioinformatics methods for long-read sequencing.
Collapse
Affiliation(s)
- Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Daniel P Agustinho
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA;
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
20
|
Rausch T, Marschall T, Korbel JO. The impact of long-read sequencing on human population-scale genomics. Genome Res 2025; 35:593-598. [PMID: 40228902 PMCID: PMC12047236 DOI: 10.1101/gr.280120.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/16/2025]
Abstract
Long-read sequencing technologies, particularly those from Pacific Biosciences and Oxford Nanopore Technologies, are revolutionizing genome research by providing high-resolution insights into complex and repetitive regions of the human genome that were previously inaccessible. These advances have been particularly enabling for the comprehensive detection of genomic structural variants (SVs), which is critical for linking genotype to phenotype in population-scale and rare disease studies, as well as in cancer. Recent developments in sequencing throughput and computational methods, such as pangenome graphs and haplotype-resolved assemblies, are paving the way for the future inclusion of long-read sequencing in clinical cohort studies and disease diagnostics. DNA methylation signals directly obtained from long reads enhance the utility of single-molecule long-read sequencing technologies by enabling molecular phenotypes to be interpreted, and by allowing the identification of the parent of origin of de novo mutations. Despite this recent progress, challenges remain in scaling long-read technologies to large populations due to cost, computational complexity, and the lack of tools to facilitate the efficient interpretation of SVs in graphs. This perspective provides a succinct review on the current state of long-read sequencing in genomics by highlighting its transformative potential and key hurdles, and emphasizing future opportunities for advancing the understanding of human genetic diversity and diseases through population-scale long-read analysis.
Collapse
Affiliation(s)
- Tobias Rausch
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, 69117 Heidelberg, Germany;
| | - Tobias Marschall
- Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University, 40225 Düsseldorf, Germany;
- Center for Digital Medicine, Heinrich Heine University, 40225 Düsseldorf, Germany
| | - Jan O Korbel
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, 69117 Heidelberg, Germany;
| |
Collapse
|
21
|
Zeng T, Liao H, Xia L, You S, Huang Y, Zhang J, Liu Y, Liu X, Xie D. Multisite long-read sequencing reveals the early contributions of somatic structural variations to HBV-related hepatocellular carcinoma tumorigenesis. Genome Res 2025; 35:671-685. [PMID: 40037842 PMCID: PMC12047258 DOI: 10.1101/gr.279617.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Accepted: 01/30/2025] [Indexed: 03/06/2025]
Abstract
Somatic structural variations (SVs) represent a critical category of genomic mutations in hepatocellular carcinoma (HCC). However, the accurate identification of somatic SVs using short-read high-throughput sequencing is challenging. Here, we applied long-read nanopore sequencing and multisite sampling in a cohort of 42 samples from five patients. We found that adjacent nontumor tissue is not entirely normal, as significant somatic SV alterations were detected in these nontumor genomes. The adjacent nontumor tissue is highly similar to tumor tissue in terms of somatic SVs but differs in somatic single-nucleotide variants and copy number variations. The types of SVs in adjacent nontumor and tumor tissue are markedly different, with somatic insertions and deletions identified as early genomic events associated with HCC. Notably, hepatitis B virus (HBV) DNA integration frequently results in the generation of somatic SVs, particularly inducing interchromosomal translocations (TRAs). Although HBV DNA integration into the liver genome occurs randomly, multisite shared HBV-induced SVs are early driving events in the pathogenesis of HCC. Long-read RNA sequencing reveals that some HBV-induced SVs impact cancer-associated genes, with TRAs being capable of inducing the formation of fusion genes. These findings enhance our understanding of somatic SVs in HCC and their role in early tumorigenesis.
Collapse
Affiliation(s)
- Tianfu Zeng
- Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Haotian Liao
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy and Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Lin Xia
- Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Siyao You
- Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Yanqun Huang
- Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Jiaxun Zhang
- Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Yahui Liu
- Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Xuyan Liu
- Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Dan Xie
- Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China;
| |
Collapse
|
22
|
Del Gobbo GF, Boycott KM. The additional diagnostic yield of long-read sequencing in undiagnosed rare diseases. Genome Res 2025; 35:559-571. [PMID: 39900460 PMCID: PMC12047273 DOI: 10.1101/gr.279970.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2025]
Abstract
Long-read sequencing (LRS) is a promising technology positioned to study the significant proportion of rare diseases (RDs) that remain undiagnosed as it addresses many of the limitations of short-read sequencing, detecting and clarifying additional disease-associated variants that may be missed by the current standard diagnostic workflow for RDs. Some key areas where additional diagnostic yields may be realized include: (1) detection and resolution of structural variants (SVs); (2) detection and characterization of tandem repeat expansions; (3) coverage of regions of high sequence similarity; (4) variant phasing; (5) the use of de novo genome assemblies for reference-based or graph genome variant detection; and (6) epigenetic and transcriptomic evaluations. Examples from over 50 studies support that the main areas of added diagnostic yield currently lie in SV detection and characterization, repeat expansion assessment, and phasing (with or without DNA methylation information). Several emerging studies applying LRS in cohorts of undiagnosed RDs also demonstrate that LRS can boost diagnostic yields following negative standard-of-care clinical testing and provide an added yield of 7%-17% following negative short-read genome sequencing. With this evidence of improved diagnostic yield, we discuss the incorporation of LRS into the diagnostic care pathway for undiagnosed RDs, including current challenges and considerations, with the ultimate goal of ending the diagnostic odyssey for countless individuals with RDs.
Collapse
Affiliation(s)
- Giulia F Del Gobbo
- Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, Ontario, Canada K1H 5B2
| | - Kym M Boycott
- Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, Ontario, Canada K1H 5B2;
- Department of Genetics, Children's Hospital of Eastern Ontario, Ottawa, Ontario, Canada K1H 8L1
| |
Collapse
|
23
|
Fernandez-Luna L, Aguilar-Perez C, Grochowski CM, Mehaffey MG, Carvalho CMB, Gonzaga-Jauregui C. Genome-wide maps of highly-similar intrachromosomal repeats that can mediate ectopic recombination in three human genome assemblies. HGG ADVANCES 2025; 6:100396. [PMID: 39722459 PMCID: PMC11794170 DOI: 10.1016/j.xhgg.2024.100396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 12/23/2024] [Accepted: 12/23/2024] [Indexed: 12/28/2024] Open
Abstract
Repeated sequences spread throughout the genome play important roles in shaping the structure of chromosomes and facilitating the generation of new genomic variation through structural rearrangements. Several mechanisms of structural variation formation use shared nucleotide similarity between repeated sequences as substrate for ectopic recombination. We performed genome-wide analyses of direct and inverted intrachromosomal repeated sequence pairs with 200 bp or more and 80% or greater sequence identity in three human genome assemblies, GRCh37, GRCh38, and T2T-CHM13. Overall, the composition and distribution of direct and inverted repeated sequences identified was similar among the three assemblies involving 13%-15% of the haploid genome, with an increased, albeit not significant, number of repeated sequences in T2T-CHM13. Interestingly, the majority of repeated sequences are below 1 kb in length with a median of 84.2% identity, highlighting the potential relevance of smaller, less identical repeats, such as Alu-Alu pairs, for ectopic recombination. We cross-referenced the identified repeated sequences with protein-coding genes to identify those at risk for being involved in genomic rearrangements. Olfactory receptors and immune response genes were enriched among those impacted.
Collapse
Affiliation(s)
- Luis Fernandez-Luna
- International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México
| | - Carlos Aguilar-Perez
- International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México
| | | | | | | | - Claudia Gonzaga-Jauregui
- International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México; Pacific Northwest Research Institute, Seattle, WA, USA.
| |
Collapse
|
24
|
Xu IRL, Danzi MC, Raposo J, Züchner S. The continued promise of genomic technologies and software in neurogenetics. J Neuromuscul Dis 2025:22143602251325345. [PMID: 40208247 DOI: 10.1177/22143602251325345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2025]
Abstract
The continued evolution of genomic technologies over the past few decades has revolutionized the field of neurogenetics, offering profound insights into the genetic underpinnings of neurological disorders. Identification of causal genes for numerous monogenic neurological conditions has informed key aspects of disease mechanisms and facilitated research into critical proteins and molecular pathways, laying the groundwork for therapeutic interventions. However, the question remains: has this transformative trend reached its zenith? In this review, we suggest that despite significant strides in genome sequencing and advanced computational analyses, there is still ample room for methodological refinement. We anticipate further major genetic breakthroughs corresponding with the increased use of long-read genomes, variant calling software, AI tools, and data aggregation databases. Genetic progress has historically been driven by technological advancements from the commercial sector, which are developed in response to academic research needs, creating a continuous cycle of innovation and discovery. This review explores the potential of genomic technologies to address the challenges of neurogenetic disorders. By outlining both established and modern resources, we aim to emphasize the importance of genetic technologies as we enter an era poised for discoveries.
Collapse
Affiliation(s)
- Isaac R L Xu
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Matt C Danzi
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Jacquelyn Raposo
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Stephan Züchner
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| |
Collapse
|
25
|
Mortazavi M, Batalov S, Lenberg J, Blucher C, Omorodion A, Helbling D, Van Der Kraan L, Bezares-Orin Z, Ramalingam A, Bainbridge MN, Sebat J, Besterman AD. Long-Read Genome Sequencing in Clinical Psychiatry: RFX3 Haploinsufficiency in a Hospitalized Adolescent With Autism, Intellectual Disability, and Behavioral Decompensation. Am J Psychiatry 2025:appiajp20240471. [PMID: 40200712 DOI: 10.1176/appi.ajp.20240471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/10/2025]
Affiliation(s)
- Milad Mortazavi
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Sergey Batalov
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Jerica Lenberg
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Corrine Blucher
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Aisha Omorodion
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Daniel Helbling
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Lucita Van Der Kraan
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Zaira Bezares-Orin
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Arivudainambi Ramalingam
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Matthew N Bainbridge
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Jonathan Sebat
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| | - Aaron D Besterman
- Department of Psychiatry (Mortazavi, Omorodion, Sebat, Besterman), Department of Cellular and Molecular Medicine and Pediatrics (Sebat), and Institute for Genomic Medicine (Sebat), University of California San Diego, La Jolla; Rady Children's Institute for Genomic Medicine, San Diego (Batalov, Lenberg, Blucher, Helbling, Van Der Kraan, Bezares-Orin, Ramalingam, Bainbridge, Besterman); Rady Children's Hospital San Diego, San Diego (Omorodion, Besterman); Codified Genomics, Houston (Bainbridge)
| |
Collapse
|
26
|
Zhang S, Xu N, Fu L, Yang X, Ma K, Li Y, Yang Z, Li Z, Feng Y, Jiang X, Han J, Hu R, Zhang L, Lian D, de Gennaro L, Paparella A, Ryabov F, Meng D, He Y, Wu D, Yang C, Mao Y, Bian X, Lu Y, Antonacci F, Ventura M, Shepelev VA, Miga KH, Alexandrov IA, Logsdon GA, Phillippy AM, Su B, Zhang G, Eichler EE, Lu Q, Shi Y, Sun Q, Mao Y. Integrated analysis of the complete sequence of a macaque genome. Nature 2025; 640:714-721. [PMID: 40011769 PMCID: PMC12003069 DOI: 10.1038/s41586-025-08596-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 01/03/2025] [Indexed: 02/28/2025]
Abstract
The crab-eating macaques (Macaca fascicularis) and rhesus macaques (Macaca mulatta) are pivotal in biomedical and evolutionary research1-3. However, their genomic complexity and interspecies genetic differences remain unclear4. Here, we present a complete genome assembly of a crab-eating macaque, revealing 46% fewer segmental duplications and 3.83 times longer centromeres than those of humans5,6. We also characterize 93 large-scale genomic differences between macaques and humans at a single-base-pair resolution, highlighting their impact on gene regulation in primate evolution. Using ten long-read macaque genomes, hundreds of short-read macaque genomes and full-length transcriptome data, we identified roughly 2 Mbp of fixed-genetic variants, roughly 240 Mbp of complex loci, 16.76 Mbp genetic differentiation regions and 110 alternative splice events, potentially associated with various phenotypic differences between the two macaque species. In summary, the integrated genetic analysis enhances understanding of lineage-specific phenotypes, adaptation and primate evolution, thereby improving their biomedical applications in human disease research.
Collapse
Affiliation(s)
- Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
| | - Ning Xu
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Lianting Fu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Kaiyue Ma
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yamei Li
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Zikun Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Zhengtong Li
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yu Feng
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
| | - Xinrui Jiang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Junmin Han
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Ruixing Hu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Lu Zhang
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Da Lian
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Luciana de Gennaro
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Annalisa Paparella
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Dan Meng
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yaoxi He
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
| | - Dongya Wu
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- School of Medicine, Zhejiang University, Hangzhou, China
| | - Chentao Yang
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Yuxiang Mao
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xinyan Bian
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yong Lu
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Francesca Antonacci
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Mario Ventura
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Valery A Shepelev
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Karen H Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ivan A Alexandrov
- Department of Anatomy and Anthropology and Department of Human Molecular Genetics and Biochemistry, Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Glennis A Logsdon
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Adam M Phillippy
- Center for Genomics and Data Science Research, Genome Informatics Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Bing Su
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Guojie Zhang
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- School of Medicine, Zhejiang University, Hangzhou, China
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Qing Lu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Qiang Sun
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China.
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China.
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
- University of Chinese Academy of Sciences, Beijing, China.
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China.
- Shanghai Key Laboratory of Embryo Original Diseases, International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
27
|
Nachtigall PG, Nystrom GS, Broussard EM, Wray KP, Junqueira-de-Azevedo ILM, Parkinson CL, Margres MJ, Rokyta DR. A Segregating Structural Variant Defines Novel Venom Phenotypes in the Eastern Diamondback Rattlesnake. Mol Biol Evol 2025; 42:msaf058. [PMID: 40101100 PMCID: PMC11965796 DOI: 10.1093/molbev/msaf058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 03/03/2025] [Accepted: 03/06/2025] [Indexed: 03/20/2025] Open
Abstract
Of all mutational mechanisms contributing to phenotypic variation, structural variants are both among the most capable of causing major effects as well as the most technically challenging to identify. Intraspecific variation in snake venoms is widely reported, and one of the most dramatic patterns described is the parallel evolution of streamlined neurotoxic rattlesnake venoms from hemorrhagic ancestors by means of deletion of snake venom metalloproteinase (SVMP) toxins and recruitment of neurotoxic dimeric phospholipase A2 (PLA2) toxins. While generating a haplotype-resolved, chromosome-level genome assembly for the eastern diamondback rattlesnake (Crotalus adamanteus), we discovered that our genome animal was heterozygous for a ∼225 Kb deletion containing six SVMP genes, paralleling one of the two steps involved in the origin of neurotoxic rattlesnake venoms. Range-wide population-genomic analysis revealed that, although this deletion is rare overall, it is the dominant homozygous genotype near the northwestern periphery of the species' range, where this species is vulnerable to extirpation. Although major SVMP deletions have been described in at least five other rattlesnake species, C. adamanteus is unique in not additionally gaining neurotoxic PLA2s. Previous work established a superficially complementary north-south gradient in myotoxin (MYO) expression based on copy number variation with high expression in the north and low in the south, yet we found that the SVMP and MYO genotypes vary independently, giving rise to an array of diverse, novel venom phenotypes across the range. Structural variation, therefore, forms the basis for the major axes of geographic venom variation for C. adamanteus.
Collapse
Affiliation(s)
- Pedro G Nachtigall
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
- Laboratório de Toxinologia Aplicada, CeTICS, Instituto Butantan, São Paulo, SP, Brazil
| | - Gunnar S Nystrom
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
| | - Emilie M Broussard
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
| | - Kenneth P Wray
- Biodiversity Center, University of Texas at Austin, Austin, TX, USA
| | | | | | - Mark J Margres
- Department of Integrative Biology, University of South Florida, Tampa, FL, USA
| | - Darin R Rokyta
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
| |
Collapse
|
28
|
Moya ND, Yan SM, McCoy RC, Andersen EC. The long and short of hyperdivergent regions. Trends Genet 2025; 41:303-314. [PMID: 39706705 PMCID: PMC11981857 DOI: 10.1016/j.tig.2024.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 11/11/2024] [Accepted: 11/14/2024] [Indexed: 12/23/2024]
Abstract
The increasing prevalence of genome sequencing and assembly has uncovered evidence of hyperdivergent genomic regions - loci with excess genetic diversity - in species across the tree of life. Hyperdivergent regions are often enriched for genes that mediate environmental responses, such as immunity, parasitism, and sensory perception. Especially in self-fertilizing species where the majority of the genome is homozygous, the existence of hyperdivergent regions might imply the historical action of evolutionary forces such as introgression and/or balancing selection. We anticipate that the application of new sequencing technologies, broader taxonomic sampling, and evolutionary modeling of hyperdivergent regions will provide insights into the mechanisms that generate and maintain genetic diversity within and between species.
Collapse
Affiliation(s)
- Nicolas D Moya
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Stephanie M Yan
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
| | - Erik C Andersen
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
29
|
Farstad-O’Halloran K, Sooda A, Iqbal T, Wilton S, Aung-Htut MT. Discovery of Novel APOC3 Isoforms in Hepatic and Intestinal Cell Models Using Long-Read RNA Sequencing. Genes (Basel) 2025; 16:412. [PMID: 40282372 PMCID: PMC12027394 DOI: 10.3390/genes16040412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2025] [Revised: 03/28/2025] [Accepted: 03/29/2025] [Indexed: 04/29/2025] Open
Abstract
BACKGROUND Apolipoprotein C-III (APOC3) plays a crucial role in triglyceride metabolism and is closely associated with cardiovascular disease risk. Elevated APOC3 levels contribute to higher plasma triglycerides and increased risk of atherosclerosis, making APOC3 expression an attractive and logical therapeutic target. METHODS While studying various APOC3 transcript isoforms expressed in hepatoma cell lines (HepG2, Huh7) and healthy liver tissue using publicly available long-read RNA sequencing, we found three novel APOC3 isoforms. These isoforms were validated through RT-PCR and Sanger sequencing. RESULTS All three novel isoforms are splicing variants of the MANE transcript, APOC3-201. Isoforms 1 and 2 exhibit splicing patterns similar to APOC3-201 from exons 2-4; however, isoform 1 shares its exon 1 splicing pattern with APOC3-203, while isoform 2 features an extended exon 1 that includes exon 1a, the adjacent intronic region, and exon 1b. The third isoform closely resembles APOC3-201, but lacks exon 2, which contains the translation start codon. Remarkably, similar APOC3 splicing patterns and transcript variants were observed in Caco-2 cells, a model of the small intestine, indicating that these isoforms are not liver-specific. CONCLUSIONS This study identifies three novel APOC3 isoforms and highlights their expression in both hepatic and intestinal cell models. Further studies are needed to elucidate the functional roles of these novel isoforms and their contribution to the regulation of APOC3 gene expression.
Collapse
Affiliation(s)
- Kara Farstad-O’Halloran
- Personalised Medicine Centre, Health Futures Institute, Murdoch University, Murdoch, Perth, WA 6150, Australia (A.S.); (S.W.)
| | - Anuradha Sooda
- Personalised Medicine Centre, Health Futures Institute, Murdoch University, Murdoch, Perth, WA 6150, Australia (A.S.); (S.W.)
| | - Tooba Iqbal
- Personalised Medicine Centre, Health Futures Institute, Murdoch University, Murdoch, Perth, WA 6150, Australia (A.S.); (S.W.)
| | - Steve Wilton
- Personalised Medicine Centre, Health Futures Institute, Murdoch University, Murdoch, Perth, WA 6150, Australia (A.S.); (S.W.)
- Perron Institute for Neurological and Translational Science, The University of Western Australia, Nedlands, Perth, WA 6009, Australia
| | - May T. Aung-Htut
- Personalised Medicine Centre, Health Futures Institute, Murdoch University, Murdoch, Perth, WA 6150, Australia (A.S.); (S.W.)
- Perron Institute for Neurological and Translational Science, The University of Western Australia, Nedlands, Perth, WA 6009, Australia
| |
Collapse
|
30
|
Saunders CT, Holt JM, Baker DN, Lake JA, Belyeu JR, Kronenberg Z, Rowell WJ, Eberle MA. Sawfish: improving long-read structural variant discovery and genotyping with local haplotype modeling. Bioinformatics 2025; 41:btaf136. [PMID: 40203061 PMCID: PMC12000528 DOI: 10.1093/bioinformatics/btaf136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 02/20/2025] [Accepted: 04/07/2025] [Indexed: 04/11/2025] Open
Abstract
MOTIVATION Structural variants (SVs) play an important role in evolutionary and functional genomics but are challenging to characterize. High-accuracy, long-read sequencing can substantially improve SV characterization when coupled with effective calling methods. While state-of-the-art long-read SV callers are highly accurate, further improvements are achievable by systematically modeling local haplotypes during SV discovery and genotyping. RESULTS We describe sawfish, an SV caller for mapped high-quality long reads incorporating systematic SV haplotype modeling to improve accuracy and resolution. Assessment against the draft Genome in a Bottle (GIAB) SV benchmark from the T2T-HG002-Q100 diploid assembly shows that sawfish has the highest accuracy among state-of-the-art long-read SV callers across every tested SV size group. Additionally, sawfish maintains the highest accuracy at every tested depth level from 10- to 32-fold coverage, such that other callers required at least 30-fold coverage to match sawfish accuracy at 15-fold coverage. Sawfish also shows the highest accuracy in the GIAB challenging medically relevant genes benchmark, demonstrating improvements in both comprehensive and medically relevant contexts.When joint-genotyping seven samples from CEPH-1463, sawfish has over 9000 more pedigree-concordant calls than other state-of-the-art SV callers, with the highest proportion of concordant SVs (81%). Sawfish's quality model enables selection for an even higher proportion of concordant SVs (88%), while still calling nearly 5000 more pedigree-concordant SVs than other callers. These results demonstrate that sawfish improves on the state-of-the-art for long-read SV calling accuracy across both individual and joint-sample analyses. AVAILABILITY AND IMPLEMENTATION Sawfish source code, pre-compiled Linux binaries, and documentation are released on GitHub: https://github.com/PacificBiosciences/sawfish.
Collapse
Affiliation(s)
| | - James M Holt
- Computational Biology, PacBio, Menlo Park, CA 94025, United States
| | - Daniel N Baker
- Computational Biology, PacBio, Menlo Park, CA 94025, United States
| | - Juniper A Lake
- Computational Biology, PacBio, Menlo Park, CA 94025, United States
| | | | - Zev Kronenberg
- Computational Biology, PacBio, Menlo Park, CA 94025, United States
| | - William J Rowell
- Computational Biology, PacBio, Menlo Park, CA 94025, United States
| | - Michael A Eberle
- Computational Biology, PacBio, Menlo Park, CA 94025, United States
| |
Collapse
|
31
|
Rao J, Luo H, An D, Liang X, Peng L, Chen F. Performance evaluation of structural variation detection using DNBSEQ whole-genome sequencing. BMC Genomics 2025; 26:299. [PMID: 40133825 PMCID: PMC11938577 DOI: 10.1186/s12864-025-11494-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Accepted: 03/17/2025] [Indexed: 03/27/2025] Open
Abstract
BACKGROUND DNBSEQ platforms have been widely used for variation detection, including single-nucleotide variants (SNVs) and short insertions and deletions (INDELs), which is comparable to Illumina. However, the performance and even characteristics of structural variations (SVs) detection using DNBSEQ platforms are still unclear. RESULTS In this study, we assessed the detection of SVs using 40 tools on eight DNBSEQ whole-genome sequencing (WGS) datasets and two Illumina WGS datasets of NA12878. Our findings confirmed that the performance of SVs detection using the same tool on DNBSEQ and Illumina datasets was highly consistent, with correlations greater than 0.80 on metrics of number, size, precision and sensitivity, respectively. Furthermore, we constructed a "DNBSEQ" SV set (4,785 SVs) from the DNBSEQ datasets and an "Illumina" SV set (6,797 SVs) from the Illumina datasets. We found that these two SV sets were highly consistent of SV sites and genomic characteristics, including repetitive regions, GC distribution, difficult-to-sequence regions, and gene features, indicating the robustness of our comparative analysis and highlights the value of both platforms in understanding the genomic context of SVs. CONCLUSIONS Our study systematically analyzed and characterized germline SVs detected on WGS datasets sequenced from DNBSEQ platforms, providing a benchmark resource for further studies of SVs using DNBSEQ platforms.
Collapse
Affiliation(s)
- Junhua Rao
- MGI Tech, Shenzhen, 518083, China
- BGI, Shenzhen, 518083, China
| | | | - Dan An
- MGI Tech, Shenzhen, 518083, China
- BGI, Shenzhen, 518083, China
| | - Xinming Liang
- MGI Tech, Shenzhen, 518083, China
- BGI, Shenzhen, 518083, China
| | | | - Fang Chen
- MGI Tech, Shenzhen, 518083, China.
- BGI, Shenzhen, 518083, China.
| |
Collapse
|
32
|
Yang Q, Sun J, Wang X, Wang J, Liu Q, Ru J, Zhang X, Wang S, Hao R, Bian P, Dai X, Gong M, Zhang Z, Wang A, Bai F, Li R, Cai Y, Jiang Y. SVLearn: a dual-reference machine learning approach enables accurate cross-species genotyping of structural variants. Nat Commun 2025; 16:2406. [PMID: 40069188 PMCID: PMC11897243 DOI: 10.1038/s41467-025-57756-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Accepted: 03/04/2025] [Indexed: 03/15/2025] Open
Abstract
Structural variations (SVs) are diverse forms of genetic alterations and drive a wide range of human diseases. Accurately genotyping SVs, particularly occurring at repetitive genomic regions, from short-read sequencing data remains challenging. Here, we introduce SVLearn, a machine-learning approach for genotyping bi-allelic SVs. It exploits a dual-reference strategy to engineer a curated set of genomic, alignment, and genotyping features based on a reference genome in concert with an allele-based alternative genome. Using 38,613 human-derived SVs, we show that SVLearn significantly outperforms four state-of-the-art tools, with precision improvements of up to 15.61% for insertions and 13.75% for deletions in repetitive regions. On two additional sets of 121,435 cattle SVs and 113,042 sheep SVs, SVLearn demonstrates a strong generalizability to cross-species genotype SVs with a weighted genotype concordance score of up to 90%. Notably, SVLearn enables accurate genotyping of SVs at low sequencing coverage, which is comparable to the accuracy at 30× coverage. Our studies suggest that SVLearn can accelerate the understanding of associations between the genome-scale, high-quality genotyped SVs and diseases across multiple species.
Collapse
Affiliation(s)
- Qimeng Yang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Xinyu Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Jiong Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Quanzhong Liu
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Jinlong Ru
- Institute of Virology, Helmholtz Centre Munich - German Research Centre for Environmental Health, Neuherberg, Germany
| | - Xin Zhang
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Sizhe Wang
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Ran Hao
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Peipei Bian
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Xuelei Dai
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
- Yazhouwan National Laboratory, Sanya, Hainan, China
| | - Mian Gong
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
- State Key Laboratory of Animal Biotech Breeding, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China
| | - Zhuangbiao Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Ao Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Fengting Bai
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Ran Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Yudong Cai
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China.
| | - Yu Jiang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China.
| |
Collapse
|
33
|
English AC, Dolzhenko E, Ziaei Jam H, McKenzie SK, Olson ND, De Coster W, Park J, Gu B, Wagner J, Eberle MA, Gymrek M, Chaisson MJP, Zook JM, Sedlazeck FJ. Analysis and benchmarking of small and large genomic variants across tandem repeats. Nat Biotechnol 2025; 43:431-442. [PMID: 38671154 PMCID: PMC11952744 DOI: 10.1038/s41587-024-02225-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 03/28/2024] [Indexed: 04/28/2024]
Abstract
Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits and are linked to over 60 disease phenotypes. However, they are often excluded from at-scale studies because of challenges with variant calling and representation, as well as a lack of a genome-wide standard. Here, to promote the development of TR methods, we created a catalog of TR regions and explored TR properties across 86 haplotype-resolved long-read human assemblies. We curated variants from the Genome in a Bottle (GIAB) HG002 individual to create a TR dataset to benchmark existing and future TR analysis methods. We also present an improved variant comparison method that handles variants greater than 4 bp in length and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ~24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 'truth-set' TR benchmark. We demonstrate the utility of this pipeline across short-read and long-read technologies.
Collapse
Affiliation(s)
- Adam C English
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| | | | - Helyaneh Ziaei Jam
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
| | | | - Nathan D Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Jonghun Park
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Bida Gu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | | | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
34
|
Solovyov A, Behr JM, Hoyos D, Banks E, Drong AW, Thornlow B, Zhong JZ, Garcia-Rivera E, McKerrow W, Chu C, Arisdakessian C, Zaller DM, Kamihara J, Diao L, Fromer M, Greenbaum BD. Pan-cancer multi-omic model of LINE-1 activity reveals locus heterogeneity of retrotransposition efficiency. Nat Commun 2025; 16:2049. [PMID: 40021663 PMCID: PMC11871128 DOI: 10.1038/s41467-025-57271-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 02/12/2025] [Indexed: 03/03/2025] Open
Abstract
Somatic mobilization of LINE-1 (L1) has been implicated in cancer etiology. We analyzed a recent TCGA data release comprised of nearly 5000 pan-cancer paired tumor-normal whole-genome sequencing (WGS) samples and ~9000 tumor RNA samples. We developed TotalReCall an improved algorithm and pipeline for detection of L1 retrotransposition (RT), finding high correlation between L1 expression and "RT burden" per sample. Furthermore, we mathematically model the dual regulatory roles of p53, where mutations in TP53 disrupt regulation of both L1 expression and retrotransposition. We found those with Li-Fraumeni Syndrome (LFS) heritable TP53 pathogenic and likely pathogenic variants bear similarly high L1 activity compared to matched cancers from patients without LFS, suggesting this population be considered in attempts to target L1 therapeutically. Due to improved sensitivity, we detect over 10 genes beyond TP53 whose mutations correlate with L1, including ATRX, suggesting other, potentially targetable, mechanisms underlying L1 regulation in cancer remain to be discovered.
Collapse
Affiliation(s)
- Alexander Solovyov
- Halvorsen Center for Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| | | | - David Hoyos
- Halvorsen Center for Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Eric Banks
- ROME Therapeutics, Inc., Boston, MA, USA
- Acorn Biosciences, Cambridge, MA, USA
| | | | | | | | | | | | - Chong Chu
- ROME Therapeutics, Inc., Boston, MA, USA
| | | | | | - Junne Kamihara
- Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | | | | | - Benjamin D Greenbaum
- Halvorsen Center for Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Physiology, Biophysics & Systems Biology, Weill Cornell Medical College, New York, NY, USA.
| |
Collapse
|
35
|
Bozkurt-Yozgatli T, Lun MY, Bengtsson JD, Sezerman U, Chinn IK, Coban-Akdemir Z, Carvalho CMB. Investigation of a pathogenic inversion in UNC13D and comprehensive analysis of chromosomal inversions across diverse datasets. Eur J Hum Genet 2025:10.1038/s41431-025-01817-w. [PMID: 40021841 DOI: 10.1038/s41431-025-01817-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2024] [Revised: 12/19/2024] [Accepted: 02/12/2025] [Indexed: 03/03/2025] Open
Abstract
Inversions are known contributors to the pathogenesis of genetic diseases. Identifying inversions poses significant challenges, making it one of the most demanding structural variants (SVs) to detect and interpret. Recent advancements in sequencing technologies and the development of publicly available SV datasets have substantially enhanced our capability to explore inversions. However, a cross-comparison in those datasets remains unexplored. In this study, we reported a proband with familial hemophagocytic lymphohistiocytosis type-3 carrying a splicing variant (c.1389+1G>A) in trans with an inversion present in 0.006345% of individuals in gnomAD (v4.0) that disrupts UNC13D. Based on this result, we investigate the features of potentially pathogenic inversions in gnomAD which revealed 98.9% of them are rare and disrupt 5% of protein-coding genes associated with a phenotype in OMIM. We then conducted a comparative analysis of additional public datasets, including DGV, 1KGP, and two recent studies from the Human Genome Structural Variation Consortium which revealed common and dataset-specific inversion characteristics suggesting methodology detection biases. Next, we investigated the genetic features of inversions disrupting the protein-coding genes. Notably, we found that the majority of protein-coding genes in OMIM disrupted by inversions are associated with autosomal recessive phenotypes supporting the hypothesis that inversions in trans with other variants are potential hidden causes of monogenic diseases. This effort aims to fill the gap in our understanding of the molecular characteristics of inversions with low frequency in the population and highlight the importance of identifying them in rare disease studies.
Collapse
Affiliation(s)
- Tugce Bozkurt-Yozgatli
- Department of Biostatistics and Bioinformatics, Institute of Health Sciences, Acibadem Mehmet Ali Aydinlar University, Istanbul, Turkey
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Ming Yin Lun
- Pacific Northwest Research Institute, Seattle, WA, USA
| | | | - Ugur Sezerman
- Department of Biostatistics and Bioinformatics, Institute of Health Sciences, Acibadem Mehmet Ali Aydinlar University, Istanbul, Turkey
- Department of Biostatistics and Medical Informatics, School of Medicine, Acibadem Mehmet Ali Aydinlar University, Istanbul, Turkey
| | - Ivan K Chinn
- Department of Pediatrics, Division of Immunology, Allergy, and Retrovirology, Baylor College of Medicine and Texas Children's Hospital, Houston, TX, USA
- Center for Human Immunobiology of Texas Children's Hospital/Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Zeynep Coban-Akdemir
- Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA.
| | | |
Collapse
|
36
|
Qiao X, Shi J, Xu H, Liu K, Pu Y, Xue X, Zheng W, Guo Y, Ma H, Wang CC, Bitsue HK, Xu X, Wang S, Zhao J, Guo X, Hou X, Wang X, Peng L, Qiu Z, Su B, Tang W, He Y, Guo J, Yang Z. Genetic diversity and dietary adaptations of the Central Plains Han Chinese population in East Asia. Commun Biol 2025; 8:291. [PMID: 39987348 PMCID: PMC11846999 DOI: 10.1038/s42003-025-07760-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Accepted: 02/17/2025] [Indexed: 02/24/2025] Open
Abstract
The Central Plains Han Chinese (CPHC) is the typical agricultural population of East Asia. Investigating the genome of the CPHC is crucial to understanding the genetic structure and adaptation of the modern humans in East Asia. Here, we perform whole genome sequencing of 492 CPHC individuals and obtained 22.65 million SNPs, 4.26 million INDELs and 41,959 SVs. We found the CPHC has a higher level of genetic diversity and the glycolipid metabolic genes show strong selection signals, e.g. LONP2, FADS2, FGF21 and SLC19A2. Ancient DNA analyses suggest that the domestication of crops, which drove the emergence of the candidate mutations. Notably, East Asian-specific SVs, e.g., DEL_21699 (LINC01749) and DEL_38406 (FAM102A) may be associated with the high prevalence of esophageal squamous carcinoma and primary angle-closure glaucoma. Our results provide an important genetic resource and show that dietary adaptations play an important role in phenotypic evolution in East Asian populations.
Collapse
Affiliation(s)
- Xiaoyang Qiao
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Jianxiang Shi
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Hongen Xu
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Kai Liu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Youwei Pu
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Xia Xue
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Wangshan Zheng
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yongbo Guo
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Hao Ma
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, China
| | - Chuan-Chao Wang
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, China
| | - Habtom K Bitsue
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Xiaoyu Xu
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Shanshan Wang
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Jingru Zhao
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Xiangqian Guo
- Zhongyuan Intelligent Medical Laboratory, School of Basic Medical Sciences, Henan University, Kaifeng, China
| | - Xinyue Hou
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Xinwei Wang
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Lei Peng
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Zan Qiu
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China
| | - Bing Su
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Wenxue Tang
- The Research and Application Center of Precision Medicine, Departments of Otolaryngology, The Second Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
| | - Yaoxi He
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
| | - Jiancheng Guo
- The Research and Application Center of Precision Medicine, The Second Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
| | - Zhaohui Yang
- Tianjian Laboratory of Advanced Biomedical Sciences, Academy of Medical Science, Zhengzhou University, Zhengzhou, China.
| |
Collapse
|
37
|
Nesta A, Veiga DFT, Banchereau J, Anczukow O, Beck CR. Alternative splicing of transposable elements in human breast cancer. Mob DNA 2025; 16:6. [PMID: 39987084 PMCID: PMC11846448 DOI: 10.1186/s13100-025-00341-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Accepted: 01/09/2025] [Indexed: 02/24/2025] Open
Abstract
Transposable elements (TEs) drive genome evolution and can affect gene expression through diverse mechanisms. In breast cancer, disrupted regulation of TE sequences may facilitate tumor-specific transcriptomic alterations. We examine 142,514 full-length isoforms derived from long-read RNA sequencing (LR-seq) of 30 breast samples to investigate the effects of TEs on the breast cancer transcriptome. Approximately half of these isoforms contain TE sequences, and these contribute to half of the novel annotated splice junctions. We quantify splicing of these LR-seq derived isoforms in 1,135 breast tumors from The Cancer Genome Atlas (TCGA) and 1,329 healthy tissue samples from the Genotype-Tissue Expression (GTEx), and find 300 TE-overlapping tumor-specific splicing events. Some splicing events are enriched in specific breast cancer subtypes - for example, a TE-driven transcription start site upstream of ERBB2 in HER2 + tumors, and several TE-mediated splicing events are associated with patient survival and poor prognosis. The full-length sequences we capture with LR-seq reveal thousands of isoforms with signatures of RNA editing, including a novel isoform belonging to RHOA; a gene previously implicated in tumor progression. We utilize our full-length isoforms to discover polymorphic TE insertions that alter splicing and validate one of these events in breast cancer cell lines. Together, our results demonstrate the widespread effects of dysregulated TEs on breast cancer transcriptomes and highlight the advantages of long-read isoform sequencing for understanding TE biology. TE-derived isoforms may alter the expression of genes important in cancer and can potentially be used as novel, disease-specific therapeutic targets or biomarkers.One sentence summary: Transposable elements generate alternative isoforms and alter post-transcriptional regulation in human breast cancer.
Collapse
Affiliation(s)
- Alex Nesta
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA.
| | - Diogo F T Veiga
- Department of Translational Medicine, School of Medical Sciences, University of Campinas (UNICAMP), Campinas, SP, 13083, Brazil
| | - Jacques Banchereau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
- Immunoledge LLC, Montclair, NJ, 07042, USA
| | - Olga Anczukow
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, 06269, USA
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA.
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, 06269, USA.
| |
Collapse
|
38
|
Liang SA, Ren T, Zhang J, He J, Wang X, Jiang X, He Y, McCoy RC, Fu Q, Akey JM, Mao Y, Chen L. A refined analysis of Neanderthal-introgressed sequences in modern humans with a complete reference genome. Genome Biol 2025; 26:32. [PMID: 39962554 PMCID: PMC11834205 DOI: 10.1186/s13059-025-03502-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Accepted: 02/11/2025] [Indexed: 02/20/2025] Open
Abstract
BACKGROUND Leveraging long-read sequencing technologies, the first complete human reference genome, T2T-CHM13, corrects assembly errors in previous references and resolves the remaining 8% of the genome. While studies on archaic admixture in modern humans have so far relied on the GRCh37 reference due to the availability of archaic genome data, the impact of T2T-CHM13 in this field remains unexplored. RESULTS We remap the sequencing reads of the high-quality Altai Neanderthal and Denisovan genomes onto GRCh38 and T2T-CHM13. Compared to GRCh37, we find that T2T-CHM13 significantly improves read mapping quality in archaic samples. We then apply IBDmix to identify Neanderthal-introgressed sequences in 2504 individuals from 26 geographically diverse populations using different reference genomes. We observe that commonly used pre-phasing filtering strategies in public datasets substantially influence archaic ancestry determination, underscoring the need for careful filter selection. Our analysis identifies approximately 51 Mb of Neanderthal sequences unique to T2T-CHM13, predominantly in genomic regions where GRCh38 and T2T-CHM13 assemblies diverge. Additionally, we uncover novel instances of population-specific archaic introgression in diverse populations, spanning genes involved in metabolism, olfaction, and ion-channel function. Finally, to facilitate the exploration of archaic alleles and adaptive signals in human genomics and evolutionary research, we integrate these introgressed sequences and adaptive signals across all reference genomes into a visualization database, ASH ( www.arcseqhub.com ). CONCLUSIONS Our study enhances the detection of archaic variations in modern humans, highlights the importance of utilizing the T2T-CHM13 reference, and provides novel insights into the functional consequences of archaic hominin admixture.
Collapse
Affiliation(s)
- Shen-Ao Liang
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Tianxin Ren
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Jiayu Zhang
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Jiahui He
- Ministry of Education Key Laboratory of Contemporary Anthropology, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Xuankai Wang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Xinrui Jiang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Yuan He
- Ministry of Education Key Laboratory of Contemporary Anthropology, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, 21212, USA
| | - Qiaomei Fu
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, 100044, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Joshua M Akey
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, 08540, USA
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, 200030, China.
- Center for Genomic Research, International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University, Yiwu, 322000, China.
| | - Lu Chen
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, School of Life Science, Fudan University, Shanghai, 200438, China.
| |
Collapse
|
39
|
Catlin NS, Agha HI, Platts AE, Munasinghe M, Hirsch CN, Josephs EB. Structural Variants Contribute to Phenotypic Variation in Maize. Mol Ecol 2025:e17662. [PMID: 39945381 DOI: 10.1111/mec.17662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 12/04/2024] [Accepted: 12/31/2024] [Indexed: 02/19/2025]
Abstract
Comprehensively identifying the loci shaping trait variation has been challenging, in part because standard approaches often miss many types of genetic variants. Structural variants (SVs), especially transposable elements (TEs), are likely to affect phenotypic variation but we lack methods that can detect polymorphic SVs and TEs using short-read sequencing data. Here, we used a whole genome alignment between two maize genotypes to identify polymorphic SVs and then genotyped a large maize diversity panel for these variants using short-read sequencing data. After characterising SV variation in the panel, we identified SV polymorphisms that are associated with life history traits and genotype-by-environment (GxE) interactions. While most of the SVs associated with traits contained TEs, only two of the SVs had boundaries that clearly matched TE breakpoints indicative of a TE insertion, while the other polymorphisms were likely caused by deletions. One of the SVs that appeared to be caused by a TE insertion had the most associations with gene expression compared to other trait-associated SVs. All of the SVs associated with traits were in linkage disequilibrium with nearby single nucleotide polymorphisms (SNPs), suggesting that the approach used here did not identify unique associations that would have been missed in a SNP association study. Overall, we have (1) created a technique to genotype SV polymorphisms across a large diversity panel using support from genomic short-read sequencing alignments and (2) connected this presence/absence SV variation to diverse traits and GxE interactions.
Collapse
Affiliation(s)
- Nathan S Catlin
- Department of Plant Biology, Michigan State University, East Lansing, Michigan, USA
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, Michigan, USA
- Plant Resilience Institute, Michigan State University, East Lansing, Michigan, USA
| | - Husain I Agha
- Department of Plant Biology, Michigan State University, East Lansing, Michigan, USA
- Plant Resilience Institute, Michigan State University, East Lansing, Michigan, USA
| | - Adrian E Platts
- Department of Plant Biology, Michigan State University, East Lansing, Michigan, USA
| | - Manisha Munasinghe
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, Minnesota, USA
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota, USA
| | - Emily B Josephs
- Department of Plant Biology, Michigan State University, East Lansing, Michigan, USA
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, Michigan, USA
- Plant Resilience Institute, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|
40
|
Gong J, Sun H, Wang K, Zhao Y, Huang Y, Chen Q, Qiao H, Gao Y, Zhao J, Ling Y, Cao R, Tan J, Wang Q, Ma Y, Li J, Luo J, Wang S, Wang J, Zhang G, Xu S, Qian F, Zhou F, Tang H, Li D, Sedlazeck FJ, Jin L, Guan Y, Fan S. Long-read sequencing of 945 Han individuals identifies structural variants associated with phenotypic diversity and disease susceptibility. Nat Commun 2025; 16:1494. [PMID: 39929826 PMCID: PMC11811171 DOI: 10.1038/s41467-025-56661-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 01/22/2025] [Indexed: 02/13/2025] Open
Abstract
Genomic structural variants (SVs) are a major source of genetic diversity in humans. Here, through long-read sequencing of 945 Han Chinese genomes, we identify 111,288 SVs, including 24.56% unreported variants, many with predicted functional importance. By integrating human population-level phenotypic and multi-omics data as well as two humanized mouse models, we demonstrate the causal roles of two SVs: one SV that emerges at the common ancestor of modern humans, Neanderthals, and Denisovans in GSDMD for bone mineral density and one modern-human-specific SV in WWP2 impacting height, weight, fat, craniofacial phenotypes and immunity. Our results suggest that the GSDMD SV could serve as a rapid and cost-effective biomarker for assessing the risk of cisplatin-induced acute kidney injury. The functional conservation from human to mouse and widespread signals of positive natural selection suggest that both SVs likely influence local adaptation, phenotypic diversity, and disease susceptibility across diverse human populations.
Collapse
Affiliation(s)
- Jiao Gong
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Huiru Sun
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Kaiyuan Wang
- Shanghai Frontiers Science Center of Genome Editing and Cell Therapy, Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Yanhui Zhao
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Yechao Huang
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Qinsheng Chen
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Hui Qiao
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Yang Gao
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Jialin Zhao
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Yunchao Ling
- Bio-Med Big Data Center, Chinese Academy of Sciences Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Ruifang Cao
- Bio-Med Big Data Center, Chinese Academy of Sciences Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Jingze Tan
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Qi Wang
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Yanyun Ma
- Department of Anthropology and Human Genetics, Institute for Six-sector Economy, and MOE Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, China
| | - Jing Li
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Jingchun Luo
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Sijia Wang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Jiucun Wang
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
- Research Unit of dissecting the population genetics and developing new technologies for treatment and prevention of skin phenotypes and dermatological diseases (2019RU058), Chinese Academy of Medical Sciences, Shanghai, China
| | - Guoqing Zhang
- Bio-Med Big Data Center, Chinese Academy of Sciences Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of the Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Shuhua Xu
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Feng Qian
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Fang Zhou
- School of Data Science and Engineering, East China Normal University, Shanghai, China
| | - Huiru Tang
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China
| | - Dali Li
- Shanghai Frontiers Science Center of Genome Editing and Cell Therapy, Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| | - Li Jin
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China.
- Research Unit of dissecting the population genetics and developing new technologies for treatment and prevention of skin phenotypes and dermatological diseases (2019RU058), Chinese Academy of Medical Sciences, Shanghai, China.
| | - Yuting Guan
- Shanghai Frontiers Science Center of Genome Editing and Cell Therapy, Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China.
| | - Shaohua Fan
- State Key Laboratory of Genetic Engineering, Lab for Evolutionary Synthesis, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, China.
| |
Collapse
|
41
|
Hu M, Wan P, Chen C, Tang S, Chen J, Wang L, Chakraborty M, Zhou Y, Chen J, Gaut BS, Emerson J, Liao Y. Benchmarking, detection, and genotyping of structural variants in a population of whole-genome assemblies using the SVGAP pipeline. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.07.637096. [PMID: 39975360 PMCID: PMC11839052 DOI: 10.1101/2025.02.07.637096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Comparisons of complete genome assemblies offer a direct procedure for characterizing all genetic differences among them. However, existing tools are often limited to specific aligners or optimized for specific organisms, narrowing their applicability, particularly for large and repetitive plant genomes. Here, we introduce SVGAP, a pipeline for structural variant (SV) discovery, genotyping, and annotation from high-quality genome assemblies at the population level. Through extensive benchmarks using simulated SV datasets at individual, population, and phylogenetic contexts, we demonstrate that SVGAP performs favorably relative to existing tools in SV discovery. Additionally, SVGAP is one of the few tools to address the challenge of genotyping SVs within large assembled genome samples, and it generates fully genotyped VCF files. Applying SVGAP to 26 maize genomes revealed hidden genomic diversity in centromeres, driven by abundant insertions of centromere-specific LTR-retrotransposons. The output of SVGAP is well-suited for pan-genome construction and facilitates the interpretation of previously unexplored genomic regions.
Collapse
Affiliation(s)
- Ming Hu
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (South China), Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangdong 510642, China
- These authors contributed equally to this work
| | - Penglong Wan
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (South China), Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangdong 510642, China
- These authors contributed equally to this work
| | - Chengjie Chen
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences & National Key Laboratory for Tropical Crop Breeding & Laboratory of Crop Gene Resources and Germplasm Enhancement in South China, Ministry of Agriculture and Rural Affairs & Key Laboratory of Tropical Crops Germplasm Resources Genetic Improvement and Innovation of Hainan Province, Hainan, 571101, China
- These authors contributed equally to this work
| | - Shuyuan Tang
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (South China), Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangdong 510642, China
| | - Jiahao Chen
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (South China), Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangdong 510642, China
| | - Liang Wang
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (South China), Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangdong 510642, China
| | - Mahul Chakraborty
- Department of Biology, Texas A&M University, College Station, TX, 77843, USA
| | - Yongfeng Zhou
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences & National Key Laboratory for Tropical Crop Breeding & Laboratory of Crop Gene Resources and Germplasm Enhancement in South China, Ministry of Agriculture and Rural Affairs & Key Laboratory of Tropical Crops Germplasm Resources Genetic Improvement and Innovation of Hainan Province, Hainan, 571101, China
| | - Jinfeng Chen
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Brandon S. Gaut
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
| | - J.J. Emerson
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
| | - Yi Liao
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (South China), Ministry of Agriculture and Rural Affairs, College of Horticulture, South China Agricultural University, Guangdong 510642, China
| |
Collapse
|
42
|
Negi S, Stenton SL, Berger SI, Canigiula P, McNulty B, Violich I, Gardner J, Hillaker T, O'Rourke SM, O'Leary MC, Carbonell E, Austin-Tse C, Lemire G, Serrano J, Mangilog B, VanNoy G, Kolmogorov M, Vilain E, O'Donnell-Luria A, Délot E, Miga KH, Monlong J, Paten B. Advancing long-read nanopore genome assembly and accurate variant calling for rare disease detection. Am J Hum Genet 2025; 112:428-449. [PMID: 39862869 PMCID: PMC11866955 DOI: 10.1016/j.ajhg.2025.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 12/22/2024] [Accepted: 01/02/2025] [Indexed: 01/27/2025] Open
Abstract
More than 50% of families with suspected rare monogenic diseases remain unsolved after whole-genome analysis by short-read sequencing (SRS). Long-read sequencing (LRS) could help bridge this diagnostic gap by capturing variants inaccessible to SRS, facilitating long-range mapping and phasing and providing haplotype-resolved methylation profiling. To evaluate LRS's additional diagnostic yield, we sequenced a rare-disease cohort of 98 samples from 41 families, using nanopore sequencing, achieving per sample ∼36× average coverage and 32-kb read N50 from a single flow cell. Our Napu pipeline generated assemblies, phased variants, and methylation calls. LRS covered, on average, coding exons in ∼280 genes and ∼5 known Mendelian disease-associated genes that were not covered by SRS. In comparison to SRS, LRS detected additional rare, functionally annotated variants, including structural variants (SVs) and tandem repeats, and completely phased 87% of protein-coding genes. LRS detected additional de novo variants and could be used to distinguish postzygotic mosaic variants from prezygotic de novos. Diagnostic variants were established by LRS in 11 probands, with diverse underlying genetic causes including de novo and compound heterozygous variants, large-scale SVs, and epigenetic modifications. Our study demonstrates LRS's potential to enhance diagnostic yield for rare monogenic diseases, implying utility in future clinical genomics workflows.
Collapse
Affiliation(s)
- Shloka Negi
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Sarah L Stenton
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Seth I Berger
- Children's National Research Institute, Washington, DC, USA
| | | | - Brandy McNulty
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Ivo Violich
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Joshua Gardner
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Todd Hillaker
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Sara M O'Rourke
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Melanie C O'Leary
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Elizabeth Carbonell
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Christina Austin-Tse
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Gabrielle Lemire
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Jillian Serrano
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Brian Mangilog
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Grace VanNoy
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mikhail Kolmogorov
- Cancer Data Science Laboratory, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Eric Vilain
- Institute for Clinical and Translational Science, University of California, Irvine, Irvine, CA, USA
| | - Anne O'Donnell-Luria
- Center for Mendelian Genomics, Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Emmanuèle Délot
- Institute for Clinical and Translational Science, University of California, Irvine, Irvine, CA, USA
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Jean Monlong
- Institut de Recherche en Santé Digestive, Université de Toulouse, INSERM, INRA, ENVT, UPS, Toulouse, France.
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
43
|
Dishuck PC, Munson KM, Lewis AP, Dougherty ML, Underwood JG, Harvey WT, Hsieh P, Pastinen T, Eichler EE. Structural variation, selection, and diversification of the NPIP gene family from the human pangenome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.04.636496. [PMID: 39975192 PMCID: PMC11838601 DOI: 10.1101/2025.02.04.636496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
The NPIP (nuclear pore interacting protein) gene family has expanded to high copy number in humans and African apes where it has been subject to an excess of amino acid replacement consistent with positive selection (1). Due to the limitations of short-read sequencing, NPIP human genetic diversity has been poorly understood. Using highly accurate assemblies generated from long-read sequencing as part of the human pangenome, we completely characterize 169 human haplotypes (4,665 NPIP paralogs and alleles). Of the 28 NPIP paralogs, just three (NPIPB2, B11, and B14) are fixed at a single copy, and only a single locus, B2, shows no structural variation. Four NPIP paralogs map to large segmental duplication blocks that mediate polymorphic inversions (355 kbp-1.6 Mbp) corresponding to microdeletions associated with developmental delay and autism. Haplotype-based tests of positive selection and selective sweeps identify two paralogs, B9 and B15, within the top percentile for both tests. Using full-length cDNA data from 101 tissue/cell types, we construct paralog-specific gene models and show that 56% (31/55 most abundant isoforms) have not been previously described in RefSeq. We define six distinct translation start sites and other protein structural features that distinguish paralogs, including a variable number tandem repeat that encodes a beta helix of variable size that emerged ~3.1 million years ago in human evolution. Among the 28 NPIP paralogs, we identify distinct tissue and developmental patterns of expression with only a few maintaining the ancestral testis-enriched expression. A subset of paralogs (NPIPA1, A5, A6-9, B3-5, and B12/B13) show increased brain expression. Our results suggest ongoing positive selection in the human population and rapid diversification of NPIP gene models.
Collapse
Affiliation(s)
- Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P. Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Max L. Dougherty
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Present address: Tisch Cancer Institute, Division of Hematology and Medical Oncology, The Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jason G. Underwood
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Pacific Biosciences (PacBio) of California, Incorporated, Menlo Park, CA, USA
| | - William T. Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Cell Biology, and Development, Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| | - Tomi Pastinen
- Genomic Medicine Center, Department of Pediatrics, Children’s Mercy Kansas City, Kansas City, KS, USA
- UMKC School of Medicine, University of Missouri, Kansas City, Kansas City, KS, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
44
|
Zhang Z, Gupta I, Pevzner PA. GenomeDecoder: inferring segmental duplications in highly repetitive genomic regions. Bioinformatics 2025; 41:btaf058. [PMID: 39908455 PMCID: PMC11842051 DOI: 10.1093/bioinformatics/btaf058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Revised: 12/02/2024] [Accepted: 02/03/2025] [Indexed: 02/07/2025] Open
Abstract
MOTIVATION The emergence of the 'telomere-to-telomere' genomics brought the challenge of identifying segmental duplications (SDs) in complete genomes. It further opened a possibility for identifying the differences in SDs across individual human genomes and studying the SD evolution. These newly emerged challenges require algorithms for reconstructing SDs in the most complex genomic regions that evaded all previous attempts to analyze their architecture, such as rapidly evolving immunoglobulin loci. RESULTS We describe the GenomeDecoder algorithm for inferring SDs and apply it to analyzing genomic architectures of various loci in primate genomes. Our analysis revealed that multiple duplications/deletions led to a rapid birth/death of immunoglobulin genes within the human population and large changes in genomic architecture of immunoglobulin loci across primate genomes. Comparison of immunoglobulin loci across primate genomes suggests that they are subjected to diversifying selection. AVAILABILITY AND IMPLEMENTATION GenomeDecoder is available at https://github.com/ZhangZhenmiao/GenomeDecoder. The software version and test data used in this paper are uploaded to https://doi.org/10.5281/zenodo.14753844.
Collapse
Affiliation(s)
- Zhenmiao Zhang
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, United States
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR, China
| | - Ishaan Gupta
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, United States
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, United States
| |
Collapse
|
45
|
Chen K, Zhang Y, Pan Y, Xiang X, Peng C, He J, Huang G, Wang Z, Zhao P. Genomic insights into demographic history, structural variation landscape, and complex traits from 514 Hu sheep genomes. J Genet Genomics 2025; 52:245-257. [PMID: 39643267 DOI: 10.1016/j.jgg.2024.11.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 11/21/2024] [Accepted: 11/24/2024] [Indexed: 12/09/2024]
Abstract
Hu sheep is an indigenous breed from the Taihu Lake Plain in China, known for its high fertility. Although Hu sheep belong to the Mongolian group, their demographic history and genetic architecture remain inconclusive. Here, we analyze 697 sheep genomes from representatives of Mongolian sheep breeds. Our study suggests that the ancestral Hu sheep first separated from the Mongolian group approximately 3000 years ago. As Hu sheep migrated from the north and flourished in the Taihu Lake Plain around 1000 years ago, they developed a unique genetic foundation and phenotypic characteristics, which are evident in the genomic footprints of selective sweeps and structural variation landscape. Genes associated with reproductive traits (BMPR1B and TDRD10) and horn phenotype (RXFP2) exhibit notable selective sweeps in the genome of Hu sheep. A genome-wide association analysis reveals that structural variations at LOC101110773, MAST2, and ZNF385B may significantly impact polledness, teat number, and early growth in Hu sheep, respectively. Our study offers insights into the evolutionary history of Hu sheep and may serve as a valuable genetic resource to enhance the understanding of complex traits in Hu sheep.
Collapse
Affiliation(s)
- Kaiyu Chen
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Yuelang Zhang
- Hainan Institute, Zhejiang University, Sanya, Hainan 572000, China
| | - Yizhe Pan
- Agricultural Product Quality and Safety Research Center of Huzhou City, Huzhou, Zhejiang 313000, China
| | - Xin Xiang
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Chen Peng
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China; Hainan Institute, Zhejiang University, Sanya, Hainan 572000, China
| | - Jiayi He
- Hainan Institute, Zhejiang University, Sanya, Hainan 572000, China
| | - Guiqing Huang
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China; Hainan Institute, Zhejiang University, Sanya, Hainan 572000, China
| | - Zhengguang Wang
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China; Hainan Institute, Zhejiang University, Sanya, Hainan 572000, China.
| | - Pengju Zhao
- College of Animal Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China; Hainan Institute, Zhejiang University, Sanya, Hainan 572000, China.
| |
Collapse
|
46
|
He G, Liu C, Wang M. Perspectives and opportunities in forensic human, animal, and plant integrative genomics in the Pangenome era. Forensic Sci Int 2025; 367:112370. [PMID: 39813779 DOI: 10.1016/j.forsciint.2025.112370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 12/24/2024] [Accepted: 01/08/2025] [Indexed: 01/18/2025]
Abstract
The Human Pangenome Reference Consortium, the Chinese Pangenome Consortium, and other plant and animal pangenome projects have announced the completion of pilot work aimed at constructing high-quality, haplotype-resolved reference graph genomes representative of global ethno-linguistically different populations or different plant and animal species. These graph-based, gapless pangenome references, which are enriched in terms of genomic diversity, completeness, and contiguity, have the potential for enhancing long-read sequencing (LRS)-based genomic research, as well as improving mappability and variant genotyping on traditional short-read sequencing platforms. We comprehensively discuss the advancements in pangenome-based genomic integrative genomic discoveries across forensic-related species (humans, animals, and plants) and summarize their applications in variant identification and forensic genomics, epigenetics, transcriptomics, and microbiome research. Recent developments in multiplexed array sequencing have introduced a highly efficient and programmable technique to overcome the limitations of short forensic marker lengths in LRS platforms. This technique enables the concatenation of short RNA transcripts and DNA fragments into LRS-optimal molecules for sequencing, assembly, and genotyping. The integration of new pangenome reference coordinates and corresponding computational algorithms will benefit forensic integrative genomics by facilitating new marker identification, accurate genotyping, high-resolution panel development, and the updating of statistical algorithms. This review highlights the necessity of integrating LRS-based platforms, pangenome-based study designs, and graph-based pangenome references in short-read mapping and LRS-based innovations to achieve precision forensic science.
Collapse
Affiliation(s)
- Guanglin He
- Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu 610000, China; Center for Archaeological Science, Sichuan University, Chengdu 610000, China.
| | - Chao Liu
- Anti-Drug Technology Center of Guangdong Province, Guangzhou 510230, China.
| | - Mengge Wang
- Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu 610000, China; Center for Archaeological Science, Sichuan University, Chengdu 610000, China; Department of Forensic Medicine, College of Basic Medicine, Chongqing Medical University, Chongqing 400331, China.
| |
Collapse
|
47
|
Todd C, Jin L, McQuillan I. SV-JIM, detailed pairwise structural variant calling using long-reads and genome assemblies. Methods 2025; 234:305-313. [PMID: 39826659 DOI: 10.1016/j.ymeth.2024.12.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 12/21/2024] [Accepted: 12/30/2024] [Indexed: 01/22/2025] Open
Abstract
This paper proposes a detailed process for SV calling that permits a data-driven assessment of multiple SV callers that uses both genome assemblies and long-reads. The process is implemented as a software pipeline named Structural Variant - Jaccard Index Measure, or SVJIM, using the Snakemake [20] workflow management system. Like most state-of-the-art SV callers, SV-JIM detects the presence of variations between pairs of genomes, but it streamlines the numerous SV calling stages into a single process for user convenience and evaluates the multiple SV sets produced using the Jaccard index measure to identify those with the highest consistency among the included SV callers. SV-JIM then produces aggregated SV results based on how many callers supported the reported SVs. For validation, SV-JIM was assessed through three case studies on the Homo sapiens genome and two plant genomes - Brassica nigra and Arabidopsis thaliana. Executing SV-JIM identified a significant amount of inter-caller variance which varied by tens of thousands of results on the larger Brassica nigra and Homo sapiens genomes. Further, aggregating the SV sets helped simplify better retention of the less frequently occurring SV types by requiring a level of minimum support rather than from a specific SV caller combination. Finally, these case studies identified a potential for inflated precision reporting that can occur during evaluation. SV-JIM is available publicly under MIT license at https://github.com/USask-BINFO/SV-JIM.
Collapse
Affiliation(s)
- Clarence Todd
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Lingling Jin
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada.
| | - Ian McQuillan
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada.
| |
Collapse
|
48
|
Wang S, Lin J, Jia P, Xu T, Li X, Liu Y, Xu D, Bush SJ, Meng D, Ye K. De novo and somatic structural variant discovery with SVision-pro. Nat Biotechnol 2025; 43:181-185. [PMID: 38519720 PMCID: PMC11825360 DOI: 10.1038/s41587-024-02190-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 02/27/2024] [Indexed: 03/25/2024]
Abstract
Long-read-based de novo and somatic structural variant (SV) discovery remains challenging, necessitating genomic comparison between samples. We developed SVision-pro, a neural-network-based instance segmentation framework that represents genome-to-genome-level sequencing differences visually and discovers SV comparatively between genomes without any prerequisite for inference models. SVision-pro outperforms state-of-the-art approaches, in particular, the resolving of complex SVs is improved, with low Mendelian error rates, high sensitivity of low-frequency SVs and reduced false-positive rates compared with SV merging approaches.
Collapse
Affiliation(s)
- Songbo Wang
- Department of Gynecology and Obstetrics, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Jiadong Lin
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Peng Jia
- Department of Gynecology and Obstetrics, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Tun Xu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Xiujuan Li
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Yuezhuangnan Liu
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, China
| | - Dan Xu
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, China
| | - Stephen J Bush
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
| | - Deyu Meng
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, China
- Macau Institute of Systems Engineering, Macau University of Science and Technology, Taipa, Macau
- Pazhou Laboratory (Huangpu), Guangzhou, Guangdong, China
| | - Kai Ye
- Department of Gynecology and Obstetrics, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China.
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China.
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, China.
- Faculty of Science, Leiden University, Leiden, The Netherlands.
- Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.
| |
Collapse
|
49
|
Mutlu MB, Karakaya T, Çelebi HBG, Duymuş F, Seyhan S, Yılmaz S, Yiş U, Atik T, Yetkin MF, Gümüş H. Utility of Optical Genome Mapping in Repeat Disorders. Clin Genet 2025; 107:188-195. [PMID: 39435674 DOI: 10.1111/cge.14633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 09/30/2024] [Accepted: 10/01/2024] [Indexed: 10/23/2024]
Abstract
Genomic repeat sequences are patterns of nucleic acids that exist in multiple copies throughout the genome. More than 60 Mendelian disorders are caused by the expansion or contraction of these repeats. Various specific methods for determining tandem repeat variations have been developed. However, these methods are highly specific to the genomic region being studied and sometimes require specialized tools. In this study, we have investigated the use of Optical Genome Mapping (OGM) as a diagnostic tool for detecting repeat disorders. We evaluated 19 patients with a prediagnosis of repeat disorders and explained the molecular etiology of 9 of them with OGM (5 patients with Facioscapulohumeral Muscular Dystrophy (FSHD), 2 patients with Friedreich's Ataxia (FA), 1 patient with Fragile X Syndrome (FXS), and 1 patient with Progressive Myoclonic Epilepsy 1A (EPM1A)). We confirmed OGM results with more widely used fragment analysis techniques. This study highlights the utility of OGM as a diagnostic tool for repeat expansion and contraction diseases such as FA, FXS, EPM1A, and FSHD.
Collapse
Affiliation(s)
| | - Taner Karakaya
- Department of Medical Genetics, Samsun Education and Research Hospital, Samsun, Türkiye
| | | | - Fahrettin Duymuş
- Department of Medical Genetics, Konya City Hospital, Konya, Türkiye
| | - Serhat Seyhan
- Laboratory of Genetics, Memorial Şişli Hospital, Istanbul, Türkiye
| | - Sanem Yılmaz
- Department of Pediatrics, Division of Pediatric Neurology, Ege University Faculty of Medicine, Izmir, Türkiye
| | - Uluç Yiş
- Department of Pediatrics, Division of Pediatric Neurology, Dokuz Eylül University Faculty of Medicine, Izmir, Türkiye
| | - Tahir Atik
- Department of Pediatrics, Division of Pediatric Genetics, Ege University Faculty of Medicine, Izmir, Türkiye
| | - Mehmet Fatih Yetkin
- Department of Neurology, Erciyes University Faculty of Medicine, Kayseri, Türkiye
| | - Hakan Gümüş
- Department of Pediatrics, Division of Pediatric Neurology, Erciyes University Faculty of Medicine, Kayseri, Türkiye
| |
Collapse
|
50
|
Jeong H, Dishuck PC, Yoo D, Harvey WT, Munson KM, Lewis AP, Kordosky J, Garcia GH, Human Genome Structural Variation Consortium (HGSVC), Yilmaz F, Hallast P, Lee C, Pastinen T, Eichler EE. Structural polymorphism and diversity of human segmental duplications. Nat Genet 2025; 57:390-401. [PMID: 39779957 PMCID: PMC11821543 DOI: 10.1038/s41588-024-02051-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Accepted: 12/04/2024] [Indexed: 01/11/2025]
Abstract
Segmental duplications (SDs) contribute significantly to human disease, evolution and diversity but have been difficult to resolve at the sequence level. We present a population genetics survey of SDs by analyzing 170 human genome assemblies (from 85 samples representing 38 Africans and 47 non-Africans) in which the majority of autosomal SDs are fully resolved using long-read sequence assembly. Excluding the acrocentric short arms and sex chromosomes, we identify 173.2 Mb of duplicated sequence (47.4 Mb not present in the telomere-to-telomere reference) distinguishing fixed from structurally polymorphic events. We find that intrachromosomal SDs are among the most variable, with rare events mapping near their progenitor sequences. African genomes harbor significantly more intrachromosomal SDs and are more likely to have recently duplicated gene families with higher copy numbers than non-African samples. Comparison to a resource of 563 million full-length isoform sequencing reads identifies 201 novel, potentially protein-coding genes corresponding to these copy number polymorphic SDs.
Collapse
Affiliation(s)
- Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Altos Labs, San Diego, CA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jennifer Kordosky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Gage H Garcia
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Tomi Pastinen
- Children's Mercy Hospital and University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|