1
|
Krishna Murthy SB, Yang S, Bheda S, Tomar N, Li H, Yaghoobi A, Khan A, Kiryluk K, Motelow JE, Ren N, Gharavi AG, Milo Rasouly H. Assisting the analysis of insertions and deletions using regional allele frequencies. Funct Integr Genomics 2024; 24:104. [PMID: 38764005 DOI: 10.1007/s10142-024-01358-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 04/02/2024] [Accepted: 04/12/2024] [Indexed: 05/21/2024]
Abstract
Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides "regional AF" (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (n=125,748 samples), an internal dataset (IGM; n=39,367), and the UK BioBank (UKBB; n=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10-4 and rAF>10-4) as "rAF-hi" indels. Notably, a high percentage of rare indels were "rAF-hi", with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels' parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.
Collapse
Affiliation(s)
- Sarath Babu Krishna Murthy
- Center for Precision Genetics and Genomics, Department of Medicine, Columbia University, New York, NY, USA
| | - Sandy Yang
- Center for Precision Genetics and Genomics, Department of Medicine, Columbia University, New York, NY, USA
| | - Shiraz Bheda
- Center for Precision Genetics and Genomics, Department of Medicine, Columbia University, New York, NY, USA
| | - Nikita Tomar
- Center for Precision Genetics and Genomics, Department of Medicine, Columbia University, New York, NY, USA
| | - Haiyue Li
- Center for Precision Genetics and Genomics, Department of Medicine, Columbia University, New York, NY, USA
| | - Amir Yaghoobi
- Center for Precision Genetics and Genomics, Department of Medicine, Columbia University, New York, NY, USA
| | - Atlas Khan
- Division of Nephrology, Department of Medicine, Columbia University, New York, NY, USA
| | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, Columbia University, New York, NY, USA
| | - Joshua E Motelow
- Division of Critical Care and Hospital Medicine, Department of Pediatrics, Columbia University Irving Medical Center, New York-Presbyterian Morgan Stanley Children's Hospital, New York, New York, USA
| | - Nick Ren
- Institute for Genomic Medicine, Columbia University, New York, NY, USA
| | - Ali G Gharavi
- Center for Precision Genetics and Genomics, Department of Medicine, Columbia University, New York, NY, USA
- Division of Nephrology, Department of Medicine, Columbia University, New York, NY, USA
| | - Hila Milo Rasouly
- Center for Precision Genetics and Genomics, Department of Medicine, Columbia University, New York, NY, USA.
- Division of Nephrology, Department of Medicine, Columbia University, New York, NY, USA.
| |
Collapse
|
2
|
Steyaert W, Sagath L, Demidov G, Yépez VA, Esteve-Codina A, Gagneur J, Ellwanger K, Derks R, Weiss M, den Ouden A, van den Heuvel S, Swinkels H, Zomer N, Steehouwer M, O'Gorman L, Astuti G, Neveling K, Schüle R, Xu J, Synofzik M, Beijer D, Hengel H, Schöls L, Claeys KG, Baets J, Van de Vondel L, Ferlini A, Selvatici R, Morsy H, Saeed Abd Elmaksoud M, Straub V, Müller J, Pini V, Perry L, Sarkozy A, Zaharieva I, Muntoni F, Bugiardini E, Polavarapu K, Horvath R, Reid E, Lochmüller H, Spinazzi M, Savarese M, Matalonga L, Laurie S, Brunner HG, Graessner H, Beltran S, Ossowski S, Vissers LELM, Gilissen C, Hoischen A. Unravelling undiagnosed rare disease cases by HiFi long-read genome sequencing. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.03.24305331. [PMID: 38746462 PMCID: PMC11092722 DOI: 10.1101/2024.05.03.24305331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Solve-RD is a pan-European rare disease (RD) research program that aims to identify disease-causing genetic variants in previously undiagnosed RD families. We utilised 10-fold coverage HiFi long-read sequencing (LRS) for detecting causative structural variants (SVs), single nucleotide variants (SNVs), insertion-deletions (InDels), and short tandem repeat (STR) expansions in extensively studied RD families without clear molecular diagnoses. Our cohort includes 293 individuals from 114 genetically undiagnosed RD families selected by European Rare Disease Network (ERN) experts. Of these, 21 families were affected by so-called 'unsolvable' syndromes for which genetic causes remain unknown, and 93 families with at least one individual affected by a rare neurological, neuromuscular, or epilepsy disorder without genetic diagnosis despite extensive prior testing. Clinical interpretation and orthogonal validation of variants in known disease genes yielded thirteen novel genetic diagnoses due to de novo and rare inherited SNVs, InDels, SVs, and STR expansions. In an additional four families, we identified a candidate disease-causing SV affecting several genes including an MCF2 / FGF13 fusion and PSMA3 deletion. However, no common genetic cause was identified in any of the 'unsolvable' syndromes. Taken together, we found (likely) disease-causing genetic variants in 13.0% of previously unsolved families and additional candidate disease-causing SVs in another 4.3% of these families. In conclusion, our results demonstrate the added value of HiFi long-read genome sequencing in undiagnosed rare diseases.
Collapse
|
3
|
Qiao L, Li C, Lin W, He X, Mi J, Tong Y, Gao J. ViroISDC: a method for calling integration sites of hepatitis B virus based on feature encoding. BMC Bioinformatics 2024; 25:177. [PMID: 38704528 PMCID: PMC11070082 DOI: 10.1186/s12859-024-05763-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 03/26/2024] [Indexed: 05/06/2024] Open
Abstract
BACKGROUND Hepatitis B virus (HBV) integrates into human chromosomes and can lead to genomic instability and hepatocarcinogenesis. Current tools for HBV integration site detection lack accuracy and stability. RESULTS This study proposes a deep learning-based method, named ViroISDC, for detecting integration sites. ViroISDC generates corresponding grammar rules and encodes the characteristics of the language data to predict integration sites accurately. Compared with Lumpy, Pindel, Seeksv, and SurVirus, ViroISDC exhibits better overall performance and is less sensitive to sequencing depth and integration sequence length, displaying good reliability, stability, and generality. Further downstream analysis of integrated sites detected by ViroISDC reveals the integration patterns and features of HBV. It is observed that HBV integration exhibits specific chromosomal preferences and tends to integrate into cancerous tissue. Moreover, HBV integration frequency was higher in males than females, and high-frequency integration sites were more likely to be present on hepatocarcinogenesis- and anti-cancer-related genes, validating the reliability of the ViroISDC. CONCLUSIONS ViroISDC pipeline exhibits superior precision, stability, and reliability across various datasets when compared to similar software. It is invaluable in exploring HBV infection in the human body, holding significant implications for the diagnosis, treatment, and prognosis assessment of HCC.
Collapse
Affiliation(s)
- Lei Qiao
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Chang Li
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Wei Lin
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Xiaoqi He
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Jia Mi
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Yigang Tong
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China.
| | - Jingyang Gao
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China.
| |
Collapse
|
4
|
Tsardakas Renhuldt N, Bentzer J, Ahrén D, Marmon S, Sirijovski N. Phenotypic characterization and candidate gene analysis of a short kernel and brassinosteroid insensitive mutant from hexaploid oat ( Avena sativa). FRONTIERS IN PLANT SCIENCE 2024; 15:1358490. [PMID: 38736447 PMCID: PMC11082396 DOI: 10.3389/fpls.2024.1358490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 03/27/2024] [Indexed: 05/14/2024]
Abstract
In an ethyl methanesulfonate oat (Avena sativa) mutant population we have found a mutant with striking differences to the wild-type (WT) cv. Belinda. We phenotyped the mutant and compared it to the WT. The mutant was crossed to the WT and mapping-by-sequencing was performed on a pool of F2 individuals sharing the mutant phenotype, and variants were called. The impacts of the variants on genes present in the reference genome annotation were estimated. The mutant allele frequency distribution was combined with expression data to identify which among the affected genes was likely to cause the observed phenotype. A brassinosteroid sensitivity assay was performed to validate one of the identified candidates. A literature search was performed to identify homologs of genes known to be involved in seed shape from other species. The mutant had short kernels, compact spikelets, altered plant architecture, and was found to be insensitive to brassinosteroids when compared to the WT. The segregation of WT and mutant phenotypes in the F2 population was indicative of a recessive mutation of a single locus. The causal mutation was found to be one of 123 single-nucleotide polymorphisms (SNPs) spanning the entire chromosome 3A, with further filtering narrowing this down to six candidate genes. In-depth analysis of these candidate genes and the brassinosteroid sensitivity assay suggest that a Pro303Leu substitution in AVESA.00010b.r2.3AG0419820.1 could be the causal mutation of the short kernel mutant phenotype. We identified 298 oat proteins belonging to orthogroups of previously published seed shape genes, with AVESA.00010b.r2.3AG0419820.1 being the only of these affected by a SNP in the mutant. The AVESA.00010b.r2.3AG0419820.1 candidate is functionally annotated as a GSK3/SHAGGY-like kinase with homologs in Arabidopsis, wheat, barley, rice, and maize, with several of these proteins having known mutants giving rise to brassinosteroid insensitivity and shorter seeds. The substitution in AVESA.00010b.r2.3AG0419820.1 affects a residue with a known gain-of function substitution in Arabidopsis BRASSINOSTEROID-INSENSITIVE2. We propose a gain-of-function mutation in AVESA.00010b.r2.3AG0419820.1 as the most likely cause of the observed phenotype, and name the gene AsGSK2.1. The findings presented here provide potential targets for oat breeders, and a step on the way towards understanding brassinosteroid signaling, seed shape and nutrition in oats.
Collapse
Affiliation(s)
- Nikos Tsardakas Renhuldt
- ScanOats Industrial Research Centre, Department of Chemistry, Division of Pure and Applied Biochemistry, Lund University, Lund, Sweden
| | - Johan Bentzer
- ScanOats Industrial Research Centre, Department of Chemistry, Division of Pure and Applied Biochemistry, Lund University, Lund, Sweden
| | - Dag Ahrén
- National Bioinformatics Infrastructure Sweden (NBIS), SciLifeLab, Department of Biology, Lund University, Lund, Sweden
| | - Sofia Marmon
- ScanOats Industrial Research Centre, Department of Chemistry, Division of Pure and Applied Biochemistry, Lund University, Lund, Sweden
| | - Nick Sirijovski
- ScanOats Industrial Research Centre, Department of Chemistry, Division of Pure and Applied Biochemistry, Lund University, Lund, Sweden
- CropTailor AB, Department of Chemistry, Division of Pure and Applied Biochemistry, Lund University, Lund, Sweden
| |
Collapse
|
5
|
Villani F, Guarracino A, Ward RR, Green T, Emms M, Pravenec M, Prins P, Garrison E, Williams RW, Chen H, Colonna V. Pangenome reconstruction in rats enhances genotype-phenotype mapping and novel variant discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.10.575041. [PMID: 38260597 PMCID: PMC10802574 DOI: 10.1101/2024.01.10.575041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The HXB/BXH family of recombinant inbred rat strains is a unique genetic resource that has been extensively phenotyped over 25 years, resulting in a vast dataset of quantitative molecular and physiological phenotypes. We built a pangenome graph from 10x Genomics Linked-Read data for 31 recombinant inbred rats to study genetic variation and association mapping. The pangenome includes 0.2Gb of sequence that is not present the reference mRatBN7.2, confirming the capture of substantial additional variation. We validated variants in challenging regions, including complex structural variants resolving into multiple haplotypes. Phenome-wide association analysis of validated SNPs uncovered variants associated with glucose/insulin levels and hippocampal gene expression. We propose an interaction between Pirl1l1, chromogranin expression, TNF-α levels, and insulin regulation. This study demonstrates the utility of linked-read pangenomes for comprehensive variant detection and mapping phenotypic diversity in a widely used rat genetic reference panel.
Collapse
Affiliation(s)
- Flavia Villani
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Rachel R Ward
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center
| | - Tomomi Green
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center
| | - Madeleine Emms
- Institute of Genetics and Biophysics, National Research Council, Naples, 80111, Italy
| | - Michal Pravenec
- Institute of Physiology, Czech Academy of Sciences, 14200 Prague, Czech Republic
| | - Pjotr Prins
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Erik Garrison
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Robert W. Williams
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Hao Chen
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center
| | - Vincenza Colonna
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Institute of Genetics and Biophysics, National Research Council, Naples, 80111, Italy
| |
Collapse
|
6
|
de Jong TV, Pan Y, Rastas P, Munro D, Tutaj M, Akil H, Benner C, Chen D, Chitre AS, Chow W, Colonna V, Dalgard CL, Demos WM, Doris PA, Garrison E, Geurts AM, Gunturkun HM, Guryev V, Hourlier T, Howe K, Huang J, Kalbfleisch T, Kim P, Li L, Mahaffey S, Martin FJ, Mohammadi P, Ozel AB, Polesskaya O, Pravenec M, Prins P, Sebat J, Smith JR, Solberg Woods LC, Tabakoff B, Tracey A, Uliano-Silva M, Villani F, Wang H, Sharp BM, Telese F, Jiang Z, Saba L, Wang X, Murphy TD, Palmer AA, Kwitek AE, Dwinell MR, Williams RW, Li JZ, Chen H. A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats. CELL GENOMICS 2024; 4:100527. [PMID: 38537634 PMCID: PMC11019364 DOI: 10.1016/j.xgen.2024.100527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 12/26/2023] [Accepted: 02/29/2024] [Indexed: 04/09/2024]
Abstract
The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ∼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.
Collapse
Affiliation(s)
- Tristan V de Jong
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Yanchao Pan
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Pasi Rastas
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Daniel Munro
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA; Department of Integrative Structural and Computational Biology, Scripps Research, San Diego, CA, USA
| | - Monika Tutaj
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA; Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Huda Akil
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI, USA
| | - Chris Benner
- Department of Medicine, University of California San Diego, San Diego, CA, USA
| | - Denghui Chen
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Apurva S Chitre
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - William Chow
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Vincenza Colonna
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy; Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Clifton L Dalgard
- Department of Anatomy, Physiology & Genetics, The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Wendy M Demos
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA; Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Peter A Doris
- The Brown Foundation Institute of Molecular Medicine, Center for Human Genetics, University of Texas Health Science Center, Houston, TX, USA
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Aron M Geurts
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Hakan M Gunturkun
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Victor Guryev
- Genome Structure and Ageing, University of Groningen, UMC, Groningen, the Netherlands
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus in Hinxton, Cambridgeshire, UK
| | - Kerstin Howe
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Jun Huang
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Ted Kalbfleisch
- Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Louisville, KY, USA
| | - Panjun Kim
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Ling Li
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA; Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Spencer Mahaffey
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus in Hinxton, Cambridgeshire, UK
| | - Pejman Mohammadi
- Center for Immunity and Immunotherapies, Seattle Children's Research Institute, Seattle, WA, USA; Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, USA
| | - Ayse Bilge Ozel
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Oksana Polesskaya
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Michal Pravenec
- Institute of Physiology, Czech Academy of Sciences, Prague, Czechia
| | - Pjotr Prins
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Jonathan Sebat
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Jennifer R Smith
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA; Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Leah C Solberg Woods
- Department of Internal Medicine, Section on Molecular Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Boris Tabakoff
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Alan Tracey
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | | | - Flavia Villani
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Hongyang Wang
- Department of Animal Sciences, Washington State University, Pullman, WA, USA
| | - Burt M Sharp
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Francesca Telese
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Zhihua Jiang
- Department of Animal Sciences, Washington State University, Pullman, WA, USA
| | - Laura Saba
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Xusheng Wang
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA; Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Abraham A Palmer
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA; Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
| | - Anne E Kwitek
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA; Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Melinda R Dwinell
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA; Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Robert W Williams
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Jun Z Li
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
| | - Hao Chen
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA.
| |
Collapse
|
7
|
Zhang S, Xu N, Fu L, Yang X, Li Y, Yang Z, Feng Y, Ma K, Jiang X, Han J, Hu R, Zhang L, de Gennaro L, Ryabov F, Meng D, He Y, Wu D, Yang C, Paparella A, Mao Y, Bian X, Lu Y, Antonacci F, Ventura M, Shepelev VA, Miga KH, Alexandrov IA, Logsdon GA, Phillippy AM, Su B, Zhang G, Eichler EE, Lu Q, Shi Y, Sun Q, Mao Y. Comparative genomics of macaques and integrated insights into genetic variation and population history. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.07.588379. [PMID: 38645259 PMCID: PMC11030432 DOI: 10.1101/2024.04.07.588379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The crab-eating macaques ( Macaca fascicularis ) and rhesus macaques ( M. mulatta ) are widely studied nonhuman primates in biomedical and evolutionary research. Despite their significance, the current understanding of the complex genomic structure in macaques and the differences between species requires substantial improvement. Here, we present a complete genome assembly of a crab-eating macaque and 20 haplotype-resolved macaque assemblies to investigate the complex regions and major genomic differences between species. Segmental duplication in macaques is ∼42% lower, while centromeres are ∼3.7 times longer than those in humans. The characterization of ∼2 Mbp fixed genetic variants and ∼240 Mbp complex loci highlights potential associations with metabolic differences between the two macaque species (e.g., CYP2C76 and EHBP1L1 ). Additionally, hundreds of alternative splicing differences show post-transcriptional regulation divergence between these two species (e.g., PNPO ). We also characterize 91 large-scale genomic differences between macaques and humans at a single-base-pair resolution and highlight their impact on gene regulation in primate evolution (e.g., FOLH1 and PIEZO2 ). Finally, population genetics recapitulates macaque speciation and selective sweeps, highlighting potential genetic basis of reproduction and tail phenotype differences (e.g., STAB1 , SEMA3F , and HOXD13 ). In summary, the integrated analysis of genetic variation and population genetics in macaques greatly enhances our comprehension of lineage-specific phenotypes, adaptation, and primate evolution, thereby improving their biomedical applications in human diseases.
Collapse
|
8
|
Dumont BL, Gatti DM, Ballinger MA, Lin D, Phifer-Rixey M, Sheehan MJ, Suzuki TA, Wooldridge LK, Frempong HO, Lawal RA, Churchill GA, Lutz C, Rosenthal N, White JK, Nachman MW. Into the Wild: A novel wild-derived inbred strain resource expands the genomic and phenotypic diversity of laboratory mouse models. PLoS Genet 2024; 20:e1011228. [PMID: 38598567 PMCID: PMC11034653 DOI: 10.1371/journal.pgen.1011228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 04/22/2024] [Accepted: 03/18/2024] [Indexed: 04/12/2024] Open
Abstract
The laboratory mouse has served as the premier animal model system for both basic and preclinical investigations for over a century. However, laboratory mice capture only a subset of the genetic variation found in wild mouse populations, ultimately limiting the potential of classical inbred strains to uncover phenotype-associated variants and pathways. Wild mouse populations are reservoirs of genetic diversity that could facilitate the discovery of new functional and disease-associated alleles, but the scarcity of commercially available, well-characterized wild mouse strains limits their broader adoption in biomedical research. To overcome this barrier, we have recently developed, sequenced, and phenotyped a set of 11 inbred strains derived from wild-caught Mus musculus domesticus. Each of these "Nachman strains" immortalizes a unique wild haplotype sampled from one of five environmentally distinct locations across North and South America. Whole genome sequence analysis reveals that each strain carries between 4.73-6.54 million single nucleotide differences relative to the GRCm39 mouse reference, with 42.5% of variants in the Nachman strain genomes absent from current classical inbred mouse strain panels. We phenotyped the Nachman strains on a customized pipeline to assess the scope of disease-relevant neurobehavioral, biochemical, physiological, metabolic, and morphological trait variation. The Nachman strains exhibit significant inter-strain variation in >90% of 1119 surveyed traits and expand the range of phenotypic diversity captured in classical inbred strain panels. These novel wild-derived inbred mouse strain resources are set to empower new discoveries in both basic and preclinical research.
Collapse
Affiliation(s)
- Beth L. Dumont
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
- Graduate School of Biomedical Science and Engineering, The University of Maine, Orono, Maine, United States of America
| | - Daniel M. Gatti
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Mallory A. Ballinger
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, New York, United States of America
| | - Dana Lin
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Megan Phifer-Rixey
- Department of Biology, Drexel University, Philadelphia, Pennsylvania, United States of America
| | - Michael J. Sheehan
- Department of Neurobiology and Behavior, Cornell University, Ithaca, New York, United States of America
| | - Taichi A. Suzuki
- College of Health Solutions and Biodesign Center for Health Through Microbiomes, Arizona State University, Tempe, Arizona, United States of America
| | - Lydia K. Wooldridge
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Hilda Opoku Frempong
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Science and Engineering, The University of Maine, Orono, Maine, United States of America
| | - Raman Akinyanju Lawal
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Gary A. Churchill
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
- Graduate School of Biomedical Science and Engineering, The University of Maine, Orono, Maine, United States of America
| | - Cathleen Lutz
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Nadia Rosenthal
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
- Graduate School of Biomedical Science and Engineering, The University of Maine, Orono, Maine, United States of America
- National Heart and Lung Institute, Imperial College London, London, United Kingdom
| | - Jacqueline K. White
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Michael W. Nachman
- Department of Integrative Biology, Museum of Vertebrate Zoology, and Center for Computational Biology, University of California, Berkeley, Berkeley, California, United States of America
| |
Collapse
|
9
|
Leonard AS, Mapel XM, Pausch H. Pangenome-genotyped structural variation improves molecular phenotype mapping in cattle. Genome Res 2024; 34:300-309. [PMID: 38355307 PMCID: PMC10984387 DOI: 10.1101/gr.278267.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 02/01/2024] [Indexed: 02/16/2024]
Abstract
Expression and splicing quantitative trait loci (e/sQTL) are large contributors to phenotypic variability. Achieving sufficient statistical power for e/sQTL mapping requires large cohorts with both genotypes and molecular phenotypes, and so, the genomic variation is often called from short-read alignments, which are unable to comprehensively resolve structural variation. Here we build a pangenome from 16 HiFi haplotype-resolved cattle assemblies to identify small and structural variation and genotype them with PanGenie in 307 short-read samples. We find high (>90%) concordance of PanGenie-genotyped and DeepVariant-called small variation and confidently genotype close to 21 million small and 43,000 structural variants in the larger population. We validate 85% of these structural variants (with MAF > 0.1) directly with a subset of 25 short-read samples that also have medium coverage HiFi reads. We then conduct e/sQTL mapping with this comprehensive variant set in a subset of 117 cattle that have testis transcriptome data, and find 92 structural variants as causal candidates for eQTL and 73 for sQTL. We find that roughly half of the top associated structural variants affecting expression or splicing are transposable elements, such as SV-eQTL for STN1 and MYH7 and SV-sQTL for CEP89 and ASAH2 Extensive linkage disequilibrium between small and structural variation results in only 28 additional eQTL and 17 sQTL discovered when including SVs, although many top associated SVs are compelling candidates.
Collapse
Affiliation(s)
| | - Xena M Mapel
- Animal Genomics, ETH Zurich, 8092 Zurich, Switzerland
| | - Hubert Pausch
- Animal Genomics, ETH Zurich, 8092 Zurich, Switzerland
| |
Collapse
|
10
|
Olbrich M, Bartels L, Wohlers I. Sequencing technologies and hardware-accelerated parallel computing transform computational genomics research. FRONTIERS IN BIOINFORMATICS 2024; 4:1384497. [PMID: 38567256 PMCID: PMC10985184 DOI: 10.3389/fbinf.2024.1384497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 03/07/2024] [Indexed: 04/04/2024] Open
Affiliation(s)
- Michael Olbrich
- Center for Biotechnology, Khalifa University for Science and Technology, Abu Dhabi, United Arab Emirates
| | - Lennart Bartels
- Biomolecular Data Science in Pneumology, Research Center Borstel, Borstel, Germany
| | - Inken Wohlers
- Biomolecular Data Science in Pneumology, Research Center Borstel, Borstel, Germany
- University of Lübeck, Lübeck, Germany
| |
Collapse
|
11
|
Ablooglu AJ, Chen WS, Xie Z, Desai A, Paul S, Lack JB, Scott LA, Eisch AR, Dudek AZ, Parikh SM, Druey KM. Intrinsic endothelial hyperresponsiveness to inflammatory mediators drives acute episodes in models of Clarkson disease. J Clin Invest 2024; 134:e169137. [PMID: 38502192 DOI: 10.1172/jci169137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 03/08/2024] [Indexed: 03/21/2024] Open
Abstract
Clarkson disease, or monoclonal gammopathy-associated idiopathic systemic capillary leak syndrome (ISCLS), is a rare, relapsing-remitting disorder featuring the abrupt extravasation of fluids and proteins into peripheral tissues, which in turn leads to hypotensive shock, severe hemoconcentration, and hypoalbuminemia. The specific leakage factor(s) and pathways in ISCLS are unknown, and there is no effective treatment for acute flares. Here, we characterize an autonomous vascular endothelial defect in ISCLS that was recapitulated in patient-derived endothelial cells (ECs) in culture and in a mouse model of disease. ISCLS-derived ECs were functionally hyperresponsive to permeability-inducing factors like VEGF and histamine, in part due to increased endothelial nitric oxide synthase (eNOS) activity. eNOS blockade by administration of N(γ)-nitro-l-arginine methyl ester (l-NAME) ameliorated vascular leakage in an SJL/J mouse model of ISCLS induced by histamine or VEGF challenge. eNOS mislocalization and decreased protein phosphatase 2A (PP2A) expression may contribute to eNOS hyperactivation in ISCLS-derived ECs. Our findings provide mechanistic insights into microvascular barrier dysfunction in ISCLS and highlight a potential therapeutic approach.
Collapse
Affiliation(s)
- Ararat J Ablooglu
- Lung and Vascular Inflammation Section, Laboratory of Allergic Diseases, and
| | - Wei-Sheng Chen
- Lung and Vascular Inflammation Section, Laboratory of Allergic Diseases, and
| | - Zhihui Xie
- Lung and Vascular Inflammation Section, Laboratory of Allergic Diseases, and
| | - Abhishek Desai
- Lung and Vascular Inflammation Section, Laboratory of Allergic Diseases, and
| | - Subrata Paul
- Integrative Data Sciences Section, National Institute of Allergy and Infectious Diseases (NIAID), NIH, Bethesda, Maryland, USA
| | - Justin B Lack
- Integrative Data Sciences Section, National Institute of Allergy and Infectious Diseases (NIAID), NIH, Bethesda, Maryland, USA
| | - Linda A Scott
- Lung and Vascular Inflammation Section, Laboratory of Allergic Diseases, and
| | - A Robin Eisch
- Lung and Vascular Inflammation Section, Laboratory of Allergic Diseases, and
| | - Arkadiusz Z Dudek
- Division of Medical Oncology, Mayo Clinic, Rochester, Minnesota, USA
| | - Samir M Parikh
- Division of Nephrology, Departments of Internal Medicine and Pharmacology, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Kirk M Druey
- Lung and Vascular Inflammation Section, Laboratory of Allergic Diseases, and
| |
Collapse
|
12
|
Wang N, Chen P, Xu Y, Guo L, Li X, Yi H, Larkin RM, Zhou Y, Deng X, Xu Q. Phased genomics reveals hidden somatic mutations and provides insight into fruit development in sweet orange. HORTICULTURE RESEARCH 2024; 11:uhad268. [PMID: 38371640 PMCID: PMC10873711 DOI: 10.1093/hr/uhad268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 12/01/2023] [Indexed: 02/20/2024]
Abstract
Although revisiting the discoveries and implications of genetic variations using phased genomics is critical, such efforts are still lacking. Somatic mutations represent a crucial source of genetic diversity for breeding and are especially remarkable in heterozygous perennial and asexual crops. In this study, we focused on a diploid sweet orange (Citrus sinensis) and constructed a haplotype-resolved genome using high fidelity (HiFi) reads, which revealed 10.6% new sequences. Based on the phased genome, we elucidate significant genetic admixtures and haplotype differences. We developed a somatic detection strategy that reveals hidden somatic mutations overlooked in a single reference genome. We generated a phased somatic variation map by combining high-depth whole-genome sequencing (WGS) data from 87 sweet orange somatic varieties. Notably, we found twice as many somatic mutations relative to a single reference genome. Using these hidden somatic mutations, we separated sweet oranges into seven major clades and provide insight into unprecedented genetic mosaicism and strong positive selection. Furthermore, these phased genomics data indicate that genomic heterozygous variations contribute to allele-specific expression during fruit development. By integrating allelic expression differences and somatic mutations, we identified a somatic mutation that induces increases in fruit size. Applications of phased genomics will lead to powerful approaches for discovering genetic variations and uncovering their effects in highly heterozygous plants. Our data provide insight into the hidden somatic mutation landscape in the sweet orange genome, which will facilitate citrus breeding.
Collapse
Affiliation(s)
- Nan Wang
- Institute of Horticultural Research, Hunan Academy of Agricultural Sciences, Changsha, China
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- National Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Peng Chen
- Institute of Horticultural Research, Hunan Academy of Agricultural Sciences, Changsha, China
- Yuelu Mountain Laboratory, Changsha, China
| | - Yuanyuan Xu
- Institute of Horticultural Research, Hunan Academy of Agricultural Sciences, Changsha, China
- Yuelu Mountain Laboratory, Changsha, China
| | - Lingxia Guo
- Institute of Horticultural Research, Hunan Academy of Agricultural Sciences, Changsha, China
- Yuelu Mountain Laboratory, Changsha, China
| | - Xianxin Li
- Institute of Horticultural Research, Hunan Academy of Agricultural Sciences, Changsha, China
- Yuelu Mountain Laboratory, Changsha, China
| | - Hualin Yi
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| | - Robert M Larkin
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| | - Yongfeng Zhou
- National Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- National Key Laboratory of Tropical Crop Breeding, Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou, China
| | - Xiuxin Deng
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| | - Qiang Xu
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| |
Collapse
|
13
|
Mohsin N, Hunt D, Yan J, Jabbour AJ, Nghiem P, Choi J, Zhang Y, Freeman AF, Bergerson JRE, Dell’Orso S, Lachance K, Kulikauskas R, Collado L, Cao W, Lack J, Similuk M, Seifert BA, Ghosh R, Walkiewicz MA, Brownell I. Genetic Risk Factors for Early-Onset Merkel Cell Carcinoma. JAMA Dermatol 2024; 160:172-178. [PMID: 38170500 PMCID: PMC10765310 DOI: 10.1001/jamadermatol.2023.5362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 11/03/2023] [Indexed: 01/05/2024]
Abstract
Importance Merkel cell carcinoma (MCC) is a rare, aggressive neuroendocrine skin cancer. Of the patients who develop MCC annually, only 4% are younger than 50 years. Objective To identify genetic risk factors for early-onset MCC via genomic sequencing. Design, Setting, and Participants The study represents a multicenter collaboration between the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS), the National Institute of Allergy and Infectious Diseases (NIAID), and the University of Washington. Participants with early-onset and later-onset MCC were prospectively enrolled in an institutional review board-approved study at the University of Washington between January 2003 and May 2019. Unrelated controls were enrolled in the NIAID Centralized Sequencing Program (CSP) between September 2017 and September 2021. Analysis was performed from September 2021 and March 2023. Early-onset MCC was defined as disease occurrence in individuals younger than 50 years. Later-onset MCC was defined as disease occurrence at age 50 years or older. Unrelated controls were evaluated by the NIAID CSP for reasons other than familial cancer syndromes, including immunological, neurological, and psychiatric disorders. Results This case-control analysis included 1012 participants: 37 with early-onset MCC, 45 with later-onset MCC, and 930 unrelated controls. Among 37 patients with early-onset MCC, 7 (19%) had well-described variants in genes associated with cancer predisposition. Six patients had variants associated with hereditary cancer syndromes (ATM = 2, BRCA1 = 2, BRCA2 = 1, and TP53 = 1) and 1 patient had a variant associated with immunodeficiency and lymphoma (MAGT1). Compared with 930 unrelated controls, the early-onset MCC cohort was significantly enriched for cancer-predisposing pathogenic or likely pathogenic variants in these 5 genes (odds ratio, 30.35; 95% CI, 8.89-106.30; P < .001). No germline disease variants in these genes were identified in 45 patients with later-onset MCC. Additional variants in DNA repair genes were also identified among patients with MCC. Conclusions and Relevance Because variants in certain DNA repair and cancer predisposition genes are associated with early-onset MCC, genetic counseling and testing should be considered for patients presenting at younger than 50 years.
Collapse
Affiliation(s)
- Noreen Mohsin
- Dermatology Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS), National Institutes of Health (NIH), Bethesda, Maryland
| | - Devin Hunt
- Division of Intramural Research, National Institute of Allergy and Infectious Diseases (NIAID), NIH, Bethesda, Maryland
| | - Jia Yan
- Division of Intramural Research, National Institute of Allergy and Infectious Diseases (NIAID), NIH, Bethesda, Maryland
| | | | - Paul Nghiem
- Division of Dermatology, University of Washington, Seattle
| | - Jaehyuk Choi
- Northwestern University Department of Dermatology and Department of Biochemistry and Molecular Genetics, Chicago, Illinois
| | - Yue Zhang
- Northwestern University Department of Dermatology and Department of Biochemistry and Molecular Genetics, Chicago, Illinois
| | - Alexandra F. Freeman
- Laboratory of Clinical Immunology and Microbiology, NIAID, NIH, Bethesda, Maryland
| | | | | | | | | | - Loren Collado
- Dermatology Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS), National Institutes of Health (NIH), Bethesda, Maryland
| | - Wenjia Cao
- Collaborative Bioinformatics Resource, NIAID, NIH, Bethesda, Maryland
| | - Justin Lack
- Collaborative Bioinformatics Resource, NIAID, NIH, Bethesda, Maryland
| | - Morgan Similuk
- Division of Intramural Research, National Institute of Allergy and Infectious Diseases (NIAID), NIH, Bethesda, Maryland
| | - Bryce A. Seifert
- Division of Intramural Research, National Institute of Allergy and Infectious Diseases (NIAID), NIH, Bethesda, Maryland
| | - Rajarshi Ghosh
- Division of Intramural Research, National Institute of Allergy and Infectious Diseases (NIAID), NIH, Bethesda, Maryland
| | - Magdalena A. Walkiewicz
- Division of Intramural Research, National Institute of Allergy and Infectious Diseases (NIAID), NIH, Bethesda, Maryland
| | - Isaac Brownell
- Dermatology Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS), National Institutes of Health (NIH), Bethesda, Maryland
| |
Collapse
|
14
|
Mapel XM, Kadri NK, Leonard AS, He Q, Lloret-Villas A, Bhati M, Hiltpold M, Pausch H. Molecular quantitative trait loci in reproductive tissues impact male fertility in cattle. Nat Commun 2024; 15:674. [PMID: 38253538 PMCID: PMC10803364 DOI: 10.1038/s41467-024-44935-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 01/08/2024] [Indexed: 01/24/2024] Open
Abstract
Breeding bulls are well suited to investigate inherited variation in male fertility because they are genotyped and their reproductive success is monitored through semen analyses and thousands of artificial inseminations. However, functional data from relevant tissues are lacking in cattle, which prevents fine-mapping fertility-associated genomic regions. Here, we characterize gene expression and splicing variation in testis, epididymis, and vas deferens transcriptomes of 118 mature bulls and conduct association tests between 414,667 molecular phenotypes and 21,501,032 genome-wide variants to identify 41,156 regulatory loci. We show broad consensus in tissue-specific and tissue-enriched gene expression between the three bovine tissues and their human and murine counterparts. Expression- and splicing-mediating variants are more than three times as frequent in testis than epididymis and vas deferens, highlighting the transcriptional complexity of testis. Finally, we identify genes (WDR19, SPATA16, KCTD19, ZDHHC1) and molecular phenotypes that are associated with quantitative variation in male fertility through transcriptome-wide association and colocalization analyses.
Collapse
Affiliation(s)
- Xena Marie Mapel
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
| | - Naveen Kumar Kadri
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
| | - Alexander S Leonard
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
| | - Qiongyu He
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
| | | | - Meenu Bhati
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
- Roslin Institute, The University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | - Maya Hiltpold
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet Tolosan, France
| | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland.
| |
Collapse
|
15
|
Barbitoff YA, Ushakov MO, Lazareva TE, Nasykhova YA, Glotov AS, Predeus AV. Bioinformatics of germline variant discovery for rare disease diagnostics: current approaches and remaining challenges. Brief Bioinform 2024; 25:bbad508. [PMID: 38271481 PMCID: PMC10810331 DOI: 10.1093/bib/bbad508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/18/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.
Collapse
Affiliation(s)
- Yury A Barbitoff
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| | - Mikhail O Ushakov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Tatyana E Lazareva
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Yulia A Nasykhova
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Andrey S Glotov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Alexander V Predeus
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| |
Collapse
|
16
|
Poterba T, Vittal C, King D, Goldstein D, Goldstein JI, Schultz P, Karczewski KJ, Seed C, Neale BM. The Scalable Variant Call Representation: Enabling Genetic Analysis Beyond One Million Genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.09.574205. [PMID: 38260295 PMCID: PMC10802441 DOI: 10.1101/2024.01.09.574205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The Variant Call Format (VCF) is widely used in genome sequencing but scales poorly. For instance, we estimate a 150,000 genome VCF would occupy 900 TiB, making it both costly and complicated to produce and analyze. The issue stems from VCF's requirement to densely represent both reference-genotypes and allele-indexed arrays. These requirements lead to unnecessary data duplication and, ultimately, very large files. To address these challenges, we introduce the Scalable Variant Call Representation (SVCR). This representation reduces file sizes by ensuring they scale linearly with samples. SVCR achieves this by adopting reference blocks from the Genomic Variant Call Format (GVCF) and employing local allele indices. SVCR is also lossless and mergeable, allowing for N+1 and N+K incremental joint-calling. We present two implementations of SVCR: SVCR-VCF, which encodes SVCR in VCF format, and VDS, which uses Hail's native format. Our experiments confirm the linear scalability of SVCR-VCF and VDS, in contrast to the super-linear growth seen with standard VCF files. We also discuss the VDS Combiner, a scalable, open-source tool for producing a VDS from GVCFs and unique features of VDS which enable rapid data analysis. SVCR, and VDS in particular, ensure the scientific community can generate, analyze, and disseminate genetics datasets with millions of samples.
Collapse
Affiliation(s)
- Timothy Poterba
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Christopher Vittal
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Daniel King
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Daniel Goldstein
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Jacqueline I. Goldstein
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Patrick Schultz
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Konrad J. Karczewski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Cotton Seed
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Benjamin M. Neale
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| |
Collapse
|
17
|
Nosková A, Li C, Wang X, Leonard AS, Pausch H, Kadri N. Exploiting public databases of genomic variation to quantify evolutionary constraint on the branch point sequence in 30 plant and animal species. Nucleic Acids Res 2023; 51:12069-12075. [PMID: 37953306 PMCID: PMC10711541 DOI: 10.1093/nar/gkad970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 10/06/2023] [Accepted: 10/18/2023] [Indexed: 11/14/2023] Open
Abstract
The branch point sequence is a degenerate intronic heptamer required for the assembly of the spliceosome during pre-mRNA splicing. Disruption of this motif may promote alternative splicing and eventually cause phenotype variation. Despite its functional relevance, the branch point sequence is not included in most genome annotations. Here, we predict branch point sequences in 30 plant and animal species and attempt to quantify their evolutionary constraints using public variant databases. We find an implausible variant distribution in the databases from 16 of 30 examined species. Comparative analysis of variants from whole-genome sequencing shows that variants submitted from exome sequencing or false positive variants are widespread in public databases and cause these irregularities. We then investigate evolutionary constraint with largely unbiased public variant databases in 14 species and find that the fourth and sixth position of the branch point sequence are more constrained than coding nucleotides. Our findings show that public variant databases should be scrutinized for possible biases before they qualify to analyze evolutionary constraint.
Collapse
Affiliation(s)
- Adéla Nosková
- Animal Genomics, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
| | - Chao Li
- Animal Genomics, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
- International Joint Agriculture Research Center for Animal Bio-Breeding, Ministry of Agriculture and Rural Affairs/Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Xiaolong Wang
- International Joint Agriculture Research Center for Animal Bio-Breeding, Ministry of Agriculture and Rural Affairs/Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | | | - Hubert Pausch
- Animal Genomics, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
| | - Naveen Kumar Kadri
- Animal Genomics, ETH Zürich, Universitätstrasse 2, 8092 Zürich, Switzerland
| |
Collapse
|
18
|
Zhang YJ, Luo Z, Sun Y, Liu J, Chen Z. From beasts to bytes: Revolutionizing zoological research with artificial intelligence. Zool Res 2023; 44:1115-1131. [PMID: 37933101 PMCID: PMC10802096 DOI: 10.24272/j.issn.2095-8137.2023.263] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 10/30/2023] [Indexed: 11/08/2023] Open
Abstract
Since the late 2010s, Artificial Intelligence (AI) including machine learning, boosted through deep learning, has boomed as a vital tool to leverage computer vision, natural language processing and speech recognition in revolutionizing zoological research. This review provides an overview of the primary tasks, core models, datasets, and applications of AI in zoological research, including animal classification, resource conservation, behavior, development, genetics and evolution, breeding and health, disease models, and paleontology. Additionally, we explore the challenges and future directions of integrating AI into this field. Based on numerous case studies, this review outlines various avenues for incorporating AI into zoological research and underscores its potential to enhance our understanding of the intricate relationships that exist within the animal kingdom. As we build a bridge between beast and byte realms, this review serves as a resource for envisioning novel AI applications in zoological research that have not yet been explored.
Collapse
Affiliation(s)
- Yu-Juan Zhang
- Chongqing Key Laboratory of Vector Insects
- Chongqing Key Laboratory of Animal Biology
- College of Life Science, Chongqing Normal University, Chongqing 401331, China
| | - Zeyu Luo
- Chongqing Key Laboratory of Vector Insects
- Chongqing Key Laboratory of Animal Biology
- College of Life Science, Chongqing Normal University, Chongqing 401331, China
| | - Yawen Sun
- Chongqing Key Laboratory of Vector Insects
- Chongqing Key Laboratory of Animal Biology
- College of Life Science, Chongqing Normal University, Chongqing 401331, China
| | - Junhao Liu
- Chongqing Key Laboratory of Vector Insects
- Chongqing Key Laboratory of Animal Biology
- College of Life Science, Chongqing Normal University, Chongqing 401331, China
| | - Zongqing Chen
- School of Mathematical Sciences
- National Center for Applied Mathematics in Chongqing, Chongqing Normal University, Chongqing 401331, China. E-mail:
| |
Collapse
|
19
|
Pagnamenta AT, Camps C, Giacopuzzi E, Taylor JM, Hashim M, Calpena E, Kaisaki PJ, Hashimoto A, Yu J, Sanders E, Schwessinger R, Hughes JR, Lunter G, Dreau H, Ferla M, Lange L, Kesim Y, Ragoussis V, Vavoulis DV, Allroggen H, Ansorge O, Babbs C, Banka S, Baños-Piñero B, Beeson D, Ben-Ami T, Bennett DL, Bento C, Blair E, Brasch-Andersen C, Bull KR, Cario H, Cilliers D, Conti V, Davies EG, Dhalla F, Dacal BD, Dong Y, Dunford JE, Guerrini R, Harris AL, Hartley J, Hollander G, Javaid K, Kane M, Kelly D, Kelly D, Knight SJL, Kreins AY, Kvikstad EM, Langman CB, Lester T, Lines KE, Lord SR, Lu X, Mansour S, Manzur A, Maroofian R, Marsden B, Mason J, McGowan SJ, Mei D, Mlcochova H, Murakami Y, Németh AH, Okoli S, Ormondroyd E, Ousager LB, Palace J, Patel SY, Pentony MM, Pugh C, Rad A, Ramesh A, Riva SG, Roberts I, Roy N, Salminen O, Schilling KD, Scott C, Sen A, Smith C, Stevenson M, Thakker RV, Twigg SRF, Uhlig HH, van Wijk R, Vona B, Wall S, Wang J, Watkins H, Zak J, Schuh AH, Kini U, Wilkie AOM, Popitsch N, Taylor JC. Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases. Genome Med 2023; 15:94. [PMID: 37946251 PMCID: PMC10636885 DOI: 10.1186/s13073-023-01240-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 09/27/2023] [Indexed: 11/12/2023] Open
Abstract
BACKGROUND Whole genome sequencing is increasingly being used for the diagnosis of patients with rare diseases. However, the diagnostic yields of many studies, particularly those conducted in a healthcare setting, are often disappointingly low, at 25-30%. This is in part because although entire genomes are sequenced, analysis is often confined to in silico gene panels or coding regions of the genome. METHODS We undertook WGS on a cohort of 122 unrelated rare disease patients and their relatives (300 genomes) who had been pre-screened by gene panels or arrays. Patients were recruited from a broad spectrum of clinical specialties. We applied a bioinformatics pipeline that would allow comprehensive analysis of all variant types. We combined established bioinformatics tools for phenotypic and genomic analysis with our novel algorithms (SVRare, ALTSPLICE and GREEN-DB) to detect and annotate structural, splice site and non-coding variants. RESULTS Our diagnostic yield was 43/122 cases (35%), although 47/122 cases (39%) were considered solved when considering novel candidate genes with supporting functional data into account. Structural, splice site and deep intronic variants contributed to 20/47 (43%) of our solved cases. Five genes that are novel, or were novel at the time of discovery, were identified, whilst a further three genes are putative novel disease genes with evidence of causality. We identified variants of uncertain significance in a further fourteen candidate genes. The phenotypic spectrum associated with RMND1 was expanded to include polymicrogyria. Two patients with secondary findings in FBN1 and KCNQ1 were confirmed to have previously unidentified Marfan and long QT syndromes, respectively, and were referred for further clinical interventions. Clinical diagnoses were changed in six patients and treatment adjustments made for eight individuals, which for five patients was considered life-saving. CONCLUSIONS Genome sequencing is increasingly being considered as a first-line genetic test in routine clinical settings and can make a substantial contribution to rapidly identifying a causal aetiology for many patients, shortening their diagnostic odyssey. We have demonstrated that structural, splice site and intronic variants make a significant contribution to diagnostic yield and that comprehensive analysis of the entire genome is essential to maximise the value of clinical genome sequencing.
Collapse
Affiliation(s)
- Alistair T Pagnamenta
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Carme Camps
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Edoardo Giacopuzzi
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Human Technopole, Viale Rita Levi Montalcini 1, 20157, Milan, Italy
| | - John M Taylor
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - Mona Hashim
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Eduardo Calpena
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Pamela J Kaisaki
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Akiko Hashimoto
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Jing Yu
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Edward Sanders
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Ron Schwessinger
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Jim R Hughes
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Gerton Lunter
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
- University Medical Center Groningen, Groningen University, PO Box 72, 9700 AB, Groningen, The Netherlands
| | - Helene Dreau
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK
| | - Matteo Ferla
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Lukas Lange
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Yesim Kesim
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Vassilis Ragoussis
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Dimitrios V Vavoulis
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK
| | - Holger Allroggen
- Neurosciences Department, UHCW NHS Trust, Clifford Bridge Road, Coventry, CV2 2DX, UK
| | - Olaf Ansorge
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Christian Babbs
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Siddharth Banka
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
- Manchester Centre for Genomic Medicine, Saint Mary's Hospital, Oxford Road, Manchester, M13 9WL, UK
| | - Benito Baños-Piñero
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - David Beeson
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Tal Ben-Ami
- Pediatric Hematology-Oncology Unit, Kaplan Medical Center, Rehovot, Israel
| | - David L Bennett
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Celeste Bento
- Hematology Department, Hospitais da Universidade de Coimbra, Coimbra, Portugal
| | - Edward Blair
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 7LE, UK
| | - Charlotte Brasch-Andersen
- Department of Clinical Genetics, Odense University Hospital and Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Katherine R Bull
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN, UK
| | - Holger Cario
- Department of Pediatrics and Adolescent Medicine, University Medical Center, Eythstrasse 24, 89075, Ulm, Germany
| | - Deirdre Cilliers
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 7LE, UK
| | - Valerio Conti
- Neuroscience Department, Meyer Children's Hospital IRCCS, Viale Pieraccini 24, 50139, Florence, Italy
| | - E Graham Davies
- Department of Immunology, Great Ormond Street Hospital for Children NHS Trust and UCL Great Ormond Street Institute of Child Health, Zayed Centre for Research, 2Nd Floor, 20C Guilford Street, London, WC1N 1DZ, UK
| | - Fatima Dhalla
- Department of Paediatrics, Institute of Developmental and Regenerative Medicine, IMS-Tetsuya Nakamura Building, Old Road Campus, Roosevelt Drive, Oxford, OX3 7TY, UK
| | - Beatriz Diez Dacal
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - Yin Dong
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - James E Dunford
- Oxford NIHR Musculoskeletal BRC and Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Old Road, Oxford, OX3 7HE, UK
| | - Renzo Guerrini
- Neuroscience Department, Meyer Children's Hospital IRCCS, Viale Pieraccini 24, 50139, Florence, Italy
| | - Adrian L Harris
- Department of Oncology, University of Oxford, Old Road Campus Research Building, Oxford, OX3 7DQ, UK
| | - Jane Hartley
- Liver Unit, Birmingham Women's & Children's Hospital and University of Birmingham, Steelhouse Lane, Birmingham, B4 6NH, UK
| | - Georg Hollander
- Department of Paediatrics, University of Oxford, Level 2, Children's Hospital, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Kassim Javaid
- Oxford NIHR Musculoskeletal BRC and Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Old Road, Oxford, OX3 7HE, UK
| | - Maureen Kane
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Pharmacy Hall North, Room 731, 20 N. Pine Street, Baltimore, MD, 21201, USA
| | - Deirdre Kelly
- Liver Unit, Birmingham Women's & Children's Hospital and University of Birmingham, Steelhouse Lane, Birmingham, B4 6NH, UK
| | - Dominic Kelly
- Children's Hospital, OUH NHS Foundation Trust, NIHR Oxford BRC, Headley Way, Oxford, OX3 9DU, UK
| | - Samantha J L Knight
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Alexandra Y Kreins
- Department of Immunology, Great Ormond Street Hospital for Children NHS Trust and UCL Great Ormond Street Institute of Child Health, Zayed Centre for Research, 2Nd Floor, 20C Guilford Street, London, WC1N 1DZ, UK
| | - Erika M Kvikstad
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Craig B Langman
- Feinberg School of Medicine, Northwestern University, 211 E Chicago Avenue, Chicago, IL, MS37, USA
| | - Tracy Lester
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - Kate E Lines
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- University of Oxford, Academic Endocrine Unit, OCDEM, Churchill Hospital, Oxford, OX3 7LJ, UK
| | - Simon R Lord
- Early Phase Clinical Trials Unit, Department of Oncology, University of Oxford, Cancer and Haematology Centre, Level 2 Administration Area, Churchill Hospital, Oxford, OX3 7LJ, UK
| | - Xin Lu
- Nuffield Department of Clinical Medicine, Ludwig Institute for Cancer Research, University of Oxford, Old Road Campus Research Building, Oxford, OX3 7DQ, UK
| | - Sahar Mansour
- St George's University Hospitals NHS Foundation Trust, Blackshore Road, Tooting, London, SW17 0QT, UK
| | - Adnan Manzur
- MRC Centre for Neuromuscular Diseases, National Hospital for Neurology and Neurosurgery, Queen Square, London, WC1N 3BG, UK
| | - Reza Maroofian
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, WC1N 3BG, UK
| | - Brian Marsden
- Nuffield Department of Medicine, Kennedy Institute, University of Oxford, Oxford, OX3 7BN, UK
| | - Joanne Mason
- Yourgene Health Headquarters, Skelton House, Lloyd Street North, Manchester Science Park, Manchester, M15 6SH, UK
| | - Simon J McGowan
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Davide Mei
- Neuroscience Department, Meyer Children's Hospital IRCCS, Viale Pieraccini 24, 50139, Florence, Italy
| | - Hana Mlcochova
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Yoshiko Murakami
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Osaka, 565-0871, Japan
| | - Andrea H Németh
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 7LE, UK
| | - Steven Okoli
- Imperial College NHS Trust, Department of Haematology, Hammersmith Hospital, Du Cane Road, London, W12 0HS, UK
| | - Elizabeth Ormondroyd
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- University of Oxford, Level 6 West Wing, Oxford, OX3 9DU, JR, UK
| | - Lilian Bomme Ousager
- Department of Clinical Genetics, Odense University Hospital and Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Jacqueline Palace
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Smita Y Patel
- Clinical Immunology, John Radcliffe Hospital, Level 4A, Oxford, OX3 9DU, UK
| | - Melissa M Pentony
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Chris Pugh
- Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN, UK
| | - Aboulfazl Rad
- Department of Otolaryngology-Head & Neck Surgery, Tübingen Hearing Research Centre, Eberhard Karls University, Elfriede-Aulhorn-Str. 5, 72076, Tübingen, Germany
| | - Archana Ramesh
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Simone G Riva
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Irene Roberts
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
- Department of Paediatrics, University of Oxford, Level 2, Children's Hospital, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Noémi Roy
- Department of Haematology, Oxford University Hospitals NHS Foundation Trust, Level 4, Haematology, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Outi Salminen
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK
| | - Kyleen D Schilling
- Ann & Robert H. Lurie Children's Hospital of Chicago, 225 E Chicago Avenue, Chicago, IL, 60611, USA
| | - Caroline Scott
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Arjune Sen
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Conrad Smith
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - Mark Stevenson
- University of Oxford, Academic Endocrine Unit, OCDEM, Churchill Hospital, Oxford, OX3 7LJ, UK
| | - Rajesh V Thakker
- University of Oxford, Academic Endocrine Unit, OCDEM, Churchill Hospital, Oxford, OX3 7LJ, UK
| | - Stephen R F Twigg
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Holm H Uhlig
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Paediatrics, University of Oxford, Level 2, Children's Hospital, John Radcliffe Hospital, Oxford, OX3 9DU, UK
- Translational Gastroenterology Unit, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Richard van Wijk
- UMC Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| | - Barbara Vona
- Department of Otolaryngology-Head & Neck Surgery, Tübingen Hearing Research Centre, Eberhard Karls University, Elfriede-Aulhorn-Str. 5, 72076, Tübingen, Germany
- Institute of Human Genetics, University Medical Center Göttingen, Heinrich-Düker-Weg 12, 37073, Göttingen, Germany
- Institute for Auditory Neuroscience and InnerEarLab, University Medical Center Göttingen, Robert-Koch-Str. 40, 37075, Göttingen, Germany
| | - Steven Wall
- Oxford Craniofacial Unit, John Radcliffe Hospital, Level LG1, West Wing, Oxford, OX3 9DU, UK
| | - Jing Wang
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Hugh Watkins
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- University of Oxford, Level 6 West Wing, Oxford, OX3 9DU, JR, UK
| | - Jaroslav Zak
- Nuffield Department of Clinical Medicine, Ludwig Institute for Cancer Research, University of Oxford, Old Road Campus Research Building, Oxford, OX3 7DQ, UK
- Department of Immunology and Microbiology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Anna H Schuh
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK
| | - Usha Kini
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 7LE, UK
| | - Andrew O M Wilkie
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Niko Popitsch
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Biochemistry and Cell Biology, Max Perutz Labs, University of Vienna, Vienna BioCenter(VBC), Dr.-Bohr-Gasse 9, 1030, Vienna, Austria
| | - Jenny C Taylor
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK.
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK.
| |
Collapse
|
20
|
Sun KY, Bai X, Chen S, Bao S, Kapoor M, Zhang C, Backman J, Joseph T, Maxwell E, Mitra G, Gorovits A, Mansfield A, Boutkov B, Gokhale S, Habegger L, Marcketta A, Locke A, Kessler MD, Sharma D, Staples J, Bovijn J, Gelfman S, Gioia AD, Rajagopal V, Lopez A, Varela JR, Alegre J, Berumen J, Tapia-Conyer R, Kuri-Morales P, Torres J, Emberson J, Collins R, Cantor M, Thornton T, Kang HM, Overton J, Shuldiner AR, Cremona ML, Nafde M, Baras A, Abecasis G, Marchini J, Reid JG, Salerno W, Balasubramanian S. A deep catalog of protein-coding variation in 985,830 individuals. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.09.539329. [PMID: 37214792 PMCID: PMC10197621 DOI: 10.1101/2023.05.09.539329] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Coding variants that have significant impact on function can provide insights into the biology of a gene but are typically rare in the population. Identifying and ascertaining the frequency of such rare variants requires very large sample sizes. Here, we present the largest catalog of human protein-coding variation to date, derived from exome sequencing of 985,830 individuals of diverse ancestry to serve as a rich resource for studying rare coding variants. Individuals of African, Admixed American, East Asian, Middle Eastern, and South Asian ancestry account for 20% of this Exome dataset. Our catalog of variants includes approximately 10.5 million missense (54% novel) and 1.1 million predicted loss-of-function (pLOF) variants (65% novel, 53% observed only once). We identified individuals with rare homozygous pLOF variants in 4,874 genes, and for 1,838 of these this work is the first to document at least one pLOF homozygote. Additional insights from the RGC-ME dataset include 1) improved estimates of selection against heterozygous loss-of-function and identification of 3,459 genes intolerant to loss-of-function, 83 of which were previously assessed as tolerant to loss-of-function and 1,241 that lack disease annotations; 2) identification of regions depleted of missense variation in 457 genes that are tolerant to loss-of-function; 3) functional interpretation for 10,708 variants of unknown or conflicting significance reported in ClinVar as cryptic splice sites using splicing score thresholds based on empirical variant deleteriousness scores derived from RGC-ME; and 4) an observation that approximately 3% of sequenced individuals carry a clinically actionable genetic variant in the ACMG SF 3.1 list of genes. We make this important resource of coding variation available to the public through a variant allele frequency browser. We anticipate that this report and the RGC-ME dataset will serve as a valuable reference for understanding rare coding variation and help advance precision medicine efforts.
Collapse
Affiliation(s)
| | | | - Siying Chen
- Regeneron Genetics Center, Tarrytown, NY, USA
| | - Suying Bao
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | - Adam Locke
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | | | | | | | | | | | | | - Jesus Alegre
- Experimental Research Unit from the Faculty of Medicine (UIME), National Autonomous University of Mexico (UNAM)
| | - Jaime Berumen
- Experimental Research Unit from the Faculty of Medicine (UIME), National Autonomous University of Mexico (UNAM)
| | - Roberto Tapia-Conyer
- Experimental Research Unit from the Faculty of Medicine (UIME), National Autonomous University of Mexico (UNAM)
| | - Pablo Kuri-Morales
- Experimental Research Unit from the Faculty of Medicine (UIME), National Autonomous University of Mexico (UNAM)
| | - Jason Torres
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Jonathan Emberson
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
- MRC Population Health Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Rory Collins
- Clinical Trial Service Unit & Epidemiological Studies Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | | | | | | | | | | | | | | | | | - Mona Nafde
- Regeneron Genetics Center, Tarrytown, NY, USA
| | - Aris Baras
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | | | | |
Collapse
|
21
|
Cheloor Kovilakam S, Gu M, Dunn WG, Marando L, Barcena C, Nik-Zainal S, Mohorianu I, Kar SP, Fabre MA, Quiros PM, Vassiliou GS. Prevalence and significance of DDX41 gene variants in the general population. Blood 2023; 142:1185-1192. [PMID: 37506341 DOI: 10.1182/blood.2023020209] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 06/26/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023] Open
Abstract
Germ line variants in the DDX41 gene have been linked to myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML) development. However, the risks associated with different variants remain unknown, as do the basis of their leukemogenic properties, impact on steady-state hematopoiesis, and links to other cancers. Here, we investigate the frequency and significance of DDX41 variants in 454 792 United Kingdom Biobank (UKB) participants and identify 452 unique nonsynonymous DNA variants in 3538 (1/129) individuals. Many were novel, and the prevalence of most varied markedly by ancestry. Among the 1059 individuals with germ line pathogenic variants (DDX41-GPV) 34 developed MDS/AML (odds ratio, 12.3 vs noncarriers). Of these, 7 of 218 had start-lost, 22 of 584 had truncating, and 5 of 257 had missense (odds ratios: 12.9, 15.1, and 7.5, respectively). Using multivariate logistic regression, we found significant associations of DDX41-GPV with MDS, AML, and family history of leukemia but not lymphoma, myeloproliferative neoplasms, or other cancers. We also report that DDX41-GPV carriers do not have an increased prevalence of clonal hematopoiesis (CH). In fact, CH was significantly more common before sporadic vs DDX41-mutant MDS/AML, revealing distinct evolutionary paths. Furthermore, somatic mutation rates did not differ between sporadic and DDX41-mutant AML genomes, ruling out genomic instability as a driver of the latter. Finally, we found that higher mean red cell volume (MCV) and somatic DDX41 mutations in blood DNA identify DDX41-GPV carriers at increased MDS/AML risk. Collectively, our findings give new insights into the prevalence and cognate risks associated with DDX41 variants, as well as the clonal evolution and early detection of DDX41-mutant MDS/AML.
Collapse
Affiliation(s)
- Sruthi Cheloor Kovilakam
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
| | - Muxin Gu
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
| | - William G Dunn
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, Cambridge University Hospitals NHS Trust, Cambridge, United Kingdom
| | - Ludovica Marando
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
| | - Clea Barcena
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry and Molecular Biology, Universidad de Oviedo, Oviedo, Spain
| | - Serena Nik-Zainal
- Early Cancer Institute, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | - Irina Mohorianu
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Siddhartha P Kar
- Early Cancer Institute, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
- Section of Translational Epidemiology, Division of Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Margarete A Fabre
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals Research and Development, AstraZeneca, Cambridge, United Kingdom
| | - Pedro M Quiros
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Instituto de Investigación Sanitaria del Principado de Asturias, Oviedo, Spain
| | - George S Vassiliou
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, Cambridge University Hospitals NHS Trust, Cambridge, United Kingdom
| |
Collapse
|
22
|
Guhlin J, Le Lec MF, Wold J, Koot E, Winter D, Biggs PJ, Galla SJ, Urban L, Foster Y, Cox MP, Digby A, Uddstrom LR, Eason D, Vercoe D, Davis T, Howard JT, Jarvis ED, Robertson FE, Robertson BC, Gemmell NJ, Steeves TE, Santure AW, Dearden PK. Species-wide genomics of kākāpō provides tools to accelerate recovery. Nat Ecol Evol 2023; 7:1693-1705. [PMID: 37640765 DOI: 10.1038/s41559-023-02165-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Accepted: 07/11/2023] [Indexed: 08/31/2023]
Abstract
The kākāpō is a critically endangered, intensively managed, long-lived nocturnal parrot endemic to Aotearoa New Zealand. We generated and analysed whole-genome sequence data for nearly all individuals living in early 2018 (169 individuals) to generate a high-quality species-wide genetic variant callset. We leverage extensive long-term metadata to quantify genome-wide diversity of the species over time and present new approaches using probabilistic programming, combined with a phenotype dataset spanning five decades, to disentangle phenotypic variance into environmental and genetic effects while quantifying uncertainty in small populations. We find associations for growth, disease susceptibility, clutch size and egg fertility within genic regions previously shown to influence these traits in other species. Finally, we generate breeding values to predict phenotype and illustrate that active management over the past 45 years has maintained both genome-wide diversity and diversity in breeding values and, hence, evolutionary potential. We provide new pathways for informing future conservation management decisions for kākāpō, including prioritizing individuals for translocation and monitoring individuals with poor growth or high disease risk. Overall, by explicitly addressing the challenge of the small sample size, we provide a template for the inclusion of genomic data that will be transformational for species recovery efforts around the globe.
Collapse
Affiliation(s)
- Joseph Guhlin
- Genomics Aotearoa, Biochemistry Department, School of Biomedical Sciences, University of Otago, Dunedin, Aotearoa New Zealand
| | - Marissa F Le Lec
- Genomics Aotearoa, Biochemistry Department, School of Biomedical Sciences, University of Otago, Dunedin, Aotearoa New Zealand
| | - Jana Wold
- School of Biological Sciences, University of Canterbury, Christchurch, Aotearoa New Zealand
| | - Emily Koot
- The New Zealand Institute for Plant and Food Research Ltd, Palmerston North, Aotearoa New Zealand
| | - David Winter
- School of Natural Sciences, Massey University, Palmerston North, Aotearoa New Zealand
| | - Patrick J Biggs
- School of Natural Sciences, Massey University, Palmerston North, Aotearoa New Zealand
- School of Veterinary Science, Massey University, Palmerston North, Aotearoa New Zealand
| | - Stephanie J Galla
- School of Biological Sciences, University of Canterbury, Christchurch, Aotearoa New Zealand
- Department of Biological Sciences, Boise State University, Boise, ID, USA
| | - Lara Urban
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, Aotearoa New Zealand
- Helmholtz Pioneer Campus, Helmholtz Zentrum Muenchen, Neuherberg, Germany
- Helmholtz AI, Helmholtz Zentrum Muenchen, Neuherberg, Germany
- School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Yasmin Foster
- Department of Zoology, University of Otago, Dunedin, Aotearoa New Zealand
| | - Murray P Cox
- School of Natural Sciences, Massey University, Palmerston North, Aotearoa New Zealand
- Department of Statistics, University of Auckland, Auckland, Aotearoa New Zealand
| | - Andrew Digby
- Kākāpō Recovery Programme, Department of Conservation, Invercargill, Aotearoa New Zealand
| | - Lydia R Uddstrom
- Kākāpō Recovery Programme, Department of Conservation, Invercargill, Aotearoa New Zealand
| | - Daryl Eason
- Kākāpō Recovery Programme, Department of Conservation, Invercargill, Aotearoa New Zealand
| | - Deidre Vercoe
- Kākāpō Recovery Programme, Department of Conservation, Invercargill, Aotearoa New Zealand
| | - Tāne Davis
- Rakiura Tītī Islands Administering Body, Invercargill, Aotearoa New Zealand
| | - Jason T Howard
- Neurogenetics of Language Lab, The Rockefeller University, New York, NY, USA
- Mirxes, Cambridge, MA, USA
| | - Erich D Jarvis
- The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Fiona E Robertson
- Department of Zoology, University of Otago, Dunedin, Aotearoa New Zealand
| | - Bruce C Robertson
- Department of Zoology, University of Otago, Dunedin, Aotearoa New Zealand
| | - Neil J Gemmell
- Department of Anatomy, School of Biomedical Sciences, University of Otago, Dunedin, Aotearoa New Zealand
| | - Tammy E Steeves
- School of Biological Sciences, University of Canterbury, Christchurch, Aotearoa New Zealand
| | - Anna W Santure
- School of Biological Sciences, University of Auckland, Auckland, Aotearoa New Zealand
| | - Peter K Dearden
- Genomics Aotearoa, Biochemistry Department, School of Biomedical Sciences, University of Otago, Dunedin, Aotearoa New Zealand.
| |
Collapse
|
23
|
Naithani S, Deng CH, Sahu SK, Jaiswal P. Exploring Pan-Genomes: An Overview of Resources and Tools for Unraveling Structure, Function, and Evolution of Crop Genes and Genomes. Biomolecules 2023; 13:1403. [PMID: 37759803 PMCID: PMC10527062 DOI: 10.3390/biom13091403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/29/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023] Open
Abstract
The availability of multiple sequenced genomes from a single species made it possible to explore intra- and inter-specific genomic comparisons at higher resolution and build clade-specific pan-genomes of several crops. The pan-genomes of crops constructed from various cultivars, accessions, landraces, and wild ancestral species represent a compendium of genes and structural variations and allow researchers to search for the novel genes and alleles that were inadvertently lost in domesticated crops during the historical process of crop domestication or in the process of extensive plant breeding. Fortunately, many valuable genes and alleles associated with desirable traits like disease resistance, abiotic stress tolerance, plant architecture, and nutrition qualities exist in landraces, ancestral species, and crop wild relatives. The novel genes from the wild ancestors and landraces can be introduced back to high-yielding varieties of modern crops by implementing classical plant breeding, genomic selection, and transgenic/gene editing approaches. Thus, pan-genomic represents a great leap in plant research and offers new avenues for targeted breeding to mitigate the impact of global climate change. Here, we summarize the tools used for pan-genome assembly and annotations, web-portals hosting plant pan-genomes, etc. Furthermore, we highlight a few discoveries made in crops using the pan-genomic approach and future potential of this emerging field of study.
Collapse
Affiliation(s)
- Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA;
| | - Cecilia H. Deng
- Molecular & Digital Breeing Group, New Cultivar Innovation, The New Zealand Institute for Plant and Food Research Limited, Private Bag 92169, Auckland 1142, New Zealand;
| | - Sunil Kumar Sahu
- State Key Laboratory of Agricultural Genomics, Key Laboratory of Genomics, Ministry of Agriculture, BGI Research, Shenzhen 518083, China;
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA;
| |
Collapse
|
24
|
Al-Jumaan M, Chu H, Alsulaiman A, Camp SY, Han S, Gillani R, Al Marzooq Y, Almulhim F, Vatte C, Al Nemer A, Almuhanna A, Van Allen EM, Al-Ali A, AlDubayan SH. Interplay of Mendelian and polygenic risk factors in Arab breast cancer patients. Genome Med 2023; 15:65. [PMID: 37658461 PMCID: PMC10474689 DOI: 10.1186/s13073-023-01220-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 08/09/2023] [Indexed: 09/03/2023] Open
Abstract
BACKGROUND Breast cancer patients from the indigenous Arab population present much earlier than patients from Western countries and have traditionally been underrepresented in cancer genomics studies. The contribution of polygenic and Mendelian risk toward the earlier onset of breast cancer in the population remains elusive. METHODS We performed low-pass whole genome sequencing (lpWGS) and whole-exome sequencing (WES) from 220 female breast cancer patients unselected for positive family history from the indigenous Arab population. Using publicly available resources, we imputed population-specific variants and calculated breast cancer burden-sensitive polygenic risk scores (PRS). Variant pathogenicity was also evaluated on exome variants with high coverage. RESULTS Variants imputed from lpWGS showed high concordance with paired exome (median dosage correlation: 0.9459, Interquartile range: 0.9410-0.9490). After adjusting the PRS to the Arab population, we found significant associations between PRS performance in risk prediction and first-degree relative breast cancer history prediction (Spearman rho=0.43, p = 0.03), where breast cancer patients in the top PRS decile are 5.53 (95% CI 1.76-17.97, p = 0.003) times more likely also to have a first-degree relative diagnosed with breast cancer compared to those in the middle deciles. In addition, we found evidence for the genetic liability threshold model of breast cancer where among patients with a family history of breast cancer, pathogenic rare variant carriers had significantly lower PRS than non-carriers (p = 0.0205, Mann-Whitney U test) while for non-carriers every standard deviation increase in PRS corresponded to 4.52 years (95% CI 8.88-0.17, p = 0.042) earlier age of presentation. CONCLUSIONS Overall, our study provides a framework to assess polygenic risk in an understudied population using lpWGS and identifies common variant risk as a factor independent of pathogenic variant carrier status for earlier age of onset of breast cancer among indigenous Arab breast cancer patients.
Collapse
Affiliation(s)
- Mohammed Al-Jumaan
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Hoyin Chu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Abdullah Alsulaiman
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Sabrina Y Camp
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Seunghun Han
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Riaz Gillani
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Boston Children's Hospital, Boston, MA, USA
| | - Yousef Al Marzooq
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Fatmah Almulhim
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Chittibabu Vatte
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Areej Al Nemer
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Afnan Almuhanna
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Eliezer M Van Allen
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Genomics, Dana-Farber Cancer Institute, Boston, MA, 02115, USA
| | - Amein Al-Ali
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Saud H AlDubayan
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
- Cancer Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA.
- College of Medicine, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia.
| |
Collapse
|
25
|
Marr RA, Moore J, Formby S, Martiniuk JT, Hamilton J, Ralli S, Konwar K, Rajasundaram N, Hahn A, Measday V. Whole genome sequencing of Canadian Saccharomyces cerevisiae strains isolated from spontaneous wine fermentations reveals a new Pacific West Coast Wine clade. G3 (BETHESDA, MD.) 2023; 13:jkad130. [PMID: 37307358 PMCID: PMC10411583 DOI: 10.1093/g3journal/jkad130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 05/19/2023] [Accepted: 05/22/2023] [Indexed: 06/14/2023]
Abstract
Vineyards in wine regions around the world are reservoirs of yeast with oenological potential. Saccharomyces cerevisiae ferments grape sugars to ethanol and generates flavor and aroma compounds in wine. Wineries place a high-value on identifying yeast native to their region to develop a region-specific wine program. Commercial wine strains are genetically very similar due to a population bottleneck and in-breeding compared to the diversity of S. cerevisiae from the wild and other industrial processes. We have isolated and microsatellite-typed hundreds of S. cerevisiae strains from spontaneous fermentations of grapes from the Okanagan Valley wine region in British Columbia, Canada. We chose 75 S. cerevisiae strains, based on our microsatellite clustering data, for whole genome sequencing using Illumina paired-end reads. Phylogenetic analysis shows that British Columbian S. cerevisiae strains cluster into 4 clades: Wine/European, Transpacific Oak, Beer 1/Mixed Origin, and a new clade that we have designated as Pacific West Coast Wine. The Pacific West Coast Wine clade has high nucleotide diversity and shares genomic characteristics with wild North American oak strains but also has gene flow from Wine/European and Ecuadorian clades. We analyzed gene copy number variations to find evidence of domestication and found that strains in the Wine/European and Pacific West Coast Wine clades have gene copy number variation reflective of adaptations to the wine-making environment. The "wine circle/Region B", a cluster of 5 genes acquired by horizontal gene transfer into the genome of commercial wine strains is also present in the majority of the British Columbian strains in the Wine/European clade but in a minority of the Pacific West Coast Wine clade strains. Previous studies have shown that S. cerevisiae strains isolated from Mediterranean Oak trees may be the living ancestors of European wine yeast strains. This study is the first to isolate S. cerevisiae strains with genetic similarity to nonvineyard North American Oak strains from spontaneous wine fermentations.
Collapse
Affiliation(s)
- R Alexander Marr
- Genome Science and Technology Graduate Program, University of British Columbia, Vancouver, BC V5Z 4S6, Canada
- Department of Food Science, Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, 2205 East Mall, Vancouver, BC V6T 1Z4, Canada
| | - Jackson Moore
- Genome Science and Technology Graduate Program, University of British Columbia, Vancouver, BC V5Z 4S6, Canada
- Department of Food Science, Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, 2205 East Mall, Vancouver, BC V6T 1Z4, Canada
| | - Sean Formby
- Koonkie Canada Inc., 321 Water Street Suite 501, Vancouver, BC V6B 1B8, Canada
| | - Jonathan T Martiniuk
- Department of Food Science, Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, 2205 East Mall, Vancouver, BC V6T 1Z4, Canada
- Food Science Graduate Program, Faculty of Land and Food Systems, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Jonah Hamilton
- Department of Food Science, Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, 2205 East Mall, Vancouver, BC V6T 1Z4, Canada
| | - Sneha Ralli
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, 675 West 10th Avenue, Vancouver, BC V5Z 1L3, Canada
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, 8888 University Drive East K9625, Burnaby, BC V5A 1S6, Canada
| | - Kishori Konwar
- Koonkie Canada Inc., 321 Water Street Suite 501, Vancouver, BC V6B 1B8, Canada
| | - Nisha Rajasundaram
- Koonkie Canada Inc., 321 Water Street Suite 501, Vancouver, BC V6B 1B8, Canada
| | - Aria Hahn
- Koonkie Canada Inc., 321 Water Street Suite 501, Vancouver, BC V6B 1B8, Canada
| | - Vivien Measday
- Department of Food Science, Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, 2205 East Mall, Vancouver, BC V6T 1Z4, Canada
| |
Collapse
|
26
|
Niiya A, Hamaguchi Y, Mishima H, Miura S, Komatsu N, Nagata K, Hasegawa Y, Miura K, Yoshiura KI. Four conserved amino acids on human papillomavirus E6 predict clinical high-risk types. J Med Virol 2023; 95:e29049. [PMID: 37621086 DOI: 10.1002/jmv.29049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/09/2023] [Accepted: 08/12/2023] [Indexed: 08/26/2023]
Abstract
Human papillomavirus (HPV) types included in the genus alpha papillomavirus (alpha-HPVs) are subdivided into high- and low-risk HPVs associated with tumorigenicity. According to conventional risk classification, over 30 alpha-HPVs remain unclassified and HPV groups phylogenetically classified using the L1 gene do not exactly correspond to the conventional risk classification groups. Here, we propose a novel cervical lesion progression risk classification strategy. Using four E6 risk distinguishable amino acids (E6-RDAAs), we successfully expanded the conventional classification to encompass alpha-HPVs and resolve discrepancies. We validated our classification system using alpha-HPV-targeted sequence data of 325 cervical swab specimens from participants in Japan. Clinical outcomes significantly correlated with the E6-RDAA classification. Four of five HPV types in the data set that were not conventionally classified (HPV30, 34, 67, and 69) were high-risk according to our classification criteria. This report sheds light on the carcinogenicity of rare genital HPV types using a novel risk classification strategy.
Collapse
Affiliation(s)
- Akari Niiya
- Department of Obstetrics and Gynecology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Yo Hamaguchi
- Department of Human Genetics, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Hiroyuki Mishima
- Department of Human Genetics, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
- Leading Medical Research Core Unit, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Shoko Miura
- Department of Obstetrics and Gynecology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Nahoko Komatsu
- Department of Obstetrics and Gynecology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Koh Nagata
- Department of Obstetrics and Gynecology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Yuri Hasegawa
- Department of Obstetrics and Gynecology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Kiyonori Miura
- Department of Obstetrics and Gynecology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
- Leading Medical Research Core Unit, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Koh-Ichiro Yoshiura
- Department of Human Genetics, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
- Leading Medical Research Core Unit, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| |
Collapse
|
27
|
Wang X, Huang M, Budowle B, Ge J. TRcaller: a novel tool for precise and ultrafast tandem repeat variant genotyping in massively parallel sequencing reads. Front Genet 2023; 14:1227176. [PMID: 37533432 PMCID: PMC10390829 DOI: 10.3389/fgene.2023.1227176] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 06/13/2023] [Indexed: 08/04/2023] Open
Abstract
Calling tandem repeat (TR) variants from DNA sequences is of both theoretical and practical significance. Some bioinformatics tools have been developed for detecting or genotyping TRs. However, little study has been done to genotyping TR alleles from long-read sequencing data, and the accuracy of genotyping TR alleles from next-generation sequencing data still needs to be improved. Herein, a novel algorithm is described to retrieve TR regions from sequence alignment, and a software program TRcaller has been developed and integrated into a web portal to call TR alleles from both short- and long-read sequences, both whole genome and targeted sequences generated from multiple sequencing platforms. All TR alleles are genotyped as haplotypes and the robust alleles will be reported, even multiple alleles in a DNA mixture. TRcaller could provide substantially higher accuracy (>99% in 289 human individuals) in detecting TR alleles with magnitudes faster (e.g., ∼2 s for 300x human sequence data) than the mainstream software tools. The web portal preselected 119 TR loci from forensics, genealogy, and disease related TR loci. TRcaller is validated to be scalable in various applications, such as DNA forensics and disease diagnosis, which can be expanded into other fields like breeding programs. Availability: TRcaller is available at https://www.trcaller.com/SignIn.aspx.
Collapse
Affiliation(s)
- Xuewen Wang
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Meng Huang
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Bruce Budowle
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| | - Jianye Ge
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United States
- Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, Fort Worth, TX, United States
| |
Collapse
|
28
|
Wang N, Cao S, Liu Z, Xiao H, Hu J, Xu X, Chen P, Ma Z, Ye J, Chai L, Guo W, Larkin RM, Xu Q, Morrell PL, Zhou Y, Deng X. Genomic conservation of crop wild relatives: A case study of citrus. PLoS Genet 2023; 19:e1010811. [PMID: 37339133 DOI: 10.1371/journal.pgen.1010811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 06/01/2023] [Indexed: 06/22/2023] Open
Abstract
Conservation of crop wild relatives is critical for plant breeding and food security. The lack of clarity on the genetic factors that lead to endangered status or extinction create difficulties when attempting to develop concrete recommendations for conserving a citrus wild relative: the wild relatives of crops. Here, we evaluate the conservation of wild kumquat (Fortunella hindsii) using genomic, geographical, environmental, and phenotypic data, and forward simulations. Genome resequencing data from 73 accessions from the Fortunella genus were combined to investigate population structure, demography, inbreeding, introgression, and genetic load. Population structure was correlated with reproductive type (i.e., sexual and apomictic) and with a significant differentiation within the sexually reproducing population. The effective population size for one of the sexually reproducing subpopulations has recently declined to ~1,000, resulting in high levels of inbreeding. In particular, we found that 58% of the ecological niche overlapped between wild and cultivated populations and that there was extensive introgression into wild samples from cultivated populations. Interestingly, the introgression pattern and accumulation of genetic load may be influenced by the type of reproduction. In wild apomictic samples, the introgressed regions were primarily heterozygous, and genome-wide deleterious variants were hidden in the heterozygous state. In contrast, wild sexually reproducing samples carried a higher recessive deleterious burden. Furthermore, we also found that sexually reproducing samples were self-incompatible, which prevented the reduction of genetic diversity by selfing. Our population genomic analyses provide specific recommendations for distinct reproductive types and monitoring during conservation. This study highlights the genomic landscape of a wild relative of citrus and provides recommendations for the conservation of crop wild relatives.
Collapse
Affiliation(s)
- Nan Wang
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Shuo Cao
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Zhongjie Liu
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Hua Xiao
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jianbing Hu
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
| | - Xiaodong Xu
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Hubei Hongshan Laboratory, Wuhan, China
| | - Peng Chen
- Institute of Horticultural Research, Hunan Academy of Agricultural Sciences, Changsha, China
| | - Zhiyao Ma
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Junli Ye
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
| | - Lijun Chai
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
| | - Wenwu Guo
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| | - Robert M Larkin
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| | - Qiang Xu
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| | - Peter L Morrell
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota, United States of America
| | - Yongfeng Zhou
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- State Key Laboratory of Tropical Crop Breeding, Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou, China
| | - Xiuxin Deng
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| |
Collapse
|
29
|
Wlodzimierz P, Rabanal FA, Burns R, Naish M, Primetis E, Scott A, Mandáková T, Gorringe N, Tock AJ, Holland D, Fritschi K, Habring A, Lanz C, Patel C, Schlegel T, Collenberg M, Mielke M, Nordborg M, Roux F, Shirsekar G, Alonso-Blanco C, Lysak MA, Novikova PY, Bousios A, Weigel D, Henderson IR. Cycles of satellite and transposon evolution in Arabidopsis centromeres. Nature 2023:10.1038/s41586-023-06062-z. [PMID: 37198485 DOI: 10.1038/s41586-023-06062-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 04/06/2023] [Indexed: 05/19/2023]
Abstract
Centromeres are critical for cell division, loading CENH3 or CENPA histone variant nucleosomes, directing kinetochore formation and allowing chromosome segregation1,2. Despite their conserved function, centromere size and structure are diverse across species. To understand this centromere paradox3,4, it is necessary to know how centromeric diversity is generated and whether it reflects ancient trans-species variation or, instead, rapid post-speciation divergence. To address these questions, we assembled 346 centromeres from 66 Arabidopsis thaliana and 2 Arabidopsis lyrata accessions, which exhibited a remarkable degree of intra- and inter-species diversity. A. thaliana centromere repeat arrays are embedded in linkage blocks, despite ongoing internal satellite turnover, consistent with roles for unidirectional gene conversion or unequal crossover between sister chromatids in sequence diversification. Additionally, centrophilic ATHILA transposons have recently invaded the satellite arrays. To counter ATHILA invasion, chromosome-specific bursts of satellite homogenization generate higher-order repeats and purge transposons, in line with cycles of repeat evolution. Centromeric sequence changes are even more extreme in comparison between A. thaliana and A. lyrata. Together, our findings identify rapid cycles of transposon invasion and purging through satellite homogenization, which drive centromere evolution and ultimately contribute to speciation.
Collapse
Affiliation(s)
- Piotr Wlodzimierz
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Fernando A Rabanal
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Robin Burns
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Matthew Naish
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Elias Primetis
- School of Life Sciences, University of Sussex, Brighton, UK
| | - Alison Scott
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Terezie Mandáková
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Nicola Gorringe
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Andrew J Tock
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Daniel Holland
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Katrin Fritschi
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Anette Habring
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Christa Lanz
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Christie Patel
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Theresa Schlegel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Maximilian Collenberg
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Miriam Mielke
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Magnus Nordborg
- Gregor Mendel Institute, Vienna, Austrian Academy of Sciences, Vienna BioCenter, Vienna, Austria
| | - Fabrice Roux
- LIPME, INRAE, CNRS, Université de Toulouse, Castanet-Tolosan, France
| | - Gautam Shirsekar
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Carlos Alonso-Blanco
- Departamento de Genética Molecular de Plantas, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | - Martin A Lysak
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Polina Y Novikova
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | | | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany.
| | - Ian R Henderson
- Department of Plant Sciences, University of Cambridge, Cambridge, UK.
| |
Collapse
|
30
|
Pagadala M, Sears TJ, Wu VH, Pérez-Guijarro E, Kim H, Castro A, Talwar JV, Gonzalez-Colin C, Cao S, Schmiedel BJ, Goudarzi S, Kirani D, Au J, Zhang T, Landi T, Salem RM, Morris GP, Harismendy O, Patel SP, Alexandrov LB, Mesirov JP, Zanetti M, Day CP, Fan CC, Thompson WK, Merlino G, Gutkind JS, Vijayanand P, Carter H. Germline modifiers of the tumor immune microenvironment implicate drivers of cancer risk and immunotherapy response. Nat Commun 2023; 14:2744. [PMID: 37173324 PMCID: PMC10182072 DOI: 10.1038/s41467-023-38271-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 04/24/2023] [Indexed: 05/15/2023] Open
Abstract
With the continued promise of immunotherapy for treating cancer, understanding how host genetics contributes to the tumor immune microenvironment (TIME) is essential to tailoring cancer screening and treatment strategies. Here, we study 1084 eQTLs affecting the TIME found through analysis of The Cancer Genome Atlas and literature curation. These TIME eQTLs are enriched in areas of active transcription, and associate with gene expression in specific immune cell subsets, such as macrophages and dendritic cells. Polygenic score models built with TIME eQTLs reproducibly stratify cancer risk, survival and immune checkpoint blockade (ICB) response across independent cohorts. To assess whether an eQTL-informed approach could reveal potential cancer immunotherapy targets, we inhibit CTSS, a gene implicated by cancer risk and ICB response-associated polygenic models; CTSS inhibition results in slowed tumor growth and extended survival in vivo. These results validate the potential of integrating germline variation and TIME characteristics for uncovering potential targets for immunotherapy.
Collapse
Affiliation(s)
- Meghana Pagadala
- Biomedical Sciences Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - Timothy J Sears
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - Victoria H Wu
- Department of Pharmacology, UCSD Moores Cancer Center, La Jolla, CA, 92093, USA
| | - Eva Pérez-Guijarro
- Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - Hyo Kim
- Undergraduate Bioengineering Program, Jacobs School of Engineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Andrea Castro
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - James V Talwar
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
| | | | - Steven Cao
- Division of Epidemiology, Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, 92093, USA
| | | | | | - Divya Kirani
- Undergraduate Biology and Bioinformatics Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - Jessica Au
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
| | - Tongwu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - Teresa Landi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - Rany M Salem
- Division of Epidemiology, Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, 92093, USA
| | - Gerald P Morris
- Department of Pathology, University of California San Diego, La Jolla, CA, 92093, USA
| | - Olivier Harismendy
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego School of Medicine, La Jolla, CA, 92093, USA
| | - Sandip Pravin Patel
- Center for Personalized Cancer Therapy, Division of Hematology and Oncology, UC San Diego Moores Cancer Center, San Diego, CA, 92037, USA
| | - Ludmil B Alexandrov
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Jill P Mesirov
- Moores Cancer Center, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA, 92093, USA
| | - Maurizio Zanetti
- Moores Cancer Center, University of California San Diego, La Jolla, CA, 92093, USA
- The Laboratory of Immunology and Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Chi-Ping Day
- Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - Chun Chieh Fan
- Center for Population Neuroscience and Genetics, Laureate Institute for Brain Research, Tulsa, OK, 74136, USA
- Department of Radiology, University of California San Diego, La Jolla, CA, 92093, USA
| | - Wesley K Thompson
- Division of Biostatistics, Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, 92093, USA
| | - Glenn Merlino
- Laboratory of Cancer Biology and Genetics, National Cancer Institute, National Institutes of Health (NIH), Bethesda, MD, 20892, USA
| | - J Silvio Gutkind
- Department of Pharmacology, UCSD Moores Cancer Center, La Jolla, CA, 92093, USA
| | | | - Hannah Carter
- Moores Cancer Center, University of California San Diego, La Jolla, CA, 92093, USA.
- Department of Medicine, Division of Medical Genetics, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
31
|
Chen NC, Kolesnikov A, Goel S, Yun T, Chang PC, Carroll A. Improving variant calling using population data and deep learning. BMC Bioinformatics 2023; 24:197. [PMID: 37173615 PMCID: PMC10182612 DOI: 10.1186/s12859-023-05294-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 04/17/2023] [Indexed: 05/15/2023] Open
Abstract
Large-scale population variant data is often used to filter and aid interpretation of variant calls in a single sample. These approaches do not incorporate population information directly into the process of variant calling, and are often limited to filtering which trades recall for precision. In this study, we develop population-aware DeepVariant models with a new channel encoding allele frequencies from the 1000 Genomes Project. This model reduces variant calling errors, improving both precision and recall in single samples, and reduces rare homozygous and pathogenic clinvar calls cohort-wide. We assess the use of population-specific or diverse reference panels, finding the greatest accuracy with diverse panels, suggesting that large, diverse panels are preferable to individual populations, even when the population matches sample ancestry. Finally, we show that this benefit generalizes to samples with different ancestry from the training data even when the ancestry is also excluded from the reference panel.
Collapse
Affiliation(s)
- Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA.
| | | | | | | | | | | |
Collapse
|
32
|
Lloret-Villas A, Pausch H, Leonard AS. The size and composition of haplotype reference panels impact the accuracy of imputation from low-pass sequencing in cattle. Genet Sel Evol 2023; 55:33. [PMID: 37170101 PMCID: PMC10173671 DOI: 10.1186/s12711-023-00809-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 05/02/2023] [Indexed: 05/13/2023] Open
Abstract
BACKGROUND Low-pass sequencing followed by sequence variant genotype imputation is an alternative to the routine microarray-based genotyping in cattle. However, the impact of haplotype reference panels and their interplay with the coverage of low-pass whole-genome sequencing data have not been sufficiently explored in typical livestock settings where only a small number of reference samples is available. METHODS Sequence variant genotyping accuracy was compared between two variant callers, GATK and DeepVariant, in 50 Brown Swiss cattle with sequencing coverages ranging from 4- to 63-fold. Haplotype reference panels of varying sizes and composition were built with DeepVariant based on 501 individuals from nine breeds. High-coverage sequence data for 24 Brown Swiss cattle were downsampled to between 0.01- and 4-fold to mimic low-pass sequencing. GLIMPSE was used to infer sequence variant genotypes from the low-pass sequencing data using different haplotype reference panels. The accuracy of the sequence variant genotypes that were inferred from low-pass sequencing data was compared with sequence variant genotypes called from high-coverage data. RESULTS DeepVariant was used to establish bovine haplotype reference panels because it outperformed GATK in all evaluations. Within-breed haplotype reference panels were more accurate and efficient to impute sequence variant genotypes from low-pass sequencing than equally-sized multibreed haplotype reference panels for all target sample coverages and allele frequencies. F1 scores greater than 0.9, which indicate high harmonic means of recall and precision of called genotypes, were achieved with 0.25-fold sequencing coverage when large breed-specific haplotype reference panels (n = 150) were used. In absence of such large within-breed haplotype panels, variant genotyping accuracy from low-pass sequencing could be increased either by adding non-related samples to the haplotype reference panel or by increasing the coverage of the low-pass sequencing data. Sequence variant genotyping from low-pass sequencing was substantially less accurate when the reference panel lacked individuals from the target breed. CONCLUSIONS Variant genotyping is more accurate with DeepVariant than GATK. DeepVariant is therefore suitable to establish bovine haplotype reference panels. Medium-sized breed-specific haplotype reference panels and large multibreed haplotype reference panels enable accurate imputation of low-pass sequencing data in a typical cattle breed.
Collapse
Affiliation(s)
| | - Hubert Pausch
- Animal Genomics, ETH Zürich, Universitätstrasse 2, Zürich, 8092, Switzerland
| | - Alexander S Leonard
- Animal Genomics, ETH Zürich, Universitätstrasse 2, Zürich, 8092, Switzerland
| |
Collapse
|
33
|
Delpuech E, Vandeputte M, Morvezen R, Bestin A, Besson M, Brunier J, Bajek A, Imarazene B, François Y, Bouchez O, Cousin X, Poncet C, Morin T, Bruant JS, Chatain B, Haffray P, Phocas F, Allal F. Whole-genome sequencing identifies interferon-induced protein IFI6/IFI27-like as a strong candidate gene for VNN resistance in European sea bass. Genet Sel Evol 2023; 55:30. [PMID: 37143017 PMCID: PMC10161657 DOI: 10.1186/s12711-023-00805-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 04/18/2023] [Indexed: 05/06/2023] Open
Abstract
BACKGROUND Viral nervous necrosis (VNN) is a major disease that affects European sea bass, and understanding the biological mechanisms that underlie VNN resistance is important for the welfare of farmed fish and sustainability of production systems. The aim of this study was to identify genomic regions and genes that are associated with VNN resistance in sea bass. RESULTS We generated a dataset of 838,451 single nucleotide polymorphisms (SNPs) identified from whole-genome sequencing (WGS) in the parental generation of two commercial populations (A: 2371 individuals and B: 3428 individuals) of European sea bass with phenotypic records for binary survival in a VNN challenge. For each population, three cohorts were submitted to a red-spotted grouper nervous necrosis virus (RGNNV) challenge by immersion and genotyped on a 57K SNP chip. After imputation of WGS SNPs from their parents, quantitative trait loci (QTL) were mapped using a Bayesian sparse linear mixed model (BSLMM). We found several QTL regions that were specific to one of the populations on different linkage groups (LG), and one 127-kb QTL region on LG12 that was shared by both populations and included the genes ZDHHC14, which encodes a palmitoyltransferase, and IFI6/IFI27-like, which encodes an interferon-alpha induced protein. The most significant SNP in this QTL region was only 1.9 kb downstream of the coding sequence of the IFI6/IFI27-like gene. An unrelated population of four large families was used to validate the effect of the QTL. Survival rates of susceptible genotypes were 40.6% and 45.4% in populations A and B, respectively, while that of the resistant genotype was 66.2% in population B and 78% in population A. CONCLUSIONS We have identified a genomic region that carries a major QTL for resistance to VNN and includes the ZDHHC14 and IFI6/IFI27-like genes. The potential involvement of the interferon pathway, a well-known anti-viral defense mechanism in several organisms (chicken, human, or fish), in survival to VNN infection is of particular interest. Our results can lead to major improvements for sea bass breeding programs through marker-assisted genomic selection to obtain more resistant fish.
Collapse
Affiliation(s)
- Emilie Delpuech
- MARBEC, Univ. Montpellier, CNRS, Ifremer, IRD, INRAE, 34250, Palavas-Les-Flots, France.
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France.
| | - Marc Vandeputte
- MARBEC, Univ. Montpellier, CNRS, Ifremer, IRD, INRAE, 34250, Palavas-Les-Flots, France
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France
| | - Romain Morvezen
- SYSAAF, Station LPGP/INRAE, Campus de Beaulieu, 35042, Rennes, France
| | - Anastasia Bestin
- SYSAAF, Station LPGP/INRAE, Campus de Beaulieu, 35042, Rennes, France
| | - Mathieu Besson
- SYSAAF, Station LPGP/INRAE, Campus de Beaulieu, 35042, Rennes, France
| | - Joseph Brunier
- Ecloserie Marine de Gravelines-Ichtus, Gloria Maris Group, 59273, Gravelines, France
| | - Aline Bajek
- Ecloserie Marine de Gravelines-Ichtus, Gloria Maris Group, 59273, Gravelines, France
| | | | - Yoannah François
- SYSAAF, Station LPGP/INRAE, Campus de Beaulieu, 35042, Rennes, France
- ANSES, Unit Virology, Immunology and Ecotoxicology of Fish, Technopôle Brest-Iroise, 29280, Plouzané, France
| | - Olivier Bouchez
- US 1426, GeT-PlaGe, INRAE, Genotoul, Castanet-Tolosan, France
| | - Xavier Cousin
- MARBEC, Univ. Montpellier, CNRS, Ifremer, IRD, INRAE, 34250, Palavas-Les-Flots, France
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France
| | - Charles Poncet
- INRAE-UCA, UMR 1095 GDEC, 63000, Clermont-Ferrand, France
| | - Thierry Morin
- ANSES, Unit Virology, Immunology and Ecotoxicology of Fish, Technopôle Brest-Iroise, 29280, Plouzané, France
| | | | - Béatrice Chatain
- MARBEC, Univ. Montpellier, CNRS, Ifremer, IRD, INRAE, 34250, Palavas-Les-Flots, France
| | - Pierrick Haffray
- SYSAAF, Station LPGP/INRAE, Campus de Beaulieu, 35042, Rennes, France
| | - Florence Phocas
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France
| | - François Allal
- MARBEC, Univ. Montpellier, CNRS, Ifremer, IRD, INRAE, 34250, Palavas-Les-Flots, France
| |
Collapse
|
34
|
Jurgens SJ, Pirruccello JP, Choi SH, Morrill VN, Chaffin M, Lubitz SA, Lunetta KL, Ellinor PT. Adjusting for common variant polygenic scores improves yield in rare variant association analyses. Nat Genet 2023; 55:544-548. [PMID: 36959364 PMCID: PMC11078202 DOI: 10.1038/s41588-023-01342-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 02/22/2023] [Indexed: 03/25/2023]
Abstract
With the emergence of large-scale sequencing data, methods for improving power in rare variant association tests are needed. Here we show that adjusting for common variant polygenic scores improves yield in gene-based rare variant association tests across 65 quantitative traits in the UK Biobank (up to 20% increase at α = 2.6 × 10-6), without marked increases in false-positive rates or genomic inflation. Benefits were seen for various models, with the largest improvements seen for efficient sparse mixed-effects models. Our results illustrate how polygenic score adjustment can efficiently improve power in rare variant association discovery.
Collapse
Affiliation(s)
- Sean J Jurgens
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Department of Experimental Cardiology, Heart Centre, Amsterdam UMC location University of Amsterdam, Amsterdam, the Netherlands
- Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, Amsterdam, the Netherlands
| | - James P Pirruccello
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Cardiology, University of California, San Francisco, San Francisco, CA, USA
| | - Seung Hoan Choi
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Valerie N Morrill
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mark Chaffin
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Steven A Lubitz
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA
| | - Kathryn L Lunetta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- NHLBI and Boston University's Framingham Heart Study, Framingham, MA, USA
| | - Patrick T Ellinor
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA.
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
35
|
Nartisa I, Kirsteina R, Neiburga KD, Zigure S, Ozola L, Grantina I, Micule I, Murmane D, Slisere B, Gailite L, Vilne B, Rots D, Taurina G, Kurjane N. Clinical and genetic characterization of Netherton syndrome due to SPINK5 founder variant in Latvian population. Pediatr Allergy Immunol 2023; 34:e13937. [PMID: 37102386 DOI: 10.1111/pai.13937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 02/24/2023] [Accepted: 02/27/2023] [Indexed: 04/28/2023]
Abstract
OBJECTIVE Netherton syndrome (NS) (OMIM:256500) is a very rare autosomal recessive multisystem disorder mostly affecting ectodermal derivatives (skin and hair) and immune system. It is caused by biallelic loss-of-function variants in the SPINK5 gene, encoding the protease inhibitor lymphoepithelial Kazal-type-related inhibitor (LEKTI). MATERIAL, METHODS AND RESULTS Here, we describe NS clinical and genetic features of homogenous patient group: 9 individuals from 7 families with similar ethnic background and who have the same SPINK5 variant (NM_006846.4: c.1048C > T, p.(Arg350*)) in homozygous or compound heterozygous states, suggesting that it is a common founder variant in Latvian population. Indeed, we were able to show that the variant is common in general Latvian population, and it shares the same haplotype among the NS individual. It is estimated that the variant arose >1000 years ago. Clinically, all nine patients exhibited typical NS skin changes (scaly erythroderma, ichthyosis linearis circumflexa, itchy skin), except for one patient who has a different skin manifestation-epidermodysplasia. Additionally, we show that developmental delay, previously underrecognized in NS, is a common feature among these patients. CONCLUSIONS This study shows that the phenotype of NS individuals with the same genotype is highly homogeneous.
Collapse
Affiliation(s)
- Inga Nartisa
- Riga Stradins University, Riga, Latvia
- Children's Clinical University Hospital, Riga, Latvia
| | - Rasa Kirsteina
- Clinic for Medical Genetics and Prenatal Diagnosis, Children's Clinical University Hospital, Riga, Latvia
| | | | - Sanita Zigure
- Riga Stradins University, Riga, Latvia
- Children's Clinical University Hospital, Riga, Latvia
| | - Lota Ozola
- Children's Clinical University Hospital, Riga, Latvia
| | | | - Ieva Micule
- Clinic for Medical Genetics and Prenatal Diagnosis, Children's Clinical University Hospital, Riga, Latvia
| | - Daiga Murmane
- Clinic for Medical Genetics and Prenatal Diagnosis, Children's Clinical University Hospital, Riga, Latvia
| | - Baiba Slisere
- Riga Stradins University, Riga, Latvia
- Pauls Stradins Clinical University Hospital, Riga, Latvia
| | | | | | - Dmitrijs Rots
- Riga Stradins University, Riga, Latvia
- Children's Clinical University Hospital, Riga, Latvia
- Radboudumc, Nijmegen, The Netherlands
| | - Gita Taurina
- Clinic for Medical Genetics and Prenatal Diagnosis, Children's Clinical University Hospital, Riga, Latvia
| | - Natalja Kurjane
- Riga Stradins University, Riga, Latvia
- Clinic for Medical Genetics and Prenatal Diagnosis, Children's Clinical University Hospital, Riga, Latvia
- Pauls Stradins Clinical University Hospital, Riga, Latvia
| |
Collapse
|
36
|
Wu Y, Bayrak CS, Dong B, He S, Stenson PD, Cooper DN, Itan Y, Chen L. Identifying shared genetic factors underlying epilepsy and congenital heart disease in Europeans. Hum Genet 2023; 142:275-288. [PMID: 36352240 DOI: 10.1007/s00439-022-02502-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 10/24/2022] [Indexed: 11/11/2022]
Abstract
Epilepsy (EP) and congenital heart disease (CHD) are two apparently unrelated diseases that nevertheless display substantial mutual comorbidity. Thus, while congenital heart defects are associated with an elevated risk of developing epilepsy, the incidence of epilepsy in CHD patients correlates with CHD severity. Although genetic determinants have been postulated to underlie the comorbidity of EP and CHD, the precise genetic etiology is unknown. We performed variant and gene association analyses on EP and CHD patients separately, using whole exomes of genetically identified Europeans from the UK Biobank and Mount Sinai BioMe Biobank. We prioritized biologically plausible candidate genes and investigated the enriched pathways and other identified comorbidities by biological proximity calculation, pathway analyses, and gene-level phenome-wide association studies. Our variant- and gene-level results point to the Voltage-Gated Calcium Channels (VGCC) pathway as being a unifying framework for EP and CHD comorbidity. Additionally, pathway-level analyses indicated that the functions of disease-associated genes partially overlap between the two disease entities. Finally, phenome-wide association analyses of prioritized candidate genes revealed that cerebral blood flow and ulcerative colitis constitute the two main traits associated with both EP and CHD.
Collapse
Affiliation(s)
- Yiming Wu
- Department of Neurology, West China Hospital of Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Cigdem Sevim Bayrak
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Bosi Dong
- Department of Neurology, West China Hospital of Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Shixu He
- Department of Neurology, West China Hospital of Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Peter D Stenson
- Institute of Medical Genetics, Cardiff University, Cardiff, UK
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Cardiff, UK
| | - Yuval Itan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Icahn School of Medicine at Mount Sinai, The Charles Bronfman Institute for Personalized Medicine, New York, NY, USA.
| | - Lei Chen
- Department of Neurology, West China Hospital of Sichuan University, Chengdu, Sichuan, People's Republic of China.
| |
Collapse
|
37
|
Secomandi S, Gallo GR, Sozzoni M, Iannucci A, Galati E, Abueg L, Balacco J, Caprioli M, Chow W, Ciofi C, Collins J, Fedrigo O, Ferretti L, Fungtammasan A, Haase B, Howe K, Kwak W, Lombardo G, Masterson P, Messina G, Møller AP, Mountcastle J, Mousseau TA, Ferrer Obiol J, Olivieri A, Rhie A, Rubolini D, Saclier M, Stanyon R, Stucki D, Thibaud-Nissen F, Torrance J, Torroni A, Weber K, Ambrosini R, Bonisoli-Alquati A, Jarvis ED, Gianfranceschi L, Formenti G. A chromosome-level reference genome and pangenome for barn swallow population genomics. Cell Rep 2023; 42:111992. [PMID: 36662619 PMCID: PMC10044405 DOI: 10.1016/j.celrep.2023.111992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 07/20/2022] [Accepted: 01/04/2023] [Indexed: 01/20/2023] Open
Abstract
Insights into the evolution of non-model organisms are limited by the lack of reference genomes of high accuracy, completeness, and contiguity. Here, we present a chromosome-level, karyotype-validated reference genome and pangenome for the barn swallow (Hirundo rustica). We complement these resources with a reference-free multialignment of the reference genome with other bird genomes and with the most comprehensive catalog of genetic markers for the barn swallow. We identify potentially conserved and accelerated genes using the multialignment and estimate genome-wide linkage disequilibrium using the catalog. We use the pangenome to infer core and accessory genes and to detect variants using it as a reference. Overall, these resources will foster population genomics studies in the barn swallow, enable detection of candidate genes in comparative genomics studies, and help reduce bias toward a single reference genome.
Collapse
Affiliation(s)
- Simona Secomandi
- Department of Biosciences, University of Milan, Milan, Italy; Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Guido R Gallo
- Department of Biosciences, University of Milan, Milan, Italy
| | | | - Alessio Iannucci
- Department of Biology, University of Florence, Sesto Fiorentino (FI), Italy
| | - Elena Galati
- Department of Biosciences, University of Milan, Milan, Italy
| | - Linelle Abueg
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Jennifer Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Manuela Caprioli
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | | | - Claudio Ciofi
- Department of Biology, University of Florence, Sesto Fiorentino (FI), Italy
| | | | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Luca Ferretti
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | | | - Bettina Haase
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Woori Kwak
- Department of Medical and Biological Sciences, The Catholic University of Korea, Bucheon 14662, Korea
| | - Gianluca Lombardo
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | - Anders P Møller
- Ecologie Systématique Evolution, Université Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, Orsay Cedex, France
| | | | - Timothy A Mousseau
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA
| | - Joan Ferrer Obiol
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | - Anna Olivieri
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Diego Rubolini
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | | | - Roscoe Stanyon
- Department of Biology, University of Florence, Sesto Fiorentino (FI), Italy
| | | | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | - Antonio Torroni
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | | | - Roberto Ambrosini
- Department of Environmental Sciences and Policy, University of Milan, Milan, Italy
| | - Andrea Bonisoli-Alquati
- Department of Biological Sciences, California State Polytechnic University - Pomona, Pomona, CA, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA; The Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | | | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
38
|
Ng JK, Turner TN. HAT: de novo variant calling for highly accurate short-read and long-read sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.27.525940. [PMID: 36747667 PMCID: PMC9900919 DOI: 10.1101/2023.01.27.525940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/13/2023]
Abstract
Motivation de novo variant (DNV) calling is challenging from parent-child sequenced trio data. We developed Hare And Tortoise (HAT) to work as an automated workflow to detect DNVs in highly accurate short-read and long-read sequencing data. Reliable detection of DNVs is important for human genetics studies (e.g., autism, epilepsy). Results HAT is a workflow to detect DNVs from short-read and long read sequencing data. This workflow begins with aligned read data (i.e., CRAM or BAM) from a parent-child sequenced trio and outputs DNVs. HAT detects high-quality DNVs from short-read whole-exome sequencing, short-read whole-genome sequencing, and highly accurate long-read sequencing data.
Collapse
Affiliation(s)
- Jeffrey K. Ng
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Tychele N. Turner
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
39
|
Marszalek-Zenczak M, Satyr A, Wojciechowski P, Zenczak M, Sobieszczanska P, Brzezinski K, Iefimenko T, Figlerowicz M, Zmienko A. Analysis of Arabidopsis non-reference accessions reveals high diversity of metabolic gene clusters and discovers new candidate cluster members. FRONTIERS IN PLANT SCIENCE 2023; 14:1104303. [PMID: 36778696 PMCID: PMC9909608 DOI: 10.3389/fpls.2023.1104303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 01/11/2023] [Indexed: 06/18/2023]
Abstract
Metabolic gene clusters (MGCs) are groups of genes involved in a common biosynthetic pathway. They are frequently formed in dynamic chromosomal regions, which may lead to intraspecies variation and cause phenotypic diversity. We examined copy number variations (CNVs) in four Arabidopsis thaliana MGCs in over one thousand accessions with experimental and bioinformatic approaches. Tirucalladienol and marneral gene clusters showed little variation, and the latter was fixed in the population. Thalianol and especially arabidiol/baruol gene clusters displayed substantial diversity. The compact version of the thalianol gene cluster was predominant and more conserved than the noncontiguous version. In the arabidiol/baruol cluster, we found a large genomic insertion containing divergent duplicates of the CYP705A2 and BARS1 genes. The BARS1 paralog, which we named BARS2, encoded a novel oxidosqualene synthase. The expression of the entire arabidiol/baruol gene cluster was altered in the accessions with the duplication. Moreover, they presented different root growth dynamics and were associated with warmer climates compared to the reference-like accessions. In the entire genome, paired genes encoding terpene synthases and cytochrome P450 oxidases were more variable than their nonpaired counterparts. Our study highlights the role of dynamically evolving MGCs in plant adaptation and phenotypic diversity.
Collapse
Affiliation(s)
| | - Anastasiia Satyr
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Pawel Wojciechowski
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Faculty of Computing and Telecommunications, Poznan University of Technology, Poznan, Poland
| | - Michal Zenczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | | | | | - Tetiana Iefimenko
- Department of Biology, National University of Kyiv-Mohyla Academy, Kyiv, Ukraine
| | - Marek Figlerowicz
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Agnieszka Zmienko
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| |
Collapse
|
40
|
López-Cortegano E, Craig RJ, Chebib J, Balogun EJ, Keightley PD. Rates and spectra of de novo structural mutations in Chlamydomonas reinhardtii. Genome Res 2023; 33:45-60. [PMID: 36617667 PMCID: PMC9977147 DOI: 10.1101/gr.276957.122] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 12/06/2022] [Indexed: 12/14/2022]
Abstract
Genetic variation originates from several types of spontaneous mutation, including single-nucleotide substitutions, short insertions and deletions (indels), and larger structural changes. Structural mutations (SMs) drive genome evolution and are thought to play major roles in evolutionary adaptation, speciation, and genetic disease, including cancers. Sequencing of mutation accumulation (MA) lines has provided estimates of rates and spectra of single-nucleotide and indel mutations in many species, yet the rate of new SMs is largely unknown. Here, we use long-read sequencing to determine the full mutation spectrum in MA lines derived from two strains (CC-1952 and CC-2931) of the green alga Chlamydomonas reinhardtii The SM rate is highly variable between strains and between MA lines, and SMs represent a substantial proportion of all mutations in both strains (CC-1952 6%; CC-2931 12%). The SM spectra differ considerably between the two strains, with almost all inversions and translocations occurring in CC-2931 MA lines. This variation is associated with heterogeneity in the number and type of active transposable elements (TEs), which comprise major proportions of SMs in both strains (CC-1952 22%; CC-2931 38%). In CC-2931, a Crypton and a previously undescribed type of DNA element have caused 71% of chromosomal rearrangements, whereas in CC-1952, a Dualen LINE is associated with 87% of duplications. Other SMs, notably large duplications in CC-2931, are likely products of various double-strand break repair pathways. Our results show that diverse types of SMs occur at substantial rates, and support prominent roles for SMs and TEs in evolution.
Collapse
Affiliation(s)
- Eugenio López-Cortegano
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Rory J. Craig
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom;,California Institute for Quantitative Biosciences, UC Berkeley, Berkeley, California 94720, USA
| | - Jobran Chebib
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Eniolaye J. Balogun
- Department of Ecology and Evolutionary Biology, University of Toronto, Ontario ON M5S 3B2, Canada;,Department of Biology, University of Toronto Mississauga, Mississauga ON L5L 1C6, Canada
| | - Peter D. Keightley
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| |
Collapse
|
41
|
Redei EE, Udell ME, Solberg Woods LC, Chen H. The Wistar Kyoto Rat: A Model of Depression Traits. Curr Neuropharmacol 2023; 21:1884-1905. [PMID: 36453495 PMCID: PMC10514523 DOI: 10.2174/1570159x21666221129120902] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/19/2022] [Accepted: 10/21/2022] [Indexed: 12/05/2022] Open
Abstract
There is an ongoing debate about the value of animal research in psychiatry with valid lines of reasoning stating the limits of individual animal models compared to human psychiatric illnesses. Human depression is not a homogenous disorder; therefore, one cannot expect a single animal model to reflect depression heterogeneity. This limited review presents arguments that the Wistar Kyoto (WKY) rats show intrinsic depression traits. The phenotypes of WKY do not completely mirror those of human depression but clearly indicate characteristics that are common with it. WKYs present despair- like behavior, passive coping with stress, comorbid anxiety, and enhanced drug use compared to other routinely used inbred or outbred strains of rats. The commonly used tests identifying these phenotypes reflect exploratory, escape-oriented, and withdrawal-like behaviors. The WKYs consistently choose withdrawal or avoidance in novel environments and freezing behaviors in response to a challenge in these tests. The physiological response to a stressful environment is exaggerated in WKYs. Selective breeding generated two WKY substrains that are nearly isogenic but show clear behavioral differences, including that of depression-like behavior. WKY and its substrains may share characteristics of subgroups of depressed individuals with social withdrawal, low energy, weight loss, sleep disturbances, and specific cognitive dysfunction. The genomes of the WKY and WKY substrains contain variations that impact the function of many genes identified in recent human genetic studies of depression. Thus, these strains of rats share characteristics of human depression at both phenotypic and genetic levels, making them a model of depression traits.
Collapse
Affiliation(s)
- Eva E. Redei
- Department of Psychiatry and Behavioral Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Mallory E. Udell
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Leah C. Solberg Woods
- Section on Molecular Medicine, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Hao Chen
- Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| |
Collapse
|
42
|
Betschart RO, Thiéry A, Aguilera-Garcia D, Zoche M, Moch H, Twerenbold R, Zeller T, Blankenberg S, Ziegler A. Comparison of calling pipelines for whole genome sequencing: an empirical study demonstrating the importance of mapping and alignment. Sci Rep 2022; 12:21502. [PMID: 36513709 PMCID: PMC9748128 DOI: 10.1038/s41598-022-26181-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 12/12/2022] [Indexed: 12/14/2022] Open
Abstract
Rapid advances in high-throughput DNA sequencing technologies have enabled the conduct of whole genome sequencing (WGS) studies, and several bioinformatics pipelines have become available. The aim of this study was the comparison of 6 WGS data pre-processing pipelines, involving two mapping and alignment approaches (GATK utilizing BWA-MEM2 2.2.1, and DRAGEN 3.8.4) and three variant calling pipelines (GATK 4.2.4.1, DRAGEN 3.8.4 and DeepVariant 1.1.0). We sequenced one genome in a bottle (GIAB) sample 70 times in different runs, and one GIAB trio in triplicate. The truth set of the GIABs was used for comparison, and performance was assessed by computation time, F1 score, precision, and recall. In the mapping and alignment step, the DRAGEN pipeline was faster than the GATK with BWA-MEM2 pipeline. DRAGEN showed systematically higher F1 score, precision, and recall values than GATK for single nucleotide variations (SNVs) and Indels in simple-to-map, complex-to-map, coding and non-coding regions. In the variant calling step, DRAGEN was fastest. In terms of accuracy, DRAGEN and DeepVariant performed similarly and both superior to GATK, with slight advantages for DRAGEN for Indels and for DeepVariant for SNVs. The DRAGEN pipeline showed the lowest Mendelian inheritance error fraction for the GIAB trios. Mapping and alignment played a key role in variant calling of WGS, with the DRAGEN outperforming GATK.
Collapse
Affiliation(s)
- Raphael O. Betschart
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 1, 7265 Davos Wolfgang, Switzerland
| | - Alexandre Thiéry
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 1, 7265 Davos Wolfgang, Switzerland
| | - Domingo Aguilera-Garcia
- grid.412004.30000 0004 0478 9977Institute of Pathology and Molecular Pathology, University Hospital Zurich, Schmelzbergstrasse 12, 8091 Zurich, Switzerland
| | - Martin Zoche
- grid.412004.30000 0004 0478 9977Institute of Pathology and Molecular Pathology, University Hospital Zurich, Schmelzbergstrasse 12, 8091 Zurich, Switzerland
| | - Holger Moch
- grid.412004.30000 0004 0478 9977Institute of Pathology and Molecular Pathology, University Hospital Zurich, Schmelzbergstrasse 12, 8091 Zurich, Switzerland
| | - Raphael Twerenbold
- grid.13648.380000 0001 2180 3484Department of Cardiology, University Heart & Vascular Center, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.13648.380000 0001 2180 3484University Center of Cardiovascular Research Hamburg, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.452396.f0000 0004 5937 5237German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Hamburg, Germany
| | - Tanja Zeller
- grid.13648.380000 0001 2180 3484Department of Cardiology, University Heart & Vascular Center, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.13648.380000 0001 2180 3484University Center of Cardiovascular Research Hamburg, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.452396.f0000 0004 5937 5237German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Hamburg, Germany
| | - Stefan Blankenberg
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 1, 7265 Davos Wolfgang, Switzerland ,grid.13648.380000 0001 2180 3484Department of Cardiology, University Heart & Vascular Center, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.13648.380000 0001 2180 3484University Center of Cardiovascular Research Hamburg, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.452396.f0000 0004 5937 5237German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Hamburg, Germany
| | - Andreas Ziegler
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 1, 7265 Davos Wolfgang, Switzerland ,grid.13648.380000 0001 2180 3484Department of Cardiology, University Heart & Vascular Center, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,School Mathematics, Statistics and Computer Science, Scottsville, Private Bag X01, Pietermaritzburg, 3209 South Africa
| |
Collapse
|
43
|
Woerner AE, Mandape S, Kapema KB, Duque TM, Smuts A, King JL, Crysup B, Wang X, Huang M, Ge J, Budowle B. Optimized variant calling for estimating kinship. Forensic Sci Int Genet 2022; 61:102785. [DOI: 10.1016/j.fsigen.2022.102785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 08/07/2022] [Accepted: 09/29/2022] [Indexed: 11/16/2022]
|
44
|
Cavazos TB, Kachuri L, Graff RE, Nierenberg JL, Thai KK, Alexeeff S, Van Den Eeden S, Corley DA, Kushi LH, Hoffmann TJ, Ziv E, Habel LA, Jorgenson E, Sakoda LC, Witte JS. Assessment of genetic susceptibility to multiple primary cancers through whole-exome sequencing in two large multi-ancestry studies. BMC Med 2022; 20:332. [PMID: 36199081 PMCID: PMC9535845 DOI: 10.1186/s12916-022-02535-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 08/17/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Up to one of every six individuals diagnosed with one cancer will be diagnosed with a second primary cancer in their lifetime. Genetic factors contributing to the development of multiple primary cancers, beyond known cancer syndromes, have been underexplored. METHODS To characterize genetic susceptibility to multiple cancers, we conducted a pan-cancer, whole-exome sequencing study of individuals drawn from two large multi-ancestry populations (6429 cases, 165,853 controls). We created two groupings of individuals diagnosed with multiple primary cancers: (1) an overall combined set with at least two cancers across any of 36 organ sites and (2) cancer-specific sets defined by an index cancer at one of 16 organ sites with at least 50 cases from each study population. We then investigated whether variants identified from exome sequencing were associated with these sets of multiple cancer cases in comparison to individuals with one and, separately, no cancers. RESULTS We identified 22 variant-phenotype associations, 10 of which have not been previously discovered and were significantly overrepresented among individuals with multiple cancers, compared to those with a single cancer. CONCLUSIONS Overall, we describe variants and genes that may play a fundamental role in the development of multiple primary cancers and improve our understanding of shared mechanisms underlying carcinogenesis.
Collapse
Affiliation(s)
- Taylor B Cavazos
- Biological and Medical Informatics, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Linda Kachuri
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, 94158, USA.,Department of Epidemiology and Population Health, Stanford University, Alway Building, 300 Pasteur Drive, Stanford, CA, 94305, USA
| | - Rebecca E Graff
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, 94158, USA.,Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA
| | - Jovia L Nierenberg
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, 94158, USA.,Regeneron Genetics Center, Tarrytown, NY, 10591, USA
| | - Khanh K Thai
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA
| | - Stacey Alexeeff
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA
| | - Stephen Van Den Eeden
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA
| | - Douglas A Corley
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA
| | - Lawrence H Kushi
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA
| | | | - Thomas J Hoffmann
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Elad Ziv
- Regeneron Genetics Center, Tarrytown, NY, 10591, USA
| | - Laurel A Habel
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA
| | - Eric Jorgenson
- Department of Medicine, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Lori C Sakoda
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, 94612, USA.,Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA, 91101, USA
| | - John S Witte
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, 94158, USA. .,Department of Epidemiology and Population Health, Stanford University, Alway Building, 300 Pasteur Drive, Stanford, CA, 94305, USA. .,Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
45
|
Crysup B, Woerner AE. A genotype likelihood function for DNA mixtures. Forensic Sci Int Genet 2022; 61:102776. [PMID: 36152508 DOI: 10.1016/j.fsigen.2022.102776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 08/10/2022] [Accepted: 09/14/2022] [Indexed: 11/04/2022]
Abstract
The recent advent of genetic genealogy has brought about a renewed interest in genome-scale forensic analyses, of which kinship estimation is a critical component. Most genomic kinship estimators consider SNPs (single nucleotide polymorphisms), often leveraging the co-inheritance of shared alleles to inform their analyses. While current estimators cannot directly evaluate mixed samples, there exist well-established SNP-based kinship estimators tailored to considering challenged samples, including low-pass whole genome sequencing. As an example, several studies have shown remarkable success in imputing genotype posterior probabilities in low template samples when linked sites are considered. Critical to these approaches is the ability to account for genotype uncertainty; the lack of an expression for a genotype likelihood in imbalanced mixtures has prevented direct application. This work develops such an expression. The formulation is fully compatible with genotype imputation software, suggesting a genomic pipeline that estimates genotype likelihoods, performs imputation, and then estimates kinship when the sample is a mixture. Further, when framed as an imbalanced mixture, the problem of mixture deconvolution is reducible to the problem of genotyping mixed samples. Herein, the ability to genotype two-person mixtures is assessed through example and in silico settings. While certain mixture scenarios and classes of sites are inherently inseparable, simulations of read depths between 60 and 190 appear to produce likelihoods of sufficient magnitude to deconvolve two-person mixtures whenever the mixture fraction is moderately imbalanced. The described approach and results suggest a path forward for estimating the kinship coefficient (and similar inferences on relatedness) when the sample is a mixture.
Collapse
Affiliation(s)
- Benjamin Crysup
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA
| | - August E Woerner
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science, USA.
| |
Collapse
|
46
|
Savatt JM, Shimelis H, Moreno-De-Luca A, Strande NT, Oetjens MT, Ledbetter DH, Martin CL, Myers SM, Finucane BM. Frequency of truncating FLCN variants and Birt-Hogg-Dubé-associated phenotypes in a health care system population. Genet Med 2022; 24:1857-1866. [PMID: 35639097 PMCID: PMC9703446 DOI: 10.1016/j.gim.2022.05.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 05/05/2022] [Accepted: 05/06/2022] [Indexed: 02/05/2023] Open
Abstract
PURPOSE Penetrance estimates of Birt-Hogg-Dubé syndrome (BHD)-associated cutaneous, pulmonary, and kidney manifestations are based on clinically ascertained families. In a health care system population, we used a genetics-first approach to estimate the prevalence of pathogenic/likely pathogenic (P/LP) truncating variants in FLCN, which cause BHD, and the penetrance of BHD-related phenotypes. METHODS Exomes from 135,990 patient-participants in Geisinger's MyCode cohort were assessed for P/LP truncating FLCN variants. BHD-related phenotypes were evaluated from electronic health records. Association between P/LP FLCN variants and BHD-related phenotypes was assessed using Firth's logistic regression. RESULTS P/LP truncating FLCN variants were identified in 35 individuals (1 in 3234 unrelated individuals), 68.6% of whom had BHD-related phenotype(s), including cystic lung disease (65.7%), pneumothoraces (17.1%), cutaneous manifestations (8.6%), and kidney cancer (2.9%). A total of 4 (11.4%) individuals had prior clinical BHD diagnoses. CONCLUSION In this health care population, the frequency of P/LP truncating FLCN variants is 60 times higher than the previously reported prevalence. Although most variant-positive individuals had BHD-related phenotypes, a minority were previously clinically diagnosed, likely because cutaneous manifestations, pneumothoraces, and kidney cancer were observed at lower frequencies than in clinical cohorts. Improved clinical recognition of cystic lung disease and education concerning its association with FLCN variants could prompt evaluation for BHD.
Collapse
Affiliation(s)
- Juliann M. Savatt
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania,Genomic Medicine Institute, Geisinger, Danville, Pennsylvania
| | - Hermela Shimelis
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania
| | - Andres Moreno-De-Luca
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania,Genomic Medicine Institute, Geisinger, Danville, Pennsylvania,Department of Radiology, Geisinger, Danville, Pennsylvania
| | - Natasha T. Strande
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania,Genomic Medicine Institute, Geisinger, Danville, Pennsylvania
| | - Matthew T. Oetjens
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania
| | - David H. Ledbetter
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania,Genomic Medicine Institute, Geisinger, Danville, Pennsylvania
| | - Christa Lese Martin
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania,Genomic Medicine Institute, Geisinger, Danville, Pennsylvania
| | - Scott M. Myers
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania
| | - Brenda M. Finucane
- Autism & Developmental Medicine Institute, Geisinger, Lewisburg, Pennsylvania
| |
Collapse
|
47
|
Zhao Y, Gardner EJ, Tuke MA, Zhang H, Pietzner M, Koprulu M, Jia RY, Ruth KS, Wood AR, Beaumont RN, Tyrrell J, Jones SE, Lango Allen H, Day FR, Langenberg C, Frayling TM, Weedon MN, Perry JRB, Ong KK, Murray A. Detection and characterization of male sex chromosome abnormalities in the UK Biobank study. Genet Med 2022; 24:1909-1919. [PMID: 35687092 DOI: 10.1016/j.gim.2022.05.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 05/15/2022] [Accepted: 05/16/2022] [Indexed: 11/21/2022] Open
Abstract
PURPOSE The study aimed to systematically ascertain male sex chromosome abnormalities, 47,XXY (Klinefelter syndrome [KS]) and 47,XYY, and characterize their risks of adverse health outcomes. METHODS We analyzed genotyping array or exome sequence data in 207,067 men of European ancestry aged 40 to 70 years from the UK Biobank and related these to extensive routine health record data. RESULTS Only 49 of 213 (23%) of men whom we identified with KS and only 1 of 143 (0.7%) with 47,XYY had a diagnosis of abnormal karyotype on their medical records or self-report. We observed expected associations for KS with reproductive dysfunction (late puberty: risk ratio [RR] = 2.7; childlessness: RR = 4.2; testosterone concentration: RR = -3.8 nmol/L, all P < 2 × 10-8), whereas XYY men appeared to have normal reproductive function. Despite this difference, we identified several higher disease risks shared across both KS and 47,XYY, including type 2 diabetes (RR = 3.0 and 2.6, respectively), venous thrombosis (RR = 6.4 and 7.4, respectively), pulmonary embolism (RR = 3.3 and 3.7, respectively), and chronic obstructive pulmonary disease (RR = 4.4 and 4.6, respectively) (all P < 7 × 10-6). CONCLUSION KS and 47,XYY were mostly unrecognized but conferred substantially higher risks for metabolic, vascular, and respiratory diseases, which were only partially explained by higher levels of body mass index, deprivation, and smoking.
Collapse
Affiliation(s)
- Yajie Zhao
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom
| | - Eugene J Gardner
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom
| | - Marcus A Tuke
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom
| | - Huairen Zhang
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom
| | - Maik Pietzner
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom; Computational Medicine, Berlin Institute of Health (BIH) at Charité, Universitätsmedizin Berlin, Berlin, Germany
| | - Mine Koprulu
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom
| | - Raina Y Jia
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom
| | - Katherine S Ruth
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom
| | - Andrew R Wood
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom
| | - Robin N Beaumont
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom
| | - Jessica Tyrrell
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom
| | - Samuel E Jones
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom; Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Hana Lango Allen
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom
| | - Felix R Day
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom
| | - Claudia Langenberg
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom; Computational Medicine, Berlin Institute of Health (BIH) at Charité, Universitätsmedizin Berlin, Berlin, Germany
| | - Timothy M Frayling
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom
| | - Michael N Weedon
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom
| | - John R B Perry
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom
| | - Ken K Ong
- MRC Epidemiology Unit, Institute of Metabolic Science, School of Clinical Medicine, Cambridge University, Cambridge, United Kingdom.
| | - Anna Murray
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Royal Devon & Exeter Hospital, Exeter, United Kingdom.
| |
Collapse
|
48
|
Akbari P, Sosina OA, Bovijn J, Landheer K, Nielsen JB, Kim M, Aykul S, De T, Haas ME, Hindy G, Lin N, Dinsmore IR, Luo JZ, Hectors S, Geraghty B, Germino M, Panagis L, Parasoglou P, Walls JR, Halasz G, Atwal GS, Jones M, LeBlanc MG, Still CD, Carey DJ, Giontella A, Orho-Melander M, Berumen J, Kuri-Morales P, Alegre-Díaz J, Torres JM, Emberson JR, Collins R, Rader DJ, Zambrowicz B, Murphy AJ, Balasubramanian S, Overton JD, Reid JG, Shuldiner AR, Cantor M, Abecasis GR, Ferreira MAR, Sleeman MW, Gusarova V, Altarejos J, Harris C, Economides AN, Idone V, Karalis K, Della Gatta G, Mirshahi T, Yancopoulos GD, Melander O, Marchini J, Tapia-Conyer R, Locke AE, Baras A, Verweij N, Lotta LA. Multiancestry exome sequencing reveals INHBE mutations associated with favorable fat distribution and protection from diabetes. Nat Commun 2022; 13:4844. [PMID: 35999217 PMCID: PMC9399235 DOI: 10.1038/s41467-022-32398-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 07/28/2022] [Indexed: 12/13/2022] Open
Abstract
Body fat distribution is a major, heritable risk factor for cardiometabolic disease, independent of overall adiposity. Using exome-sequencing in 618,375 individuals (including 160,058 non-Europeans) from the UK, Sweden and Mexico, we identify 16 genes associated with fat distribution at exome-wide significance. We show 6-fold larger effect for fat-distribution associated rare coding variants compared with fine-mapped common alleles, enrichment for genes expressed in adipose tissue and causal genes for partial lipodystrophies, and evidence of sex-dimorphism. We describe an association with favorable fat distribution (p = 1.8 × 10-09), favorable metabolic profile and protection from type 2 diabetes (~28% lower odds; p = 0.004) for heterozygous protein-truncating mutations in INHBE, which encodes a circulating growth factor of the activin family, highly and specifically expressed in hepatocytes. Our results suggest that inhibin βE is a liver-expressed negative regulator of adipose storage whose blockade may be beneficial in fat distribution-associated metabolic disease.
Collapse
Affiliation(s)
- Parsa Akbari
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Olukayode A. Sosina
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Jonas Bovijn
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Karl Landheer
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Jonas B. Nielsen
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Minhee Kim
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Senem Aykul
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Tanima De
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Mary E. Haas
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - George Hindy
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Nan Lin
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Ian R. Dinsmore
- grid.280776.c0000 0004 0394 1447Department of Molecular and Functional Genomics, Geisinger Health System, Danville, PA USA
| | - Jonathan Z. Luo
- grid.280776.c0000 0004 0394 1447Department of Molecular and Functional Genomics, Geisinger Health System, Danville, PA USA
| | - Stefanie Hectors
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Benjamin Geraghty
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Mary Germino
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Lampros Panagis
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Prodromos Parasoglou
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Johnathon R. Walls
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Gabor Halasz
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Gurinder S. Atwal
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | | | | | - Marcus Jones
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Michelle G. LeBlanc
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Christopher D. Still
- grid.280776.c0000 0004 0394 1447Geisinger Obesity Institute, Geisinger Health System, Danville, PA USA
| | - David J. Carey
- grid.280776.c0000 0004 0394 1447Geisinger Obesity Institute, Geisinger Health System, Danville, PA USA
| | - Alice Giontella
- grid.4514.40000 0001 0930 2361Department of Clinical Sciences Malmö, Lund University, Malmö, Sweden ,grid.5611.30000 0004 1763 1124Department of Medicine, University of Verona, Verona, Italy
| | - Marju Orho-Melander
- grid.4514.40000 0001 0930 2361Department of Clinical Sciences Malmö, Lund University, Malmö, Sweden
| | - Jaime Berumen
- grid.9486.30000 0001 2159 0001Unidad de Medicina Experimental de la Facultad de Medicina de la Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Pablo Kuri-Morales
- grid.9486.30000 0001 2159 0001Unidad de Medicina Experimental de la Facultad de Medicina de la Universidad Nacional Autónoma de México, Mexico City, Mexico ,grid.419886.a0000 0001 2203 4701Instituto Tecnológico y de Estudios Superiores de Monterrey, Monterrey, Mexico
| | - Jesus Alegre-Díaz
- grid.9486.30000 0001 2159 0001Unidad de Medicina Experimental de la Facultad de Medicina de la Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Jason M. Torres
- grid.4991.50000 0004 1936 8948MRC Population Health Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK ,grid.4991.50000 0004 1936 8948Clinical Trial Service Unit & Epidemiological Studies Unit Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Jonathan R. Emberson
- grid.4991.50000 0004 1936 8948MRC Population Health Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford, UK ,grid.4991.50000 0004 1936 8948Clinical Trial Service Unit & Epidemiological Studies Unit Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Rory Collins
- grid.4991.50000 0004 1936 8948Clinical Trial Service Unit & Epidemiological Studies Unit Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Daniel J. Rader
- grid.25879.310000 0004 1936 8972Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Brian Zambrowicz
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Andrew J. Murphy
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Suganthi Balasubramanian
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - John D. Overton
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Jeffrey G. Reid
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Alan R. Shuldiner
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Michael Cantor
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Goncalo R. Abecasis
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Manuel A. R. Ferreira
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Mark W. Sleeman
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Viktoria Gusarova
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Judith Altarejos
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Charles Harris
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Aris N. Economides
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA ,grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Vincent Idone
- grid.418961.30000 0004 0472 2713Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Katia Karalis
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Giusy Della Gatta
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Tooraj Mirshahi
- grid.280776.c0000 0004 0394 1447Geisinger Obesity Institute, Geisinger Health System, Danville, PA USA
| | | | - Olle Melander
- grid.4514.40000 0001 0930 2361Department of Clinical Sciences Malmö, Lund University, Malmö, Sweden ,grid.411843.b0000 0004 0623 9987Department of Emergency and Internal Medicine, Skåne University Hospital, Malmö, Sweden
| | - Jonathan Marchini
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Roberto Tapia-Conyer
- grid.419886.a0000 0001 2203 4701Instituto Tecnológico y de Estudios Superiores de Monterrey, Monterrey, Mexico
| | - Adam E. Locke
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Aris Baras
- Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY, USA.
| | - Niek Verweij
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| | - Luca A. Lotta
- grid.418961.30000 0004 0472 2713Regeneron Genetics Center, Regeneron Pharmaceuticals Inc, Tarrytown, NY USA
| |
Collapse
|
49
|
Carpi G, Gorenstein L, Harkins TT, Samadi M, Vats P. A GPU-accelerated compute framework for pathogen genomic variant identification to aid genomic epidemiology of infectious disease: a malaria case study. Brief Bioinform 2022; 23:6658853. [PMID: 35945154 PMCID: PMC9487672 DOI: 10.1093/bib/bbac314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 06/03/2022] [Accepted: 07/12/2022] [Indexed: 11/15/2022] Open
Abstract
As recently demonstrated by the COVID-19 pandemic, large-scale pathogen genomic data are crucial to characterize transmission patterns of human infectious diseases. Yet, current methods to process raw sequence data into analysis-ready variants remain slow to scale, hampering rapid surveillance efforts and epidemiological investigations for disease control. Here, we introduce an accelerated, scalable, reproducible, and cost-effective framework for pathogen genomic variant identification and present an evaluation of its performance and accuracy across benchmark datasets of Plasmodium falciparum malaria genomes. We demonstrate superior performance of the GPU framework relative to standard pipelines with mean execution time and computational costs reduced by 27× and 4.6×, respectively, while delivering 99.9% accuracy at enhanced reproducibility.
Collapse
Affiliation(s)
- Giovanna Carpi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.,Purdue Institute for Inflammation, Immunology, & Infectious Disease, Purdue University, West Lafayette, IN, USA.,W. Harry Feinstone Department of Molecular Microbiology and Immunology, Johns Hopkins Malaria Research Institute, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Lev Gorenstein
- Rosen Center for Advanced Computing, Purdue University, West Lafayette IN, USA
| | | | | | - Pankaj Vats
- NVIDIA, 2788 San Tomas, Santa Clara, CA, USA
| |
Collapse
|
50
|
Stenløkk K, Saitou M, Rud-Johansen L, Nome T, Moser M, Árnyasi M, Kent M, Barson NJ, Lien S. The emergence of supergenes from inversions in Atlantic salmon. Philos Trans R Soc Lond B Biol Sci 2022; 377:20210195. [PMID: 35694753 PMCID: PMC9189505 DOI: 10.1098/rstb.2021.0195] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Supergenes link allelic combinations into non-recombining units known to play an essential role in maintaining adaptive genetic variation. However, because supergenes can be maintained over millions of years by balancing selection and typically exhibit strong recombination suppression, both the underlying functional variants and how the supergenes are formed are largely unknown. Particularly, questions remain over the importance of inversion breakpoint sequences and whether supergenes capture pre-existing adaptive variation or accumulate this following recombination suppression. To investigate the process of supergene formation, we identified inversion polymorphisms in Atlantic salmon by assembling eleven genomes with nanopore long-read sequencing technology. A genome assembly from the sister species, brown trout, was used to determine the standard state of the inversions. We found evidence for adaptive variation through genotype-environment associations, but not for the accumulation of deleterious mutations. One young 3 Mb inversion segregating in North American populations has captured adaptive variation that is still segregating within the standard arrangement of the inversion, while some adaptive variation has accumulated after the inversion. This inversion and two others had breakpoints disrupting genes. Three multigene inversions with matched repeat structures at the breakpoints did not show any supergene signatures, suggesting that shared breakpoint repeats may obstruct supergene formation. This article is part of the theme issue 'Genomic architecture of supergenes: causes and evolutionary consequences'.
Collapse
Affiliation(s)
- Kristina Stenløkk
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Marie Saitou
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Live Rud-Johansen
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Torfinn Nome
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Michel Moser
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Mariann Árnyasi
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Matthew Kent
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Nicola Jane Barson
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Sigbjørn Lien
- Centre for Integrative Genetics (CIGENE) and Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| |
Collapse
|